Mentioned in

Trying to approximate Statistical Models as Scoring Tables

3Steven Byrnes

1IlyaShpitser

New Comment

Whatever I end up learning about Bayesian networks might not transfer to messier architectures.

I don't think you should take it as a given that AGI will involve "messier archictures"; I think there's at least a fighting chance that one of the core components of an AGI will be literally a Bayesian network, or at least something pretty close to that, basically because of this paper and related neuroscience-type things.

As such, I have a general good feeling about this line of research, even if I haven't tried to follow the details of your post.

For my PhD research, I have been tinkering with the idea of approximating complex, non-linear statistical models (Bayesian networks) as simple, quasi-linear models.

I am making some headway into the problem, but I feel I need yet to clarify what exactly I want to do, and why this is important.

I am writing this blogpost to 1) spell out the idea better (and get questions and pointers on how to explain it better) and 2) get feedback from the community on which parts of this feel interesting / useful, and which parts I could drop.

First I will explain why I am working on approximating Bayesian networks. Then I will explain the solution I came up with. Finally I will discuss issues with my current approach and next steps.

If you want something to play with, try out

this public prototype. It showcases a (clunky) solution to the problem, and might help understand what I am trying to do.Requisites: you should be familiar with Bayesian Networks, d-separation, belief propagation and vector logodds to understand the gist of this post. Shapley values and double crux are also mentioned in passing, though you should be able to skip the parts where they are mentioned and still follow the post. If there is anything else that seems unclear, you should probably let me know, because it would help me understand what do I need to explain better.Theory of researchMy (post-hoc) reasoning to have been focusing on this problem is something like this:

[REF]. They also try to go for this route of identifying “paths of influence”, but their operationalization of argument strength using minimum link strength feels somewhat ad-hoc.I think there is a non-trivial chance that I am just fooling myself into thinking that this is something worth pursuing - in my current grant program I have significant pressure to focus on explaining bayesian networks and focus on textual explanations. So here are the reasons why my theory of research might be wrong:

I’d be quite interested in hearing other critiques of this theory of research.

What I have been doingI have spent the last 6 months thinking about the concrete problem I pointed out above - how to modularize a Bayesian network as a series of arguments and then how to assign importance to each of them. In LessWrong lingo, what I have been doing is a way of extracting

cruxesfrom Bayesian models.My current approach is something inspired by Rudin et Utsun’s paper on Scoring Systems

[REF], combined with belief propagation[REF].Essentially, what I have been doing is:

d-separationto list down all conditions where those paths are “open”. This requires the first node of the path to be observed, we need to observe each collider (intermediate node with two incoming arrows) or at least a descendent of it, and to not observe any other intermediate node.This is dense and complex, so let’s walk through an example. You can also try out the end result of this process in

this interactive notebook.We will use the Asia network. The outcome we are interested in is whether the patient has lung cancer, which corresponds to the variable

`lung`

taking value`yes`

.1. Finding all possible argumentsWe first find all simple (undirected) paths to

`lung`

. For example one such path is`smoke`

→`lung`

. Another more convoluted path is`asia`

→`tub`

→`either`

←`lung`

.2. Find the conditions where the arguments applyLet’s focus on the path

`asia`

→`tub`

→`either`

←`lung`

. For this path to be open, we need to observe the first node “asia”, and we need to observe the collider`either`

or one of its descendents ie`xray`

or`dysp`

.This gives rise to 12 instantiations of the argument, eg (

`asia=yes`

,`either=yes`

), (`asia=yes`

,`either=no`

), (`asia=yes`

,`xray=yes`

), (`asia=yes`

,`dysp=yes`

), etc.3. Quantify the importance of each argumentLet’s focus on the argument

`asia`

→`tub`

→`either`

←`lung`

when (`asia`

=`yes`

,`xray`

=`yes`

). We run delta belief propagation from`asia`

to`tub`

, then from`xray`

to`either`

, and finally we quantify how the change in`tub`

affects`tub`

given the background change in`either`

due to the change in`xray`

. This sounds convoluted, but the end result is an evidence vector that approximates how much this argument affects`lung`

. We translate this vector into a logodd change, and record it. See the appendix for more info on how this works exactly.4. Order all the arguments by importanceThis is straightforward, given that we already have recorded the importance of each argument in terms of logodd changes. See figure 4 for the result.

5. Apply this for a concrete inputLet’s suppose we observe

`{'asia': 'no', 'dysp': 'yes', 'bronc': 'no', 'tub': 'no'}`

We can then find all arguments that apply under these conditions. So for example the argument

`tub=no`

→`either`

←`lung`

works because we have observed`tub=no`

and`dysp=yes`

to d-open the path. But the argument starting from`asia=no`

does not work, since`tub=no`

blocks the influence of`asia`

on the rest of the diagram.The former argument has an associated score of 0.3 logodds (as we can see in row 15 of figure 4), so it will shift our credences slightly in favor of the target outcome

`lung`

=`yes`

.Because of how delta belief propagation works, we can give explanations of how the evidence propagates along each proposed chain of argument. I have chosen to give just a brief automated textual description, see figure 5 for an example.

Does this work?Here there are some results:

CHILDthe results are weirdly off by 0.5 logodds, which suggests a systematic error I do not quite know how to debugOne problem I have here is that it is not entirely clear what I am aiming to achieve. Some things I could aim for are:

Accuracy. Getting results that are reasonably close to belief propagationScrutability. Figuring out whether users would be able to identify reasoning mistakes I deliberately introduce in the network.Simulability. Whether users can follow and replicate the reasoning of how to apply the scoring table, so that we can hand them the table and have domain experts audit itWhat are your thoughts?

Wrapping upIn summary: I am trying to extract from a Bayesian model the most important considerations on why an input should affect our beliefs about an outcome.

I have a system that does something, but I am confused on how to evaluate it and the grander point of what I am trying to achieve.

Theoretically, I can see some issues with the current framework. The main ones I see are:

`either=yes`

←`lung`

and that`tub=yes`

→`either=yes`

←`lung`

are treated as two different arguments but really it should be treated as onedouble crux.I have some ideas on how to work on these, but it will be a while until I can make those more concrete and code them up. I’d be interested in other ideas to solve these, and pointers to other issues.

Before I get to address them, I plan to spend some time clarifying what I am trying to do. In order to do that, my first plan is to just demo this system to a bunch of XAI researchers, and see what they have to say about it, plus reading more literature on the goals of the field to see if I can steal any ideas.

Appendix: delta propagationThe key concept that ties together the algorithm is what I’ve taken to call delta belief propagation. Do not expect a clean explanation because I don’t have one to offer (yet).

Roughly, this is a way of taking an evidence vector δX over a variable X and a factor ϕ over variables X,Y,Z1,…,Zn and output an evidence vector δY over variable Y, that roughly corresponds to how much of a difference the evidence δX makes over Y, given the probabilistic relations implied by ϕ.

Because the input and output of this mecanism is both an evidence vector, it is easy to see how this can be used recursively to study the effect of a path of variables such as the ones we are interested in studying.

How do we actually go about implementing this? My take is to multiply δX and ϕ, then marginalize all variables but Y. So far this would be equivalent to belief propagation. However, this mixes the information intrinsically in ϕ together with the information that comes from δX. To isolate the latter, we divide the result by the result of marginalizing ϕ without multiplying by δX.

(to address that sometimes we are interested in studying the effect as mediated by a collider, we also allow additional context vectors δZ that will be multiplied by ϕ before any other computation)

In equation form:

δY=ϕ1/ϕ2

ϕ1=∑J≠Y(ϕ⋅δX)

ϕ2=∑J≠Yϕ

In code form:

Jaime Sevilla is a researcher from Aberdeen University. He is sponsored by the NL4XAI program.I thank my supervisors Ehud Reiter, Nava Tintarev and Oren Nir for supporting me and encouraging me to write out my ideas.