Less Wrong Book Club and Study Group

Do you want to become stronger in the way of Bayes? This post is intended for people whose understanding of Bayesian probability theory is currently somewhat tentative (between levels 0 and 1 to use a previous post's terms), and who are interested in developing deeper knowledge through deliberate practice.

Our intention is to form an online self-study group composed of peers, working with the assistance of a facilitator - but not necessarily of a teacher or of an expert in the topic. Some students may be somewhat more advanced along the path, and able to offer assistance to others.

Our first text will be E.T. Jaynes' Probability Theory: The Logic of Science, which can be found in PDF form (in a slightly less polished version than the book edition) here or here.

We will work through the text in sections, at a pace allowing thorough understanding: expect one new section every week, maybe every other week. A brief summary of the currently discussed section will be published as an update to this post, and simultaneously a comment will open the discussion with a few questions, or the statement of an exercise. Please use ROT13 whenever appropriate in your replies.

A first comment below collects intentions to participate. Please reply to this comment only if you are genuinely interested in gaining a better understanding of Bayesian probability and willing to commit to spend a few hours per week reading through the section assigned or doing the exercises.

As a warm-up, participants are encouraged to start in on the book:


Most of the Preface can be safely skipped. It names the giants on whose shoulders Jaynes stood ("History", "Foundations"), deals briefly with the frequentist vs Bayesian controversy ("Comparisons"), discusses his "Style of Presentation" (and incidentally his distrust of infinite sets), and contains the usual acknowledgements.

One section, "What is 'safe'?", stands out as making several strong points about the use of probability theory. Sample: "new data that we insist on analyzing in terms of old ideas (that is, models which are not questioned) cannot lead us out of the old ideas". (The emphasis is Jaynes'. This has an almost Kuhnian flavor.)

Discussion on the Preface starts with this comment.

I found that even where I can parse a technical text (understand all introduced notions, without needing to look up the notions that are used without being defined), it's not a sufficient condition for me being ready for the text. It takes a lot of background effort to build technical fluency that allows to take away a deeper and lasting understanding of a given topic, fluency that isn't required to merely parse the text, or even solve the exercises and ace the exam. Without this fluency, without being prepared, acquired knowledge remains superficial, never becomes very useful, and quickly fades out of memory.

It's like reading a novel in a barely known foreign tongue, translating with a dictionary, and juggling the syntax without feeling the flow of the language. Technically, you can translate everything, but there is no hope for understanding the subtle points of the narrative, and the only way to get there is through obtaining fluency first, and reading the novel later.

What this tells me is that where I can't even parse a text on my own (i.e. there is a non-negligible number of statements I can't understand, or exercises I don't see how to solve), this is an absolutely unambiguous indicator that I'm not ready to try this particular text, and should work on something more elementary.

(This is a strategy for building deep knowledge of a favored subject; it's much more useful to skim in order to obtain superficial general knowledge of many diverse subjects, although elementary textbooks should still be the way to go, not recent research papers.)

Thanks. "Fluency" is exactly the concept I needed to argue with the people who say you don't need to know things, you just need to know where to look them up.

It's like reading a novel in a barely known foreign tongue, translating with a dictionary, and juggling the syntax without feeling the flow of the language.

Hm. This is not quite on-topic, but I learned English that way, lo those many years ago. My dad is an avid science-fiction fan and his collection resided in the loo. As a kid I loved sci-fi, I'd gotten hooked after reading Star Beast (translated, natch). And everytime I went for a twinkle there was this treasure trove, mocking me because I couldn't understand a word of it.

So one day (must have been 11ish) I picked up an Asimov, a dictionary and a French translation. (Why? The Loo Library was sorted by author; I started with the A's.) I struggled through that first book, I have no idea how I didn't just give up, but I made it through. The next one came easier, I didn't rely on a translation. After a while I could pretty much do without the dictionary, and started building my vocabulary from unknown words made clear by contextual clues. By the time I started on Heinlein I was getting top grades in English at school, a nice added bonus.

I'd agree that my real fluency only came later, after I also got some practice writing. But I disagree that tackling a difficult but intrinsically rewarding work isn't a good way to enter a previously unknown domain of knowledge.

Further to that, and getting back on topic, there are places in Jaynes where I can tell that I'm missing some pieces of background knowledge (e.g. familiarity with binomial coefficient manipulations) that others are likely to have, that don't really detract from my understanding what's going on but do make it harder for me to reproduce the derivations, but that having others to help me would add some "icing on the cake" to my appreciation of the math.

I'd agree that my real fluency only came later, after I also got some practice writing. But I disagree that tackling a difficult but intrinsically rewarding work isn't a good way to enter a previously unknown domain of knowledge.

I thought so as well, for many years, and it cost me dearly. The problem with math is that the more elementary tools won't be even mentioned in advanced specialized books, and are not necessary to parse them. The only efficient way to obtain them is to study from the ground up. Until recently, I was getting along on ability to parse more advanced texts relatively fine, but remained much weaker and near-sighted than I could have been. This made much of my previous study a waste of time, only moderately helping me to recapture the territory now.

I think one should learn on different levels at the same time:

If you only do what's convenient, your progress stops.

If you don't revisit the basics from time to time, you build on sand.

It is necessary to challenge oneself and at the same time work on the fundamentals. It is both inspiring and necessary to strike the right balance between the two extremes: A constant back and forth between them proved to be both the most productive and most entertaining for me personally.

This is the reason I am also interested in this study group: For me, it is revisiting the fundamentals. Although this book is relatively basic, it is very well written and focuses more on the right philosophy than the actual pragmatic issues. On the other hand, it is very detailed at places that other books easily take for granted and points out issues that other just step over. It is really a great reading to deepen one's knowledge. I am unsure though, whether it is the best introductory reading for someone who just wants to acquire a practical skill set.

If you only do what's convenient, your progress stops.
If you don't revisit the basics from time to time, you build on sand.

This should be engraved somewhere in big letters.

Hm. This is not quite on-topic, but I learned English that way, lo those many years ago.

So have I, but this didn't help with subtle understanding of those novels spent in learning. (I'm currently breaking into fluency in German using TV shows, and find this method more enjoyable.)

This is not quite on-topic, but I learned English that way, lo those many years ago.

Wow, I was planning to ask if English wasn't your native language, since you said you lived in Paris but speak English like a native speaker ... guess that one's resolved! So French is your native language then?


I tried, many years later, to do the same for German. Süskind's Die Taube was my first attempt, followed by Remarque's Der Funke Leben, I still have a few on my bookshelves. It didn't really work out - I can usually understand the gist of a text or conversation in German on a familiar topic, but I didn't have the motivation or the opportunities to stay invested in German as I was invested in English, and never achieved anywhere near the same fluency. There was also a more recent and short-lived attempt to start learning Icelandic. On the whole I'd rate my language-learning ability as unremarkable, but I have lots of experience with English.

Is this a roundabout way of saying that people 'between levels 0 and 1' should probably start with a more introductory text? Do you have any specific recommendations?

This is a comment about usefulness of a book club (for technical books of any difficulty), not about any book or topic in particular. In the lingo of levels of understanding, my argument is that you should finish a book with a deeper level than is possible to achieve if you barely understand what's explicitly written, and otherwise you shouldn't even start. If Jaynes reads easily, and you seek the knowledge it contains, read Jaynes (on your own). If you need help with reading Jaynes, don't read Jaynes at all, find something simpler. Maybe after obtaining more basic knowledge you'll discover that you shouldn't read Jaynes because it doesn't teach what you want to learn, even when you become ready for it.

Even when I am perfectly capable of understanding something myself, I find it extremely helpful to learn with other interested people in a "study, study, then discuss" kind of format. I come to my own conclusions about the text while I study, just as I would if I were working by myself, but then I additionally get to compare the details of my conclusions with those of other minds working independently. Also, having participants with diverse intellectual backgrounds means that they may be able to identify and share interesting tangential ideas that would not have occurred to me alone.

I also find that communicating my thoughts to other people forces me to clarify them to a greater degree, often revealing small gaps in understanding that I had papered over in my own mind.

My interest in the book club is as an anti-akrasia strategy. I've read the first few chapters of Jaynes and find it easy enough to understand but have not finished it because I generally have trouble with finding motivation to read technical books when I have no immediate application for the subject matter.

My interest in the book club is as an anti-akrasia strategy.

That could work, but there are other things you could be doing with your time, given that it's not a fun/useful enough activity to drive you without additional help.

Yes, such as the things I am currently doing with my time. This would be akrasia.

This is a comment about usefulness of a book club

You haven't really said anything about the book club aspect. Might that help with the obstacles you mention? For one thing, it's often difficult to figure out the prereqs, but people in the club may know. One thing that does make your comments particularly applicable to clubs is that they suggest the choice of book should be individual.

The most important parts of a technical book are often the non-technical portions, and this is especially true with Jaynes.


This reflects in the fact that great artists are invariably technical virtuosos. Mastery makes way for creativity.

This is due to limited working memory. You may be able to juggle the concepts/math of a particular field in working memory, but that takes away precious space for the combinatorial exploration of novel ideas, or even higher level concepts. Only with practice, when most of the steps in your thought processes can be carried out subconsciously, are you free to do higher-level thinking.

It's not all about working memory of course, since there is subconscious exploration going on as well. Still, things must surface to working memory to be checked that they make sense.

Also, there is also the fact that concepts are built upon concepts. To think at a higher level, you have to truly understand how concepts of the lower level work. It is simply impossible to do it all with limited working memory capacity.

So, you disagree with this? http://lesswrong.com/lw/gl/eric_drexler_on_learning_about_everything/

Don't drop a subject because you know you'd fail a test — instead, read other half-understandable journals and textbooks to accumulate vocabulary, perspective, and context.

and this, too http://radicalacademy.com/adlerhardreading.htm

I wrote the last paragraph specifically to distinguish this use case. And I set a higher standard - I said that even if you ace (not fail!) the exam, it's still not enough.

I liked the last slide a lot. (The others, too.) One reason I'm putting this post up is I plan to teach my kids this stuff, and I want to know how to explain it.

My outside view, in that I have rarely seen online book clubs or group readings ever work out*, is that you will probably fail.

Many times the attempts seem to founder on a lack of clear objectives or a clunky technical setup. I suggest you work on these; many short, automatically graded exercises, and a quick easy interface to them may work for a math-heavy PT:TLOS.

(Even if you did set up such a site - which would be a great resource - I am still pessimistic. I have PT:TLOS, and it requires quite a bit of math. Better know your calculus well.)

(Oh, and you might want to make a PDF version with the various corrections added in. Little is more frustrating than errors in a math book.)

(Also of relevance: http://groups.yahoo.com/group/etjaynesstudy/ )

* offhand: one failure to read through Dune, 2 different groups failing to read through The Book of the New Sun, multiple failed attempts at The Structure and Interpretation of Computer Programs and Real World Haskell, and no doubt some others that escape me now

How do you unpack "fail" or "flounder" in this context? What counts as success?

If your criterion for success is "taking a group of people through the entire text without most of them dropping out" then yes, I also expect failure, but I wouldn't be bothered by it much.

Here relative success seems more important. On a personal level this is an opportnunity for me to get more out of Jaynes, through comparing notes with others, than I would otherwise have. Symmetrically others should get the same benefit. Some people will invest time in studying the book that they may not otherwise have, and depending on their goal just that may count as a success.

I've had a very satisfactory prior experience participating in an online SICP study group (i.e. I learned quite a bit about programming), and I was peripherally involved in a brown bag club that tackled Jerry Weinberg's Quality Software Management, the people who did stick with it got a lot out of it.

How do you unpack "fail" or "flounder" in this context?

I would also add in 'founding members spend a great deal of effort and time on group-related activity such that they get less out of studying than they would otherwise have'. That, combined with the usual massive attrition just in the first few chapters...

This is the primary reason I gave my location - if I'm meeting with a group of people every week, we can go over the problems together in a high-bandwidth format.

offhand: one failure to read through Dune, 2 different groups failing to read through The Book of the New Sun, multiple failed attempts at The Structure and Interpretation of Computer Programs and Real World Haskell

Great list! You must have some cool friends, though apparently somewhat undermotivated.

If anyone has already reformatted (automatic conversion is not as good as one might hope) to an ebook format for easy Kindle use, I'd appreciate it. For a good non-DRM conversion, I'd be happy to pay up to $72 (the current Amazon.com price for a new dead-tree copy).

I'm using the Sony Reader, and I I've found the PDFs to be adequate so far. What specifically is your problem?

Please reply to this comment if you intend to participate, and are willing and able to free up a few hours per week or fortnight to work through the suggested reading or exercises.

Please indicate where you live, if you would be willing to have some discussion IRL. My intent is to facilitate an online discussion here on LW but face-to-face would be a nice complement, in locations where enough participants live.

(You need not check in again here if you have already done so in the previous discussion thread, but you can do so if you want to add details such as your location.)

I'm not exactly between 0 and 1...But I have some hours available here, and would like to do this. I've been through bits of Jaynes, but the social aspect will make doing the whole thing more interesting.

FWIW, I've a math degree, and have 20 years of technical (math, software, etc.) teaching expertise, if you'd like some assistance.

I'd suggest to everyone who hasn't as much tech-teaching experience that time spent doing exercises is the only thing that you should be counting as learning-time. Time spent reading has no feedback system, and you don't know (despite believing) whether you've learned anything. Do-->Learn. Read-->???

if you'd like some assistance.

That would be wonderful, thanks.

Time spent reading has no feedback system

Book discussions can counteract that to some extent: we will be asking questions about the material, and participating in such a discussion can correct misconceptions or prompt you to pay closer attention to something that struck you as trivial at first.

I will also be in the vicinity of the Bay Area from June 12 to late September, and would be quite happy to give the study group a try. I attempted a full read of Jaynes' book about a year ago, and realized about 70% of the way through that I didn't have all the mathematical background necessary to fully appreciate it.

A zipped archive of all the chapters, which seemed to be missing on the pages linked in the top-level post, is available here.

I'm currently trying to go through Jaynes:PTTLOS myself. As mentioned earlier in this comments: you hardly learn anything by reading alone, you need to discuss or solve exercises. So of course I would love to join a group!

I am currently in Manchester, UK, but will spend most of August and early September on the road, without regular internet access. After that I do not know for sure where I will live or how much time I can commit. But I am very interested!

I also do have a high-quality .pdf-version of the book. Apparently it is the first edition and has no links but it is not simply a scan of the pages of the book! That means the formulae and diagrams are all very high quality and you can do a full text search. I am not sure on the legal status though, probably it is an "only for private study use"-version that one is not supposed to make publicly available. What is the LW-policy concerning links to such content?

Non-profit doesn't change anything as far as I know (IANAL as they say).

I'm pretty sure that people who want to get a copy of the book can get it based on information they already have, and my recommendation would be to not expose yourself to legal risks.

I don't know what the copyright status is - the edition at http://bayes.wustl.edu/ was removed at the request of the publisher, so it might not be good.

I finally registered just to participate in this.

I'm living in Buffalo, NY for the summer if anyone is up for a meetup.

Will participate (online only, living in Serbia). Additional back-and-forth on IRC seems like a good idea.

I'm in. Living in New Haven, CT. Though I wouldn't call myself "Between 0 and 1".

I intend to participate, and I publicly commit 3 hours of reading/thinking/problems and one hour of discussion per week (negotiable). Face-to-face in Seattle would be great. IRC or moderated real-time chat would be great as well. weekly new posts and comment threads on LW would be less ideal, but I'm willing to give it a go.

I'll be interested; I'm going to run a similar club on David Deutsche's Fabric of Reality, and this would be a nice compliment.

Oxford for the next few weeks, then in the south east of the UK.

Thank you very much for initiating this Morendil. I have lurked at LessWrong since Day #1 and came very close to getting an account several times before and submitting comments. This is my first one. I have studied Jaynes for years and spent many hours reading in the book and also his physics papers archived on the Jaynes website.

I look forward to some Probability Theory centered discussion and hope to contribute.

I am in Houston, Texas, USA.

I have been going through the book myself, having never taken probability, statistics, combinatorics, past the high school level. I'm studying to be a mathematician, so I decided to fix that my reading Jaynes on Eliezer's recommendation. I had an electronic edition, and just got the paper edition a couple days ago. I have a one week vacation starting now, so this will probably occupy a large fraction of it.

I've spent probably 15-30 hours in the last two weeks on it, but I've been having trouble with the problems (which is unusual for me), and was looking for a study group. I didn't think it was appropriate for Less Wrong (clearly not true!) and I found someone locally.

I would be happy to talk with other members online. I've never used Google Wave, but people can contact me via email or instant messenger until I get the hang of it. My throwaway is 'lispalien' on Yahoo! Messenger, or you can get another IM handle by contacting me any which way, including a private message here.

I'll give it a go. I will be in Iowa City until August, Philadelphia after that. Willing to meet IRL.

Good timing. This was among the next few on my list already. I would like to participate. I'm in Halifax, NS (Canada) and interested in trying IRL as well as online.

I live in Pittsburgh and would like to participate.

I'd like to participate. I live in a hidden location.

One would assume you already know what is to be known about inference, from examination of your own source code.

Of what type are your inference and decision algorithms?

I'm in. I live in Kenosha, Wi., on campus at UWP. No car.

Edinburgh, Scotland, would love to discuss in real life.

I have the dead tree version and have written about the first two chapters.

I got hung up in chapter three on the symmetry of the hypergeometric distribution. I've managed to sort that out, but haven't got going again due to poor health.

I live in Waterloo, Ontario (Canada). Does anyone live nearby?

This sounds great - count me in. I'm in Toronto.

I'd like to join. I'm in Greenville, South Carolina.

I live in Melbourne, Australia, and am open to discussion IRL.

I'm (still) in!

I live in Davis, California, USA which is about an hour from the Bay Area.

I'm inclined to participate. I have some baseline knowledge of various discrete math but not much probability theory.

I'm in West Michigan, within forty-five minutes of Grand Rapids and two and a half hours of Chicago. I learn better face-to-face than online, so I'd be happy to meet.

I'll try. I know little calculus, though. Location is Finland.

I'm interested, definitely online, possibly IRL. I'm in London.

I've already read the book (the published paper version) without solving the exercises.

I'd be interested in participating in a technical discussion. Maybe (but not very probably) even IRL (Bay Area).

About time. Definitely in (will have to bump Subjective Probability: the real thing off the stack)

Edit: I work in NYC if anyone want to complement this with some IRL action.

I attempted PT:TLOS back in September and made it to chapter 7 before realizing I had only been understanding 30% of it, and gave up. I'm certainly eager to have another stab at it.

In the Bay Area until early/mid August, then St. Louis for a week or two, then Pittsburgh for the school year.

I'm in. I'm located in Salt Lake City, UT, US. I would be very willing to have IRL discussions, if I can fit it around work.

I'll join in. I have some vacation coming up, but no more than a week at a time. In Denmark (fat chance).

I'm in. Started reading through it this past winter but stopped. Hopefully this group will provide some motivation.

Really tempted to participate, a IRL group would help, I am in London but spend lots of time in Guildford.

Count me in, I've had a copy of PT:TLOS sitting around forever, and I'm not too far into it.

I'm in Illinois and am often in Chicago on weekends.

I've been meaning to work through PT:LOS forever, but never gotten it done. I'd be interested. (In the SF Bay Area for about a month longer, then in Finland.)

Participants in the DC area, reply to this comment.

(I can go anywhere where WMATA and Ride-On public transportation can take me, but spend a lot of time at the University of Maryland.)

Are there some places you would suggest as tentative meeting-points? I have the combination to get into one of the student lounges at the University of Maryland, College Park, if you want tables, chairs, and a whiteboard.

That'd probably be fine though not super convenient for me. Alternatively, the Georgetown University Library always has group study rooms free during the summer. What kind of math background do you have?

Georgetown University Library isn't impossible, although I'd want help with the directions (particularly on locating the right building and right room in the building). My statistics background is the usual one-semester college course*, but I've got the full engineering-student education in calculus (I went as far as the partial differential equations course) and a smattering of linear algebra in the course of studying graduate dynamics and finite element methods. I usually pick it up fairly easily.

* The catalog entry for STAT400 in the year I took it states

Random variables, standard distributions, moments, law of large numbers and central limit theorem. Sampling methods, estimation of parameters, testing of hypotheses.

I intend to participate, sounds like a great idea!

ETA: I live in Texas, on the northern part of the I-35 corridor. Anyone remotely nearby? (I'll feel lucky to find just one person.)

I live in Plano (i.e., for y'all far away, a bit north of Dallas). I might be interested in participating in a meatspace study group arrangement of some sort. I've never done something like this outside of university classes, dunno how it'd work out, except to guess that it probably depends strongly on individual personalities and schedules and such.

I've studied parts of the Jaynes book in the past. Recently I've been studying more specialized machine learning techniques, like support vector machines, but it seems clear that more time spent studying the more general and fundamental stuff would be time well spent in understanding specialized techniques, and the Jaynes book looks like a good candidate for such study.

Yippee! I'm in Waco and wouldn't mind meeting in Dallas (or Austin, if there are LWers there), but so far there's only two of us.

I'd like to join too. I'm in Hungary, in the vicinity of Budapest.

Still in. I'm in London until tomorrow and then back home to Melbourne, Australia.

This has been on my reading queue for ages, might as well join in!

I live in Seattle (technically on the border of Bellevue and Redmond), which makes me #3 for this area. Meetups would be great, though I'm unavailable weekdays until after 7 or so.

I think this could be a fun project.

Besides IRL (which is hard to organize) I think other real time communication could be tried out as well. What do you think about the following options:

  • Traditional IRC
  • Google wave
  • Skype conference call
  • Realtime desktop sharing (e.g. mikogo up to 10 participants.)

Does anyone know a good IRC infrastructure that allows for quickly entering and displaying TeX formulas?

Does anyone know a good IRC infrastructure that allows for quickly entering and displaying TeX formulas?

There's a plugin for Pidgin called pidgin-latex which handles just that.

ETA: If people start using this plugin (or, more generally, if we use TeX/LaTeX in any capacity for this study group), it might occasionally be helpful to use the detexify handwritten symbol recognizer - for when you want to use a symbol and can't quite remember the command that produces it.

Other LWers have used IRC before, so that would be a good option to prolong our discussions. The difficulty I anticipate is dealing with time zones. People who have responded to the post are from all over the map.

I think some interactive discussion would definitely help to keep up the spirit.

I'd definitely be interested in joining a real time discussion if there is enough substance for an clear agenda. Using IRC with pidgin-latex sounds good to me.

It is also not strictly necessary that everybody participates at the same time: we could have two meetings, for two different time zones discussing the same topic.

I've set up an experimental Google Wave for this with Morendil. Eqygadget seems to be able to render Latex input in the Wave.

I can add people to the Wave so you can take a look. Just give your Wave account id here, on the #lesswrong IRC channel or mail it to rsaarelm at the Gmail.

dimeforthepassingtime here. Sorry I'm late.

I'm robin.zimm - let me know a time to get on to see how this will work.

Google Wave could be excellent for this, because it acts in part as a wiki (as well as a bunch of other things), meaning we could archive our discussions and come back to them. Also, it has Latex, which is probably necessary.

Also, it's federated. Thus, if our hosts wanted to set up a Less Wrong Wave they could do so without needing to rely on Google storing our discussions. On the other hand, setting up a federated Wave certainly isn't necessary.

As a warm-up, and to indicate how I intend to prompt discussion (subject to the group's feedback) I have posted a summary of the Preface. (ETA: for instance, implications of this method are that it's up to participants to check back on the post from time to time to see if new summaries have been posted; then after reading the parts summarized, come back and answer this comment. Does that work?)

I will start work today on a summary of as much of Chapter 1 as might make for a nice bite-sized chunk to discuss, and post that in a few days, or sooner if the discussion on the Preface dies down quickly.

Discussion question for the Preface: can you think of further examples of the type of "old ideas" Jaynes refers to?

(http://www.thisamericanlife.org/radio-archives/episode/204/81-Words): For some time, the only homosexuals who were studied were in prison or insane asylums. It took a good bit of work and some risk to get the word out that there were homosexuals living non-pathological lives.

However, I'm not sure this is the sort of thing Jaynes had in mind-- the old ideas are shaping which data gets collected, so it's not an example of re-examining the same data set. On the other hand, no amount of study of inmates in prisons and insane asylums could have established that there were homosexuals living ordinary lives.

No, I think this is right on target, and it reminds me of Yvain's post on "disease".

Cataloging a particular behavior as a pathology leads to "hidden inferences", and no amount of new data can lead to correct conclusions without first challenging those among such hidden inferences which happen to be false. We could ask, "what data are we failing to collect on causes of obesity owing to our prevailing model of obesity"?

We could also ask "what data are we failing to collect about the risks of intentional weight loss because of our prevailing model of obesity?".

I wonder if Jaynes' statement is really true? Here is an example that is on my mind because I'm reading the (thus far) awesome book The Making of the Atomic Bomb. Apologies if I get details wrong:

In the 1930s, there was a lot of work done on neutron bombardment of uranium. At some point, Fermi fired slow moving neutrons at uranium and got a bunch of interesting reaction products that he concluded were most plausibly transuranic elements. I believe he came to this conclusion because the models of the day discounted the hypothesis that a slow moving neutron could do anything but release a "small" particle like a helium nucleus or something and furthermore there was experimental work done to discount the lower elements that were in the vicinity of uranium.

Some weird experimental data by Joliet and Curie which seemed inconsistent with the prevailing model came up later. Hahn and Strassman seemed not to believe their results, and so tried to replicate them and found similar anomalies. A careful chemical analysis of the reaction products of uranium bombardment found elements like barium -- much lower on the periodic table. Meitner and Frisch came along and provided a new model which turned out to be right.

So here was data that when analyzed with respect to old models seemed implausible. The data was questioned, but then replicated, studied and then understood. The result was that the old model had to be cast aside for something new. The reason is that the data was incompatible with the model (or at least implausible enough) that a new model needed to be created.

Isn't this narrative the way knowledge often goes? New data comes along and blows up old ideas because the new data is inconsistent with or implausible in the old model. Does this jibe with Jaynes' statement?

re: old ideas

I can't really figure out what he means by that. His example with dangerous doses of artificial sweeteners seems to be about asking the wrong question. It seems logical that no amount of data can get you the right answer if you don't ask the/a right (set of) question(s).

He goes on about mutilating datasets, which seems to me a sin. Me, with GBytes of storage on my PC. When the medium of storage is paper, data gets mutilated. Consider a doctor writing up anamnesis: patient talks on and on, but only what the doctor considers relevant data is written down. Seems like a perfect example of a mutilated dataset and what Jaynes was talking about - if the doctor has a wrong model in mind while collecting data, (s)he is more likely not to collect important information.

I heard that the people at CERN don't let a bit go unstored. But are there variables not measured at all, due to our existing models of the universe.

I believe Jaynes was implying that since the experimenters didn't have a threshold model in mind, the experiment did not measure a broad enough range of doses to distinguish between a linear response and a threshold. For example, if the only tests of the sweetener were at doses which produced harmful effects, then it might be impossible to derive the correct model based on only that data.

Please reply to this comment if you have feedback other than intent to participate, such as ideas on what would make a book club / study group process a satisfactory experience for you.

I think most of the "proof these are unique/sufficient" bits in the first IIRC couple chapters are quite unnecessary to your goals and just make it look like you need more mathematical expertise than you really do. So don't get bogged down by that.

Some attention to the mathematical prerequisites needed to properly get through Jaynes might be nice. I've basically got some poorly learned undergraduate math, practically no calculus above high-school level and a pretty hand-wavy understanding of probability theory. I think the undergrad probability course I took said that proper treatment of probability axioms requires measure theoretic calculus, so it will be dealt with in a later course. I know pretty much nothing about measure theory beyond it having something to do with both calculus and probability theory. So stuff that assumes good calculus literacy might be anything from hard going to impossible to understand properly without further study.

Good point, I'll try and see what I can tell of the prerequisites. I've made it through to Chapter 6 with extremely rusty high-school math and found it accessible if demanding. But it's possible I've missed out deeper nuances due to lacking some background.

Looks like it might provide a good alternate venue if it ever becomes cumbersome to have our discussions on LW.

I appreciate your use of and linking to my scale for rating understanding. But in the few months until it becomes a universally recognized standard ;-) , you should probably briefly explain any reference to the numbered levels.

In this case, given the subject matter, "between levels 0 and 1" means that you can sometimes, but not always, generate the Bayesian answer to a given problem.

I'd love to participate in this type of group in the future.

No problem. :) Please see the update and, if you're interested in live meetings, register your preferred times on the spreadsheet linked there.

I'm in. Out in Michigan, though unlikely to be able to meet up.

Re: Preface

Is there a good reason why the Maximum Entropy method is treated as distinct from the Bayesian, rather than simply as a method for generating priors?

I think Jaynes more or less defines 'Bayesian methods' to be those gadgets which fall out of the Cox-Polya desiderata (i.e. probability theory as extended logic). Actually, this can't be the whole story given the following quote on page xxiii:

"It is true that all 'Bayesian' calculations are included automatically as particular cases of our rules; but so are all 'frequentist' calculations. Nevertheless, our basic rules are broader than either of these."

In any case, Maximum entropy gives you the pre-Bayesian ensemble (I got that word from here) which then allow the Bayesian crank to turn. In particular, I think Maximum entropy methods are not Bayesian in the sense that they do not follow from the Cox-Polya desiderata.

In particular, I think Maximum entropy methods are not Bayesian in the sense that they do not follow from the Cox-Polya desiderata.

IIRC, this was my understanding of Jaynes's position on maxent:

  1. the Cox-Polya desiderata say that multiple allowed derivations of a problem ought to all lead to the same answer
  2. if we consider a list of identifiers about which we know nothing, and we ask whether the first one is more likely than the nth one, then we should answer that they are equal, because if we say either greater than or less than, we could shuffle the list and get a contradictory answer. By induction, we ought to say that all members of the list are equiprobable, which only allows entries to be 1/n probable.
  3. hence, we get the Principle of Indifference. (Points 1-3 are my version of chapter 2 or 3, IIRC.)
  4. Maxent is just the same idea, abstract and applied to non-list thingies. (I haven't actually gotten this far, but it seems like the obvious next step.)

The arguments seem to me to be as Bayesian as anything in his building up of Bayesian methods from the Cox-Polya criteria.

I think this is not so important, but it helpful to think about nonetheless. I guess the first step is to define what is meant by 'Bayesian'. In my original comment, I took one necessary condition to be that a Bayesian gadget is one which follows from the Cox-Polya desiderata. It might be better to define it to be one which uses Bayes' Theorem. I think in either case, Maxent fails to meet the criteria.

Maxent produces the distribution on the sample space which maximizes entropy subject to any known constraints which presumably come from data. If there are no constraints, then one gets the principle of indifference which can also be gotten straight out of the Cox-Polya desiderata as you say. But I think these are two different approaches to the same target. Maxent needs something new -- namely Shannon's information entropy (by 'new' I mean new w.r.t. Cox-Polya). Furthermore, the derivation of Maxent is really different from the derivation of the principle of indifference from Cox-Polya.

I could be completely off here, but I believe the principle of indifference argument is generalized by the transformation group stuff. I think this because I can see the action of the symmetric group (this is the group (group in the abstract algebra sense) of permutations) on the hypothesis space in the principle of indifference stuff. Anyway, hopefully we'll get up to that chapter!

Upon further study, I disagree with myself here. It does seem like entropy as a measurement of uncertainty in probability distributions does more or less fall out of the Cox Polya desiderata. I guess that 'common sense' one is pretty useful!

Jaynes recommends MaxEnt for situations when "the Bayesian apparatus", consisting of "a model, a sample space, hypothesis space, prior probabilities, sampling distribution" is not yet available, and only a sample space can be defined.

This is as good a time as any for me to tentatively return to this community (last was around in the Overcoming Bias days, when I was a fairly regular commenter and occasional poster, some of you probably remember me).

I'm tentatively in, subject to time crunches.

If it's not too late, I'd like to participate as well.

Lbraschi is my account for Google Wave. I'm in Madrid, Spain.

I'm working through chapter 2 right now, and finding it very rough going from page 203 on if I try to really understand what's going on instead of just skimming for a general outline.

I suppose the text is expecting higher than high-school level math literacy.

I am definitely participating in this! I just started reading the text last week, and have been trying to get someone else interested as well for this purpose. Thanks for the effort!

Study group update

(edited to defer starting group work in earnest - see discussion below - readings to start monday)

On the meta level, what do people think of this method of expanding the study group post by increments?

(Feedback sought: am I going too fast?)

"Are you going too fast"? The post has only been up for two days, man - people are still signing up to join the group, we don't have any scheduled meets, we don't have any scheduled chatroom meets ... the pace might be right once we get going, but most of us haven't.

Okay, I'll ease up. Thanks. :)

Two-three of us met up this morning (my time) on IRC informally, we started a Google Wave on an experimental basis - I have a biased impression of how much is going on, relative to a typical member of the group. Do feel free to help me correct for that.

If you ask me, announce the official start on Monday in a new post and pick (say) five weekdays and UTC times staggered so that most of the people who announced their locations would be probably able to make at least one or two of them. And tell people what chapters are being discussed this week and next week in a new post every week, so they can get ahead if they want.

I missed the preface update. Sometimes I get old posts duplicated in my RSS feed but this didn't happen with your edits so something else must trigger that. A title change perhaps? I think new posts might be better than expanding the original post by increments both because it would probably make RSS notifications work better and because I imagine the size of the comments section could get out of hand with it contained to a single post.

I've been thinking about that; I'd like to strike a compromise between helping people notice when something new is posted, and the interests of LW readers outside the study group. ISTM that we might want to keep this to one top-level post per chapter, with each chapter possibly divided into several updates.