There is a new paper out by Sanguinetti, Allen, and Peterson, The Ground Side of an Object: Perceived as Shapeless yet Processed for Semantics. In it, the authors conduct a series of experiments to try to answer the question of how the brain separates background from foreground in visual processing. I found it interesting, so I thought I'd share. The human visual system is incredibly complex and we still have no clear idea how it does a lot of the things it does.

The experimental protocol was as follows:

The stimuli were 120 small, mirror-symmetric, enclosed white silhouettes (Trujillo, Allen, Schnyer, & Peterson, 2010). Of these, 40 portrayed meaningful name-able objects (animals, plants, symbols) inside their borders and suggested only meaningless novel objects on the ground side of their borders. The remaining 80 silhouettes depicted meaningless novel objects (objects not encountered previously) inside their borders. Of these, 40 suggested portions of nameable meaningful objects on the ground side of their borders. Note, however, that participants were not aware of the meaningful objects suggested on the ground side of these silhouettes. The remaining 40 novel silhouettes suggested novel shapes on both sides of their borders."
Credit: Jay Sanguinetti
Stimuli were presented on a 20-in. CRT monitor 90 cm from the participants using DMDX software. Participants’ heads were unrestrained. Their task was to classify the silhouettes as depicting real-world or novel objects. Responses were made via button press; assignment of the responses to the two response buttons was random.

They then recorded the EEG signals from the participants and found something surprising: When the background was meaningful, the subject's brain waves produced the same signatures as would be expected when conscious awareness had taken place (called 'N300' and 'N400' signatures because they occur 300 and 400 ms after presentation of the stimulus), even if the subjects did not report percieving anything meaningful in the images. However, if the background really was meaningless, this signature was absent.

This is consistent with the "Brains aren't magic" doctrine. The brain can't magically separate objects from non-objects in the simplest levels of processing. No, just like a computer it has to tediously go through the image, pixel by pixel, identifying patterns and gradually building up into complex representations. It is only after this process is over that it can reliably separate background from foreground. It's just that this process happens with a huge degree of parallelism and we are not consciously aware of what is happening. Of course, none of this should be of any surprise to those here at LessWrong. However, as the authors note, the thinking that there is some mysterious mechanism in the brain that does complex visual processing in the very first visual areas seems to be subconsciously prevalent in psychological studies, and this has also confused many machine learning researchers.

There is a question about the study, though. How far did the processing go? Was it just the brain recognizing some abstract features of the background, not identifying it as a particular object, or did the brain actually subconsciously figure out what the object was? To determine this, they designed another experiment:

To ascertain that semantic access underlies the N300 and N400 effects observed in Experiment 1, we preceded each silhouette with a word in Experiment 2. For a critical novel-object/meaningful-ground silhouette, the word named either the object suggested on the silhouette’s ground side (match condition) or a different object (mis-match condition). If semantic access occurs for grounds, N300 and N400 responses to the novel-object/meaningful-ground silhouettes would be reduced in the match condition compared with the mismatch condition even though the semantic repetition involved stimuli from different modalities.


What they found was that indeed, semantic access underlies the responses - the brain knows what it's seeing, even when the interpretation doesn't rise up to the level of full conscious awareness (the full statistical analysis is in the paper, which is sadly not open-access):

Experiment 2 replicated these results when semantics were accessed cross-modally. Our interpretation of the neurophysiological evidence is buttressed by a recent behavioral experiment showing semantic priming from objects suggested on the ground side of the silhouettes (Peterson et al., 2012). These results are contrary to the traditional serial-processing assumption that semantic representations are accessed only after object segregation and instead support a dynamic view of perception according to which more objects are evaluated by the visual system than are ultimately perceived.

If you regard brains as Bayesian networks and spikes as messages, neurons in the visual cortex send trillions of messages amongst each other before arriving on a 'consensus' probability distribution on what is being seen. Throughout this whole process, many complex hypotheses are generated but discarded before reaching conscious perception. The brain does not magically side-step through the search space. It must trudge through it just like a computer. This is consistent with what we know about existing bayesian vision systems and the limitations of computer hardware.

New to LessWrong?

New Comment
4 comments, sorted by Click to highlight new comments since: Today at 9:11 PM

Great write-up, thanks for providing details in an accessible way.

I had previously heard of some experiments done by the military or security sectors where they wanted to monitor a large area for human activity. They set up a bunch of security cameras and wanted to see how the best way to detect human motion on those was. A computer program to detect it had only moderate success. A human could only monitor a fraction of the cameras simultaneously. So they hooked a human into some sort of brain-reading machine (I have forgotten all the details!) and flashed the cameras images before them on a screen at a rate that was too high for the human to consciously be able to say whether or not there was anything of interest in the screen. But they were able to train a computer program to interpret the information read from the human brain in order to determine whether there was anything interesting. And this system performed better than the pure-computer system.

I can't say I have a link to this though. Just more evidence that this isn't unexpected.

Things like this seem to increase the probability that AI is currently constrained more by hardware than by software.

I agree strongly with this, however it's not just availability of hardware but willingness to put the hardware to AI uses. It could be that we already have the hardware to execute a self-improving AI, it's just that it's being used to simulate nuclear explosions or molecular dynamics or carry out billions of web searches instead.

For an auditory counterpart, think Cocktail Party Effect and how you can recognize your name in an audiostream you're not consciously attending.