There is a new paper out by Sanguinetti, Allen, and Peterson, The Ground Side of an Object: Perceived as Shapeless yet Processed for Semantics. In it, the authors conduct a series of experiments to try to answer the question of how the brain separates background from foreground in visual processing. I found it interesting, so I thought I'd share. The human visual system is incredibly complex and we still have no clear idea how it does a lot of the things it does.
The experimental protocol was as follows:
The stimuli were 120 small, mirror-symmetric, enclosed white silhouettes (Trujillo, Allen, Schnyer, & Peterson, 2010). Of these, 40 portrayed meaningful name-able objects (animals, plants, symbols) inside their borders and suggested only meaningless novel objects on the ground side of their borders. The remaining 80 silhouettes depicted meaningless novel objects (objects not encountered previously) inside their borders. Of these, 40 suggested portions of nameable meaningful objects on the ground side of their borders. Note, however, that participants were not aware of the meaningful objects suggested on the ground side of these silhouettes. The remaining 40 novel silhouettes suggested novel shapes on both sides of their borders."
Stimuli were presented on a 20-in. CRT monitor 90 cm from the participants using DMDX software. Participants’ heads were unrestrained. Their task was to classify the silhouettes as depicting real-world or novel objects. Responses were made via button press; assignment of the responses to the two response buttons was random.
They then recorded the EEG signals from the participants and found something surprising: When the background was meaningful, the subject's brain waves produced the same signatures as would be expected when conscious awareness had taken place (called 'N300' and 'N400' signatures because they occur 300 and 400 ms after presentation of the stimulus), even if the subjects did not report percieving anything meaningful in the images. However, if the background really was meaningless, this signature was absent.
This is consistent with the "Brains aren't magic" doctrine. The brain can't magically separate objects from non-objects in the simplest levels of processing. No, just like a computer it has to tediously go through the image, pixel by pixel, identifying patterns and gradually building up into complex representations. It is only after this process is over that it can reliably separate background from foreground. It's just that this process happens with a huge degree of parallelism and we are not consciously aware of what is happening. Of course, none of this should be of any surprise to those here at LessWrong. However, as the authors note, the thinking that there is some mysterious mechanism in the brain that does complex visual processing in the very first visual areas seems to be subconsciously prevalent in psychological studies, and this has also confused many machine learning researchers.
There is a question about the study, though. How far did the processing go? Was it just the brain recognizing some abstract features of the background, not identifying it as a particular object, or did the brain actually subconsciously figure out what the object was? To determine this, they designed another experiment:
To ascertain that semantic access underlies the N300 and N400 effects observed in Experiment 1, we preceded each silhouette with a word in Experiment 2. For a critical novel-object/meaningful-ground silhouette, the word named either the object suggested on the silhouette’s ground side (match condition) or a different object (mis-match condition). If semantic access occurs for grounds, N300 and N400 responses to the novel-object/meaningful-ground silhouettes would be reduced in the match condition compared with the mismatch condition even though the semantic repetition involved stimuli from different modalities.
What they found was that indeed, semantic access underlies the responses - the brain knows what it's seeing, even when the interpretation doesn't rise up to the level of full conscious awareness (the full statistical analysis is in the paper, which is sadly not open-access):
Experiment 2 replicated these results when semantics were accessed cross-modally. Our interpretation of the neurophysiological evidence is buttressed by a recent behavioral experiment showing semantic priming from objects suggested on the ground side of the silhouettes (Peterson et al., 2012). These results are contrary to the traditional serial-processing assumption that semantic representations are accessed only after object segregation and instead support a dynamic view of perception according to which more objects are evaluated by the visual system than are ultimately perceived.
If you regard brains as Bayesian networks and spikes as messages, neurons in the visual cortex send trillions of messages amongst each other before arriving on a 'consensus' probability distribution on what is being seen. Throughout this whole process, many complex hypotheses are generated but discarded before reaching conscious perception. The brain does not magically side-step through the search space. It must trudge through it just like a computer. This is consistent with what we know about existing bayesian vision systems and the limitations of computer hardware.