A few years ago, the symbol grounding problem was widely considered a significant challenge in AI discussions. I believed that it would likely be addressed as a side effect of capability improvements, without requiring specific breakthroughs or attention, unlike others who considered it fundamental. However, like many, I didn't anticipate the extent to which GPT-4 has demonstrated this to be true. Asserting that such capabilities could be achieved with a text-only training set at that time would have seemed like a parody of my position.
Had you asked me how a model like GPT-4 would acquire its capabilities, I would have suggested a process more akin to how children learn. It might have started with a predictive model of physical reality, established the concept of an object, learned object permanence, and then acquired simple words as they applied to previously learned concepts.
Despite its capabilities, GPT-4 still seems to lack robust physical intuition, evident in data science tasks and mathematical understanding related to the 3D world. Will we see a model trained from scratch, as described earlier? For instance, the Meta AI model appears to grasp the concept of an object in 2D. Suppose this understanding is fully extended to 3D and designated as our foundational model. Can such a model be trained on language, particularly that which relates to the physical world, and develop physical intuition as good as a human's?

Grounding overhang and interpretability implications:

Would such a model be much better at mathematical and programming tasks for given model resources? Assuming the foundational model is much smaller than GPT-4 it seems reasonable that it could gain similar or greater mathematical and programming skills while still having a smaller model size even when trained on a enough language concepts to be capable at those tasks.

This could also help with interpretability as the foundational model couldn't really be thought to lie or deceive as it is just modelling objects in 3d. Deception and theory of mind abilities could be observed as they became available.

New Comment
3 comments, sorted by Click to highlight new comments since: Today at 4:15 AM

I don't think it has gone away; its new form is generalization errors, where the model's decision boundaries get pushed into places that happen to implement some degree of grounding but where the grounding is unreliable and fragile and prone to ontology drift if the grounding pressure doesn't stick around

I don't think it makes sense to say that the symbol grounding problem has gone away, but I think it does make sense to say that we were wrong about what problems couldn't be solved without first solving symbol grounding. I also don't think we're really that confused about how symbols are grounded(1, 2, 3), although we don't yet have a clear demonstration of a system that has grounded its symbols in reality. GPTs do seem to be grounded by proxy through their training data, but this gives limited amounts of grounded reasoning today, as you note.

Thanks. My title was a bit tongue in cheek 'Betteridge's law' so yes I agree. I have decided to reply to your post before I have read your references as that may take a while to digest but I plan to do so. I also see you have written stuff on P-Zombies that I was going to write something on. As a relative newcomer its always a balance between just saying something and attempting to read and digest everything about it on LW first.