Web developer and Python programmer. Professionally interested in data processing and machine learning. Non-professionally is interested in science and farming. Studied at Warsaw University of Technology.
What I would also like to add, which is often not addressed and it gives some positive look, is that the "wanting" meaning the objective function of the agent, it's goals, should not necessarily be some certain outcome or certain end-goal on which it will focus totally. It might not be the function over the state of universe but function over how it changes in time. Like velocity vs position. It might prefer some way the world changes or does not change, but not having a certain end-goal (which is also unreachable in long-term in a stable way as universe will die in some sense, everything will be destroyed with P=1 minus very minute epsilon over enough time).
Why positive? Because these things usually need balance and stabilisation in some sense to retain same properties, which means less probability of drastic measures to get a little bit better outcome a little bit sooner. It might cease controll over us, which is bad, but gives lower probability of rapid doom.
Also, looking on current works it seems more likely for me that those property-based goals will be embedded rather than some end-goal like curing cancer or helping humanity. We try to make robust AGI so we don't want to embed certain goals or targets, but rather patterns how to work productively and safely with humans. Those are more meta and about way of how things go/change.
Note it is more like intuition for me than hard argument.
The question is if one can make a thing that is "wanting" in that long-term sense by combining not-wanting LLM model as short-term intelligence engine with some programming-based structure that would refocus it onto it's goals and some memory engine (to remember not only information, buy also goals, plans and ways to do things). I think that the answer is a big YES and we will soon see that in a form of amalgamation of several models and enforced mind structure.
There is one thing that I'm worried about in future of LLM. This is a basic notion that the whole is not always just the sum of parts and it may have very different properties.
Many people feel safe because of properties of LLM and how they are trained etc. and because we are not anywhere close to AGI when it comes to different solutions which seem more dangerous. What they don't realize is that soonest AGI likely won't be a next bigger LLM model.
It will likely be amalgamation of few models and pieces of programming, including few LLM of different sizes and capabilities, maybe not all exactly chat-like. It will have different properties than any one of its parts and it will be different than single LLM. It might be more brain-like when it comes to learning and memories. Maybe not in a way that weights of LLM models will change, but some inner state will change, and some more basic parts will learn or remember solutions and will structure them into more complex solutions (like we remember how to drive without consciously deciding on each move of the muscle or even making higher level decisions). It will have goals, priorities, strategies and short-time tactic schemes and it will be processed on higher level than single LLM.
Why I do think that? Because it is already to be seen on the horizon if you think about works like multi-model GPT-4, GPT Engineer, multitude of projects adding long term memory for GPT, and that scientific works where GPT writes itself code to bootstrap itself into doing complex tasks like achieving goals in Minecraft. If you extrapolate that then AGI is likely, initially maybe not very fast or cheap one though. It is likely to be on top of LLM but not being simply an LLM.
Most likely explanation is the simplest fitting one:
Take note about the language that Ilya uses. He didn't say they did bad to Altman or that decission was bad. He said that that he changed his mind because of consequences being harm for the company.
Both seem around the corner for me.
For robo-taxis it is more a society-based problem than a technical one.
What is missing so we would have robo-taxis is mostly public trust and more investments.
For AGI we already have basic blocks. Just need to scale them up and connect them into a proper system. What building blocks? These:
What is missing for AGI:
If I would be to guess I would say that AGI will be sooner in scale - just because there is hype, there are big investments and the main problems are currently less like "we need a breakthrough" and more like "we need refinements". For robo-taxis we still need a lot more investments and some breakthroughs in areas of public trust or law.
In the case of biological species, it is not as simple as competing for resources. Not on the level of individuals and not on the level of genes or evolution.
First of all, there is sexual reproduction. This is more optimal due to the pressure of microorganisms that adapt to immunological systems. Sexual reproduction mixes immunological genes fairly quickly. This also enables a quicker mutation rate with protection against negative aspects (by having two copies of genes - for many of those one working gene is enough and there are 2 copies from 2 parents). With this sexual reproduction often the female is biologically forced to give more resources to the offspring while for males it is somewhat voluntary and minimal input is much lower. Another difference is that often female knows exactly that she is the biological mother, but the father might not be certain about that. This kind of specialization is often more optimal than equalization - so the male can pursue more risky behavior including fighting off predators and losing the male to the predators or environment does not mean that the prospect of having offspring fails. This also makes more complex mating behaviors like the need to lose resources to show off health and other good qualities. Mating behaviors and peacock feathers are examples. Human complex social and linguistic behaviors are also somewhat examples - that's why humans dine together and talk a lot together on dates. The human female gives much more time and energy to the offspring, at least initially. Needs to know if the male is both good genetic material, healthy enough to take care of her during pregnancy when she is more vulnerable (at least in a natural environment where humans evolved), and also willing to raise the child with her later. There is a more prevalent strategy for females and males where they make a pair, bond together, have children, and raise them. There is also a more uncommon strategy for females (take genes from one male that looks more healthy and raise offspring with another one which looks more stable and able) and for males (impregnate many females and leave each of them so some will manage to handle on their own or with another male that does not know that he is not the father). The situation is more complex than only efficiency for resources or survival of the fittest. The environment is complex and evolution is not totally efficient (it optimizes often up to local optimum, and niches overlap and interact).
Second of all, resources are limited, and ways to use them also. Storing them long-term after harvest for many species is either impossible (microbes and insects will eat them) or would hinder their other capabilities (e.g. can store that as fat, but being fat is usually not very good). This means that preserving from gathering resources and resting might be better than gathering them efficiently all the time. This is what lions do - they rest instead of hunting when they don't need to hunt.
What does it tell us about self-replicating nano-machines? First of all, they won't need sexual reproduction. So unlikely they would lose energy on mating. They would rather do computational emulations at scale to redesign themself, if capable. They would also not need to rest. They will either use resources or store them in a manner that is more efficient to secure or use. If there is no such sensible manner that would not lose energy, they would leave it for later in the original state. They might secure it and observe but leave it until later.
What would they do depends on what is their goal and their technical capabilities. If they are capable and in need of converting as much of atoms to "computronium" or to their copies (as either a final goal or instrumental one) then they will surely do that. No need to lose resources. If they are not capable then they will probably hang low until more capable and use only what is usable.
Nevertheless, in my opinion, goals may not be compatible with that strategy. Including one like "simulate a virtual reality with beings having good fun". For many final goals more usable is to secure and gather resources on a grand scale but try to use them on as small a scale as possible and sensible for the end goal. The small scale is more efficient because of the light speed limit and dilation of time. Machines might try to find technology to stop stars from dispersing energy (maybe to break them and cool them down in a controlled way or some way to block them and stabilize them inside enclosed space, I don't know). Then they might add a network of observing agents with low energy usage for security, but not to use those resources right away. Use the matter slowly at the center of the galaxy turning it into energy (+ some lost to the black hole) to work for eons. They might make the galaxy go dim to preserve resources but might choose not to use them until much later.
An alternative explanation of mistakes is that making mistakes and then correcting them was awarded during additional post-training refinement stages. I work with GPT-4 daily and sometimes it feels like it makes mistakes on purpose just to be able to say that it is sorry for the confusion and then correct it. It feels like it also makes fewer mistakes when you ask politely, which is rather strange (use please, thank you, etc.).
Nevertheless, distillation seems like a very possible thing that is also going on here.
It does not distill the whole of a human mind though. There are areas that are intuitive for the average human, even a small child, that are not for the GPT-4. For example, it has problems with concepts of 3D geometry and visualizing things in 3D. It may have similar gaps in other areas, including more important ones (like moral intuitions).
I'm already worried as I tested AutoGPT and looked at how it works in code and for me, it seems like it will get very good planning capabilities with the change of a model to one with a few times longer token scope (like coming soon GPT-4 version with about 32k tokens) plus small refinements. So it won't get into loops, maybe have more than one GPT-4 module for different scopes of planning like long-term strategy vs short-term strategy vs tactic vs decisions on most current task + maybe some summarization-based memory. I don't see how it wouldn't work as an agent.
Put it into ElasticSearch index and give GPT-4 simple query API that it can use by adding some prefix and predefined set of parameters or a JSON so the script would run it instead of communicating this back to the user and give an answer as user response with also predefined prefix. Then it should be able to get questions, search for info, and respond. Worked like a charm for a product database in PoC so should work for documentation.
I think that in an ideal world where you could review all priors to very minute details having as much time as needed, and where people were fully rational, then "trust" as a word would not be needed.
We don't live in such a world though.
If someone says "trust me" then in my opinion it conveys two meanings on two different planes (usually both, sometimes only one):
"I decided to trust her about ..." - for me, it is a short colloquial term for "I took time to reconsider if things she says on the topic ... are true and now I think that is more likely that they are".
For many people, it also has emotional and bonding components.
Another thing is that people tend to trust or mistrust another person in general broad scope. They don't go into detail and think separately on every topic or thing someone says and decide separately for each of them. That's an easy heuristic that usually is good enough, so our minds are wired to operate like that. So people usually say that they trust a person generally, not trust that person within some subject/scope.
P.S. I'm from a different part of the world (central EU, Poland). We don't use phrases like "accept trust" here - which is probably an interesting difference in how differences in language create different ways of thinking. For us here "trust" is not like a contract. It is more a one-way thing (but with some expectation of mutuality in most circumstances).