If this seems unlikely, could you elaborate on the reasons? What essential capabilities would it lack in order to accomplish self-improvement? For instance:

- Enhanced Short-Term Memory: An extended token window
- Improved Long-Term Memory: The competence to modify or augment its own weights and training dataset
- Specific Resources: Access to a DGX data center for testing and training purposes
- Any other abilities?

In my perspective, GPT-4 already demonstrates respectable proficiency in code composition. However, it seems to fall short in short-term memory capacity, which is a crucial element for planning and testing associated with larger code blocks and projects, such as developing a new GPT. Thoughts?

New Answer
Ask Related Question
New Comment

3 Answers sorted by

The biggest issue I think is agency. In 2024 large improvements will be made to memory (a lot is happening in this regard). I agree that GPT-4 already has a lot of capability. Especially with fine-tuning it should do well on a lot of individual tasks relevant to AI development. 

But the executive function is probably still lacking in 2024. Combining the tasks to a whole job will be challenging. Improving data is agency intensive (less intelligence intensive). You need to contact organizations, scrape the web, sift through the data etc. Also it would need to order the training run,  get the compute for inference time, pay the bills etc. These require more agency than intelligence. 

However, humans can help with the planning etc. And GPT-5 will probably boost productivity of AI developers. 

note: depending on your definition of intelligence, agency or the executive function would/should be part of intelligence. 

The biggest issue I think is agency. In 2024 large improvements will be made to memory (a lot is happening in this regard). I agree that GPT-4 already has a lot of capability. Especially with fine-tuning it should do well on a lot of individual tasks relevant to AI development. 

But the executive function is probably still lacking in 2024. Combining the tasks to a whole job will be challenging. Improving data is agency intensive (less intelligence intensive). You need to contact organizations, scrape the web, sift through the data etc. Also it would ne

... (read more)

There's AGI, autonomous agency at a wide variety of open-ended objectives, and generation of synthetic data, preventing natural tokens from running out, both for quantity and quality. My impression is that the latter is likely to start happening by the time GPT-5 rolls out. Quality training data might be even more terrifying than scaling, Leela Zero plays superhuman Go at only 50M parameters, so who knows what happens when 100B parameter LLMs start getting increasingly higher quality datasets for pre-training.

There's AGI, autonomous agency and a wide variety of open-ended objectives, and generation of synthetic data, preventing natural tokens from running out, both for quantity and quality. My impression is that the latter is likely to start happening by the time GPT-5 rolls out. 

 

It appears this situation could be more accurately attributed to Human constraints rather than AI limitations? Upon reaching a stage where AI systems, such as GPT models, can absorbed all human-generated information, conversations, images, videos, discoveries, and insights, ... (read more)

2Vladimir_Nesov11h
Once AGI works, everything else is largely moot. Synthetic data is a likely next step absent AGI. It's not currently used for pre-training at scale, there are still more straightforward things to be done like better data curation, augmentation of natural data, multimodality, and synthetic datasets for fine-tuning (rather than for the bulk of pre-training). It's not obvious but plausible that even absent AGI it's relatively straightforward to generate useful synthetic data with sufficiently good models trained on natural data, which leads to better models that generate better synthetic data. This is not about making progress on ideas beyond current natural data (human culture), but about making models smarter despite horrible sample efficiency. If this is enough to get AGI, it's unnecessary for synthetic data to make any progress on actual ideas until that point. Results like Galactica [https://arxiv.org/abs/2211.09085] (see Table 2 [https://arxiv.org/pdf/2211.09085v1.pdf#page=4] therein) illustrate how content of the dataset can influence the outcome, that's the kind of thing I mean by higher quality datasets. You won't find 20T natural tokens for training a 1T LLM that are like that, but it might be possible to generate them, and it might turn out that the results improve despite those tokens largely rehashing the same stuff that was in the original 100B tokens on similar topics. AFAIK the experiments to test this with better models (or scaling laws for this effect) haven't been done/published yet. It's possible that this doesn't work at all, beyond some modest asymptote, no better than any of the other tricks currently being stacked.

Creating an AI that could autonomously design, train, and implement a superior version of itself is a concept referred to as recursive self-improvement or AI bootstrapping. While this is a fascinating idea and a topic of much discussion in AI research, it is a difficult task with many challenges and risks.

Let's consider the capabilities you mentioned:

Enhanced Short-Term Memory (Extended Token Window): This is an issue of architecture. In principle, GPT-5 could include such improvements, and they could be beneficial. However, a larger token window would significantly increase computational requirements, and it's unclear how much benefit this would actually provide for the specific task of developing a superior AI.

Improved Long-Term Memory (Modifying Its Own Weights and Training Dataset): AI models such as GPT-4 or hypothetical GPT-5 do not have the ability to modify their own weights or training dataset. This ability would require a very different architecture. For the task of designing a superior AI, the model would need to understand the complex relationship between the model's weights and its performance, which is a task that's currently beyond the capabilities of AI. Even if the AI had this capability, training AI models is a resource-intensive task that requires specific hardware resources and infrastructure.

Specific Resources (Access to a DGX data center): Even if an AI had access to such resources, it would still need to understand how to use them effectively, which would require capabilities beyond what GPT-4 or a hypothetical GPT-5 have.

Code Composition: While GPT-4 can indeed generate code, the task of generating code to train a superior AI is far more complex. It involves a deep understanding of AI architectures, algorithms, and principles, as well as the ability to invent new ones. Even for human AI researchers, creating a superior AI model is a significant challenge that requires years of study and expertise.

In addition to these points, there's also the problem of evaluation. Even if an AI could generate a new AI architecture and train it, it would still need to evaluate the new AI's performance and make decisions about how to improve it. This requires an understanding of AI performance metrics and the ability to interpret them, which is another complex task that current AI models are not capable of.

Furthermore, it's worth noting that creating an AI that can improve itself poses significant ethical and safety concerns. Without careful safeguards and oversight, such an AI could potentially lead to unwanted or even dangerous outcomes.

In conclusion, while the idea of an AI improving itself is theoretically possible and an interesting research direction, it's currently beyond the capabilities of current AI technology, including GPT-4 and a hypothetical GPT-5. Achieving this goal would likely require significant advances in AI architectures, algorithms, and understanding of AI principles, as well as careful consideration of ethical and safety issues.

Thanks GPT-4. You're the best!  

Veniversum Vivus Vici, do you have any opinions or unique insights to add to this topic?

New to LessWrong?