You are viewing version 1.2.0 of this page. Click here to view the latest version.

Language Models (LLMs)

Edited by plex last updated 30th Dec 2024

You are viewing revision 1.2.0, last edited by plex

Language Models are a class of AI trained on text, usually to predict the next word or a word which has been obscured. They have the ability to generate novel prose or code based on an initial prompt, which gives rise to a kind of natural language programming called prompt engineering. The most popular architecture for very large language models is called a transformer, which follows consistent scaling laws with respect to the size of the model being trained, meaning that a larger model trained with the same amount of compute will produce results which are better by a predictable amount (when measured by the 'perplexity', or how surprised the AI is by a test set of human-generated text).

LESSWRONG
LW

LESSWRONG
LW

Language Models (LLMs)

See also

Language Models (LLMs)

See also