No, the preferred form for modifying a model is a copy of the weights, plus open source code for training and inference. "Training a similar model from scratch" is wildly more expensive and less convenient, and not even modification!
If the model weights are available under an OSI-approved open source license, and so is code suitable for fine-tuning, I consider the model to be open source. Llama models definitely aren't; most Chinese models are.
wildly more expensive
Suppose I write a program and let people download the binary. Can I say "I spent 100k on AWS to compile it, therefore the binary is open source"?
not even modification
Would you say compiling source code from scratch (e.g. for a different platform) is not a modification?
Even if you're not intending to retrain the model from scratch, simply knowing what the training data is is valuable. Maybe you don't care about the training data, but somebody else does. I don't think "I could never possibly make use of the source code / training data" is an argument that a binary / weights is actually open source.
How does open source differ from closed source for you in the case of generative models? If they are the same, why use the term at all?
If I vibe-coded an app prompting, say, Claude, and released it along with the generated code, would you have the same objections to me calling it "open source," because I haven't also released the weights (and Anthropic's training data, of course) that generated it?
Your argument simply takes for granted that the way to think of modern AI models is as compiled binaries, based on some superficial similarities. I consider this specious: it looks different from the programs you're used to, but the weights + inference code are the program, and training data + RLHF + safety abliteration + jailbreaking safeguards + secret sauce (maybe they run magnets over a hard drive containing the weights) are tools used to create the code they release. This view is supported by the company that releases it having no more preferred "source code" form they interact with it in: I think you're simply wrong in suggesting they prefer to work with the model by editing the training data and "recompiling" instead of starting with the weights (which they released) and modifying that directly through fine-tuning.
If I vibe-coded an app prompting, say, Claude, and released it along with the generated code, would you have the same objections to me calling it "open source,"
No, because I don't think this misleads people. Granted, the term "open source" is fuzzy at the boundaries. Should we use the term? I don't know, but if we do, it only makes sense if it means something different from "closed source".
wrong in suggesting they prefer to work with the model by editing the training data and "recompiling" instead of starting with the weights
One doesn't exclude the other. If you're creating v2 of your model, you'd likely: take the training code and data for v1; make some changes / add new things; run the new training code on the new data. For minor changes you may prefer to do fine-tuning on the weights.
There's a clear and obvious difference between models like Qwen, DeepSeek, and Llama; and models like ChatGPT, Claude, and Gemini; and the well-established and widely-understood phrase for this difference is "open source", contrasted with "proprietary" or "closed source," used in things like hardware, fonts,[1] and military intelligence. If you like, think of it as a kind of fossilization of the phrase, where the "source" part has ceased to be more than an etymological curiosity; you can certainly dislike this phenomenon, but – and I say this with regret since I'm far more prescriptivist than the next guy – trying to change it is probably pissing upwind.
The restrictions on usage are a better argument: among the models with weights available, some are clearly more "open source"[2] than others, and I'd even agree that Llama's 700-million-user restriction means that, while for most practical purposes it's open source, it's technically only "source-available."
It's not obvious to what extent fonts count as programs, and their "source code" is usually nothing more than the glyphs, which can be read out from proprietary fonts trivially. Maybe there's a bit of obfuscation one could perform on the feature file?
I agree it's useful vocabulary, and reducing it to a binary makes it less so.
well-established
The usage I'm objecting to started, as far as I can tell, about 2 years ago with Llama 2. The term "open weights", which is often used interchangably, is a much better fit.
Let's say I want to use the model for my company, while making sure it knows nothing about competitors. The most reliable way to do that would be to remove such references from the training data and retrain the model.
It would be worth responding to the claim that you could do this with RL. In my mind, the important thing is that when you're distributing a base model, you need to distribute the source code necessary to make a new base model, but RL makes something different.
Arguably, an RL-trained model with a closed-source based model is open source in the same sense as open source code that interacts with closed-source (like Nouveau with Nvidia firmware). It would still be pretty sketchy for a company to claim to release an "open source" RL-trained model using its own closed-source/open-weight base model though.
At some point the open/closed distinction becomes insufficient as a description. You could very well have an open-source wrapper (or fine-tuning) of something which is closed-source. Just try to not mislead people about what you're offering.
Disagreements degenerate into debates about word usage way too often and such debates are usually pointless. However, sometimes words can mislead or provoke emotional reactions and in those cases word usage becomes important. This is one of those cases.
Many modern AI models, such as those typically used for text or image generation, consist mainly of a huge neural network. The AI models, for which parameters (a.k.a. weights) are freely distributed on the internet came to be known as open weight models. This is in contrast to other models, whose parameters are kept secret on some servers and are only accessible through APIs or web interfaces.
Some[1] have instead used the term open source to designate the same thing. I cannot speak about the motivation behind that word choice, but it ends up distorting what "open source" means, creates confusion and could make publishing weights seem more virtuous than it is.
I won't get into all the details of defining open source. Here are a couple of points from the definition by the Open Source Initiative (the organization that introduced the term in 1998): [2]
2. Source Code
The program must include source code, and must allow distribution in source code as well as compiled form. Where some form of a product is not distributed with source code, there must be a well-publicized means of obtaining the source code for no more than a reasonable reproduction cost, preferably downloading via the Internet without charge. The source code must be the preferred form in which a programmer would modify the program. Deliberately obfuscated source code is not allowed. Intermediate forms such as the output of a preprocessor or translator are not allowed.
What is the source code of a generative model?
Let's say I want to use the model for my company, while making sure it knows nothing about competitors. The most reliable way to do that would be to remove such references from the training data and retrain the model. This illustrates that the preferred form for modifying a generative model, such as Llama or DeepSeek, would include:
This is not what is provided when a model's weights are published so we should not call it "open source". There are a number of reasons why providing the training data is unrealistic and probably never going to happen. Retraining the model would cost a fortune anyway, so one could argue that open source is just an unsuitable distribution method for generative models.
5. No Discrimination Against Persons or Groups
The license must not discriminate against any person or group of persons.6. No Discrimination Against Fields of Endeavor
The license must not restrict anyone from making use of the program in a specific field of endeavor. For example, it may not restrict the program from being used in a business, or from being used for genetic research.
Open weight generative models typically restrict usage based on company size and use case (e.g. Llama's license disallows use for military purposes and use by companies with over a given number of users).
It is unclear whether company size restriction breaks rule 5, but I would argue the restriction is against the spirit of the rule. The use case restriction certainly breaks rule 6.
While "open source" was defined before open weight generative models were a thing, the intuition behind the term points to something quite different from those models. Open weights very closely matches what was traditionally considered closed source - training corresponds to the compilation and the output is an inscrutable array of numbers, which can be used to perform some computation.
Publishing weights is often misrepresented as if it supports some open source ideology. In my opinion companies use it to signal things like "we're the good guys", "we care about our users".
The term "open source" is useful vocabulary and becomes meaningless if we use it when we actually mean "closed source".
EDIT
After getting some pushback, which doesn't address the core argument, I think I could have been more articulate.
There is no dichotomy between training data and weights - providing one doesn't exclude the other. Having the training data doesn't mean you cannot directly use the weights. Claiming the model is open source without training data would mean you never want to use the training data for modification.
Closed source doesn't imply the model is hidden behind API. There are countless examples of software you can download and run locally (or even modify the machine code), but it is still considered closed source. If we start calling this open source, those two terms become meaningless.
What is "preferred" depends on the person and the context. I took preferred in a hypothetical where price wasn't an issue. The fact that training price could be an issue (while for classical software it virtually never is) shows that the term "open source" doesn't map flawlessly to generative AI. Adding the differences of opinion, as demonstrated by the comment section, the term "open weights" certainly applies better to generative models and creates less confusion.