Owain_Evans

Wiki Contributions

Comments

Let's buy out Cyc, for use in AGI interpretability systems?

Some small subsets of CYC were released (see Wiki). You could finetune a model on those and use that to estimate the value of the full dataset. (You could also talk to Michael Witbrock, who worked on CYC in the past and is familiar with AI alignment.) 

There are also various large open-source knowledge bases. There are closed knowledge graphs/bases at Google and many other companies. You might be able to collaborate with researchers at those organizations. 

The Rationalists of the 1950s (and before) also called themselves “Rationalists”

I'm interested to know how much the prominent figures in these past Rationalist groups cared about rationality itself rather than its bedfellows (science, atheism, socialism or communism etc.). A related question is whether these groups sometimes functioned as a fig leaf for a certain kind of political association (e.g. scientifically-minded socialists).  

 From reading the J. B. S. Haldane biography linked in the OP, I got the sense that Haldane cared most about science and the status of scientists in society. He seems to care less about rationality per se than science. He was a devoted communist for a period but this also stems (in part) from the value he places on science. (He had the view that communist countries gave more status to scientists and were run more scientifically.) So I doubt he was involved with the Rationalist Association because of the politics (though maybe if the politics were very conservative he would have left).

 

The Rationalists of the 1950s (and before) also called themselves “Rationalists”

This is a great list of posts. I had some of these in mind but hadn't remembered all of these. Thanks!

The Rationalists of the 1950s (and before) also called themselves “Rationalists”

There is also a French non-profit called the Rationalist Union, co-founded by Langevin (of the Langevin Equation and Twin Paradox). Apparently, Borel, Einstein, and Hadamard all had some honorary role in the past. Like the British Rationalist Association, it seems it was associated with socialism and communism during the mid-20th Century. The best source I could find is translated French Wikipedia. 

Shulman and Yudkowsky on AI progress

Interesting. Can you point to a study by external researchers (not DeepL) that compares DeepL to other systems (such as Google Translate) quantitatively? After a quick search, I could only find one paper, which was just testing some tricky idioms in Spanish and didn't find significant differences between DeepL and Google. (Wikipedia links to this archived page of comparisons conducted by DeepL but there's no information about the methodology used and the difference in performance seem too big to be credible to me.)

The Rationalists of the 1950s (and before) also called themselves “Rationalists”

Thanks, Rob! I agree with this summary. It is unfortunate that "rationalism" has this standard usage in philosophy ("rationalist vs empiricist"). This usage is not completely unrelated to the "rational vs superstitious/irrational" distinction, which makes it more likely to confuse. That said, outside of the fields of philosophy and intellectual history, not many people are aware of the rationalist/empiricist distinction, and so I don't see it as a major problem. 

The Rationalists of the 1950s (and before) also called themselves “Rationalists”

Yes, I said "overlap" not "coincide" for that reason. The present movement has more discussion of applied epistemology, ideology, and world-view formation, and less discussion specifically focused on religion. My sense is that the earlier movement is also more focused on Christianity than on religion or ideology in general. 

Evolution was a pretty new theory in 1880 and so it makes sense it would be discussed more. (AI is a big topic for the present movement and not for the earlier). 

AMA on Truthful AI: Owen Cotton-Barratt, Owain Evans & co-authors

It seems like fine-tuning a general language model on truthful question-answers didn't generalize all that well

I'm not sure what you're referring to. Is there some experiment you have in mind? People have finetuned general language models to answer general-knowledge questions correctly, but I don't know of people finetuning for being truthful (i.e. avoiding falsehoods). 
 

How might you design or train a language model so that it would have some general factor of honesty that could be easily affected by fine-tuning?

There's a more general question. Can you train a model to have property X in such a way that finetuning is unlikely to easily remove property X? If any kind of finetuning is allowed, I think the answer will be "No". So you'd need to place restrictions of the finetuning. There is probably some work in ML on this problem. 

To address the specific question about honesty/truthfulness, we discuss some ways to design language models in the executive summary and (at greater length) in Section 5 of the paper (e.g. discussion of "robust" truthfulness). 
 

AMA on Truthful AI: Owen Cotton-Barratt, Owain Evans & co-authors

(This won't address all parts of your questions.) 
You suggest that the default outcome is for governments and tech platforms to not regulate whether AI needs to be truthful. I think it’s plausible that the default outcome is some kind of regulation. 

Why to expect regulation?
Suppose an AI system produces false statements that deceive a group of humans. Suppose also that the deception is novel in some way: e.g. the falsehoods are personalized to individuals, the content/style is novel, or the humans behind the AI didn't intend any kind of deception. I think if this happens repeatedly, there will be some kind of regulation. This could be voluntary self-regulation from tech companies or normal regulation by governments. Regulation may be more likely if it’s harder to litigate using existing laws relating to (human) deception. 

Why expect AI to cause deception?
You also suggest that in the default scenario AI systems say lots of obviously false things and most humans would learn to distrust them. So there's little deception in the first place. I’m uncertain about this but your position seems overconfident. Some considerations:

1. AI systems that generate wild and blatant falsehoods all the time are not very useful. For most applications, it’s more useful to have systems that are fairly truthful in discussing non-controversial topics. Even for controversial or uncertain topics, there’s pressure for systems to not stray far from the beliefs of the intended audience.

2. I expect some people will find text/chat by AI systems compelling based on stylistic features. Style can be personalized to individual humans. For example, texts could be easy to read (“I understood every word without pausing once!”) and entertaining (“It was so witty that I just didn’t want to stop reading”). Texts can also use style to signal intelligence and expertise (“This writer is obviously a genius and so I took their views seriously”).

3. Sometimes people won't know whether it was an AI or human who generated the text. If there are tools for distinguishing, some people won’t use them and some won’t have enough tech savvy to use them well. 

4. There are humans (“charlatans”) who frequently say false and dubious things while having devoted followers. Not all human followers of charlatans are “fools” (to refer to your original question). AI charlatans would have the advantage of more experimentation. Human charlatans exploit social proof and AI charlatans could do the same (e.g. humans believe the claim X because they think other humans they like/trust believe X).

Truthful AI: Developing and governing AI that does not lie

Standards for truthful AI could be "opt-in". So humans might (a) choose to opt into truthfulness standards for their AI systems, and (b) choose from multiple competing evaluation bodies. Standards need not be mandated by governments to apply to all systems. (I'm not sure how much of your Balkanized internet is mandated by governments rather than arising from individuals opting into different web stacks). 

We also discuss having different standards for different applications. For example, you might want stricter and more conservative standards for AI that helps assess nuclear weapon safety than for AI that teaches foreign languages to children or assists philosophers with thought experiments. 

Load More