sanxiyn

Wiki Contributions

Comments

sanxiyn7-1

First, I would like to second that the world is incredibly small. It bears repeating. I am repeating it to myself to get courage to write this comment. Maybe this is obvious, but maybe it is not. It could be helpful.

Random thoughts on alleged OpenAI memo on selling AGI to highest bidder including China and Russia. This sounds plausible to me, because as I understand before the split with Anthropic OpenAI was very much "team humanity, not team USG or team CCP". I think this should be understood in context that getting aligned AI is higher priority than geopolitical competition.

Random thoughts on AI labs and coup. Could Los Alamos coup? I mean, obviously no in the real timeline, they didn't have delivery, none of bomber, ICBM, and nuclear submarine. Let's just assume after the Trinity test they could unilaterally decide to put a newly produced nuke not yet delivered to the army on ICBM and point that to Washington DC. Can Los Alamos force Truman, say, to share the nuke with Soviet Union (which many scientists actually wanted)?

By assumption, Truman should surrender (even unconditionally), but it is hard to imagine he would. Nuclear threats not only need to be executable, it also needs to be understandable. Also Los Alamos would depend on enriched uranium supply chain which is large industry not under its control, physical security of Los Alamos is under army control and what if security guards just go into Technical area?

Applying this to OpenAI or possible OpenAI-in-the-desert, OpenAI would depend on trillion dollars cluster and its supply chain, large industry not under its control, and same physical security problem. How does OpenAI defend against tanks on the street of San Francisco? With ASI-controlled drones? Why does OpenAI conveniently happen to have drones and drone factories on premise?

I am trying to push back against "if you have ASI you are the government". If the government is monopoly on violence, millions of perfectly coordinated Von Neumanns do not immediately overthrow USG, key word being immediately. Considering Von Neumann's talk of nuking Moscow today instead of tomorrow and lunch instead of dinner it will be pretty quick, but it still takes time to have fabs and power plants and data centers and drone factories etc. Even if you use nanotechnology to build them, it still takes time to research nanotechnology.

Maybe they develop mind control level convincing argument and send it to key people (president, congress, NORAD, etc) or hack their iPhones and recursively down to security guards of fabs/power plants/data centers/drone factories. That may be quick enough. The point is that it is not obvious.

Random thoughts on Chinese AI researchers and immigration. US's track record here is extremely bad, even with cold war. Do you know how China got nukes and missiles? US deported Qian Xuesen, MIT graduate, who founded JPL. He had US military ranks in WW2. He interrogated Werner von Braun for USG! Then USG decided Qian is a communist, which was completely ridiculous. Then Qian went back and worked for communists whoops. Let me quote Atomic Heritage Foundation:

Deporting Qian was the stupidest thing this country ever did. He was no more a communist than I was, and we forced him to go.

US would be well advised to avoid repeating this total fiasco. But I am not optimistic.

Unclear. I think there is a correlation, but: one determinant of crawl completeness/quality/etc is choice of seeds. It is known that Internet Archive crawl has better Chinese data than Common Crawl, because they made specific effort to improve seeds for Chinese web. Such missing data originating from choice of seeds bias probably is not particularly low quality than average of what is in Common Crawl.

(That is, to clarify, yes in general effort is spent for quality writing to be easily accessible (hence easily crawlable), but accessibility is relative to choice of seeds, and it in fact is the case that being easily accessible from Chinese web does not necessarily entail being easily accessible from English web.)

sanxiyn130

Modern AI is trained on a huge fraction of the internet

I want to push against this. The internet (or world wide web) is incredibly big. In fact we don't know exactly how big it is, and measuring the size is a research problem!

When they say this what they mean is it is trained on a huge fraction of Common Crawl. Common Crawl is a crawl of world wide web that is free to use. But there are other crawls, and you could crawl world wide web yourself. Everyone uses Common Crawl because it is decent and crawling world wide web is itself a large engineering project.

But Common Crawl is not at all a complete crawl of world wide web. It is very far from being complete. For example, Google has its own proprietary crawl of world wide web (which you can access as Google cache). Probabilistic estimate of size of Google's search index suggests it is 10x size of Common Crawl. And Google's crawl is also not complete. Bing also has its own crawl.

It is known that Anthropic runs its own crawler called ClaudeBot. Web crawl is highly non-trivial engineering project, but it is also kind of well understood. (Although I heard that you continue to encounter new issues as you approach more and more extreme scales.) There are also failed search engines with their own web crawls and you could buy them.

There is also another independent web crawl that is public! Internet Archive has its own crawl, it is just less well known than Common Crawl. Recently someone made a use of Internet Archive crawl and analyzed overlap and difference with Common Crawl, see https://arxiv.org/abs/2403.14009.

If the data wall is a big problem, making a use of Internet Archive crawl is like the first obvious thing to try. But as far as I know, that 2024 paper is first public literature to do so. At least, any analysis of data wall should take into account both Common Crawl and Internet Archive crawl with overlap excluded, but I have never seen anyone doing this.

My overall point is that Common Crawl is not world wide web. It is not complete, and there are other crawls, both public and private. You can also crawl yourself, and we know AI labs do. How much does it help is unclear, but I think 10x is very likely, although probably not 100x.

sanxiyn51

I don't think the post is saying the result is not valuable. The claim is that it underperformed expectation. Stock prices fall if they underperformed expectation, even if they are profitable. That does not mean they made loss.

sanxiyn41

Unsure about this. Isn't Qwen on Chatbot Arena Leaderboard, and is made by Alibaba?

sanxiyn81

No. Traditionally, donors have no standing to sue charity. From https://www.thetaxadviser.com/issues/2021/sep/donor-no-standing-sue-donor-advised-fund.html

California limits by statute the persons who can sue for mismanagement of a charitable corporation's assets. The court found that the claims raised by Pinkert for breach of a fiduciary duty for mismanagement of assets were claims for breach of a charitable trust. The court determined that under California law, a suit for breach of a charitable trust can be brought by the attorney general of California...

sanxiyn20

Someone from South Korea is extremely skeptical and wrote a long thread going into paper's details why it must be 100% false: https://twitter.com/AK2MARU/status/1684435312557314048. Sorry it's in Korean, but we live in the age of miracle and serviceable machine translation.

sanxiyn31

But it wasn't until the 1940s and the advent of the electronic computer that they actually built a machine that was used to construct mathematical tables. I'm confused...

You are confused because that is not the reality. As you can read on Wikipedia's entry on difference engine, Scheutz built a difference engine derivative, sold it, and it was used to create logarithmic tables.

You must have read this while writing this article. It is prominent in the Wikipedia article in question and hard to miss. Why did you make this mistake? If it was a deliberate act of misleading for narrative convenience I am very disappointed. Yes, the reality is rarely narratively convenient, but you shouldn't lie about it.

My median estimate has been 2028 (so 5 years). I first wrote down 2028 in 2016 (so 12 years after then), and during 7 years since, I barely moved the estimate. Things roughly happened when I expected them to.

Load More