All of delton137's Comments + Replies

Visible Thoughts Project and Bounty Announcement

I don't have much direct experience with transformers (I was part of some research with BERT once where we found it was really hard to use without adding hard-coded rules on top, but I have no experience with the modern GPT stuff). However, what you are saying makes a lot of sense to me based on my experience with CNNs and the attempts I've seen to explain/justify CNN behaviour with side channels (for instance this medical image classification system that also generates text as a side output). 

See also my comment on Facebook

Visible Thoughts Project and Bounty Announcement

I think what you're saying makes a lot of sense. When assembling a good training data set, it's all about diversity. 

1oge6dIt'd be hard for humans to compete with AI unless humans can communicate with the AI in reasonable-sized chunks e.g. a 100-page document. Me, I think we should chat in 10-page documents or less ᾓ7ἿE‍♀️.
Visible Thoughts Project and Bounty Announcement

(cross posting this comment from E. S. Yudkowksy's Facebook with some edits / elaboration)

Has anyone tried fine-tuning a transformer on small datasets of increasing size to get a sense of how large a dataset would be needed to do this well? I suspect it might have to be very large.

Note this is similar to the "self explaining AI" idea I explored in early 2020, which I threw together a paper on (I am hesitant to link to it because it's not that great of a paper and much of the discussion there is CNN specific, but here it is.). I can see how producing "thoug... (read more)

4nostalgebraist5dI've fine-tuned GPT models on a bunch of different datasets of different sizes, although not this particular dataset (which doesn't exist yet). Below I list some key things to note. Also see here [https://www.lesswrong.com/posts/pv7Qpu8WSge8NRbpB/larger-language-models-may-disappoint-you-or-an-eternally#5__what_scaling_does] for related discussion. These points hold true for typical tasks/datasets, though a few unusual ones like arithmetic behave differently. * GPT performance tends to scale smoothly and gradually with data/model size, over multiple orders of magnitude. * In terms of subjective response, you don't need much data to get GPTs to the level of "hey, it kinda gets it!". * You may need several orders of magnitude more data to reach the point of saturation where the model can't improve with additional data. * Incomplete mastery usually looks more like "randomly failing X% of the time" than "understanding X% of the content of the task," which can make it difficult to assess quality (or quality differences) at a glance. For a concrete example, here is a data scaling experiment [https://wandb.ai/nostalgebraist/mesh-transformer-jax/reports/GPT-J-fine-tuning-val-loss-scaling-with-data-size--VmlldzoxMjg5MTUx?accessToken=yugkwjdz5fp5vnov8zb9wczmnihwzzcj233zvtjwoo7r7fkm0eeoxw8tpzlhc9ln] I did with GPT-J (6.1B params) on the tumblr post dataset I use for my tumblr bot [GPT-J (6.1B params) ]. My full dataset is roughly 4 times as large as the 30M word dataset proposed here, i.e. the 30M word dataset would be roughly as big as the 25% subsample shown in the report. The linked report only shows val loss, which is not very interpretable, but at least conveys that I haven't reached diminishing returns yet. This seems plausible from subjective evidence, as the model still sometimes misunderstands tumblr lingo / the conversational structure of the data / etc.
5Stella Biderman6dUsing the stated length estimates per section, a single run would constitute approximately 600 pages of single spaced text. This is a lot of writing.

We're guessing 1000 steps per reasonably-completed run (more or less, doesn't have to be exact) and guessing maybe 300 words per step, mostly 'thought'.  Where 'thoughts' can be relatively stream-of-consciousness once accustomed (we hope) and the dungeon run doesn't have to be Hugo quality in its plotting, so it's not like we're asking for a 300,000-word edited novel.

However I also could see the "thoughts" output misleading people - people might mistake the model's explanations as mapping onto the calculations going on inside the model to produce an output.

I think the key point on avoiding this is the intervening-on-the-thoughts part:
"An AI produces thoughts as visible intermediates on the way to story text, allowing us to watch the AI think about how to design its output, and to verify that we can get different sensible outputs by intervening on the thoughts".

So the idea is that you train things in such a way that the thoughts do map onto the calculations going on inside the model.

Should PAXLOVID be given preventively?

Note: Pfizer started a trial in September to try to answer this question.  We may know answer in a few months. In theory I don't see why it wouldn't work but with limited supply there's probably better uses at least in the next few months. 

Also, note the initial EUA application is asking it be approved for high-risk patients only, probably because Pfizer was told by FDA it wouldn't be EUA'd otherwise. 

Paxlovid must be taken with Ritonavir (otherwise Paxlovid breaks down to fast) which messes with liver enzymes and isn't a good choice for man... (read more)

Does anyone know what Marvin Minsky is talking about here?

Very cool, will take a look. This basically solves question 1. It seems the original Solomonoff work isn't published anywhere. By the way, the author, William H. Press, is a real polymath! I am curious if there is any extension of this work to agents with finite memory..  as an example, the same situation where you're screening a large number of people, but now you have a memory where you can store N results of prior screenings for reference. I'm going to look into it.. 

2gwern17dSeems like a memory version would be identical, just with a smaller n after subtracting the individuals you screen. When you fill up your memory with cleared individuals, why would you then ever want to 'forget' them? By stipulation, you learn nothing about other individuals or the population, only about the ones you look at. If you forget them to replace them with a new memory, that de facto makes the n bigger, and worsens your odds since you've flushed back into the pool the only individuals you knew for sure you never want to sample again (because they are clear) and so now you may waste a sample to test them again while gaining nothing. And once you remove them from the population via your memory, you're back to the solved memoryless problem and have to square-root it.
Possible research directions to improve the mechanistic explanation of neural networks

Here's another paper on small / non-robust features, but rather specific to patch-based vision transformers: 
Understanding and Improving Robustness of Vision Transformers through Patch-based Negative Augmentation
^ This work is very specific to patch-based methods. Whether patches are here to stay and for how long is unclear to me, but right now they seem to be on an ascendancy (?).  

Improving on the Karma System

For what it's worth - I see value in votes being public by default. It can be very useful to see who upvoted or downvoted your comment. Of course then people will use the upvote feature just to indicate they read a post, but that's OK (we are familiar with that system from Facebook, Twitter, etc). 

I'm pretty apathetic about all the other proposals here. Reactions seem to me to be unnecessary distractions. [side note - emojiis are very ambiguous so it's good you put words next to each one to explain what they are supposed to mean].  The way I woul... (read more)

Discussion with Eliezer Yudkowsky on AGI interventions

I'm curious why this comment has such low karma and has -1 alignment forum karma. 

If you think doom is very likely when AI reaches a certain level, than efforts to buy us time before then have the highest expected utility. The best way to buy time, arguably, is to study the different AI approaches that exist today and figure out which ones are the most likely to lead to dangerous AI. Then create regulations (either through government or at corporation level) banning the types of AI systems that are proving to be very hard to align. (For example we may... (read more)

What would we do if alignment were futile?

Also... alignment is obviously continuum and of course 100% alignment with all human values is impossible. 

A different thing you could prove is whether it's possible to guarantee human control over an AI system as it becomes more intelligent. 

There's also a concern that a slightly unaligned system may become more and more aligned as its intelligence is scaled up (either by humans  re-building/trianing it with more parameters/hardware or via recursive self-improvement). It would useful if someone could prove whether that is impossible to prev... (read more)

What would we do if alignment were futile?

Roman Yampolsky has said recently (at a Foresight Salon event, the recording should be posted on YouTube soon) that it would be highly valuable if someone could prove that alignment is impossible. Given the high value for informing AI existential safety investment, I agree with Yampolsky we should have more people working on this (trying to prove theorems (or creating very rigorous arguments) as to whether alignment is possible or impossible).  

If we knew with very high certainty that alignment is impossible, than that would compel us to invest more r... (read more)

1delton13722dAlso... alignment is obviously continuum and of course 100% alignment with all human values is impossible. A different thing you could prove is whether it's possible to guarantee human control over an AI system as it becomes more intelligent. There's also a concern that a slightly unaligned system may become more and more aligned as its intelligence is scaled up (either by humans re-building/trianing it with more parameters/hardware or via recursive self-improvement). It would useful if someone could prove whether that is impossible to prevent. I need to think about this more and read Yampolsky's paper to really understand what would be the most useful to prove is possible or impossible.
What’s the likelihood of only sub exponential growth for AGI?

It's hard to imagine a "general intelligence" getting stuck at the level of a 10 year child in all areas -- certainly it will have an ability to interface with hardware that allows it to perform rapid calculations or run other super-human algorithms. 

But there are some arguments that suggest intelligence scaling at an exponential rate can't go on indefinitely and in fact limitations to exponential growth ("foom") may be hit very soon after AGI is developed, so basically foom is impossible. For instance, see this article by Francois Chollet: 
https... (read more)

1M. Y. Zuo21dThanks for the links. It may be that the development of science, and of all technical endeavours in general, follow a pattern of punctuated equilibrium, that is sub linear growth, or even regression, for the vast majority of the time, interspersed by brief periods of tremendous change.
Possible research directions to improve the mechanistic explanation of neural networks

we haven't solved the problem of deeper networks taking longer to train, right


My understanding is the vanishing gradient problem has been largely mitigated by introducing skip connections (first with resnet, and now standard in CNN architectures), allowing for networks with hundreds of layers. 
 

It's too bad fully-connected networks don't scale. 

I've heard people say vision transformers are sort of like going back to MLPs for vision. The disadvantage of going away from the CNN architecture (in particular weight sharing across receptive fields... (read more)

1TLW22dDoes this actually solve the problem, or just mask it? Skip connections end up with a bunch of shallow networks in parallel with deeper networks, to an over-approximation. If the shallow portions end up training faster and out-competing the deeper portions...
Comments on OpenPhil's Interpretability RFP

".. we just don't have very compelling example domains where ML systems understand important things in ways we can't. " 


I'm guessing you mean in ways humans can't even in principle

Regardless, here's something people might find amusing - researchers found that a simple VGG-like 3D CNN model can look at electron microscope images of neural tissue and do a task that humans don't know how to do. The network distinguishes neurons that specialize in certain neurotransmitters. From the abstract to this preprint:

"The network successfully discriminates

... (read more)
Interpretability

Great..

Also I just realized that the "grokking" phenomena is relevant here. The "grokking" paper shows jumps during training, but it's similar. From the lens of the lottery ticket hypothesis, it's not surprising that grokking may be easier / more likely in larger models. 

I wonder how much "grokking" is new to transformers. I happened to stumble across an example in the literature where a CNN model "fails to grok" the Game of Life: https://arxiv.org/abs/2009.01398 .. I wonder what would happen if you used a transformer model instead..

Also, please check... (read more)

2gwern1moI hesitate to call grokking an example of blessings of scale because it's still not clear what is going on there with grokking or patient teacher. They are, after all, tiny models, and patient teacher is all about distilling to small models. And the need for regularization is strange if it's a scaling thing where larger=better: what, the regularization by tininess isn't enough, it needs more regularization from weight decay? I doubt grokking is unique to Transformers. The research I see as most related to grokking, the finding shallow minima paradigm with the wide basins & cyclic learning rates, are well-established for CNNs. Not finding it for some CNN is pretty weak evidence, given the grokking paper showing that you can go anywhere from like 0% to what was it 90%? depending on the details of the setup and how long you run.
Interpretability

Thanks..  I was looking for more graphs with discontinuous jumps and "# of parameters" on the x-axis... but I think "totally new and unexpected capabilities after going from GPT-2 to GPT-3" is a reasonable thing to point at, also. The scaling laws bibliography is super, super useful. I am just embarking on making my way through it now..

2gwern1moYou can dig those 'money shot' capability jump graphs out of the papers, usually, I think. I try to add them to annotations when I make them because that's a very critical stylized fact about DL's blessings of scale. I'm not going to look now, but Brown has the graphs, and I'm pretty sure the text style transfer & RL finetuning do have the money shot graphs, and probably the others. XLand and MuZero might have them if you squint (not necessarily in parameter # - parameters aren't the only thing that scales, remember!).
Interpretability

"If one looks at the performance of particular tasks, such as arithmetic on numbers of a certain size, across model sizes, one often observes points where larger models discontinuously become better at a task."


Is it accurate to say that one "often observes" this?  The only examples I know of are in GPT-3 with the addition, multiplication, and symbolic substitution tasks. I'm not sure how concerned to be about this being a general phenomena. Does anyone have further examples? Does anyone have insights into whether the GPT-3 examples are special cases or not? 

3gwern1moIn [https://www.gwern.net/notes/Scaling] addition to the original Brown et al 2020 examples, text style transfer, meta-learning instructability [https://www.gwern.net/notes/Scaling#wei-et-al-2021]*, RL-finetuning of summarization, self-critique of math word problems, and maybe the improving zero-shot translation & program writing/dialogue (I'd have to double-check those), have been shown with GPT-3 and LamDA to 'kick in' at certain sizes going from the O(1b) models to 10-1000b. Nobody seems very surprised these days to see something work on GPT-3-173b but then not on ~1b. * Should we count all of the examples of meta-learning [https://www.gwern.net/docs/reinforcement-learning/meta-learning/index] / generalization which require diverse environments to get abruptly better performance out of sample, like XLand [https://deepmind.com/blog/article/generally-capable-agents-emerge-from-open-ended-play] or the MuZero meta-learning paper I mention [https://www.lesswrong.com/posts/jYNT3Qihn2aAYaaPb/efficientzero-human-ale-sample-efficiency-w-muzero-self?commentId=JQXocvPjaCr5Gn3zg] over in EfficientZero? That's definitely a stark jump in performance: the single-environment agents, no matter how good in the primary environment, typically perform extremely poorly or even near floor in the new environment.
Deep limitations? Examining expert disagreement over deep learning

Here's my opinions on what deep learning can do, FWIW - 
1 (abstraction) yes, but they aren't sample efficient ! 
2. (generalization) eh, not if you define generalization as going out of distribution (note: that's not how it's normally defined in ML literature).  Deep learning systems can barely generalize outside their training data distribution at all. The one exception I know is how GPT-3 learned addition but even then it broke down at large numbers. Some GPT-3 generalization failures can be seen here.
3. (causality) maybe?  
4. (long te... (read more)

Emergent modularity and safety

I'm having trouble understanding the n-cut metric used in Filan's work. 

A more intuitive measure would be the sum of weights contained in edges that go between each subset of vertices divided by the total sum of weights in the graph as a whole. That's not quite what n-cut measures though, if you look at the equation - it isn't normalized that way. 

It would be nice if there were some figures of examples of modular graphs with different n-cut values to provide an intuitive understanding of what n-cut = 9 means vs n-cut = 5. 

Look at the latest ... (read more)

Whole Brain Emulation: No Progress on C. elgans After 10 Years

I want to point out that there has been some very small amounts of progress in the last 10 years on the problem of moving from connectome to simulation rather than no progress. 

First, there has been interesting work at the JHU Applied Physics Lab which extends what Busbice was trying to do when he tried to run as simulation of c elegans in a Lego Mindstorms robot (by the way, that work by Busbice was very much overhyped by Busbice and in the media, so it's fitting that you didn't mention it). They use a basic integrate and fire model to simulate the n... (read more)

4niconiconi2moThanks for the info. Your comment is the reason why I'm on LessWrong.
#unclogtheFDA: a twitter storm to approve vaccines

Here's some updates on this: 

We have two Facebook event pages created for this: 
https://www.facebook.com/events/208338324360886 (35 RSVPs) 
https://www.facebook.com/events/1028113637697900 (18 RSVPs) 

This is great, but we need more people. It might be worth gently reminding people that it only takes a few minutes to set up a Twitter account. 

We have some big Twitter influencers who have signaled they are on our side but haven't yet used our hashtags. They should be our primary targets to get involved: 

  • Matthew Yglesias - 498k fo
... (read more)
#unclogtheFDA: a twitter storm to approve vaccines

Not opposed, but just want to note we are planning a demonstration outside the FDA on sunday from 2-4pm. I shall post links to the Facebook and Meetup events soon. A bunch of local LWers can make it on weekends but not weekdays. I think this sequence works well - we do the protest and gets some pictures, and then share them as part of the Twitter storm on Monday.  

A Return to Discussion

As far as "playing the comments game", I admit I am guilty of that. At a deeper level it comes from a desire to connect with like-minded people. I may even be doing it right now.

We like to think people post because they are genuinely intellectually engaged in the material we've written, but the truth is people post comments for a myriad of different reasons, including wanting to score comment 'points' or 'karma' or engage in a back-and-forth with a figure they admire. People like getting attention. [even shy nerdy people who are socially isolate... (read more)

2Evan_Gaensbauer5yLessWrong itself doesn't have as much activity as it once did, but the first users on LessWrong have pursued their ideas on Artificial Intelligence and rationality, through the Machine Intelligence Research Institute (MIRI) and the Center for Applied Rationality (CFAR), respectively, they have a lot more opportunity to impact the world than they did before. If those are the sorts of things you or anyone, really, is passionate about, if they can get abreast of what these organizations are doing now and can greatly expand on it on LW itself, it can lead to jobs. Well, it'd probably help to be able to work in the United States and also have a degree to work at either CFAR or MIRI. I've known several people who've gone on to collaborate with them by starting on LW. Still, though, personally I'd find the most exciting part to be shaping the future of ideas regardless of whether it led to a job or not. I think it's much easier to say now to become a top contributor on LW can be a springboard to much greater things. Caveat: whether those things are greater depends on what you want. Of course there are all manner of readers and users on LW who don't particularly pay attention to what goes on in AI safety, or at CFAR/MIRI. I shouldn't say building connections through LW is unusually likely to lead to great things if most LessWrongers might not think the outcomes so great after all. If LW became the sort of rationality community which was conducive to other slam-dunk examples of systematic winning, like a string of successful entrepreneurs, that'd make the sight much more attractive. I know several CFAR alumni have credited the rationality skills they learned at CFAR as contributing to their success as entrepreneurs or on other projects. That's something else entirely different from finding the beginnings of that sort of success merely on this website itself. If all manner of aspiring rationalists pursued and won in all manner of domains, with all the beginnings of their
5Viliam5yI believe the proper solution is like an eukaryotic cell -- with outer circle, and inner circle(s). In Christianity, the outer circle is to be formally a Christian, and to visit a church on (some) Sundays. The inner circles are various monastic orders, or becoming a priest, or this kind of stuff. Now you can provide both options for people who want different things. If you just want the warm fuzzy feelings of belonging to a community, here you go. If you want some hardcore stuff, okay, come here. These two layers need to cooperate: the outer circle must respect the inner circle, but the inner circle must provide some services for the outer circle. -- In case of LW such services would mostly be writing articles or making videos. The outer circle must be vague enough that anyone can join, but the inner circles must be protected from invasion of charlatans; they must cooperate with each other so that they are able to formally declare someone "not one of us", if a charlatan tries to take over the system or just benefit from claiming that he is a part of the system. In other words, the inner circles need some system to formally recognize who is an inner circle of the system and who is not. Looking at rationalist community today, "MIRI representatives" and "CFAR representatives" seem like inner circles, and there are also a few obvious celebrities such as Yvain of SSC. But if the community is going to grow, these people are going to need some common flag to make them different from anyone else who decides to make "rationality" their applause light and gather followers.
0spriteless5yCommenting takes less energy than moderating comments, certainly.
Open Thread, Aug. 22 - 28, 2016

Its funny because 90+% of articles on Salon.com are 'godawful clickbait' in my opinion -- with this one being one of the exceptions.

3passive_fist5yAnd Lumifer's dismissal of it is probably the most low-effort way of responding. Students of rationality, take note.
Open Thread, Aug. 22 - 28, 2016

Decent article but pretty basic. Still, a glimmer of reason in the dark pit of Salon.

Didn't know Y Combinator was doing a pilot. They don't mention how many people will be in the pilot in the announcement, but it will be interesting to see.

One thing I never understood is why it makes sense to do cash transfers to people that are already wealthy - or even above average income. A social safety net (while admittedly more difficult to manage) consisting solely of cash income seems to makes more sense. I guess the issue is with the practical implementation details of managing the system and making sure everyone who needs to be enrolled is.

4Gleb_Tsipursky5yAgreed, to me it also makes no sense to do cash transfers to people with above average income. I see basic income as mainly about a social safety net.
A Review of Signal Data Science

Thanks for the review. I just submitted my application today (before I saw your post). I was a bit wary, due to fluttershy's post you mentioned, but more because of the lack of results (ie actual job placements) on their website compared to more established programs. The main benefit I see to this program is being in a space with other people who you can easily bounce ideas off (ie, the social experience). I tend to work bettered in a structured environment, also. Its also good to hear that it is useful for networking as well. I wasn't sure about that, because whereas other data science programs have working relationships with major companies, I didn't get that impression when reading about Signal.

2The_Jaded_One5ySignal do have contacts in a few high profile companies. I suspect that the track record issue is somewhat a timing thing, but also somewhat because other bootcamps are quite good at creative statistics to cook up their 90% figures. For example, I started a springboard data science course with a monthly fee, but pulled out when I realised it was useless. However they will not count me as a "fail", because I didn't complete. The mentor they assigned to me at springboard just told me to go Google everything, and their curriculum was not as good, you were left to fend for yourself with no help. Contrast Signal where you're actually there and get help when you're stuck.
An update on Signal Data Science (an intensive data science training program)

I'd love to see some results as well, and I'm assuming as soon as you have them they'd be posted. I looked under 'projects' and looked at the available LinkedIn profiles, and it looks like three of the students got jobs (well, more specifically 2 jobs and an internship). Those students already had impressive resumes going into the program, but this is quite encouraging to see.

Low hanging productivity - improving your workspace

I am currently working from home and my laptop is now considerably more powerful than my desktop, which is 8 years old.

Anyone have a suggestion for a good external video card that would allow me to use two monitors to my new laptop? [it has a mini DisplayPort output and free USB]

Open Thread, Aug. 8 - Aug 14. 2016

The Skeptics Guide to the Universe podcast interviewed Grant Richey about this. He notes that some of the headlines were misleading, because the study did find that when flossing is performed by a dental hygienist on children, it has positive effect. So, a better encapsulation of the recent review is that improper flossing doesn't have any positive effect. On the other hand, its very unlikely to hurt you, unless you damage your gums in the process.

in case anyone wants a detailed review of the literature from before this study, Grant Richey did a blog post on it a few months ago: https://www.sciencebasedmedicine.org/may-the-floss-be-with-you/