Posts

Sorted by New

Wiki Contributions

Comments

Wasn't the surprising thing about GPT-4 that scaling laws did hold? Before this many people expected scaling laws to stop before such a high level of capabilities. It doesn't seem that crazy to think that a few more OOMs could be enough for greater than human intelligence. I'm not sure that many people predicted that we would have much faster than scaling law progress (at least until ~human intelligence AI can speed up research)? I think scaling laws are the extreme rate of progress which many people with short timelines worry about.

It also seems likely that the Nano models are extremely overtrained compared to the scaling laws. The scaling laws are for optimal compute during training, but here they want to minimize inference cost so it would make sense to train for significantly longer.

It's interesting that it still always seems to give the "I'm an AI" disclaimer, I guess this part is not included in your refusal vector? Have you tried creating a disclaimer vector?