LESSWRONG
LW

Hopenope
1201170
Message
Dialogue
Subscribe

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
1Hopenope's Shortform
7mo
14
No wikitag contributions to display.
Hopenope's Shortform
Hopenope6h10

The earliest submissions by human players were at the 37-minute mark, and 3 people submitted results by the 1-hour mark. However, it is in a competitive, time-constrained environment, so it is more likely a 2-4 hour task. There is also the possibility that players made multiple attempts that were not good enough, so it may be shorter than that. The first OpenAI submission was at the 15-minute mark, so some brute-forcing is probably happening. Assuming that the tokens per second are the same as o3(168) here, they used 150,000 tokens for the first submission and more than 5.7 million for the whole competition. Of course, a lot of assumptions are going on here. There is a good chance that they used more tokens than that.

Reply
Hopenope's Shortform
Hopenope1d170

OpenAI is competing in the AtCoder world tour finals (heuristic division) with a new model/agent. It is a 10-hour competition with an optimization-based problem, and OpenAI's model is currently at  2nd place. 

Reply
Chain of Thought Monitorability: A New and Fragile Opportunity for AI Safety
Hopenope2dΩ230

Is optimizing CoT to look nice a big concern? There are other ways to show a nice CoT without optimizing for it. The frontrunners also have some incentives to not show the real CoT. Additionally, there is a good chance that people prefer a nice structured summary of CoT by a small LLM when reasonings become very long and convoluted. 

Reply
Gemini Diffusion: watch this space
Hopenope2mo54

What is the point of these benchmarks without knowing the training compute and data ? One of the main questions is their interpretability. Iterative refinement of these models may open new opportunities.

Reply
Orienting Toward Wizard Power
Hopenope2mo50

People with a history of seizures are usually excluded from these kinds of clinical trials, so it is not an apple to apple comparison. the problem is that bupropion interacts with a lot of drugs. seizure rates are also highly dose dependent(10 times higher if taking more than 450 mg daily). Generally, if you’re not taking any interacting medications, are on the 150–300 mg slow-release version, and have no history of seizures, then the risk is low.

Reply
Orienting Toward Wizard Power
Hopenope2mo*130

As a doctor, I can tell you that even if you don’t have anxiety, it’s possible to develop some while taking bupropion/welbutrin. I used it personally and experienced the most severe anxiety I’ve ever had. It is also associated with a higher chance of seizures, and if you daydream a lot, it may make them worse. However, on the positive side, it often decreases inattention. Generally i like the drug , but it is not a first-line treatment for depression, and for good reasons.

Reply
less-wronger-numb89's Shortform
Hopenope3mo30

I lived for a while in a failing country with high unemployment. The businesses and jobs that pay well become saturated very quickly. People are less likely to spend money and often delay purchasing new stuff or maintaining their homes. Many jobs exist because we dont have time to do them ourselves, and a significant number of these jobs will just vanish. It is really hard to prepare for a high unemployment rate society.

Reply
Emergent Misalignment: Narrow finetuning can produce broadly misaligned LLMs
Hopenope5mo22

Overrefusal issues were way more common 1-2 years ago. models like gemini 1, and claude 1-2 had severe overrefusal issues.

Reply
LWLW's Shortform
Hopenope5mo10

 Your argument is actually possible, but what evidences do you have, that make it the likely outcome?

Reply
LWLW's Shortform
Hopenope5mo21

the difficulty of alignment is still unknown. it may be totally impossible, or maybe some changes to current methods (deliberative alignment or constitutional ai) + some R&D automation can get us there. 

Reply
Load More
1Hopenope's Shortform
7mo
14