LESSWRONG
LW

1235
anithite
43661070
Message
Dialogue
Subscribe

B.Eng (Mechatronics)

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No wikitag contributions to display.
My Brush with Superhuman Persuasion
anithite6d40

Skill issue.

First, IQ 100 is only useful in ruling out easy to persuade IQ <80 people. There are likely other correlates of "easy to persuade" that depend on how the AI is doing the persuading.

Second, super-persuasion is about scalability and cost. Bribery doesn't scale because actors have limited amounts of money. <$100 in inference and amortised training should be able persuade a substantial fraction of people.

Achieving this requires a scalable "training environment" to generate a non-goodhartable reward signal. AI trained to persuade on a large population of real users (EG:for affiliate marketing purposes) would be a super-persuader. Once a large company decides to do this at scale results will be much better than anything a hobbyist can do. Synthetic evaluation environments (EG:LLM simulations of users) can help too limited by their exploitability in ways that don't generalise to humans.

There are no regulations against social engineering in contrast to hacking computers. Some company will develop these capabilities which can then be used for nefarious purposes with the usual associated risks like whistleblowers.

Reply
Notes on fatalities from AI takeover
anithite7d10

Even if a coup is meant to capture mineral wealth and the population is irrelevant, coup leaders recognize that mass murder will lead to sanctions stopping them from selling that mineral wealth. Plenty of examples of regimes that kill even low thousands of people being sanctioned.

AI that plans to take over the world does not need to trade with humans or keep them from being horrified and lashing out. Kill approximately everyone is a viable strategy and preferrable in most cases since it removes us as an intelligent adversary.

Reply
AI #115: The Evil Applications Division
anithite5mo32

No. o3 estimates that 60% of American jobs are physical such that you would need robotics to automate them, so if half of those fell within a year, that’s quite a lot.

A lot of jobs that can't be fully automated have sub-tasks software agents could eliminate. >30% of total labor hours might be spent in front of a computer (EG:data entry in a testing lab and all the steps needed to generate report.) That ignores email and the time savings once there is a good enough AI secretary.

AGI could eliminate almost all of that.

I'd estimate 1.7x productivity for a lab I worked at previously. Effect on employment depends on demand elasticity of course.

Reply
The Ukraine War and the Kill Market
anithite5mo50

Prices would adjust to match supply and demand as well as acting as both supply cost and demand value signals. If no one buys the vampire drone, supply side stops production and starts dropping price to liquidate inventory, possibly with a liquidation auction.

Badly done dynamic pricing and auctions feel awful to market participants and can result in issues seen in Ebay auctions like sniping.

Reply
Alexander Gietelink Oldenziel's Shortform
anithite6mo209

in my opinion, this is a poor choice of problem for demonstrating the generator/predictor simplicity gap.

If not restricted to Markov model based predictors, we can do a lot better simplicity-wise.

Simple Bayesian predictor tracks one real valued probability B in range 0...1. Probability of state A is implicitly 1-B.

This is initialized to B=p/(p+q) as a prior given equilibrium probabilities of A/B states after many time steps.

P("1")=qA is our prediction with P("0")=1-P("1") implicitly.

Then update the usual Bayesian way: if "1", B=0 (known state transition to A) if "0", A,B:=(A*(1-p),A*p+B*(1-q)), then normalise by dividing both by the sum. (standard bayesian update discarding falsified B-->A state transition) In one step after simplification: B:=(B(1+p-q)-p)/(Bq-1)

That's a lot more practical than having infinite states. Numerical stability and achieving acceptable accuracy of a real implementable predictor is straightforward but not trivial. A near perfect predictor is only slightly larger than the generator.

A perfect predictor can use 1 bit (have we ever observed a 1) and ceil(log2(n)) bits counting n, the number of observed zeroes in the last run to calculate the perfectly correct prediction. Technically as n-->infinity this turns into infinite bits but scaling is logarithmic so a practical predictor will never need more than ~500 bits given known physics.

Reply
Tormenting Gemini 2.5 with the [[[]]][][[]] Puzzle
anithite6mo10

TLDR:I got stuck on notation [a][b][c][...]→f(a,b,c,...). LLMs probably won't do much better on that for now. Translating into find an unknown f(*args) and the LLMs get it right with probability ~20% depending on the model. o3-mini-high does better. Sonnet 3.7 did get it one shot but I had it write code for doing substitutions which it messes up a lot.

Like others, I looked for some sort of binary operator or concatenation rule. Replacing "][" with "|" or "," would have made this trivial. Straight string substitutions don't work since "[[]]" can be either 2 or "[...][1][...]" as part of a prime exponent set. The notation is the problem. Staring at string diffs would have helped in hindsight maybe.

Turning this into an unknown f() puzzle makes it straightforward for LLMs (and humans) to solve.

1 = f()
2 = f(f())
3 = f(0,f())
4 = f(f(f()))
12 = f(f(f()),f())
0 = 0
-1 = -f()
19 = f(0,0,0,0,0,0,0,f())
20 = f(f(f()),0,f())
-2 = -f(f())
1/2 = f(-f())
sqrt(2) = f(f(-f()))
72^1/6 = f(f(-f()),f(0,-f()))
5/4 = f(-f(f()),0,f())
84 = f(f(f()),f(),0,f())
25/24 = f(-f(0,f()),-f(),f(f()))

Substitutions are then quite easy though most of the LLMs screw up a substitution somewhere unless they use code to do string replacements or do thinking where they will eventually catch their mistake.

Then it's ~25% likely they get it one shot. ~100% is you mention primes are involved or that addition isn't. Depends on the LLM. o3-mini-high got it. Claude 3.7 got it one shot no hints from a fully substituted starting point but that was best of k~=4 with lots of failure otherwise. Models have strong priors for addition as a primitive and definitely don't approach things systematically. Suggesting they focus on single operand evaluations (2,4,1/2,sqrt(2)) gets them on the right track but there's still a bias towards addition.

Reply
The Illusion of Iterative Improvement: Why AI (and Humans) Fail to Track Their Own Epistemic Drift
anithite7mo10

None of the labs would be doing undirected drift. That wouldn't yield improvement for exactly the reasons you suggest.

In the absence of a ground truth quality/correctness signal, optimizing for coherence works. This can give prettier answers (in the way that averaged faces are prettier) but this is limited. The inference time scaling equivalent would be a branching sampling approach that searches for especially preferred token sequences rather than the current greedy sampling approach. Optimising for idea level coherence can improve model thinking to some extent.

For improving raw intelligence significantly, ground truth is necessary. That's available in STEM domains, computer programming tasks being the most accessible. One can imagine grounding hard engineering the same way with a good mechanical/electrical simulation package. TLDR:train for test-time performance.

Then just cross your fingers and hope for transfer learning into softer domains.

For softer domains, ground truth is still accessible via tests on humans (EG:optimise for user approval). This will eventually yield super-persuaders that get thumbs up from users. Persuasion performance is trainable but maybe not a wise thing to train for.

As to actually improving some soft domain skill like "write better english prose" that's not easy to optimise directly as you've observed.

Reply
A Visual Task that's Hard for GPT-4o, but Doable for Primary Schoolers
anithite8mo20

O1 now passes the simpler "over yellow" test from the above. Still fails the picture book example though.

For a complex mechanical drawing, O1 was able to work out easier dimensions but anything more complicated tends to fail. Perhaps the full O3 will do better given ARC-AGI benchmark performance.

Meanwhile, Claude 3.5 and 4o fail a bit more badly failing to correctly identify axial and diameter dimensions.

Visuospatial performance is improving albeit slowly.

Reply
The Risk of Gradual Disempowerment from AI
anithite8mo20

My hope is that the minimum viable pivotal act requires only near human AGI. For example, hack competitor training/inference clusters to fake an AI winter.

Aligning +2SD human equivalent AGI seems more tractable than straight up FOOMing to ASI safely.

One lab does it to buy time for actual safety work.

Unless things slow down massively we probably die. An international agreement would be better but seems unlikely.

Reply
grey goo is unlikely
anithite9mo*101Review for 2023 Review

This post raises a large number of engineering challenges. Some of those engineering challenges rely on other assumptions being made. For example, the use of energy carrying molecules rather than electricity or mechanical power which can cross vacuum boundaries easily. Overall a lot of "If we solve X via method Y (which is the only way to do it) problem Z occurs" without considering making several changes at once that synergistically avoid multiple problems.

"Too much energy" means too much to be competitive with normal biological processes.

That goalpost should be right at the top and clearly stated instead of "microscopic machines that [are] superior". "grey goo alone will have doubling times slower than optimised biological systems" is definitely plausible. E-coli can double in 20 minutes in nutrient rich conditions which is hard to beat. If wet nanotech doubles faster but dry nanotech can make stuff biology can't, then use both. Dry for critical process steps and making high value products and wet for eating the biosphere and scaling up.

Newer semiconductor manufacturing processes use more energy and materials to create each transistor but those transistors use less power and run faster which makes producing them worthwhile. Dry nanotech will be a tool for making things that may be expensive but worthwhile to build like really awesome computers.

Wet nanotech (IE:biology) is plausibly the most efficient at self-replicating but notice humans use all sorts of chemical and physical processes to do other things better. Operating in space with biotech alone for example would be quite difficult.

Reply
Load More
5When discussing AI doom barriers propose specific plausible scenarios
2y
0
8"Copilot" type AI integration could lead to training data needed for AGI
2y
0
67Green goo is plausible
2y
31
27Human level AI can plausibly take over the world
3y
12
17Large Language Models Suggest a Path to Ems
3y
2
12Building trusted third parties
6y
0