B.Eng (Mechatronics)
TLDR: Moloch is more compelling for two reasons:
Earth is at "starting to adopt the wheel" stage in the coordination domain.
Abstractly, inasmuch as science and coordination are attractors
With respect to the attractor thing (post linked below)
SimplexAI-m is advocating for good decision theory.
Super-intelligent super-"moral" clippy still makes us into paperclips because it hasn't agreed not to and doesn't need our cooperation
We should build agents that value our continued existence. If the smartest agents don't, then we die out fairly quickly when they optimise for something else.
This is a good place to start: https://en.wikipedia.org/wiki/Discovery_of_nuclear_fission
There's a few key things that lead to nuclear weapons:
starting point:
realisation: large amounts of energy are theoretically available by rearranging protons/neutrons into things closer to iron (IE:curve of binding energy)
That's not something that can be easily suppressed without suppressing the entire field of nuclear physics.
Assuming there is a conspiracy doing cutting edge nuclear physics and they discover the facts pointing to feasibility of nuclear weapons there are a few suppression options:
Discovering nuclear fission was quite difficult. A Nobel prize was awarded partly in error because chemical analysis of fission products were misidentified as transuranic elements.
Presumably the leading labs could have acknowledged that producing transuranic elements was possible through neutron bombardment but kept the discovery of neutron induced fission a secret.
That's harder. Fudging the numbers on critical mass would require much larger conspiracies. An entire industry would be built on faulty measurement data with true values substituted in key places.
Isotopic separation would still be developed if only for other scientific work (EG:radioactive tracing). Ditto for mass spectroscopy, likely including some instruments capable of measuring heavier elements like uranium isotopes.
Plausibly this would involve lying about some combination of:
A nuclear physicist would be better qualified in figuring out something plausible.
A bit more compelling, though for mining, the excavator/shovel/whatever loads a truck. The truck moves it much further and consumes a lot more energy to do so. Overhead wires to power the haul trucks are the biggest win there.
This is an open pit mine. Less vertical movement may reduce imbalance in energy consumption. Can't find info on pit depth right now but haul distance is 1km.
General point is that when dealing with a move stuff from A to B problem, where A is not fixed, diesel for a varying A-X route and electric for a fixed X-B route seems like a good tradeoff. Definitely B endpoint should be electrified (EG:truck offload at ore processing location)
Getting power to varying point A is a challenging. Maybe something with overhead cables could work, Again, John deere is working on something for agriculture with a cord-laying-down-vehicle and overhead wires are used for the last 20-30 meters. But fields are nice in that there's less sharp rocks and mostly softer dirt/plants. Not impossible but needs some innovation to accomplish.
Agreed on most points. Electrifying rail makes good financial sense.
construction equipment efficiency can be improved without electrifying:
Excavators seem like the wrong thing to grid-connect:
Diesel powered excavators that get delivered and just run with no cord and no power company involvement seem much more practical.
IE:places currently using diesel engines but where cord management and/or electrical hookup cost is less of a concern
Long haul trucking:
Agriculture:
Some human population will remain for experiments or work in special conditions like radioactive mines. But bad things and population decline is likely.
Radioactivity is much more of a problem for people than for machines.
In terms of instrumental value, humans are only useful as an already existing work force
I would like to ask whether it is not more engaging if to say, the caring drive would need to be specifically towards humans, such that there is no surrogate?
Definitely need some targeting criteria that points towards humans or in their vague general direction. Clippy does in some sense care about paperclips so targeting criteria that favors humans over paperclips is important.
The duck example is about (lack of) intelligence. Ducks will place themselves in harms way and confront big scary humans they think are a threat to their ducklings. They definitely care. They're just too stupid to prevent "fall into a sewer and die" type problems. Nature is full of things that care about their offspring. Human "caring for offspring" behavior is similarly strong but involves a lot more intelligence like everything else we do.
TLDR:If you want to do some RL/evolutionary open ended thing that finds novel strategies. It will get goodharted horribly and the novel strategies that succeed without gaming the goal may include things no human would want their caregiver AI to do.
Orthogonally to your "capability", you need to have a "goal" for it.
Game playing RL architechtures like AlphaStart and OpenAI-Five have dead simple reward functions (win the game) and all the complexity is in the reinforcement learning tricks to allow efficient learning and credit assignment at higher layers.
So child rearing motivation is plausibly rooted in cuteness preference along with re-use of empathy. Empathy plausibly has a sliding scale of caring per person which increases for friendships (reciprocal cooperation relationships) and relatives including children obviously. Similar decreases for enemy combatants in wars up to the point they no longer qualify for empathy.
I want agents that take effective actions to care about their "babies", which might not even look like caring at the first glance.
ASI will just flat out break your testing environment. Novel strategies discovered by dumb agents doing lots of exploration will be enough. Alternatively the test is "survive in competitive deathmatch mode" in which case you're aiming for brutally efficient self replicators.
The hope with a non-RL strategy or one of the many sort of RL strategies used for fine tuning is that you can find the generalised core of what you want within the already trained model and the surrounding intelligence means the core generalises well. Q&A fine tuning a LLM in english generalises to other languages.
Also, some systems are architechted in such a way that the caring is part of a value estimator and the search process can be made better up till it starts goodharting the value estimator and/or world model.
Yes they can, until they will actually make a baby, and after that, it's usually really hard to sell loving mother "deals" that will involve suffering of her child as the price, or abandon the child for the more "cute" toy, or persuade it to hotwire herself to not care about her child (if she is smart enough to realize the consequences).
Yes, once the caregiver has imprinted that's sticky. Note that care drive surrogates like pets can be just as sticky to their human caregivers. Pet organ transplants are a thing and people will spend nearly arbitrary amounts of money caring for their animals.
But our current pets aren't super-stimuli. Pets will poop on the floor, scratch up furniture and don't fulfill certain other human wants. You can't teach a dog to fish the way you can a child.
When this changes, real kids will be disappointing. Parents can have favorite children and those favorite children won't be the human ones.
Superstimuli aren't about changing your reward function but rather discovering a better way to fulfill your existing reward function. For all that ice cream is cheating from a nutrition standpoint it still tastes good and people eat it, no brain surgery required.
Also consider that humans optimise their pets (neutering/spaying) and children in ways that the pets and children do not want. I expect some of the novel strategies your AI discovers will be things we do not want.
TLDR:LLMs can simulate agents and so, in some sense, contain those goal driven agents.
An LLM learns to simulate agents because this improves prediction scores. An agent is invoked by supplying a context that indicates text would be written by an agent (EG:specify text is written by some historical figure)
Contrast with pure scaffolding type agent conversions using a Q&A finetuned model. For these, you supply questions (Generate a plan to accomplish X) and then execute the resulting steps. This implicitly uses the Q&A fine tuned "agent" that can have values which conflict with ("I'm sorry I can't do that") or augment the given goal. Here's an AutoGPT taking initiative to try and report people it found doing questionable stuff rather than just doing the original task of finding their posts.(LW source).
The base model can also be used to simulate a goal driven agent directly by supplying appropriate context so the LLM fills in its best guess for what that agent would say (or rather what internet text with that context would have that agent say). The outputs of this process can of course be fed to external systems to execute actions as with the usual scafolded agents. The values of such agents are not uniform. You can ask for simulated Hitler who will have different values than simulated Gandhi.
Not sure if that's exactly what Zvi meant.
This is definitely subjective. Animals are certainly worse off in most respects and I disagree with using them as a baseline.
Imitation is not coordination, it's just efficient learning and animals do it. They also have simple coordination in the sense of generalized tit for tat (we call it friendship). You scratch my back I scratch yours.
Cooperation technologies allow similar things to scale beyond the number of people you can know personally. They bring us closer to the multi agent optimal equilibrium or at least the Core(Game Theory).
Examples of cooperation technologies:
So yes we have some well deployed coordination technologies (money/finance are the big successes here)
It's definitely subjective as to whether tech or cooperation is the less well deployed thing.
There are a lot of unsolved collective action problems though. Why are oligopolies and predatory businesses still a thing? Because coordinating to get rid of them is hard. If people pre-commited to going the distance with respect to avoiding lock in and monopolies, would-be monopolists would just not do that in the first place.
While normal technology is mostly stuff and can usually be dumbed down so even the stupidest get some benefit, cooperation technologies may require people to actively participate/think. So deploying them is not so easy and may even be counterproductive. People also need to have enough slack to make them work.