Gurkenglas

I operate by Crocker's rules.

Comments

Power as Easily Exploitable Opportunities

SOTA: Penalize my action by how well a maximizer that takes my place after the action would maximize a wide variety of goals.

If we use me instead of the maximizer, paradoxes of self-reference arise that we can resolve by inserting a modal operator: Penalize my action by how well I expect I would maximize a wide variety of goals (if given that goal). Then when considering the action of stepping towards an omnipotence button, I would expect that given that I decided to take one step, I would take more, and therefore penalize the first step a lot. Except if there's plausible deniability, because the first step towards the button is also a first step towards my concrete goal, because then I might still expect to be bound by the penalty.

I've suggested using myself before in the last sentence of this comment: https://www.lesswrong.com/posts/mdQEraEZQLg7jtozn/subagents-and-impact-measures-full-and-fully-illustrated?commentId=WGWtoKDrnN3o6cS6G

PSA: Tagging is Awesome

Long outputs will tend to naturally deteriorate, as it tries to reproduce the existing deterioration and accidentally adds some more. Better: Sample one tag at a time. Shuffle the inputs every time to access different subdistributions. (I wonder how much the subdistributions differ for two random shuffles...) If you output the tag that has the highest minimum probability in each of a hundred subdistributions, I bet that'll produce a tag that's not in the inputs.

PSA: Tagging is Awesome

You make it sound like it wants things. It could at most pretend to be something that wants things. If there's a UFAI in there that is carefully managing its bits of anonymity (which sounds as unlikely as your usual conspiracy theory - a myopic neural net of this level should keep a secret no better than a conspiracy of a thousand people), it's going to have better opportunities to influence the world soon enough.

PSA: Tagging is Awesome

Just ask GPT to do the tagging, people.

Gurkenglas's Shortform

The wavefunctioncollapse algorithm measures whichever tile currently has the lowest entropy. GPT-3 always just measures the next token. Of course in prose those are usually the same, but I expect some qualitative improvements once we get structured data with holes such that any might have low entropy, a transformer trained to fill holes, and the resulting ability to pick which hole to fill next.

Until then, I expect those prompts/GPT protocols to perform well which happen to present the holes in your data in the order that wfc would have picked, ie ask it to show its work, don't ask it to write the bottom line of its reasoning process first.

Long shortform short: Include the sequences in your prompt as instructions :)

How will internet forums like LW be able to defend against GPT-style spam?

The obvious answer to spammers being run by GPT is mods being run by GPT. Ask it whether every comment is high-quality/generated, then act on that as needed to keep the site functional.

Competition: Amplify Rohin’s Prediction on AGI researchers & Safety Concerns

It was meant as a submission, except that I couldn't be bothered to actually implement my distribution on that website :) - even/especially after superintelligent AI, researchers might come to the conclusion that we weren't prepared and *shouldn't* build another - regardless of whether the existing sovereign would allow it.

Optimizing arbitrary expressions with a linear number of queries to a Logical Induction Oracle (Cartoon Guide)

Answering with a point estimate seems rather silly. Shouldn't it answer with a distribution? Then one question would be enough.

Can you get AGI from a Transformer?

Re claim 1: If you let it use the page as a scratch pad, you can also let it output commands to a command line interface so it can outsource these hard-to-emulate calculations to the CPU.

Competition: Amplify Rohin’s Prediction on AGI researchers & Safety Concerns

Not quite. Just look at the prior and draw the vertical line at 2030. Note that you're incentivizing people to submit their guess as late as possible, both to have time to read other comments yourself and to put your guess right to one side of another.

Load More