To understand reality, especially on confusing topics, it's important to understand the mental processes involved in forming concepts and using words to speak about them.
This post is part of the output from AI Safety Camp 2023’s Cyborgism track, run by Nicholas Kees Dupuis - thank you to AISC organizers & funders for their support. Thank you for comments from Peter Hroššo; and the helpful background of conversations about the possibilities (and limits) of LLM-assisted cognition with Julia Persson, Kyle McDonnell, and Daniel Clothiaux.
Epistemic status: this is not a rigorous or quantified study, and much of this might be obvious to people experienced with LLMs, philosophy, or both. It is mostly a writeup of my (ukc10014) investigations during AISC and is a companion to The Compleat Cybornaut.
This post documents research into using LLMs for domains such as culture, politics, or philosophy (which arguably are different - from the perspective of research approach...
Are you confident in your current ontology? Are you convinced that ultimately all ufos are prosaic in nature?
If so, do you want some immediate free money?
I suspect that LW's are overconfident in their views on ufos/uap. As such, I'm willing to offer what I think many will find to be very appealing terms for a bet.
Essentially, I wish to bet on the world and rationalists eventually experiencing significant ontological shock as it relates to the nature of some ufos/uap.
Offer me odds for a bet, and the maximum payout you are willing to commit to. I will pick 1+ from the pool and immediately pay out to you. In the event that I ultimately win the bet, then you will pay out back to me.
I'm looking to...
How do the desires of possible executors/heirs/etc. factor into this?
Clearly the bet will not auto-extinguish and auto-erase itself regardless of the future desires of anyone.
If you thought I implied that the bet must be settled in purely monetary terms, that wasn't my intention. It's entirely possible for the majority, or entirety, of the bet to be settled with non-monetary currencies, such as social-status, reputation, etc...
It's just not all that likely for someone, or their successors, to insist on going down that path.
"AI alignment" has the application, the agenda, less charitably the activism, right in the name. It is a lot like "Missiology" (the study of how to proselytize to "the savages") which had to evolve into "Anthropology" in order to get atheists and Jews to participate. In the same way, "AI Alignment" excludes e.g. people who are inclined to believe superintelligences will know better than us what is good, and who don't want to hamstring them. You can think we're well rid of these people. But you're still excluding people and thereby reducing the amount of thinking that will be applied to the problem.
"Artificial Intention research" instead emphasizes the space of possible intentions, the space of possible minds, and stresses how intentions that are not natural (constrained by...
""AI alignment" has the application, the agenda, less charitably the activism, right in the name."
This seems like a feature, not a bug. "AI alignment" is not a neutral idea. We're not just researching how these models behave or how minds might be built neutrally out of pure scientific curiosity. It has a specific purpose in mind - to align AI's. Why would we not want this agenda to be part of the name?
(By "most promising" I mostly mean "not obviously making noob mistakes", with the central examples being "any Proper Noun research agenda associated with a specific person or org".)
(By "formal" I mean "involving at least some math proofs, and not solely coding things".)
Asking because the field is both relatively-small and also I'm not sure if any single person "gets" all of it anymore.
Example that made me ask this (not necessarily a central example): Nate Soares wrote this about John Wentworth's work, but then Wentworth replied saying it was inaccurate about his current/overall priorities.
This post is crossposted from my blog. If you liked this post, subscribe to Lynette's blog to read more -- I only crosspost about half my content to other platforms.
If you’re going into surgery, you want the youngest operating surgeon available.
This is a slight exaggeration – you don’t want a doctor in their first year out of medical school.[1] After that, it’s less clear. One review found thirty-two studies indicating that the older a doctor was, the worse their medical outcomes; that review only found one study indicating that all outcomes got better with increasing age.[2] Other analyses suggest that middle-aged doctors might do better than younger doctors (though the effect is not statistically significant)[3], but older doctors are still clearly worse than middle-aged doctors.[4]
It’s not like doctors...
I have found a lot of online summaries of deliberate practice frustratingly vague. So I bought a well reviewed out of print manual on deliberate practice in music called The Practiceopedia. The chapter headings give some ideas about the sort of resolution being gone for. I might do a book review at some point.
Chapter guide
Beginners: curing your addiction to the start of your peace
Blinkers: shutting out the things you shouldn't be working on
Boot camp: where you need to send passages that won't behave
Breakthroughs diary: keeping track of your progress
Bridgin...
This is a draft written by J. Dmitri Gallow, Senior Research Fellow at the Dianoia Institute of Philosophy at ACU, as part of the Center for AI Safety Philosophy Fellowship. This draft is meant to solicit feedback. Here is a PDF version of the draft.
The thesis of instrumental convergence holds that a wide range of ends have common means: for instance, self preservation, desire preservation, self improvement, and resource acquisition. Bostrom (2014) contends that instrumental convergence gives us reason to think that ''the default outcome of the creation of machine superintelligence is existential catastrophe''. I use the tools of decision theory to investigate whether this thesis is true. I find that, even if intrinsic desires are randomly selected, instrumental rationality induces biases towards certain kinds of choices....
A quick prefatory note on how I'm thinking about 'goals' (I don't think it's relevant, but I'm not sure): as I'm modelling things, Sia's desires/goals are given by a function from ways the world could be (colloquially, 'worlds') to real numbers, , with the interpretation that is how well satisfied Sia's desires are if turns out to be the way the world actually is. By 'the world', I mean to include all of history, from the beginning to the end of time, and I mean to encompass every region of space. I assume that this functio...
Sometimes, people have life problems that can be entirely solved by doing one thing. (doing X made my life 0.1% better, PERMANENTLY!) These are not things like "This TAP made me exercise more frequently", but rather like "moving my scale into my doorway made me weigh myself more, causing me to exercise more frequently" - a one-shot solution that makes a reasonable amount of progress in solving a problem.
I've found that I've had a couple of life problems that I couldn't solve because I didn't know what the solution was, not because it was hard to solve - once I thought of the solution, implementation was not that difficult. I'm looking to collect various one-shot solutions to problems to expand my solution space, as well as potentially find solutions to problems that I didn't realize I had.
Please only put one problem-solution pair per answer.
here's a small improvement for me. i open a lot of tabs every day, sometimes to read them later, etc. it would get really disorganized, till i enabled a setting that makes new tabs open to the right of the current one, rather than to the right of all of them. it still gets disorganized, but not as much. also, now i don't need to scroll all the way to the right on my tab list to get to one i just opened, and can just ctrl + click -> ctrl + tab.
(there may be a better solution for this, like a tab manager addon, though)
If a technology may introduce catastrophic risks, how do you develop it?
It occurred to me that the Wright Brothers’ approach to inventing the airplane might make a good case study.
The catastrophic risk for them, of course, was dying in a crash. This is exactly what happened to one of the Wrights’ predecessors, Otto Lilienthal, who attempted to fly using a kind of glider. He had many successful experiments, but one day he lost control, fell, and broke his neck.
Believe it or not, the news of Lilienthal’s death motivated the Wrights to take up the challenge of flying. Someone had to carry on the work! But they weren’t reckless. They wanted to avoid Lilienthal’s fate. So what was their approach?
First,...
The Wrights invented the airplane using an empirical, trial-and-error approach. They had to learn from experience. They couldn’t have solved the control problem without actually building and testing a plane. There was no theory sufficient to guide them, and what theory did exist was often wrong. (In fact, the Wrights had to throw out the published tables of aerodynamic data, and make their own measurements, for which they designed and built their own wind tunnel.)
This part in particular is where I think there's a whole bunch of useful lessons for alignment...
Apple is offering a VR/AR/XR headset, Vision Pro, for the low, low price of $3,500.
I kid. Also I am deadly serious.
The value of this headset to a middle class American or someone richer than that is almost certainly either vastly more than $3,500, or at best very close to $0.
This type of technology is a threshold effect. Once it gets good enough, if it gets good enough, it will feel essential to our lives and our productivity. Until then, it’s a trifle.
Thus, like Divia Eden, I am bullish on using the Tesla strategy of offering a premium product at a premium price, then later either people decide they need it and pay up or you scale enough to lower costs – if the tech delivers.
Gaming could be...
It seems that the "ethical simulator" from point 1. and the LLM-based agent from point 2. overlap, so you just overcomplicate things if make them two distinct systems. An LLM prompted with the right "system prompt" (virtue ethics) + doing some branching-tree search for optimal plans according to some trained "utility/value" evaluator (consequentialism) + filtering out plans which have actions that are always prohibited (law, deontology). The second component is the closest to what you described as an "ethical simulator", but is not quite it: the "utility/value" evaluator cannot say whether an action or a plan is ethical or not in absolute terms, it can only compare some proposed plans for the particular situation by some planner.