In this post, I proclaim/endorse forum participation (aka commenting) as a productive research strategy that I've managed to stumble upon, and recommend it to others (at least to try). Note that this is different from saying that forum/blog posts are a good way for a research community to communicate. It's about individually doing better as researchers.
Summary: The post describes a method that allows us to use an untrustworthy optimizer to find satisficing outputs.
Acknowledgements: Thanks to Benjamin Kolb (@benjaminko), Jobst Heitzig (@Jobst Heitzig) and Thomas Kehrenberg (@Thomas Kehrenberg) for many helpful comments.
Imagine you have black-box access to a powerful but untrustworthy optimizing system, the Oracle. What do I mean by "powerful but untrustworthy"? I mean that, when you give an objective function as input to the Oracle, it will output an element that has an impressively low[1] value of . But sadly, you don't have any guarantee that it will output the optimal element and e.g. not one that's also chosen for a different purpose (which might be dangerous for many reasons, e.g. instrumental convergence).
What questions can you safely ask the Oracle? Can you use it to...
First thought: The oracle is going to choose to systematically answer or not answer the queries we give it. This represents a causal channel of one bit per query it can use to influence the outside world[1]. Can you conquer the world in one awkwardly delivered kilobyte or less? Maybe.
Agreed. I think it's potentially a good bit worse than one kilobyte if let ourselves bet tricked to ask many questions, different questions or lower the difficulty of the safety constraint too much.
As mentioned in footnote 10, this requires a kind of perfect coordination...
This is the ninth post in my series on Anthropics. The previous one is The Solution to Sleeping Beauty.
There are some quite pervasive misconceptions about betting in regards to the Sleeping Beauty problem.
One is that you need to switch between halfer and thirder stances based on the betting scheme proposed. As if learning about a betting scheme is supposed to affect your credence in an event.
Another is that halfers should bet at thirders odds and, therefore, thirdism is vindicated on the grounds of betting. What do halfers even mean by probability of Heads being 1/2 if they bet as if it's 1/3?
In this post we are going to correct them. We will understand how to arrive to correct betting odds from both thirdist and halfist positions, and...
(Tl;dr: sleeping beauty is an edge case where different reward structures are intuitively possible and so people imagine different game payout structures behind the definition of “probability”. Once the payout structure is fixed, the confusion is gone. With a fixed payout structure&preference framework rewarding the number you output as “probability”, people don’t have a disagreement about what is the best number to output. Sleeping beauty is about definitions.)
Your posts argue that if a tree falls on a deaf Sleeping Beauty, in a forest with no one to ...
This is my personal opinion, and in particular, does not represent anything like a MIRI consensus; I've gotten push-back from almost everyone I've spoken with about this, although in most cases I believe I eventually convinced them of the narrow terminological point I'm making.
In the AI x-risk community, I think there is a tendency to ask people to estimate "time to AGI" when what is meant is really something more like "time to doom" (or, better, point-of-no-return). For about a year, I've been answering this question "zero" when asked.
This strikes some people as absurd or at best misleading. I disagree.
The term "Artificial General Intelligence" (AGI) was coined in the early 00s, to contrast with the prevalent paradigm of Narrow AI. I was getting my undergraduate computer science...
Yes, I agree. Whenever I think of things like this I focus on how what matters in the sense of "when will agi be transformational" is the idea of criticality.
I have written on it earlier but the simple idea is that our human world changes rapidly when AI capabilities in some way lead to more AI capabilities at a fast rate.
Like this whole "is this AGI" thing is totally irrelevant, all that matters is criticality. You can imagine subhuman systems using AGI reaching criticality, and superhuman systems being needed. (Note ordinary humans do have criticality...
previously: https://www.lesswrong.com/posts/h6kChrecznGD4ikqv/increasing-iq-is-trivial
I don't know to what degree this will wind up being a constraint. But given that many of the things that help in this domain have independent lines of evidence for benefit it seems worth collecting.
Food:
dark chocolate, beets, blueberries, fish, eggs. I've had good effects with strong hibiscus and mint tea (both vasodilators).
Exercise:
Regular cardio, stretching/yoga, going for daily walks.
Learning:
Meditation, math, music, enjoyable hobbies with a learning component.
Light therapy:
Unknown effect size, but increasingly cheap to test over the last few years. I was able to get Too Many lumens for under $50. Sun exposure has a larger effect size here, so exercising outside is helpful.
Cold exposure:
this might mostly just be exercise for the circulation system, but cold showers might also have some unique effects.
Chewing on things:
Increasing blood...
Sources are a shallow dive of google and reading a few abstracts, this is intended as trailheads for people, not firm recommendations. If I wanted them to be reccs I would want to estimate effect sizes and estimates of the quality of the related research.
Lots of people already know about Scott Alexander/ACX/SSC, but I think that crossposting to LW is unusually valuable in this particular case, since lots of people were waiting for a big schelling-point overview of the 15-hour Rootclaim Lab Leak debate, and unlike LW, ACX's comment section is a massive vote-less swamp that lags the entire page and gives everyone equal status.
It remains unclear whether commenting there is worth your time if you think you have something worth saying, since there's no sorting, only sifting, implying that it attracts small numbers of sifters instead of large numbers of people who expect sorting.
Here are the first 11 paragraphs:
...Saar Wilf is an ex-Israeli entrepreneur. Since 2016, he’s been developing a new form of reasoning, meant to transcend normal human bias.
His
I agree, I think the most likely version of the lab leak scenario does not involve an engineered virus. Personally I would say 60% chance zoonotic, 40% chance lab leak.
He was 90 years old.
His death was confirmed by his stepdaughter Deborah Treisman, the fiction editor for the New Yorker. She did not say where or how he died.
The obituary also describes an episode from his life that I had not previously heard (but others may have):
...Daniel Kahneman was born in Tel Aviv on March 5, 1934, while his mother was visiting relatives in what was then the British mandate of Palestine. The Kahnemans made their home in France, and young Daniel was raised in Paris, where his mother was a homemaker and his father was the chief of research for a cosmetics firm.
During World War II, he was forced to wear a Star of David after Nazi German forces occupied the city in 1940. One night
(I assume you mean the story with him and the SS soldier; I think a couple of people got confused and thought you were referring to the fact Kahneman had died)
In January, I defended my PhD thesis, which I called Algorithmic Bayesian Epistemology. From the preface:
For me as for most students, college was a time of exploration. I took many classes, read many academic and non-academic works, and tried my hand at a few research projects. Early in graduate school, I noticed a strong commonality among the questions that I had found particularly fascinating: most of them involved reasoning about knowledge, information, or uncertainty under constraints. I decided that this cluster of problems would be my primary academic focus. I settled on calling the cluster algorithmic Bayesian epistemology: all of the questions I was thinking about involved applying the "algorithmic lens" of theoretical computer science to problems of Bayesian epistemology.
Although my interest in mathematical reasoning about uncertainty...
Thanks! Here are some brief responses:
From the high level summary here it sounds like you're offloading the task of aggregation to the forecasters themselves. It's odd to me that you're describing this as arbitrage.
Here's what I say about this anticipated objection in the thesis:
...For many reasons, the expert may wish to make arbitrage impossible. First, the principal may wish to know whether the experts are in agreement: if they are not, for instance, the principal may want to elicit opinions from more experts. If the experts collude to report an aggregate
The following is an example of how if one assumes that an AI (in this case autoregressive LLM) has "feelings", "qualia", "emotions", whatever, it can be unclear whether it is experiencing something more like pain or something more like pleasure in some settings, even quite simple settings which already happen a lot with existing LLMs. This dilemma is part of the reason why I think AI suffering/happiness philosophy is very hard and we most probably won't be able to solve it.
Consider the two following scenarios:
Scenario A: An LLM is asked a complicated question and answers it eagerly.
Scenario B: A user insults an LLM and it responds.
For the sake of simplicity, let's say that the LLM is an autoregressive transformer with no RLHF (I personally think that the...
I quality-downvoted it for being silly, but agree-upvoted it because AFAICT that string does indeed contain all the (lowercase) letters of the English alphabet.