Jonathan Claybrough

Software engineer transitioned into AI safety, teaching and strategy. Particularly interested in psychology, game theory, system design, economics.

Wiki Contributions

Comments

Sorted by

There's tacit knowledge in bay rationalist conversation norms that I'm discovering and thinking about, here's an observation and related thought. (I put the example later after the generalisation because that's my preferred style, feel free to read the other way). 

Willingness to argue righteously and hash out things to the end, repeated over many conversations, makes it more salient when you're going for a dead end argument. This salience can inspire you to do argue more concisely and to the point over time. 
Going to the end of things generates ground data on which to update your models of arguing and conversation paths, instead of leaving things unanswered. 
So, though it's skilful to know when not to "waste" time on details and unimportant disagreements, the norm of "frequently enough going through til everyone agrees on things" seems profoundly virtuous. 

Short example from today, I say "good morning". They point out it's not morning (it's 12:02). I comment about how 2 minutes is not that much. They argue that 2 minutes is definitely more than zero and that's the important cut-off. 
I realize that "2 minutes is not that much" was not my true rebuttal, that this next token my brain generated was mostly defensive reasoning rather than curious exploration of why they disagreed with my statement. Next time I could instead note they're using "morning" to have a different definition/central cluster than I, appreciate that they pointed this out, and decide if I want to explore this discrepancy or not. 

Many things don't make sense if you're just doing them for local effect, but do when you consider long term gains. (something something naive consequentialism vs virtue ethics flavored stuff)

I don't strongly disagree but do weakly disagree on some points so I guess I'll answer

Re first- if you buy into automated alignment work by human level AGI, then trying to align ASI now seems less worth it. The strongest counterargument to this I see is that "human level AGI" is impossible to get with our current understanding, as it will be superhuman in some things and weirdly bad at others.

Re second- disagreements might be nitpicking on "few other approaches" vs "few currently pursued approaches". There are probably a bunch of things that would allow fundamental understanding if they panned out (various agent foundations agendas, probably safe ai agendas like davidad's), though one can argue they won't apply to deep learning or are less promising to explore than SLT

I don't think your second footnote sufficiently addresses the large variance in 3D visualization abilities (note that I do say visualization, which includes seeing 2D video in your mind of a 3D object and manipulating that smoothly), and overall I'm not sure where you're getting at if you don't ground your post in specific predictions about what you expect people can and cannot do thanks to their ability to visualize 3D. 

You might be ~conceptually right that our eyes see "2D" and add depth, but *um ackshually*, two eyes each receiving 2D data means you've received 4D input (using ML standards, you've got 4 input dimensions per time unit, 5 overall in your tensor). It's very redundant, and that redundancy mostly allows you to extract depth using a local algo, which allows you to create a 3D map in your mental representation. I don't get why you claim we don't have a 3D map at the end.

Back to concrete predictions, are there things you expect a strong human visualizer couldn't do? To give intuition I'd say a strong visualizer has at least the equivalent visualizing, modifying and measuring capabilities of solidworks/blender in their mind. You tell one to visualize a 3D object they know, and they can tell you anything about it.

It seems to me the most important thing you noticed is that in real life we don't that often see past the surface of things (because the spectrum of light we see doesn't penetrate most material) and thus most people don't know the inside of 3D things very well, but that can be explained by lack of exposure rather than inability to understand 3D. 

Fwiw looking  at the spheres I guessed an approx 2.5 volume ratio. I'm curious, if you visualized yourself picking up these two spheres, imagining them made of a dense metal, one after the other, could you feel one is 2.3 times heavier than the previous?

I'll give fake internet points to whoever actually follows the instructions and posts photographic proof.

The naming might be confusing because pivotal act sounds like a one time action, but in most cases getting to a stable world without any threat from AI requires constant pivotal processes. This makes almost all the destructive approaches moot (and they're probably already bad for ethical concerns and many others already discussed) because you'll make yourself a pariah.

The most promising venue for a pivotal act/pivotal process that I know of is doing good research so that ASI risks are known and proven, doing good outreach and education so most world leaders and decision makers are well aware of this, and helping setup good governance worldwide to monitor and limit the development of AGI and ASI until we can control it.

I recently played Outer Wilds and Subnautica, and the exercise I recommend for both of these games is : Get to the end of the game without ever failing. 
In subnautica that's dying once, in Outer Wilds it's a spoiler to describe what failing is (successfully getting to the end could certainly be argued to be a fail).
I failed in both of these. I played Outer Wilds first and was surprised at my fail, which inspired me to play Subnautica without dying. I got pretty far but also died from a mix of 1 unexpected game mechanic, uncareful measure of another mechanic, lack of redundancy in my contingency plans. 

Oh wow, makes sense. It felt weird that you'd spend so much time on posts, yet if you didn't spend much time it would mean you write at least as fast as Scott Alexander. Well, thanks for putting in the work. I probably don't publish much because I want it to not be much work to do good posts but you're reassuring it's normal it does.

(aside : I generally like your posts' scope and clarity, mind saying how long it takes you to write something of this length?)

Self modeling is a really important skill, and you can measure how good you are at it by writing predictions about yourself. (Modelling A notably important one for people who have difficulty with motivation is predicting your own motivation - will you be motivated to do X in situation Y?

If you can answer that one generally, you can plan to actually anything you could theoretically do, using the following algorithm : from current situation A, to achieve wanted outcome Z, find a predecessor situation Y from which you'll be motivated to get to Z (eg. have written 3 paragraphs of 4 of an essay), and a predecessor situational X from which you'll get to Y, iterate til you get to A (or forward chain, from A to Z). Check that indeed you'll be motivated each step of the way.

How can the above plan fail? Either you were mistaken about yourself, or about the world. Figure out which and iterate.

Appreciate the highlight of identity as this import/crucial self fulfilling prophecy, I use that frame a lot.

What does the title mean? Since they all disagree I don't see one as being more of a minority than the other. 

Load More