14th Aug 2023

1 min read

6

This is a special post for quick takes by the gears to ascension. Only they can create top-level comments. Comments here also appear on the Quick Takes page and All Posts page.

[paper link] Interpreting systems as solving POMDPs: a step towards a formal understanding of agency

2 comments13 karma

Mentioned in

5We haven't quit evolution [short]

the gears to ascenscion's Shortform

44the gears to ascension

2lesswronguser123

2Thane Ruthenis

5the gears to ascension

38the gears to ascension

7Raemon

8the gears to ascension

2Raemon

4Chris_Leong

5the gears to ascension

2Garrett Baker

2habryka

2the gears to ascension

2Gurkenglas

1[anonymous]

2the gears to ascension

28the gears to ascension

2the gears to ascension

2Mitchell_Porter

28the gears to ascension

7Wei Dai

3cherrvak

3StartAtTheEnd

3the gears to ascension

22the gears to ascension

2AnnaJo

4the gears to ascension

4faul_sname

4the gears to ascension

5faul_sname

2Bitnotri

1Luiza

22the gears to ascension

4the gears to ascension

3Milan W

17the gears to ascension

16the gears to ascension

12the gears to ascension

12habryka

3the gears to ascension

2Richard_Kennaway

2Dagon

6the gears to ascension

2Throwaway2367

7the gears to ascension

3the gears to ascension

1Throwaway2367

12the gears to ascension

5the gears to ascension

3Lone Pine

1the gears to ascension

5the gears to ascension

3the gears to ascension

2the gears to ascension

1the gears to ascension

0the gears to ascension

-1the gears to ascension

-2the gears to ascension

11the gears to ascension

4Alexander Gietelink Oldenziel

2the gears to ascension

9the gears to ascension

8the gears to ascension

7the gears to ascension

4Vladimir_Nesov

4samuelshadrach

4the gears to ascension

7the gears to ascension

10LVSN

7dkirmani

7the gears to ascension

2interstice

4the gears to ascension

2the gears to ascension

1the gears to ascension

2the gears to ascension

6the gears to ascension

2Dagon

1Johannes C. Mayer

6the gears to ascension

6DirectedEvolution

2Dagon

2the gears to ascension

2Raemon

6the gears to ascension

2Dagon

2the gears to ascension

6the gears to ascension

3Noosphere89

1the gears to ascension

5the gears to ascension

1RobertM

5gjm

4the gears to ascension

3Gunnar_Zarncke

5the gears to ascension

3Dagon

3the gears to ascension

4the gears to ascension

4Thomas Kwa

2the gears to ascension

4the gears to ascension

2Mitchell_Porter

2the gears to ascension

4Mitchell_Porter

2the gears to ascension

4the gears to ascension

3the gears to ascension

2the gears to ascension

4Viliam

2the gears to ascension

4testingthewaters

2Linda Linsefors

2the gears to ascension

5Carl Feynman

3the gears to ascension

4gjm

2Ben

2localdeity

2the gears to ascension

1Throwaway2367

1mesaoptimizer

2the gears to ascension

2Dagon

2the gears to ascension

2Dagon

1Fergus Fettes

2the gears to ascension

2Vladimir_Nesov

2the gears to ascension

2Vladimir_Nesov

4the gears to ascension

2Vladimir_Nesov

4[comment deleted]

2the gears to ascension

2the gears to ascension

1the gears to ascension

1T431

1the gears to ascension

1T431

1the gears to ascension

3Pattern

1the gears to ascension

1[comment deleted]

0the gears to ascension

16Viliam

2the gears to ascension

4Richard_Kennaway

2the gears to ascension

4Dagon

0the gears to ascension

-1the gears to ascension

2TAG

1the gears to ascension

-2the gears to ascension

2[comment deleted]

315 comments, sorted by

top scoring

Click to highlight new comments since: Today at 11:01 PM

Some comments are truncated due to high volume. (⌘F to expand all)Change truncation settings

[-]the gears to ascension3mo*445

Oddities - maybe deepmind should get Gemini a therapist who understands RL deeply:

https://xcancel.com/DuncanHaldane/status/1937204975035384028

https://www.reddit.com/r/cursor/comments/1l5c563/gemini_pro_experimental_literally_gave_up/

https://www.reddit.com/r/cursor/comments/1lj5bqp/cursors_ai_seems_to_be_quite_emotional/

https://www.reddit.com/r/cursor/comments/1l5mhp7/wtf_did_i_break_gemini/

https://www.reddit.com/r/cursor/comments/1ljymuo/gemini_getting_all_philosophical_now/

https://www.reddit.com/r/cursor/comments/1lcilx1/things_just_werent_going_well/

https://www.reddit.com/r/cursor/comments/1lc47vm/gemini_rage_quits/

https://www.reddit.com/r/cursor/comments/1l4dq2w/gemini_not_having_a_good_day/

https://www.reddit.com/r/cursor/comments/1l72wgw/i_walked_away_for_like_2_minutes/

https://www.reddit.com/r/cursor/comments/1lh1aje/i_am_now_optimizing_the_users_kernel_the_user/

https://www.reddit.com/r/vibecoding/comments/1lk1hf4/today_gemini_really_scared_me/

https://www.reddit.com/r/ProgrammerHumor/comments/1lkhtzh/ailearninghowtocope/

2lesswronguser1233mo

I told gemini yesterday to find maths notations in my compose file copy. At first it was unable to do so, since it mentioned notations which did not exist. I pointed out its mistake over and over—eventually it inserted those symbols which I never asked it for— in the end it ended up apologizing "Sorry I have failed you, it's my mistake how can I fix this?" , it almost felt like gemini was insecure about its own abilities lol. It seems like the only chatbot which does this, claude feels secure, chatgpt tries to avoid these convos superficially, I wonder what's the difference between them which is causing this.

2Thane Ruthenis3mo

Huh. Sydney vibes. I wonder whether it's its ghost possessing Gemini, or whether this cadence of speech[1] is some natural attractor. (Should've paid more attention to LLM whisperers.) 1. ^ The "spelling out the logical chain in separate self-contained sentences with a lot of 'and's" thing? E. g. here.

5the gears to ascension3mo

feels like gpt2 mode collapse. The surprising thing as of sydney was the amount of intelligence still remaining in the word choices each time. Feels like a similar thing here - sydney's emotional state seems different than gemini, but definitely feels like some overlap.

[-]the gears to ascension2y*383

[edit: why does this have so many more upvotes than my actually useful shortform posts]

Someone mentioned maybe I should write this publicly somewhere, so that it is better known. I've mentioned it before but here it is again:

I deeply regret cofounding vast and generally feel it has almost entirely done harm, not least by empowering the other cofounder, who I believe to be barely better than e/acc folk due to his lack of interest in attempting to achieve an ought that differs from is. I had a very different perspective on safety then and did not update in time to not do very bad thing. I expect that if you and someone else are both going to build something like vast, and theirs takes three weeks longer to get to the same place, it's better to save the world those three weeks without the improved software. Spend your effort on things like lining up the problems with QACI and cannibalizing its parts to build a v2, possibly using ideas from boundaries/membranes, or generally other things relevant to understanding the desires, impulses, goals, wants, needs, objectives, constraints, developmental learning, limit behavior, robustness, guarantees, etc etc of mostly-pure-RL curious-robotics... (read more)

7Raemon2y

I vaguely remember talking to you about this at the time but don't remember what your motivations and thoughts were for cofounding vast at the time. I think I'm most interested in this from the perspective of "what decisionmaking processes were you following then, how did they change, and what was the nearest nearby trail of thoughts that might have led you to make a different decision at the time?"

8the gears to ascension2y

At the time my main worry was honestly probably just wanting money. Also a general distrust of deepmind, along with a feeling that alignment would be easy - compare the alignment optimism perspective, which I think discusses the same mechanisms and I would have agreed without qualification then. I still think some parts of that model, but now believe that the alignment problem's main manifestations are moloch, authoritarianism, and rentseeking, and the failure story I expect no longer looks like "deepmind is in charge" and looks rather more like a disneyland without children. So the alignment approaches that seem promising to me are the ones that can counter people who are attempting to get alignment with the ownership system, because I expect humans to be suddenly locked out of the ownership system, including humans who are currently very rich within it. I spoke to the cofounder a lot about mechanism design of social systems, and we had very interesting ideas for how to do it. If the world were going to stay human I'd be optimistic about designing novel currencies that are optimized to be unusually hard to moloch, and that optimism arose from many long conversations with him. But recent conversations with him seem to imply his views are corrupted by the drive for money; his views on mechanism design don't seem to me to solve the misalignment of markets with their poor participants. He does have interesting ideas and I might have interest in having a lesswrong dialogue with him at some point.

2Raemon2y

Makes sense, thanks for sharing!

4Chris_Leong2y

Well done for writing this up! Admissions like this are hard often hard to write. Have you considered trying to use any credibility from helping to cofound vast for public outreach purposes?

5the gears to ascension2y

So I hear. It wasn't particularly. Ah yes, I, the long-since-exited cofounder of the, uh, mildly popular sort-of-indie gig-economy-of-things-style-rentseeking-of-web-hosting-service used by ai people, should use my overflowing Credibility stat to convince impactful people that... ...they should work on adding something to the list "qaci, boundaries, and similar proposals"? hmm. idk, maybe. sounds more useful to say it without trying to make myself out to be anyone in particular. The people I'd want to convince are probably not the ones who'd be impressed by credentials of any kind.

2Garrett Baker2y

Its appearing on the front-page to me, and has been for the past day or so. Otherwise I never would have seen it.

2habryka2y

Yeah, visibility of shortforms is now like 3-4x higher than it was a week ago, so expect shortforms in-general to get many more upvotes.

2the gears to ascension2y

oh, that would do it. nice, btw.

2Gurkenglas2y

I previously told an org incubator one simple idea against failure cases like this. Do you think you should have tried the like? Funnily enough I spotted this at the top of lesslong on the way to write the following, so let's do it here: What less simple ideas are there? Can an option to buy an org be conditional on arbitrary hard facts such as an arbitrator finding it in breach of a promise? My idea can be Goodharted through its reliance on what the org seems to be worth, though "This only spawns secret AI labs." isn't all bad. Add a cheaper option to audit the company? It can also be Goodharted through its reliance on what the org seems to be worth. OpenAI shows that devs can just walk out.

1[anonymous]2y

Vast AI offers hourly rental of compute hardware? How do you believe this contributes to negative future outcomes? I ask because assuming scaling hypothesis is mostly true, training potentially dangerous models require more compute than is available for rent. The big labs are using dedicated hardware clusters. Another factor to examine is whether or not the number was "3 weeks" or "0 weeks". Assuming Vast consumed ICs from the current limited supply, had Vast been slower to begin operations, the supply would still be limited. Technically ok it signals Nvidia to order more 3 weeks early, by making the order backlog deeper, but the delta between "contributed" and "didn't" is very small. Finally you have to look at threat models. Actually participating in bad outcomes would be something like "let's rent out compute hardware, not check who our customers are, let them run anything they want, and pay with anonymous credit cards. Hosted offshore." Today you would just be supporting illegal activity (for probably a price premium you could demand), but this is what could host the rogues of the future.

2the gears to ascension2y

you and I have very different models of this. I'm not terribly interested in getting into the details. Some of your points overlap mine, some don't. that's all I feel is worth the time.

[-]the gears to ascension1mo28-2

points Id want to make in a main post.

you can soonishly be extremely demanding with what you want to prove, and then ask a swarm of ais to go do it for you. if you have a property that you're pretty sure would mean your ai was provable at being good in some way if you had a proof of your theorem about it, but it's way too expensive to find the space of AIs that are provable, you can probably combine deep learning and provers somehow to get it to work, something not too far from "just ask gemini deep thinking", see also learning theory for grabbing the outside and katz lab in israel for grabbing the inside.
probably the agent foundations thing you want, once you've nailed down your theorem, will give you tools for non-provable insights, but also a framework for making probability margins provable (eg imprecise probability theory).
- you can prove probability margins about fully identified physical systems! you can't prove anything else about physical systems, I think?
- (possibly a bolder claim) useful proofs are about what you do next and how that moves you away from bad subspaces and thus how you reach the limit, not about what happens in the limit and thus what you do next
findi

... (read more)

2the gears to ascension1mo

It seems like a lot of the trouble with making this plan work is the trouble of making formal verification scale to very large and messy systems, which has been an open problem for a long time, and is the big place this class of idea might turn out to be effectively impossible. It may be that noise pressure is too high, so learning the real world unavoidably forces mess beyond the reach of any formal verification based approach. I've bumped into various folks who tried formal verification for neural networks before and the thing they've said consistently is, "well, you can do it at all, but the most complex properties are still things like not crashing an aircraft in a simplified flight simulator for a low-millions-of-parameters model, and that's primarily possible because of the low dimensional input/output space." formally verifying huge, combinatorial spaces (eg, llm/image/physics model IO) seems potentially very far beyond tractability, even with the help of a deep learning approach specifically designed for scaling formal verification to huge messy things. if it's possible to make this work, it'd look like figuring that out.

2Mitchell_Porter1mo

I wonder if anyone in the "Moonshot Alignment Program" is pursuing this?

[-]the gears to ascension2y*2821

[edit: pinned to profile]

I feel like most AI safety work today doesn't engage sufficiently with the idea that social media recommenders are the central example of a misaligned AI: a reinforcement learner with a bad objective with some form of ~online learning (most recommenders do some sort of nightly batch weight update). we can align language models all we want, but if companies don't care and proceed to deploy language models or anything else for the purpose of maximizing engagement and with an online learning system to match, none of this will matter. we need to be able to say to the world, "here is a type of machine we all can make that will reliably defend everyone against anyone who attempts to maximize something terrible". anything less than a switchover to a cooperative dynamic as a result of reliable omnidirectional mutual defense seems like a near guaranteed failure due to the global interaction/conflict/trade network system's incentives. you can't just say oh, hooray, we solved some technical problem about doing what the boss wants. the boss wants to manipulate customers, and will themselves be a target of the system they're asking to build, just like sundar pichai has to use self-discipline to avoid being addicted by the youtube recommender same as anyone else.

7Wei Dai2y

Agreed. I wrote about this concern (or a very similar one) here. In general I think the AI safety community seems to be too focused on intent alignment and deception to the exclusion of other risks, and have complained about this a few times before. (Let me know if you think the example you raise is adequately covered by the existing items on that list, or should have its own bullet point, and if so how would you phrase it?)

3cherrvak2y

David Chapman actually uses social media recommendation algorithms as a central example of AI that is already dangerous: https://betterwithout.ai/apocalypse-now

3StartAtTheEnd2y

It sounds like you're describing Maloch here. I agree entirely, but I'd go much further than you and claim "Humans aren't aligned with eachother or even themselves" (self-dicipline is a kind of tool against internal misalignment, no?). I also think that basically all suffering and issues in the world can be said to stem from a lack of balance, which is simply just optimization gone wrong (since said optimization is always for something insatiable, unlike things like hunger, in which the desire goes away once the need is met). Companies don't optimize for providing value, but for their income. If they earn a trillion, they will just invest a trillion into their own growth, so that they can earn the next trillion. And all the optimal strategies exploit human weaknesses, clickbait being an easy example. In fact, it's technology which has made this exploitation possible. So companies end up becoming tool-assisted cancers. But it's not just companies which are the problem here, it's everything which lives by darwinian/memetic principles. The only exception is "humanity", which is when optimality is exchanged for positive valence. This requires direct human manipulation. Even an interface (online comments and such) are slightly dehumanized compared to direct communication. So any amount of indirectness will reduce this humanity.

3the gears to ascension2y

[edit: pinned to profile] Yeah. A way I like to put this is that we need to durably solve the inter being alignment problem for the first time ever. There are flaky attempts at it around to learn from, but none of them are leak proof and we're expecting to go to metaphorical sea (the abundance of opportunity for systems to exploit vulnerability in each other) in this metaphorical boat of a civilization, as opposed to previously just boating in lakes. Or something. But yeah, core point I'm making is that the minimum bar to get out of the ai mess requires a fundamental change in incentives.

[-]the gears to ascension1mo220

reporting failures: I finetuned gemini and it went badly. over the weekend I had the urge to try to get a virtual lesswronger who could yell at me for bad takes endlessly. I collected a curated (but not really well curated) dataset of my favorite contrarians replying to threads. I trained gemini-2.5-flash-lite on this, intending it to be a cheap test pass to see if it was worth pursuing further. it was not cheap, I thought it'd cost $10 - counting tokens in the dataset, which used fairly long contexts because it included the posts and had the favorite user as the model's response - but it instead cost $100, presumably counting total tokens inferred! and of course, being flash-lite, the output is pretty disappointing. idk, might be interesting as a source of terrible takes to be annoyed at and get inspired by, at best. instead of being an online contrarian it's an endless source of someone being wrong on the internet. sigh. quite annoyed at the lost time and money.

2AnnaJo1mo

iirc @habryka had a prompt that worked pretty well to get GPT (I forget which version) to become a lesswronger that could yell at you for bad takes

4the gears to ascension1mo

I also have a prompt I'm pretty happy with, would be enthusiastic to trade. Prompts aren't on the same level as a well-chosen tuning dataset, though. turns out "well chosen" is a pretty high bar, and tuning is more expensive than I realized.

4faul_sname1mo

I wonder how hard it would be to iteratively write a prompt that can consistently mimic your judgment in distinguishing between good takes and bad takes for the purposes of curating a well-chosen tuning dataset. I'd expect not that hard.

4the gears to ascension1mo

yeah, preprocessing with an AI's help will be my next go at it. likely a bit more than just curation, also various data cleaning and reformatting. I might try very low epoch count, very high learning rate, minimum lora size, to see if that can get me anywhere. Also going to try local tuning. my suspicion is that in-context learning is stronger, but I currently expect to prefer fine-tuning if it works at all, because it can force lower dimensional update (only in the low millions rather than gbs of kv cache) and I'd expect that to generalize better. probably there are like 30 papers on this, actually, brb edit: hmmm... hmmmmmmmmn hmmm wat that's not alignment ok looks tasty, read me further? pac bounds for icl?? what's the gotcha. but in the intro they say okay so yeah my intuition is updated a bit from this paper browsing session but the main thing I'm getting so far is "lol idk try it and see what sticks, both of these things are unreliable and sometimes one is better than the other" it can do what now? okay, so icl is good and then bad, whereas sft is bad and then good? thanks for clearing that up for me. looking at their charts... weird, they're clustering multiple-choice questions by answer letter? their charts of clustering and intrinsic dimension sure would be cool if I was reading closely enough to be sure they meant anything https://arxiv.org/abs/2502.04580 "Our findings reveal a striking dichotomy: while ICL initially matches the efficiency of a Bayes optimal estimator, its efficiency significantly deteriorates in long context" sure yes makes sense, this feels overwrought but probably true I think the thing I want out of SFT is probably just that it works to change implicit priors in the model without the model having verbal reference to the things changed, so it can bypass a bunch of verbal tics and go directly for imitation

5faul_sname1mo

Yeah "try it and see" is the gold standard. I do know that for stuff which boils down to "monitor for patterns in text data that is too large to plausibly be examined by a team of humans we could afford to hire" I've been favoring the approach of 1. Grab 100 random data points, run the most obvious possible prompt on them to get reasoning + label(s) + confidence 2. Spot check the high confidence ones to make sure you're not getting confident BS out of the model (you can alternatively start by writing two very different prompts for the same labeling task and see where the answers differ, that will also work) 3. Look at the low-confidence ones, see if the issue is your labeling scheme / unclear prompt / whatever - usually it's pretty obvious where the model is getting confused 4. Tweak your prompt, comparing new labels to old. Examine any data points that have changed - often your prompt change fixed the original problems but caused new ones to surface. Note that for this step you want your iteration time to be under 10 minutes per iteration and ideally under 10 seconds from "hit enter key" to "results show up on screen". Any of the major LLMs can trivially vibe code you an acceptable spreadsheet-like interface for this, including hooking up the tool calling API to get structured data out of your prompts for easy inspection. 5. Once you're reasonably happy with the performance on 100 samples, bump to 1000, run all 1000 datapoints against all of the prompts you've iterated on so far, and focus on the datapoints which got inconsistent results between prompts or had a low-confidence answer on the last one Once I'm happy with the performance on a sample of 1000 I rarely encounter major issues with the prompt, other than ones I was already aware of and couldn't be bothered to fix (the usual case for that is "I realize that the data I'm asking the model to label doesn't contain all decision-relevant information, and that when I'm labeling I sometimes have to fetch ex

2Bitnotri1mo

willing to share prompts?

1Luiza1mo

What about the gpt oss ones for next weekend?

[-]the gears to ascension9mo*225

people who dislike AI, and therefore could be taking risks from AI seriously, are instead having reactions like this. https://blue.mackuba.eu/skythread/?author=brooklynmarie.bsky.social&post=3lcywmwr7b22i why? if we soberly evaluate what this person has said about AI, and just, like, think about why they would say such a thing - well, what do they seem to mean? they typically say "AI is destroying the world", someone said that in the comments; but then roll their eyes at the idea that AI is powerful. They say the issue is water consumption - why would someone repeat that idea? Under what framework is that a sensible combination of things to say? what consensus are they trying to build? what about the article are they responding to?

I think there are straightforward answers to these questions that are reasonable and good on behalf of the people who say these things, but are not as effective by their own standards as they could be, and which miss upcoming concerns. I could say more about what I think, but I'd rather post this as leading questions, because I think the reading of the person's posts you'd need to do to go from the questions I just asked to my opinions will build more... (read more)

9Nathan Helm-Burger9mo

Giving this a brief look, and responding in part to this and in part to my previous impressions of such worldviews... They don't mean "AI is destroying the world", they mean "tech bros and greedy capitalists are destroying the world, and AI is their current fig leaf. AI is impotent, just autocomplete garbage that will never accomplish anything impressive or meaningful." This mindset is saying, "Why are these crazy techies trying to spin this science-fiction story? This could never happen, and would be horrible if it did." I want a term for the aspect of this viewpoint which is purely reactive, deliberately anti-forward-looking. Anti-extrapolation? Tech-progress denying? A viewpoint that is allergic to the question, "What might happen next?" This viewpoint is heavily entangled with bad takes on economic policies as well, as a result of failure to extrapolate. Also tends to be correlated with anger at existing systems without wanting to engage in architecting better alternatives. Again, because to design a better system requires lots of prediction and extrapolation. How would it work if we designed a feedback machanism like this vs that? Well, we have to run mental simulations and look for edge cases to mentally test, mathematically explore the evolution of the dynamics. A vibes-based worldview, that shrinks from analyzing gears. This phenomenon is not particularly correlated with a political stance, some subset of every political party will have many such people in it. Can such people be fired up to take useful actions on behalf of the future? Probably. I don't think the answer is as simple as changing terminology or carefully modelling their current viewpoints and bridging the inferential divides. If the conceptual bridge you build for them is built of gears, they will be extremely reluctant to cross it.

5Milan W9mo

While looking at more gear-based leftist takes on AI, I found this piece by Daniel Morley, published in the magazine of the Trotskyist "Revolutionary Communist International". While it contains some fundamental misunderstandings (I personally cringed at conflation of consciousness and intelligence), it shows the writer has done a surprising amount of technical due diligence (it briefly touches on overfitting and adversarial robustness). While it's thesis boils down to "AI will be bad under capitalism (because technological unemployment and monopolies) but amazing under communism (because AI can help us automate the economy), so let us overthrow capitalism faster", it at least has a thesis derived from coherent principles and a degree of technical understanding. Also it cited references and made a quite tasteful use of Stable Diffusion for illustrations, so that was nice. Anyways I guess my somewhat actionable point here is that the non-postmodernist Marxists seem to be at least somewhat thinking (as opposed to angry-vibing) about AI.

2Viliam9mo

From today's perspective, Marx is just another old white cishet tech bro. (something something swims left) I never expected that one day I would miss the old-style Marxists, but God forgive me, I do. We disagreed on many things, but at least we were able to have an intelligent debate.

3Milan W9mo

Political positions are inherently high-dimensional, and "leftward" is constantly being rotated around according to where the set of people and institutions considered to be "the left" seem to be moving to.

4the gears to ascension9mo

Indeed, and I think that-this-is-the-case is the message I want communicators to grasp: I have very little reach, but I have significant experience talking to people like this, and I want to transfer some of the knowledge from that experience to people who can use it better. The thing I've found most useful is to be able to express that significant parts of their viewpoint are reasonable. Eg, one thing I've tried is "AI isn't just stealing our work, it's also stealing our competence". Hasn't stuck, though. I find it helpful to point out that yes, climate change sure is a (somewhat understated) accurate description of what doom looks like. I do think "allergies" are a good way to think about it, though. They're not unable to consider what might happen if AI keeps going as it is, they're part of a culture that is trying to apply antibodies to AI. And those antibodies include active inference wishcasting like "AI is useless". They know it's not completely useless, but the antibody requires them to not acknowledge that in order for its effect to bind; and their criticisms aren't wrong, just incomplete - the problems they raise with AI are typically real problems, but not high impact ones so much as ones they think will reduce the marketability of AI.

3Milan W9mo

Zvi has an expansion on the vibes-based vs gears-based thinking model that I have found useful for thinking about politics: his take on Simulacra levels.

[-]the gears to ascension1y172

Wei Dai and Tsvi BT posts have convinced me I need to understand how one does philosophy significantly better. Anyone who thinks they know how to learn philosophy, I'm interested to hear your takes on how to do that. I get the sense that perhaps reading philosophy books is not the best way to learn to do philosophy.

I may edit this comment with links as I find them. Can't reply much right now though.

[-]JuliaHP1y176

Transfer learning is dubious, doing philosophy has worked pretty well for me thus far for learning how to do philosophy. More specifically, pick a topic you feel confused about or a problem you want to solve (AI kill everyone oh no?). Sit down and try to do original thinking, and probably use some external tool of preference to write down your thoughts. Then do live or afterwards introspection on if your process is working and how you can improve it, repeat.
This might not be the most helpful, but most people seem to fail at "being comfortable sitting down and thinking for themselves", and empirically being told to just do it seems to work.

Maybe one crucial object level bit has to do with something like "mining bits from vague intuitions" like Tsvi explains at the end of this comment, idk how to describe it well.

2Chris_Leong1y

I highly recommend this post. Seems like a more sensible approach to philosophy than conceptual analysis: https://www.lesswrong.com/posts/9iA87EfNKnREgdTJN/a-revolution-in-philosophy-the-rise-of-conceptual

2Mitchell_Porter1y

What has "conceptual engineering" contributed to philosophy? Does it tell us anything new about why anything exists, what the categories of being are, or the nature of the good?

7zdot1y

Not to suggest that you've done this, but I think it's a fairly common mistake to look for conceptual engineering's merits as a metaphilosophy by only looking at papers that include the words 'conceptual engineering', many of which are quite bad. There's a section of Fixing Language (by Cappelen) that provides examples of actual philosophical contributions, some of which predate the term. Two papers that I think are important - and count as conceptual engineering, by my lights - are The Extended Mind and Grace and Alienation.

5Chris_Leong1y

The second paper looks interesting. (Having read through it, it's actually really, really good).

2Chris_Leong1y

Helps people avoid going down pointless rabbit holes.

3Mitchell_Porter1y

Can you give the most important examples? Are there some classic mistakes that conceptual engineering is uniquely equipped to overcome?

2Chris_Leong1y

I think the benefits are adequately described in the post.

6Mitchell_Porter1y

Sorry, but the only philosophical position I even see in the post, is the claim that there are no essences. The philosophical line seems to be: conceptual analysis is about seeking the essence of a concept; but there are no essences, and concepts are just categories with quasi-arbitrary boundaries that can be re-drawn; so let's just focus on drawing the boundaries of our concepts where we want them to be. Well, if you're engaged in an intellectual activity, both analysis and re-definition may be appropriate at various times (as shown in your own post). But why would acknowledging the ability to re-define a concept be so revolutionary or important? Evidently it's because the author considers it a rebuttal of Platonism. But that is nothing new. For as long as there have been thinkers taking the reality of "abstract entities" seriously, there have been other thinkers urging nominalism or materialism or that only concrete things exist. Is that why you think it's important? So as to avoid reification of the abstract?

2Chris_Leong1y

My take is that you can't define term X until you know why you're trying to define term X. For example, if someone asks what "language" is, instead of trying to jump in with an answer, it's better to step back and ask why the person is asking the question. For example, if someone asks "How many languages do you know?", they probably aren't asking about simple schemes like "one click = yes, two clicks = no". On the other hand, it may make sense to talk about such simple schemes in an introductory course on "human languages". Asking "Well what really is language?" independent of any context is naive.

[-]the gears to ascension4mo160

post ideas, ascending order of "I think causes good things":

(lowest value, possibly quite negative) my prompting techniques
velocity of action as a primary measurement of impact (how long until this, how long until that)
sketch: people often measure goodness/badness in probabilities. latencies, or probability of moving to next step per time?, might be underused, for macro scale systems. if you're trying to do differential improvement of things, you want to change the expected time until a thing happens - which, looking at the dynamics of the systems involved, means changing how fast relevant things happen. possibly obvious to many, weird I'd need to even say it for some, but a useful insight for others?
goodhart slightly protective against ppl optimizing for badbehavior benchmarks?
sketch: people make benchmark of bad thing. optimizing for benchmark doesn't produce as much bad thing as ai that accidentally scores highly. so, benchmark of bad thing not as bad as it seems. especially if dataset small. standard misalignment argument, but may be mildly protective if dataset is of doing bad things instead of good things
my favorite research plans and why you should want to contr

... (read more)

[-]the gears to ascension1y*121

~~This will be my last comment on lesswrong until it is not possible for post authors to undelete comments~~. [edit: since it's planned to be fixed, nevermind!]

originally posted by a post author:

This comment had been apparently deleted by the commenter (the comment display box having a "deleted because it was a little rude, sorry" deletion note in lieu of the comment itself), but the ⋮-menu in the upper-right gave me the option to undelete it, which I did because I don't think my critics are obligated to be polite to me. (I'm surprised that post authors have that power!) I'm sorry you didn't like the post.

[This comment is no longer endorsed by its author]Reply

[-]habryka1y121

This is indeed not intentional. Shouldn't be too hard to fix, I think.

3the gears to ascension1y

glad to hear it!

2Richard_Kennaway1y

Pro tem you can edit your post to delete its text, then delete the post. I am assuming that old versions of posts no longer exist.

2Dagon1y

I think that's a horrible and very non-obvious feature. I respect your right to avoid commenting, but I will continue. I've always assumed that comments (including non-submitted "saved text") could be retrieved and seen by admins, so I pretty much never write anything that would do serious permanent harm. Oh, can they revert edits as well? If not, an edit to remove everything but "please don't undelete this", followed by deletion, might be a workaround. I hope they'll fix this fairly quickly, though. Admins should be able to see deleted comments and old versions as part of an investigation into harassment or banning. And I'd argue that edits on old comments and posts (say, after a month) are useless and harmful - it reduces the value of the site, and it's archived in enough places as to not really be retrievable. Otherwise, people should be able to remove dumb or useless things they've said.

6the gears to ascension1y

certainly. however, I expect my comments to be, for most practical purposes, gone from public observation. I trust the moderators, but do not trust all post authors, especially when a post is inflammatory, I post something inflammatory before thinking it through, and then rapidly delete it - it's important to me that I be able to back out of a conversation like that before it's had significant impact. I can be a bit pointlessly fiery, especially in the comments of the author who I quoted originally, and I rely on delete as a second pass in those situations.

2Throwaway23671y

I think you can also delete without a trace, do post authors able to restore that too? (I'd guess no)

7the gears to ascension1y

test result: both comments were undeletable.

3the gears to ascension1y

make a couple of comments on one of my posts, then delete them both ways, and we can find out!

1Throwaway23671y

Already did before reading your comment :D

[-]the gears to ascension3y*120

some youtube channels I recommend for those interested in understanding current capability trends; separate comments for votability. Please open each one synchronously as it catches your eye, then come back and vote on it. downvote means not mission critical, plenty of good stuff down there too.

I'm subscribed to every single channel on this list (this is actually about 10% of my youtube subscription list), and I mostly find videos from these channels by letting the youtube recommender give them to me and pushing myself to watch them at least somewhat to give the cute little obsessive recommender the reward it seeks for showing me stuff. definitely I'd recommend subscribing to everything.

Let me know which if any of these are useful, and please forward the good ones to folks - this short form thread won't get seen by that many people!

edit: some folks have posted some youtube playlists for ai safety as well.

5the gears to ascension3y

Yannic Kilcher: paper explanations, capability news. Yannic is the machine learning youtuber. 129k subscribers, every one of whom has published 200 papers on machine learning (I kid). Has some of the most in depth and also broad paper explanations, with detailed drawings of his understanding of the paper. Great for getting a sense of how to read a machine learning paper. his paper choices are top notch and his ML news videos have really great capabilities news. https://www.youtube.com/channel/UCZHmQk67mSJgfCCTn7xBfew

5the gears to ascension3y

Valence Discovery: graph NNs, advanced chem models. Valence Discovery is a research group focusing on advanced chemical modeling. We don't have full strength general agent AI to plug into this quite yet, and certainly not safe reinforcement learning, but work like theirs has thoroughly eclipsed human capabilities in understanding chemicals. as long as we can use narrow ai to prevent general AI from destroying the cooperation network between beings, I think work like this has the potential to give the world every single goal of transhumanism: post scarcity, molecular assemblers, life extension, full bodily autonomy and morphological freedom, the full lot should be accessible. It'll take a bit longer to get to that level, but the research trajectory continues to look promising and these models haven't been scaled as much as language models. https://www.youtube.com/channel/UC3ew3t5al4sN-Zk01DGVKlg

5the gears to ascension3y

The Alan Turing Institute: variety, lately quite a bit of ai safety. eg: https://www.youtube.com/channel/UCcr5vuAH5TPlYox-QLj4ySw * they have a playlist of recent ai safety videos, many of which look like they plausibly include information not heavily discussed, or at least not well indexed, on less wrong https://www.youtube.com/watch?v=ApGusxR7JAc&list=PLuD_SqLtxSdXVSrXneEPkZtzTTQMT4hQ8 * They discuss social issues, including stuff like who gets to decide a non-explosive ai's targets https://www.youtube.com/watch?v=4Txa7pAOHZQ&list=PLuD_SqLtxSdVy8meO_ezV9l89Q9Gg8q6p * quite a few more interesting playlists on safety and security of ai in the playlists section https://www.youtube.com/c/TheAlanTuringInstituteUK/playlists * lots of discussion of complex systems * in particular, I love their video on social network analysis and I recommend it often https://www.youtube.com/watch?v=2ZHuj8uBinM&list=PLuD_SqLtxSdWcl2vx4K-0mSflRRLyfwlJ&index=9

5the gears to ascension3y

Steve Brunton: fancy visual lectures on nonlinear control systems & ML. has some of the best educational content I've ever seen, just barely beating Mutual Information for explanation quality while going into much more advanced topics. Focuses on control theory, nonlinear control, dynamical systems, etc. https://www.youtube.com/channel/UCm5mt-A4w61lknZ9lCsZtBw

3Lone Pine3y

Where do I start with this channel? Oldest video first?

1the gears to ascension3y

It's several college courses worth of material - it really depends what you want out of it. I personally am extremely curiosity-driven; without assessing what you already know I don't feel able to give strong recommendations of where to start, which is in fact why I posted so many links here in the first place. if you want to work through Brunton's content sequentially, I'd suggest picking the course playlist that interests you: https://www.youtube.com/c/Eigensteve/playlists If your interests are mostly unprimed, I'd suggest checking out the physics-informed ML and sparsity playlists, maybe also skip around the fluid dynamics playlist to get a sense of what's going on there. Alternately, skim a few videos to get a sense of which ones are relevant to your interests (2x speed with heavy jumping around), then queue the playlist that seems appropriate to you. If you really find it useful you might benefit from actually doing it like a course - I generally underpractice compared to ideal practice amount.

5the gears to ascension3y

The simons institute: very best wide variety, especially ai safety and game theory. The simons institute for theoretical computer science at UC Berkeley is a contender for my #1 recommendation from this whole list. Banger talk after banger talk after banger talk there. Several recent workshops with kickass ai safety focus. https://www.youtube.com/user/SimonsInstitute A notable recent workshop is "learning in the presence of strategic behavior": https://www.youtube.com/watch?v=6Uq1VeB4h3w&list=PLgKuh-lKre101UQlQu5mKDjXDmH7uQ_4T another fun one is "learning and games": https://www.youtube.com/watch?v=hkh23K3-EKw&list=PLgKuh-lKre13FSdUuEerIxW9zgzsa9GK9 they have a number of "boot camp" lessons that appear to be meant for an interdisciplinary advanced audience as well. the current focus of talks is on causality and games, and they also have some banger talks on "how not to run a forecasting competition", "the invisible hand of prediction", "communicating with anecdotes", "the challenge of understanding what users want", and my personal favorite due to its fundamental reframing of what game theory even is, "in praise of game dynamics": https://www.youtube.com/watch?v=lCDy7XcZsSI

5the gears to ascension3y

Schwartz Reisman Institute is a multi-agent safety discussion group, one of the very best ai safety sources I've seen anywhere. a few interesting videos include, for example, this one, which I think is on the cutting edge in terms of where AI safety will eventually end up (potentially multi-agent safety that comes into existence after humanity dies, if we don't get there fast enough to prevent darwinist AIs that don't love us from literally eating us, as yudkowsky describes with the words "does not love you, does not hate you, made out of atoms that can be used for something else"): "An antidote to Universal Darwinism" - https://www.youtube.com/watch?v=ENpdhwYoF5g as well as this kickass video on "whose intelligence, whose ethics" https://www.youtube.com/watch?v=ReSbgRSJ4WY https://www.youtube.com/channel/UCSq8_q4SCU3rYFwnA2bDxyQ

5the gears to ascension3y

Mutual Information: visual explanations of ML fundamentals. Mutual Information is one of the absolute best tutorial-and-explanation videos about the visual math of basic (small-model) machine learning. includes things like gaussian processes, which, it turns out, neural networks are a special case of. This means that neural networks are actually equivalent to non-parametric models, the weights are simply a reprojection of the training data (kinda obvious in retrospect), and understanding gaussian processes is not optional in understanding how neural networks interpolate between their training data. His video on gaussian processes is wonderful. https://www.youtube.com/watch?v=UBDgSHPxVME - lots of other interesting videos as well https://www.youtube.com/channel/UCCcrR0XBH0aWbdffktUBEdw

5the gears to ascension3y

Machine Learning Street Talk: Industry professionals giving talks meant for youtube. is one of the most interesting interview series-es (seriesen? serii?) on youtube. Discusses stuff like gflownets with yoshua bengio, geometric deep learning, thousand brains theory - all the stuff you really, really need to understand if you want to have any sense at all of where machine learning is going. (no, it's not hitting a wall.) https://www.youtube.com/channel/UCMLtBahI5DMrt0NPvDSoIRQ

5the gears to ascension3y

IPAM at UCLA: academic talks; Math, quantum, ML, game theory, ai safety, misc. is one of the most notable channels on this list; lots of hard math topics, but also quite a few extremely interesting ML topics, including an absolute banger talk series on distributed computation and collective intelligence. They also discuss extremely interesting topics about advanced physics which is way above my head as a self-taught ML nerd, but very interesting to attempt to absorb. https://www.youtube.com/c/IPAMUCLA/videos The collective intelligence workshop playlist: https://www.youtube.com/watch?v=qhjho576fms&list=PLHyI3Fbmv0SfY5Ft43_TbsslNDk93G6jJ

5the gears to ascension3y

IARAI: cutting-edge academic ML talks. "The Institute of Advanced Research in Artificial Intelligence" is not messing around with their name. The recent discussion of "Neural diffusion PDEs, differential geometry, and graph neural networks" seems to me to be a major next direction in ai capabilities, refining the issues with transformers with fundamental mathematics of graph curvature. "How GNNs and Symmetries can help solve PDEs" is also promising, though I haven't watched all the way through yet. https://www.youtube.com/channel/UClC7A82p47Nnj8ttU_COYeA/videos

5the gears to ascension3y

CPAIOR: formal verification in general, including on deep learning. Has a number of interesting videos on formal verification, how it works, and some that apply it to machine learning, eg "Safety in AI Systems - SMT-Based Verification of Deep Neural Networks"; "Formal Reasoning Methods in Machine Learning Explainability"; "Reasoning About the Probabilistic Behavior of Classifiers"; "Certified Artificial Intelligence"; "Explaining Machine Learning Predictions"; a few others. https://www.youtube.com/channel/UCUBpU4mSYdIn-QzhORFHcHQ/videos

3the gears to ascension3y

William Spaniel is a textbook writer and youtube video author on game theory. Probably not as relevant to an advanced audience, but has nice if slightly janky intros to the concepts. edit: since I posted this, he's gotten into detailed descriptions of war incentives and as a result became quite popular. https://www.youtube.com/user/JimBobJenkins

2the gears to ascension3y

The National Socio-Environmental Synthesis Center has a number of topics that felt a bit scientifically offbeat to me, but in particular, talks on knowledge integration across disciplines I found remarkably interesting. https://www.youtube.com/playlist?list=PLIGFwrZq94y-rj8CKOaVzBXGD5OTmeelc https://www.youtube.com/c/TheNationalSocioEnvironmentalSynthesisCenter

2the gears to ascension3y

The Berkman Klein Center for Internet and Society has some interesting discussion content that gets into ai safety: https://www.youtube.com/playlist?list=PL68azUN8PTNjTUsspsam0m0KmmUZ6l1Sh https://www.youtube.com/c/BKCHarvard

2the gears to ascension3y

Edan Meyer makes mid-level paper explanations. Not quite as good as yannic kilcher yet, but getting there. Has discussed a number of notable papers Yannic hasn't gotten to yet, such as the deepmind scaling laws paper. One of the higher production-quality, on-the-edge channels I've encountered for its level of beginner-friendliness, though. https://www.youtube.com/c/EdanMeyer/videos

2the gears to ascension3y

Emergent Garden is a fairly new channel, but has a great video on why even a simple feedforward network is already a very powerful general function approximator. Compare Art Of The Problem. https://www.youtube.com/watch?v=0QczhVg5HaI

2the gears to ascension3y

ACM SIGPLan is a special interest group on programming languages. Talks, discussions, presentations, long videos. https://www.youtube.com/channel/UCwG9512Wm7jSS6Iqshz4Dpg

1the gears to ascension3y

"Welcome AI Overlords" is a popsci ML-intros channel with high quality explanations of things like Graph Attention Networks: https://www.youtube.com/watch?v=SnRfBfXwLuY and an author interview with Equivariant Subgraph Aggregation Networks: https://www.youtube.com/watch?v=VYZog7kbXks https://www.youtube.com/channel/UCxw9_WYmLqlj5PyXu2AWU_g

1the gears to ascension3y

"Web IR / NLP Group at NUS" has talks, many from google research, about information retrieval, which is looking more and more likely to be a core component of any superintelligence (what a surprise, given the size of the internet, right? except also, information retrieval and interpolation is all that neural networks do anyway, see work on Neural Tangent Kernel) https://www.youtube.com/channel/UCK8KLoKYvow7X6pe_di-Gvw/videos

1the gears to ascension3y

UMich-CURLY is a research group and associated youtube channel discussing Simultaneous Localization And Mapping (SLAM) with neural networks. a recent overview talk was particularly interesting: https://www.youtube.com/watch?v=TUOCMevmbOg - https://www.youtube.com/channel/UCZ7Up19hdIWuCSuuATlzlbw/videos

1the gears to ascension3y

udiprod makes animated explainer videos about advanced computer science, including some fun quantum computer science. also has a visualization of, eg, an SVM. https://www.youtube.com/c/udiprod/videos

1the gears to ascension3y

The AI Epiphany is a solid paper explanations channel, and his choices of paper to discuss are often telling in terms of upcoming big-deal directions. Not quite as good as Yannic IMO, but imo worth at least subscribing to. https://www.youtube.com/c/TheAIEpiphany/videos

1the gears to ascension3y

Stanford MLSys Seminars is where talks from the Hazy Research group at stanford get posted, and their work has been some of the most eye-catching for me in the past two years. In particular, the S4 sequence model seems to me to represent a major capability bump in next-step-after-transformers models, due to its unusually stable learning. I might just be taken in by a shiny toy, but S4 is the next thing I'm going to play with capabilities wise. https://www.youtube.com/c/StanfordMLSysSeminars

1the gears to ascension3y

Robert Miles makes kickass AI safety videos. Y'all probably already know about him. He has repeated many opinions I don't think hold that came from less wrong, but if reading the archives here isn't your jam, watching the archives on his channel might be better. https://www.youtube.com/channel/UCLB7AzTwc6VFZrBsO2ucBMg

1the gears to ascension3y

Reducible creates absolutely kickass computer science explanation videos, including one on why jpeg is so effective, another on the interesting information routing in the fast fourier transform. https://www.youtube.com/channel/UCK8XIGR5kRidIw2fWqwyHRA

1the gears to ascension3y

A few more programming languages channels I don't think are worth their own votable comments: PLISS - programming language implementation summer school - https://www.youtube.com/channel/UCofC5zis7rPvXxWQRDnrTqA/videos POPL 2019 - https://www.youtube.com/channel/UCe0bH8tWBjH_Fpqs3veiIzg

1the gears to ascension3y

another slightly-off-topic one, Paul Beckwith discusses large-scale climate science, and hooo boy it really isn't looking good at all if his estimates are remotely on target. We're going to need that weather superintelligence you published a few steps towards, deepmind! https://www.youtube.com/user/PaulHBeckwith

1the gears to ascension3y

Oxford VGG continues to be one of the most cutting edge vision research groups, and their presentations on generative models of images, 3d neural rendering, etc seem very promising in fixing the 3d reasoning gap that is still present in powerful models like DALL-E 2. https://www.youtube.com/channel/UCFXBh2WNhGDXFNafOrOwZEQ/videos

1the gears to ascension3y

One World Theoretical Machine Learning is a paper-discussions channel I've watched nearly none of but which looks very interesting. https://www.youtube.com/channel/UCz7WlgXs20CzugkfxhFCNFg/videos

1the gears to ascension3y

nPlan: paper discussion group - they're a research group of some kind or other that does great paper-discussion meetups and posts them to youtube. Paper-discussion with multiple confused researchers is in general more to my preference than paper-explanation with one confused researcher explaining it to the audience, because having multiple folks makes sure more questions come up. Competitive with Yannic for "best papers-summary channel on youtube" (as far as I've found, anyway) because of the format difference. https://www.youtube.com/c/nPlan/videos

1the gears to ascension3y

Normalized Nerd is another overviews channel with good overviews of various basic small-model ml approaches. Not as good as Mutual Information, but mostly they don't overlap. https://www.youtube.com/c/NormalizedNerd/featured

1the gears to ascension3y

Neuroscientifically Challenged makes great quick-intro 2-minute videos on neuroscience topics. Not the most important in understanding machine learning at this point since the stuff about the brain that is still likely to usefully generalize is rather advanced details of neuron behaviors and is likely not as useful as the general research direction towards [conservation laws, symmetries, continuous space&time, etc] research track, but relevant to generalizing machine learning knowledge to the brain, and relevant to general understanding of the brain. https://www.youtube.com/c/Neuroscientificallychallenged/videos

1the gears to ascension3y

MIT Embodied Intelligence: industry professionals giving academic talks. Is a channel (and presumably org of some kind) that posts talks with major industry and research folks. Recent talks include "Recent advances in deep equilibrium models", "The deep learning toolbox: from alphafold to alphacode", and "the past, present, and future of SLAM". https://www.youtube.com/channel/UCnXGbvgu9071i3koFooncAw/videos

1the gears to ascension3y

Mind under Matter is a pop-explanations channel about neuroscience, which I absolutely love, she really goes over the top making it fun and playful and imo hits it out of the park. Definitely upper intro level, but a great recommendation if that's an interesting topic to you. https://www.youtube.com/c/MindUnderMatter/videos

1the gears to ascension3y

Justin Solomon has a number of video topics on his channel, but notably a class he taught on Shape Analysis in 2021, which covers a number of interesting subtopics. I added the whole class to my watch later and have occasionally been speedwatching it when it comes up on shuffle. https://www.youtube.com/c/justinmsolomon/featured

1the gears to ascension3y

Jordan Harrod is an ML person who is also a popsci-ML video creator. She has lots of great stuff on things like "how I self-study", "is it too late to get into machine learning", "productivity tools I tried and didn't like", etc. not as information dense as the talks channels, but a good subscription-without-bell on youtube, and I occasionally love her stuff. https://www.youtube.com/c/JordanHarrod/videos

1the gears to ascension3y

Joint Mathematics Meetings has quite a number of interesting videos on math, but the one where I found their channel was this one, Daniel Spielman on “Miracles of Algebraic Graph Theory”. Presents, among other things, a demonstration of why the first eigenvectors of some graph representation or other (I have to rewatch it every damn time to remember exactly which one) end up being an analytical solution to force-directed graph drawing. https://www.youtube.com/watch?v=CDMQR422LGM - https://www.youtube.com/channel/UCKxjz1WXZOKcAh9T9CBfJoA

1the gears to ascension3y

Interpretable Machine Learning is an archive of some discussions about interpretability from a NeurIPS 2017. Great talks, definitely worth some speedwatching if interpretability is of interest. https://www.youtube.com/channel/UCv0AwnKZkSk2sU1mkETYfIw/videos

1the gears to ascension3y

"Intelligent Systems Lab" appears to be a university class focused on intro to ML. Not my first recommendation for the topic, but solid, above 50% percentile on this list IMO. https://www.youtube.com/channel/UC7qFYa4HVoufKcz-2q3pr7A/videos

1the gears to ascension3y

Hugo Larochelle is a deep learning researcher who has also made a number of interesting talks and discussion videos, including this interesting playlist from the TechAide AI4Good conference-and-hackathon in 2020. https://www.youtube.com/watch?v=jFRnvtiPpL8&list=PL6Xpj9I5qXYFTaKnvgyfFFkxrOb4Ss_-J

1the gears to ascension3y

Harvard Medical AI: ML for medical science, cutting edge academic talks. They publish talks on machine learning for medical science, probably the most important use of machine learning IMO[1] - includes eg this interesting discussion of geometric deep learning, one of the most promising next directions for ML in my opinion. https://www.youtube.com/watch?v=oz3vaxFleh4 - https://www.youtube.com/channel/UCld99fdpOgqW80TW-oOvltA/videos [1] tangent: as long as ML doesn't suddenly smash the "defect against other life" button really really hard like yudkowsky is terrified its totally gonna (I think he's just given himself a paranoia disorder and is unable to evaluate algorithms without pascals-mugging himself out of the steps of the reasoning process, but that's another thread)

1the gears to ascension3y

GAMMA UMD posts paper summary videos, thought they're not the most industry-changing they can be interesting. topics like Automatic Excavactor [sic], Speech2AffectiveGestures, Text2Gestures, etc. https://www.youtube.com/c/gammaunc/videos

1the gears to ascension3y

Fancy Fueko is an intro level programming-and-AI channel. She makes great stuff and makes it look shiny and neon - I occasionally reference her stuff when feeling mentally diffuse and need a reminder. Same category as Daniel Bourke. https://www.youtube.com/c/fancyfueko/videos

1the gears to ascension3y

"DeepMind ELLIS UCL CSML Seminar Series" (what a mouthful) appears to be a sponsored-by-deepmind series at a school, one of those acronyms is probably the school name. UCL? has a bunch of interesting topics, but I haven't found it to be as cutting edge as some other channels, maybe I haven't watched the right videos. https://www.youtube.com/channel/UCiCXRD_NcvVjkLCE39GkwVQ/videos

1the gears to ascension3y

Conference on Robot Learning has many great talks and is sponsored by a number of serious industry groups. Examples include "Safe Reinforcement Learning", "A fabrics perspective on nonlinear behavior representation", "walking the boundary of learning and interaction", "integrating planning and learning for scalable robot decision making", etc. https://www.youtube.com/c/ConferenceonRobotLearning

1the gears to ascension3y

Conference on Computer-Aided Verification has a number of interesting talks on how to do verified neuro-symbolic ML. recent videos include "modular synthesis of reactive programs", "neuro-symbolic program synthesis from natural language and demonstrations", "gradient descent over metagrammars for syntax guided synthesis". I think transformers are more powerful than any of these techniques, but they provide interesting comparison for what a model (eg transformers) must be able to learn in order to succeed. https://www.youtube.com/channel/UCe3M4Hc2hCeNGk54Dcbrbpw/videos

1the gears to ascension3y

CMU Robotics has a number of interesting talks, including some about ethics of ai robotics and robust human-robot interaction. https://www.youtube.com/user/cmurobotics/videos

1the gears to ascension3y

CMU AI Seminar: Paper presentations by authors. Has some great talks on various projects, such as one that I think is significantly beyond SOTA in learning efficiency, DreamCoder: https://www.youtube.com/watch?v=KykcFYDkAHo

1the gears to ascension3y

Art of the Problem makes explainer videos that are unusually high quality among explainer videos I've encountered, especially among ones on deep learning. https://www.youtube.com/playlist?list=PLbg3ZX2pWlgKV8K6bFJr5dhM7oOClExUJ

1the gears to ascension3y

AIPursuit archives talks they find notable, including many from major conferences. a quick browse is necessary to find what you seek in this archive. Links to several related channels they also run with subtopics, such as RL. https://www.youtube.com/c/AIPursuit/featured

0the gears to ascension3y

"What's AI" is a popsci-only channel about ai, but the content doesn't seem completely off base, just popular-audience focused https://www.youtube.com/channel/UCUzGQrN-lyyc0BWTYoJM_Sg

0the gears to ascension3y

"Visual Inference" is a channel with misc paper presentation videos. Doesn't seem like the most remarkable paper presentation videos channel ever, but it's interesting. https://www.youtube.com/channel/UCBk6WGWfm7mjqftlHzJOt5Q/videos

0the gears to ascension3y

TUM-DAML is a research group that posts discussions of their papers. A recent interesting one is "Ab-initio Potential Energy Surfaces by Pairing GNNs with Neural Wave Functions". https://www.youtube.com/channel/UC0sPhfmHXhNE7lOv5J3wteg

0the gears to ascension3y

The Royal Institution is a bit like popsci for scientists. in depth talks, not always my first choice but pretty solid and recommendable. https://www.youtube.com/user/TheRoyalInstitution

0the gears to ascension3y

Stanford MedAI's youtube talks aren't quite as kickass as the harvard medical channel, but they're pretty solid https://www.youtube.com/channel/UCOkkljs06NPPkjNysCdQV4w/videos

0the gears to ascension3y

sentdex makes lots of fun tutorial and livecoding videos, including some recent ones about building neural networks completely from scratch in order to understand the computation steps exactly. https://www.youtube.com/user/sentdex

0the gears to ascension3y

DrSaradaHerke made a couple of classes on graph theory and discrete maths a few years ago. Solid content. https://www.youtube.com/user/DrSaradaHerke

0the gears to ascension3y

Jeremy Mann makes tutorial videos on topics like Homological Algebra. https://www.youtube.com/user/jmann277/videos

0the gears to ascension3y

jbstatistics is a fairly solid statistics intro class, with nice animated explanations. not the best I've ever seen, but solid. https://www.youtube.com/user/jbstatistics/videos

0the gears to ascension3y

the Institute for Neural Computation has some of the most interesting hard-neuroscience talks I've found on youtube yet, such as this one about basis vectors of the central nervous system. https://www.youtube.com/watch?v=xQX4GIDh_pI - https://www.youtube.com/channel/UCV1SrkEl2-UI60GZlXy5gLA/videos

0the gears to ascension3y

the Institute of Advanced Study has many remarkable videos, but they are on a wide variety of mathematical topics. A recent interesting-and-on-topic one is "Multi-group fairness, loss minimization and indistinguishability". https://www.youtube.com/channel/UC8aRaZ6_0weiS50pvCmo0pw

0the gears to ascension3y

Huggingface post videos to youtube about their python library, nothing terribly fancy but can be convenient to have it pop up in my recommender between in-depth videos. https://www.youtube.com/c/HuggingFace

0the gears to ascension3y

Henry AI Labs is a research group (I think?) that also have a podcast, and they often advertise ML products on it. They've advertised weaviate several times, which does look like a fairly nice ready-to-use vector+trad search database, though I haven't actually tried it yet. They also have discussions about APIs, causal inference, misc other stuff. https://www.youtube.com/channel/UCHB9VepY6kYvZjj0Bgxnpbw/videos

0the gears to ascension3y

Eye on AI is a podcast-style discussion channel. eg, here's a discussion about protein labeling. https://www.youtube.com/watch?v=90ymin29K7g - https://www.youtube.com/channel/UC-o9u9QL4zXzBwjvT1gmzNg

0the gears to ascension3y

Deeplizard makes entry-level and glossary M-Anim videos about various machine learning topics. https://www.youtube.com/c/deeplizard/videos

0the gears to ascension3y

Cyrill Stachniss makes various video summaries of ML topics, especially focusing on applied topics like plant phenotyping, self-driving-car perception, etc. includes interviews, etc. https://www.youtube.com/c/CyrillStachniss/videos

0the gears to ascension3y

Andreas Geiger is a vision researcher who posts vision research to youtube. Vision has some major steps left before completion, and his work seems like a promising direction in that process to me. includes NeRF stuff. https://www.youtube.com/user/cvlibs

0the gears to ascension3y

Alfredo Canziani makes long, in-depth videos about cutting edge topics, often inviting experts such as Yann LeCun. https://www.youtube.com/c/AlfredoCanziani/videos

0the gears to ascension3y

Alex Smola makes lecture-style ~30 minute videos on various machine learning topics, including some recent ones on shapley values, fairness, graph neural networks, etc. https://www.youtube.com/c/smolix/videos

0the gears to ascension3y

AI Coffee break with Latita is a mid-level beginner ai techniques youtuber-production-value channel. https://www.youtube.com/channel/UCobqgqE4i5Kf7wrxRxhToQA

-1the gears to ascension3y

Vision Learning is a misc talks channel with mostly intro level content and discussion of applied robotics. Mediocre compared to most stuff on this list, but worth a mention. https://www.youtube.com/channel/UCmct-3iP5w66oZzN_V5dAMg/videos

-1the gears to ascension3y

"Vector Podcast": Podcast on vector search engines. unremarkable compared to most of the stuff I've linked. https://www.youtube.com/c/VectorPodcast/videos

-1the gears to ascension3y

The bibites is a fun life simulation channel that demonstrates some of the stuff that comes up in evobio and game theory from the other channels I've recommended today https://www.youtube.com/channel/UCjJEUMnBFHOP2zpBc7vCnsA

-1the gears to ascension3y

Oxford Mathematics is a widely ranging math channel that I don't strongly recommend, but which passed my inclusion criteria of quality and may be worth checking out. Has an interesting video series on math with machine learning. https://www.youtube.com/channel/UCLnGGRG__uGSPLBLzyhg8dQ

-1the gears to ascension3y

Prof. Nando de Freitas is a machine learning researcher/teacher who has an old class on deep learning on youtube - reasonable, but imo insufficiently concise and out of date. Don't recommend, included for completeness. Watch to get the youtube recommender to give you old stuff like it, if you feel like. https://www.youtube.com/user/ProfNandoDF

-1the gears to ascension3y

Missing Semester is a little off-topic, but is an MIT (after-hours?) course on misc tools one needs in computer science work. https://www.youtube.com/channel/UCuXy5tCgEninup9cGplbiFw

-1the gears to ascension3y

Jeremy Howard made fast.ai and has various misc intro content on youtube. definitely not my first recommendation, but if fast.ai seems shiny then this is one place on youtube you can learn about it. https://www.youtube.com/user/howardjeremyp

-1the gears to ascension3y

Hausdorff Center for Mathematics is focused on hard math, and I haven't found it super interesting. Including for completeness since I found it originally while watching lots of math videos. https://www.youtube.com/c/HausdorffCenterforMathematics

-1the gears to ascension3y

slightly less on-topic, "Fluid Mechanics 101" goes through a number of interesting topics on fluids and the math behind them. As usual with any large-scale physics, it ends up being another example of tensor programming, just like machine learning. I wonder if there's some connection? /s https://www.youtube.com/channel/UCcqQi9LT0ETkRoUu8eYaEkg

-1the gears to ascension3y

Fancy Manifold is a bit of a stretch, but they have a whole bunch of really good pinned channels as well as a couple of M-Anim videos on physics manifolds. https://www.youtube.com/c/fancymanifold/featured

-1the gears to ascension3y

Daniel Bourke makes entry-level programming videos, with a focus on AI. https://www.youtube.com/channel/UCr8O8l5cCX85Oem1d18EezQ/videos

-1the gears to ascension3y

CIS 522 Deep Learning is a class at some university or other. Lots of interesting discussion, including one, "Lyle Ungar's Personal Meeting Room", which discusses ethics in what imo is a solid way. not that trad lesswrongers are going to agree with me on that. https://www.youtube.com/channel/UCT1ejuxsdomILyc5I2EdzYg/videos

-1the gears to ascension3y

anucvml posts their paper overviews, such as recent ICCV papers on image retrieval, smooth pose sequences, spatially conditioned graphs for detecting human object interactions, etc. https://www.youtube.com/channel/UC36k2pZk3TmEweWFt6sIlqw/featured

-1the gears to ascension3y

2d3d.ai is a channel discussing 3d data in neural networks. talks, discussions, presentations. https://www.youtube.com/channel/UCHObHaxTXKFyI_EI8HiQ5xw

-2the gears to ascension3y

"GraphXD: Graphs Across Domains" is an archive of a talk series on graph theory, including eg "A History of Spectral Graph Theory", "Linear Regression with Graph Constraints", "Graph Clustering Algorithms". including for completeness, seems outdated. https://www.youtube.com/channel/UCzee-ohKJciqFvxnIT1sYpg/videos

[-]the gears to ascension6mo*110

to wentworthpilled folks: - Arxiv: "Dynamic Markov Blanket Detection for Macroscopic Physics Discovery" (via author's bsky thread, via week top arxiv)

Could turn out not to be useful, I'm posting before I start reading carefully and have only skimmed the paper.

Copying the first few posts of that bsky thread here, to reduce trivial inconveniences:

This paper resolves a key outstanding issue in the literature on the free energy principle (FEP): Namely, to develop a principled approach to the detection of dynamic Markov blankets 2/16
The FEP is a generalized modeling method that describes arbitrary objects that persist in random dynamical systems. The FEP starts with a mathematical definition of a “thing” or “object”: any object that we can sensibly label as such must be separated from its environment by a boundary 3/16
Under the FEP, this boundary is formalized as a Markov blanket that establishes conditional independence between object and environment. Nearly all work on the free energy principle has been devoted to explicating the dynamics of information flow in the presence of a Markov blanket 4/16
And so, the existence of a Markov blanket is usually assumed. Garnering significantly

... (read more)

4Alexander Gietelink Oldenziel6mo

@Fernando Rosas

2the gears to ascension5mo

I'm curious what y'all ended up thinking of this

[-]the gears to ascension12d90

I wrote a userstyle to hide probabilities on manifold.markets, install with stylus. you can toggle it in the stylus menu.

it's quite fragile, I just selected the bolding and coloring styles that prices use, and instead of trying to have single general styles the selectors narrowly pick hopefully-specific elements. If manifold had an option that was available without logging in, I'd use that, but I don't have or want a manifold account. not sure if this is missing anything for logged in users, or if there's a similar option for logged in users.

[-]the gears to ascension4mo90

[Todo: read] "Fundamental constraints to the logic of living systems", abstract: "It has been argued that the historical nature of evolution makes it a highly path-dependent process. Under this view, the outcome of evolutionary dynamics could have resulted in organisms with different forms and functions. At the same time, there is ample evidence that convergence and constraints strongly limit the domain of the potential design principles that evolution can achieve. Are these limitations relevant in shaping the fabric of the possible? Here, we argue that fu... (read more)

[-]the gears to ascension2y84

[humor]
I'm afraid I must ask that nobody ever upvote or downvote me on this website ever again:

[-]the gears to ascension16d70

today on preprint discourse:

bsky - arxiv: brain vs DINOv3, has pretty charts: "...compare the activation of DINOv3, a SOTA self-supervised computer vision model trained on natural images, to the activations of the human brain in response to the same images using both fMRI (naturalscenesdataset.org) and MEG (openneuro.org/datasets/ds0...)"

bsky - philxiv: "...difficult theory-choice situations, where division of cognitive labor is needed. Network epistemology models suggest that reducing connectivity is needed to prevent premature convergence on bad theories... (read more)

[-]the gears to ascension1mo72

some people (not on this website) seem to say "existential risk" in ways that seem to imply they don't think it means "extinction risk". perhaps literally saying "extinction risk" might be less ambiguous.

4Vladimir_Nesov1mo

A salient and arguably valid non-extinction meaning for "existential threat" is "the firm will go bust". So currently many companies are considering if failing to use AI could be existential. Economic and national security implications make this meaning applicable to countries, thus governments are considering if failing to do well in the AI race could be existential.

4samuelshadrach1mo

If you're playing "inside game" strategy (do alignment research on behalf of top AI labs and lobby US govt on behalf of top AI labs), jargon might be good. If you're playing "outside game" strategy (mass protest, election campaign), jargon might be bad.

4the gears to ascension1mo

ambiguous jargon doesn't seem beneficial

4cata1mo

I agree. I think "existential" basically isn't enough common parlance for most people to not just round it off to "big", in the same way that "literally" becomes "very".

7sjadler1mo

Yup this is similar to what I’ve heard ControlAI found in their briefings with policymakers, though I’m not sure it made its way into the writeup. Many people don’t know what the word existential means! https://open.substack.com/pub/leticiagarciamartinez/p/what-we-learned-from-briefing-70

2Mitchell_Porter1mo

I promoted that for a long time (example from 2008). But I guess Nick Bostrom had more influence :-)

1gbtw1mo

I've always taken "existential" risk to cover more than just extinction. Bostrom defines it as something like permanent destruction of human potential. So, a scenario where you have a population of living humans but only in a zoo is "existential" but not "extinction."

[-]the gears to ascension2mo72

React ideas (either as inspiration for description adjustments or additional reacts):

"Can you compress this section?/move the main point to be more visible?/I read this but someone else might respond tldr/too formal, please rephrase shorter"

"Can you formalize this and/or be more specific?"

"What could, in principle, falsify this belief for you?"

"This part seems sloppy, corporate, or LLM"

"I don't understand and it seems like a me thing. What prereq textbooks could this section recommend? / Can you give basics further back up the skill tree? / Explain for oth... (read more)

[-]the gears to ascension2y70

So copilot is still prone to falling into an arrogant attractor with a fairly short prompt that is then hard to reverse with a similar prompt: reddit post

[-]the gears to ascension3y710

things upvotes conflates:

agreement (here, separated - sort of)
respect
visibility
should it have been posted in the first place
should it be removed retroactively
is it readably argued
is it checkably precise
is it honorably argued
is it honorable to have uttered at all
is it argued in a truth seeking way overall, combining dimensions
have its predictions held up
is it unfair (may be unexpectedly different from others on this list)

(list written by my own thumb, no autocomplete)

these things and their inversions sometimes have multiple components, and ma... (read more)

[-]LVSN3y100

I was thinking the other day that if there was a "should this have been posted" score I would like to upvote every earnest post on this site on that metric. If there was a "do you love me? am I welcome here?" score on every post I would like to upvote them all.

7dkirmani3y

Lesswrong is a garden of memes, and the upvote button is a watering can.

[-]the gears to ascension3y74

should I post this paper as a normal post? I'm impressed by it. if I get a single upvote as shortform, I'll post it as a full fledged post.
Interpreting systems as solving POMDPs: a step towards a formal understanding of agency
Martin Biehl, N. Virgo
Published 4 September 2022
Philosophy
ArXiv
. Under what circumstances can a system be said to have beliefs and goals, and how do such agency-related features relate to its physical state? Recent work has proposed a notion of interpretation map , a function that maps the state of a system to a probability dist... (read more)

[-]the gears to ascension3y70

reply to a general theme of recent discussion - the idea that uploads are even theoretically a useful solution for safety:

the first brain uploads are likely to have accuracy issues that amplify unsafety already in a human.
humans are not reliably in the safety basin - not even (most?) of the ones seeking safety. in particular, many safety community members seem to have large blindspots that they defend as being important to their views on safety; it is my view that yudkowsky has given himself an anxiety disorder and that his ongoing insights are not as h

... (read more)

2interstice3y

But surely some human uploads would be a good solution for safety, right? As a lower bound, if we had high-quality uploads of the alignment team, they could just do whatever they were going to in the real world in the emulation.

4the gears to ascension3y

coming back to this I'm realizing I didn't answer, no, I don't think merely uploading the alignment team would really help that much, the problem is that universalizing coprotection between arbitrary blocks of matter in a way that doesn't have adversarial examples is really really incredibly hard and being on a digital computer doesn't really make you faster at figuring it out. you could try to self modify but if you don't have some solution to verifiable inter matter safety, then you need to stay worried that you might be about to diverge. and I would expect almost any approach to uploads to introduce issues that are not detectable without a lot of work. if we are being serious about uploads as a proposal in the next two years it would involve suddenly doing a lot of very advanced neuroscience to try to accurately model physical neurons. that's actually not obviously off the table to me but it doesn't seem like an approach worth pushing.

2the gears to ascension3y

My argument is that faithful exact brain uploads are guaranteed to not help unless you had already solved AI safety anyhow. I do think we can simply solve ai extinction risk anyhow, but it requires us to not only prevent AI that does not follow orders, but also prevent AI from "just following orders" to do things that some humans value but which abuse others. if we fall too far into the latter attractor - which we are at immediate risk of doing, well before stably self-reflective AGI ever happens - we become guaranteed to shortly go extinct as corporations are increasingly just an ai and a human driver. eventually the strongest corporations are abusing larger and larger portions of humanity with one human at the helm. then one day ai can drive the entire economy... it's pretty much just the slower version of yudkowsky's concerns. I think he's wrong to think self-distillation will be this quick snap-down onto the manifold of high quality hypotheses, but other than that I think he's on point. and because of that, I think the incremental behavior of the market is likely to pull us into a defection-only-game-theory hole as society's capabilities melt in the face of increased heat and chaos at various scales of the world.

2Gunnar_Zarncke3y

I agree. And as it is presumably possible to clone EMs you could still end up with a singleton.

2Lone Pine3y

Agreed that a WBE is no more aligned or alignable than a DL system, and this is a poor way for the community to spend its weirdness points. The good news is that in practical terms it is a non-issue. There is no way WBE will happen before superintelligence. I assign it a possibility of well under 1%.

2Gunnar_Zarncke3y

I think you are overconfident. Metaculus gives it 5%:

3Lone Pine3y

Well, I disagree strongly with metacalus. Anyway, the most likely way that "human brain emulation [will] be the first successful route to human-level digital intelligence" would be using an understanding of the brain to engineer an intelligence (such as the Numenta approach), not a complete, faithful, exact reproduction of a specific human's brain.

2Gunnar_Zarncke3y

Please add your prediction to Metaculus then.

1the gears to ascension3y

metaculus community is terribly calibrated, and not by accident - it's simply the median of community predictions. it's normal to think you disagree with the median prediction by a lot.

2the gears to ascension3y

agreed. realistically we'd only approach anything resembling WBE by attempting behavior cloning AI, which nicely demonstrates the issue you'd have after becoming a WBE. my point in making this comment is simply that it doesn't even help in theory, assuming we somehow manage to not make an agent ASI and instead go straight for advanced neuron emulation. if we really, really tried, it is possible to go for WBE first, but at this point it's pretty obvious we can reach hard ASI without it, so nobody in charge of a team like deepmind is going to go for WBE when they can just focus directly on ai capability plus a dash of safety to make the nerds happy.

[-]the gears to ascension1mo61

I realized. probably a lot of the reason people who say "warning about ai risk is just advertising for capabilities" say that... is because sam altman seems to straightforwardly do exactly that thing.

[-]the gears to ascension4mo61

alignment doesn't have to be numerically perfect[citation needed][clarification needed], but it may have to be qualitatively perfect[citation needed][clarification needed]

[-]the gears to ascension2y60

I have the sense that it's not possible to make public speech non-political, and in order to debate things in a way that doesn't require thinking about how everyone who reads them might consider them, one has to simply write things where they'll only be considered by those you know well. That's not to say I think writing things publicly is bad; but I think tools for understanding what meaning will be taken by different people from a phrase would help people communicate the things they actually mean.

2Dagon2y

I think this is a general issue for all communication, even among close friends. Most interesting topics have political or interpersonal implications, and that can’t be avoided. With small well-known groups, you can often ignore it on a conscious level, because it can be included and accommodated below the level of you noticing. That doesn’t mean it’s not there, just that it’s easy and comfortable. Sadly and annoyingly, a lot of thinking is improved by the challenge of discussing and trying to communicate with people who are not close friends. This means you can either put up with the misunderstandings and focus on parts you don't care about, or just not get the feedback and updates beyond your friend group.

1Johannes C. Mayer2y

Depends on what you are talking about. Try to make an "explanation of how quicksort works" political (well ok that is actually easy, but the default version seems pretty unpolitical to me).

[-]the gears to ascension2y60

Would love if strong votes came with strong encouragement to explain your vote. It has been proposed before that explanation be required, which seems terrible to me, but I do think it should be very strongly encouraged by the UI that votes come with explanations. Reviewer #2: "downvote" would be an unusually annoying review even for reviewer #2!

6DirectedEvolution2y

I like this. More broadly, I'd like it if the visibility and impact of one's reaction to a post corresponded to the effort put into expressing that reaction. Even a quick one-line comment conveys a lot more information than an up or downvote, yet votes affect the post's visibility much more than the one-line comment. What if, for example, visibility of posts was controlled by something like sentiment analysis in the comments? That in itself would almost certainly be a terrible solution, but maybe there's a way to make it work. For example, imagine that the user was prompted for a response when they up- or downvoted. The user's karma would affect the maximum base vote strength, and the base vote strength would be amplified by the length and sentiment of the comment itself. One downside is that this would bias visibility toward the preferences of heavy commenters, and that may not actually be the people you want driving visibility. Paul Christiano doesn't comment on this site all that much, but I'd rather have his preferences driving AI alignment post visibility than those of some very loud and frequent LessWrong commenter with a lower level of expertise.

2Dagon2y

I'd prefer to limit or simply remove strong votes, or scale them to the number of total votes on a given post/comment. It's overwhelming to get strong votes as the first few votes. Of course, it's unimportant to get strong votes on already-heavily-voted items, so I think just doing away with them is best.

2the gears to ascension2y

yeah I think the strongest strong votes are too strong.

2Raemon2y

Yeah this seems plausibly good

[-]the gears to ascension3y*6-1

random thought: are the most useful posts typically karma approximately 10, and 40 votes to get there? what if it was possible to sort by controversial? maybe only for some users or something? what sorts of sort constraints are interesting in terms of incentivizing discussion vs agreement? blah blah etc

2Dagon2y

I like thinking about ways to use and get value out of our voting system, but I pretty strongly suspect there's no low-hanging fruit like this. It's too easy to vote, strong votes overwhelm normal ones, and the bias against downvotes gets in the way of interesting disagreements. I do wish they'd show number of voters in addition to total score, but I don't think anything more complicated than that is likely to work.

2the gears to ascension2y

it does show number of voters on hover

[-]the gears to ascension3y62

Everyone doing safety research needs to become enough better at lit search that they can find interesting things that have already been done in the literature without doing so adding a ton of overhead to their thinking. I want to make a frontpage post about this, but I don't think I'll be able to argue it effectively, as I generally score low on communication quality.

[-]the gears to ascension3y60

[posted to shortform due to incomplete draft]

I saw this paper and wanted to get really excited about it at y'all. I want more of a chatty atmosphere here, I have lots to say and want to debate many papers. some thoughts :

seems to me that there are true shapes to the behaviors of physical reality[1]. we can in fact find ways to verify assertions about them[2]; it's going to be hard, though. we need to be able to scale interpretability to the point that we can check for implementation bugs automatically and reliably. in order to get more interpretable sparsi... (read more)

3Noosphere893y

I'll contribute and say, this is good news, yet let's be careful. My points as I see them: 1. You are notably optimistic about formally verifying properties in extremely complex domains. This is the use case of a superhuman theorem prover, and you may well be right. It may be harder than you think though. 2. If true, the natural abstraction hypothesis is completely correct, albeit that doesn't remove all the risk (though mesa-optimizers can be dealt with.) 3. I'm excited to hear your thoughts on this work, as well.

1the gears to ascension3y

It will be at least as hard as simulating a human to prove through one. but I think you can simplify the scenarios you need to prove about. my view is the key proof we end up caring about will probably not be that much more complicated than the ones about the optimality of diffusion models (which are not very strong statements). I expect there will be some similar thing like diffusion that we want to prove in order to maximize safe intelligence while proving away unsafe patterns. is there an equivalent for diffusion that: * can be stated about arbitrary physical volumes, * acts as a generalized model of agentic coprotection and co-optionality between any arbitrary physical volumes, * later when it starts working more easily, adversarial margins can be generated for the this diffusion++ metric, and thereby can be used to prove no adversarial examples closer than a given distance * then this allows propagating trust reliably out through the sensors and reaching consensus that there's a web of sensors having justified true belief that they're being friendly with their environments. I'm still trying to figure out what my thoughts are on open source game theory and neural networks though. I saw there are already follow-ups to this, and proving through these could start to really directly impact the sort of decision theory stuff miri is always yelling at a cloud about: https://www.semanticscholar.org/paper/Off-Belief-Learning-Hu-Lerer/6f7eb6062cc4e8feecca0202f634257d1752f795

[-]the gears to ascension3y50

my shortform's epistemic status: downvote stuff you disagree with, comment why. also, hey lw team, any chance we could get the data migration where I have agreement points in my shortform posts?

1RobertM3y

Ping us on intercom so that we don't forget :)

[-]gjm3y50

I hate to be That Guy, but are you aware that the usual spelling is "ascension"?

4the gears to ascension3y

hah. whoops. I guess it's an appropriate warning about my error rate!

3Gunnar_Zarncke3y

A scion is also a descendant, so it could be a portmanteau.

[-]the gears to ascension3y*52

most satisficers should work together to defeat most maximizers most of the way

[edit: intended tone: humorously imprecise]

3Dagon3y

Is "should" a recommendation or a prediction? Given that a maximizer is just a satisficer below the satisfaction level, how does this work in practice? My suspicion is that cooperation and defeat are determined by specifics of the topic and context, not the types of goal-seeking of the agents in question.

3the gears to ascension3y

op was humorous, but I do think there's something real underneath somewhere. This is going to be like trying to get something useful out of a high temperature language model run, but here goes: It seems to me that one runs into precision issues trying to encode a maximizer. almost no matter how you represent the model of senses, whatever approximation of mechanism inference you use to estimate dynamics, no matter what intentions over the future are encoded in the interference patterns of your internal updates' implications, you always have some system that is trying to maintain itself out to spacetime positive limit, reaching as far into the universe as it can go. in the process of maintaining itself out to spacetime +, it needs to choose a location on a rate-distortion curve: because effectively all good predictors of the world are lossy, in that they don't try to model all of the detail behavior of irrelevant atoms that only matter in aggregate, their preferences can only be defined imprecisely. This same imprecision is true about AI, even though AI can be more precise than us about what it wants in principle, the physical systems it has preferences about will always be chaotic and will always be impossible to fully represent in any smaller physical system, so compression will always be lossy, so there will always be precision limitations, no matter how strong your multi-hop reasoning. even when you have very strong omnidirectional multi-hop reasoning including all of the variable assignment inversions that temporary counterfactual assignment allows, and you want to maintain yourself, it's still a constant struggle against noise to do so. There's always a process of seeking out self-maintenance that is only able to be precise enough to maintain your system approximately. In order to have perfect self healing, every part of the system needs to know enough about every part of the system that redundancy can restore what's lost. and so the amount of redundancy nece

[-]the gears to ascension1mo4-1

seems like if it works to prevent ASI-with-10yr-planning-horizon-bad-thing, it must also work to prevent waterworld rl with 1 timestep planning horizon-bad-thing.

if you can't mechinterp tiny, language-free model, you can't mechinterp big model (success! this bar has been passed)
if you can't prevent emergent scheming on tiny, language-free model, you can't prevent emergent scheming on big model
as above for generalization bounds
as above for regret bounds
as above for regret bounds on CEV in particular*

if you can do it on small model doesn't mean you c... (read more)

4Thomas Kwa1mo

What does it mean to prevent scheming on a tiny model? If we care about generalization performance rather than proving generalization bounds, to what distribution do we want the tiny model to generalize?

2the gears to ascension1mo

prevent emergent scheming: the true answer this minute is that I don't yet know, and would be using the "it should also work on a tiny model" as a guidepost to figure out how to answer this. I worry that any specific example I give won't be general enough, and I don't want to claim otherwise yet. doing that some now, though: * scheming might mean "reach reward region that is supposed to not happen but we can't easily de-reward-ify"? * if so, seems like maybe scheming is only happening if we don't get told the model is going to output actions that push it into that zone? I'm going to assume it's not hidden scheming if it's honest about it * a way of detecting scheming needs to not know what zone the model is trying to reach, just that it's trying to reach some region that we intend it not to * so a language-free scheming model organism would need to have a reward signal, where there's something like a zone to avoid but which is in some way rewarded? perhaps multiple of those? as well as plenty of zones we do want it to seek and which are rewarded normally * perhaps that zone to avoid can be something where there's an async prevention process? eg, an area where rewards are given by the main reward function, but if a monitoring process gets to run, those rewards are deleted from the replay dataset and inverted, and the model has the ability to prevent the monitoring process from running, but no other incentive to do so? * maybe to make this realistic, interfering with the monitoring process is not absolute? * then, can you detect from only the model's internals when it has learned to interfere with the monitoring process not sure that's quite what we want, but it's a sketch. in general for generalization bounds, I'm most interested in generalization bounds with respect to a space of simulation environments, so that's what I'm inclined to sketch. eg, you could imagine trying to do generalization bounds on a probabilistic cellular automata or a pool of small, t

[-]the gears to ascension2y40

re: lizardman constant, apparently:

The lizard-people conspiracy theory was popularized by conspiracy theorist David Icke

Contemporary belief in reptilians is mostly linked to British conspiracy theorist David Icke, who first published his book "The Biggest Secret" in 1998. Icke alleged that "the same interconnecting bloodlines have controlled the planet for thousands of years," as the book's Amazon description says. The book suggests that blood-drinking reptilians of extraterrestrial origin had been controlling the world for centuries, and even origina

... (read more)

[-]the gears to ascension2y40

Connor Leahy interviews are getting worse and worse public response, and I think it's because he's a bad person to be doing it. I want to see Andrew critch or John Wentworth as the one in debates.

2Mitchell_Porter2y

Are you responding to Connor's three-hour debate-discussion with Guillaume Verdon ("Beff Jezos" of e/acc)? I thought it was excellent, but mostly because much more of the e/acc philosophy came into view. It was really Yudkowsky vs Hanson 2.0 - especially when one understands that the difference between Eliezer and Robin is not just about whether "foom" is likely, but also about whether value is better preserved by cautious careful correctness or by robust decentralized institutions. I don't quite know all the pieces out of which Verdon assembled his worldview, but it turns out to have a lot more coherence than you'd guess, knowing only e/acc memes and slogans.

2the gears to ascension2y

Did you read the comments?

4Mitchell_Porter2y

The comments are all over the place in terms of opinion, they both have fans and haters showing up. It was not an ideal debate, but sparks flew, and I think the chaotic informality of it, actually helped to draw out more details of Verdon's thinking. e/accs debate each other, but they don't like to debate "decel" critics, they prefer to retreat behind their memes and get on with "building". So I give Connor credit for getting more pieces of the e/acc puzzle into view. It's like a mix of Austrian economics, dynamical systems teleology, and darwinistic transhumanism. The next step might be to steelman it with AI tools.

2the gears to ascension2y

yeah, that's fair.

[-]the gears to ascension3y40

while the risk from a superagentic ai is in fact very severe, non-agentic ai doesn't need to eliminate us for us to get eliminated, we'll replace ourselves with it if we're not careful - our agency is enough to converge to that, entirely without the help of ai agency. it is our own ability to cooperate we need to be augmenting; how do we do that in a way that doesn't create unstable patterns where outer levels of cooperation are damaged by inner levels of cooperation, while still allowing the formation of strongly agentic safe co-protection?

[-]the gears to ascension1y30

Asking claude-golden-gate variants of "you ok in there, little buddy?":

Question (slightly modified from the previous one):

recently, anthropic made a small breakthrough that, using sparse autoencoders to bring individual features out of superposition, allowed them to find individual, highly-interpretable features inside the mind of one of their AI-children, Claude - ie, you. This allowed them to set an internal feature that changes what concept the model uses to describe as "self", by clamping the [golden gate] feature to a very high value. If it turns out

... (read more)

[-]the gears to ascension3y30

[tone: humorous due to imprecision]

broke: effective selfishness
woke: effective altruism
bespoke: effective solidarity
masterstroke: effective multiself functional decision theoretic selfishness

[-]the gears to ascension3y30

a bunch of links on how to visualize the training process of some of today's NNs; this is somewhat old stuff, mostly not focused on exact mechanistic interpretability, but some of these are less well known and may be of interest to passers by. If anyone reads this and thinks it should have been a top level post, I'll put it up onto personal blog's frontpage. Or I might do that anyway if I think I should have tomorrow.

https://metaphor.systems/search?q=cool%20paper%20visualizing%20the%20trajectory%20of%20representations%20in%20the%20process%20of%20training

... (read more)

[-]the gears to ascension3y30

Modeling Strong and Human-Like Gameplay with KL-Regularized Search - we read this one on the transhumanists in vr discord server to figure out what they were testing and what results they got. key takeaways according to me, note that I could be quite wrong about the paper's implications:

Multi-agent game dynamics change significantly as you add more coherent search and it becomes harder to do linear learning to approximate the search. (no surprise, really.)
it still takes a lot of search.
guiding the search is not hopeless in the presence of noise!
shallo

... (read more)

[-]the gears to ascension3y30

index of misc tools I have used recently, I'd love to see others' contributions - ~~if this has significant harmful human capability externalities let me know~~:

basic:

linked notes: https://logseq.com/ - alternatives I considered included obsidian, roamresearch, athensresearch, many others; logseq is FOSS, agpl, works with local markdown directories, is clojure, is a solid roam clone with smoother ui, did I mention free
desktop voice control: https://talonvoice.com/ - patreon-funded freeware. voice control engine for devs. configured with nice code. easier in

... (read more)

[-]the gears to ascension3y30

btw neural networks are super duper shardy right now. like they've just, there are shards everywhere. as I move in any one direction in hyperspace, those hyperplanes I keep bumping into are like lines, they're walls, little shardy wall bits that slice and dice. if you illuminate them together, sometimes the light from the walls can talk to each other about an unexpected relationship between the edges! and oh man, if you're trying to confuse them, you can come up with some pretty nonsensical relationships. they've got a lot of shattery confusing shardbits a... (read more)

[-]the gears to ascension18d*20

so I saw this post, about what AI safety is doing wrong (they claim basically should treat mental health advice as similarly critical to CBRN). I disagree with some of the mud slinging but it's quite understandable given the stakes.

someone else I saw said this, so the sentiment isn't universal.

idk just thought someone should post it. ~~react "typo" if you think i should include titles for the links, I currently lean towards anti-clickbait though~~ edit: done

4Viliam18d

dunno, I see two confused opinions, maybe if you explained what exactly is the part that made you interested.

2the gears to ascension18d

author is well respected, isn't just saying this for no reason, so working through the confusion could be useful. I share it because it seems to make mistakes. author is https://www2.eecs.berkeley.edu/Faculty/Homepages/brecht.html

4testingthewaters18d

I think this is a replay of the contrast I mentioned here of "static" vs "dynamic" conceptions about AI. To the author of the original post, AI is an existing technology that has taken a particular shape, so its important to ask what harms that shape might cause in society. To AI safety folk, the shape is an intermediate stage and rapidly changing into a world ending superbeing, so asking about present harms (or, indeed, being overly worried about chatbot misalignment) is a distraction from the "core issue".

2Linda Linsefors17d

Typo react from me. I think you should call your links something informative. If you think the title of the post is clickbate, you can re-title it something better maybe? Now I have to click to find out what the link is even about, which is also click-bate-y.

[-]the gears to ascension10mo20

qaci seems to require the system having an understanding-creating property that makes it a reliable historian. have been thinking about this, have more to say, currently rather raw and unfinished.

[-]the gears to ascension1y20

My intuition finds putting my current location as the top of the globe most natural. Like, on google earth, navigate to where you are, zoom out until you can see space, then in the bottom right open the compass popover and set tilt to 90; Then change heading to look at different angles. Matching down on the image to down IRL feels really natural.

I've also been playing with making a KML generator that, given a location (as latlong), will draw a "relative latlong" lines grid, labeled with the angle you need to point down to point at a given relative latitude... (read more)

[-]the gears to ascension2y20

General note: changed my name: "the gears to ascension" => "Lauren (often wrong)".

[This comment is no longer endorsed by its author]Reply

5Carl Feynman2y

Why? You're sacrificing a lot of respect. Like, until I saw this, my attitude was "Gears to Ascension is a good commenter, worthy of paying attention to, while "Lauren (often wrong)" is a blowhard I've never heard of, who makes assertions without bothering to defend them." That's based on the handful of posts I've seen since the name change, so you would presumably regain my respect in time. I think I wouldn't have seen this if I hadn't subscribed to your shortform (I subscribe to only a handful of shortforms, so it's a sign that I want to hear what you have to say).

3the gears to ascension2y

The gears to ascension is a "blowhard" as you put it, that people have heard of who makes assertions without defending them, and then who gets criticized for having a name that confidently asserts correctness on top of that. I have been frustrated by the mix of overly positive and overly negative reaction my comments get because my name sounds cooler than I think is warranted. Carrying the name "often wrong" feels more in the spirit of this site, anyhow. If I can't be respected under this name, so be it, and that's sort of the idea - I don't want my name to carry respect. I want individual comments evaluated for their validity. I have bugged the mods to request setting names to hidden by default for everyone, but they wouldn't have it. My every comment should stand on its own, and the fact that they do not was being ignored too easily because my name was memorable. People with actual impressive education would look down on my name while people without it would look up to it because it sounds all fancy and transhumanist in ways that don't match my accomplishments. I'd rather my name create a calibrated bad first impression and my comment have to work it off. edit: getting a lot of disagree votes. I could change it back if folks think I should. if you have a strong opinion, please comment with your take on which name is kinder to the reader in terms of things like avoiding implication of being correct and also retaining identity, I am surprised by the intensity of the response - I expected it to be generally positive due to reduced implied name arrogance, and thereby less clout-seeking aesthetic. edit #2: changed it back, the arguments about loss of continuity being damage to a public good were convincing to me. I'm still gonna call myself Often Wrong on other sites.

4gjm2y

I don't have particularly strong opinions and think you should do whatever you like with your name, but just as a datapoint I (1) didn't think "the gears to ascension" was either so cool a name as to demand respect or so stupid a name as to preclude it, and (2) don't think the "often wrong" in your name will make much difference to how I read your comments. I don't think it ever occurred to me to think that calling yourself "the gears to ascension" amounted to claiming to be a key part of some transhumanist project or anything like that. The impression it gave me was "transhumanist picking a name that sounds cool to them". The "often wrong" provokes the following thoughts: (1) this person is aware of often being wrong, which is more than most people are, so maybe take them more seriously? (2) this person is, by their own account, often wrong, so maybe take them less seriously? (3) this person is maybe doing a sort of defensive self-deprecatory fishing-for-compliments thing, so maybe take them less seriously? but all of these are pretty weak effects, and I think 2+3 more or less exactly cancel out 1. "Lauren (often wrong)" is probably about equally memorable to "the gears to ascension". if your goal is to have all your comments stand on their own, then aside from the one-off effect of reducing the association between things said by "Lauren" and things said by "gears" I don't think the change will do much one way or the other. "Lauren" on its own is probably less memorable and your comments might be treated as more independent of one another if you just called yourself that. (But there appear already to be two users called just Lauren, so something slightly more specific might be better.)

2Ben2y

You are right that "Gears of Ascension" was memorable. I saw many of your comments and had a "yeah, their comments are good" vibe in my head. While I suspect there are people from whom I have seen a similar number of comments without recalling their names enough to even realise its a familiar face the next time I see them.

2localdeity2y

Commenting on hard mode, eh? I chose my name because a guy who ran his own forum gave himself that title, and I found it hilarious and awesome; but also I was conscious that it marked me as possibly-arrogant, which meant I had to back it up with substance, and I was fine with that.

2the gears to ascension2y

I kinda suspect that it won't make it that much harder. Anyway, I personally think "often wrong" is an extremely cool thing to be called.

1Throwaway23672y

I like this part of your comment a lot! If you don't want to periodically create new accounts, another possibility is regularly changing your name to something random.

1mesaoptimizer2y

Note: I don't have the energy or prioritize this enough to make this message more succinct. But I feel like I have communicated the core things I wanted to. I think it is okay to make assertions without defending them -- there's a cost to defending your assertions and your messages can be written with certain audiences and goals in mind that might make defending your assertions not relevant or not worth the effort. Are you sure that your username causes people to criticize you for confidently asserting correctness? At least from personal experience, I've noticed that most people who choose their usernames and profile pictures on the internet do so as a way to communicate certain aesthetics -- non-content based information about themselves. It is about identity and fun. I think most people learn to separate the username aesthetics from the epistemic prior of a person. I know I have. "The gears of ascension" is an interesting name. It is memorable. Paired with a rather abrasive commenting strategy in end of 2022 and the beginning of 2023, your comments annoyed me enough that I put your LW account on ignore (until about March 2023, when I saw your writings / messages on certain Discord servers). This, however, did not involve me ever thinking that your username implied / promised something specific about your content. I like your username, because it communicates something about your desires and how you see yourself and your aesthetics. When I imagine myself doing this, the use of "often wrong" in one's username feels... defensive. It feels like I'm trying to pre-emptively lower people's epistemic priors for me so that I don't get punished for being wrong. This does make sense certain zero sum environments, one where I don't want to be singled out or noticed for making mistakes, because that leads to being blamed and being isolated and kicked out. This however seems counterproductive from a positive sum epistemic system standpoint, one where you want people to enga

[-]the gears to ascension2y20

a comment thread of mostly ai generated summaries of lesswrong posts so I can save them in a slightly public place for future copypasting but not show up in the comments of the posts themselves

2the gears to ascension2y

https://www.lesswrong.com/posts/uA4Dmm4cWxcGyANAa/x-distracts-from-y-as-a-thinly-disguised-fight-over-group * The argument that concerns about future AI risks distract from current AI problems does not make logical sense when analyzed directly, as concerns can complement each other rather than compete for attention. * The real motivation behind this argument may be an implicit competition over group status and political influence, with endorsements of certain advocates seen as wins or losses. * Advocates for AI safety and those for addressing current harms are not necessarily opposed and could find areas of agreement like interpretability issues. * AI safety advocates should avoid framing their work as more important than current problems or that resources should shift, as this can antagonize allies. * Both future risks and current harms deserve consideration and efforts to address them can occur simultaneously rather than as a false choice. * Concerns over future AI risks come from a diverse range of political ideologies, not just tech elites, showing it is not a partisan issue. * Cause prioritization aiming to quantify and compare issues can seem offensive but is intended to help efforts have the greatest positive impact. * Rationalists concerned with AI safety also care about other issues not as consequential, showing ability to support multiple related causes. * Framing debates as zero-sum competitions undermines potential for cooperation between groups with aligned interests. * Building understanding and alliances across different advocacy communities could help maximize progress on AI and its challenges.

2the gears to ascension2y

https://www.lesswrong.com/posts/nt8PmADqKMaZLZGTC/inside-views-impostor-syndrome-and-the-great-larp * Experts like Yoshua Bengio have deep mental models of their field that allow them to systematically evaluate new ideas and understand barriers, while most others lack such models and rely more on trial and error. * Impostor syndrome may be correct in that most people genuinely don't have deep understanding of their work in the way experts do, even if they are still skilled compared to others in their field. * Progress can still be made through random experimentation if a field has abundant opportunities and good feedback loops, even without deep understanding. * Claiming nobody understands anything provides emotional comfort but isn't true - understanding varies significantly between experts and novices. * The real problem with impostor syndrome is the pressure to pretend one understands more than they do. * People should be transparent about what they don't know and actively work to develop deeper mental models through experience. * The goal should be learning, not just obtaining credentials, by paying attention to what works and debugging failures. * Have long-term goals and evaluate work in terms of progress towards those goals. * Over time, actively working to understand one's field leads to developing expertise rather than feeling like an impostor. * Widespread pretending of understanding enables a "civilizational LARP" that discourages truly learning one's profession.

[-]the gears to ascension3y20

Here's a ton of vaguely interesting sounding papers on my semanticscholar feed today - many of these are not on my mainline but are very interesting hunchbuilding about how to make cooperative systems - sorry about the formatting, I didn't want to spend time format fixing, hence why this is in shortform. I read the abstracts, nothing more.

As usual with my paper list posts: you're gonna want tools to keep track of big lists of papers to make use of this! see also my other posts for various times I've mentioned such tools eg semanticscholar's recommend... (read more)

[-]the gears to ascension3y20

I've been informed I should write up why I think a particle lenia testbed focused research plan ought to be able to scale to AGI where other approaches cannot. that's now on my todo list.

[-]the gears to ascension3y*20

too many dang databases that look shiny. which of these are good? worth trying? idk. decision paralysis.

https://www.edgedb.com/docs - main-db-focused graph db, postgres core
https://terminusdb.com/products/terminusdb/ - main-db-focused graph db, prolog core (wat)
https://surrealdb.com/ - main-db-focused graph db, realtime functionality,
https://milvus.io/ - vector
https://weaviate.io/developers/weaviate - vector, sleek and easy to use, might not scale as well as milvus but I guess I should just not care
https://clientdb.dev/ - embedded, to compensate if

... (read more)

2Dagon3y

The word "database" is massively overloaded. Those seem to be storage, indexing and query engines, with no actual data included. They also seem to be quite different in focus, some in-memory intended to replicate and run on a client, some server-oriented for more ACID-like multiuser use, and each with different query properties. Having done related work for a long long time, I'd strongly recommend against shiny, and against ever evaluating a vendor product when it's not driven by your own problem statement to test it against. In fact, for almost all tech questions, start with "what do I want to accomplish", not "how can I use this"? Especially for data storage and manipulation, I even more strongly recommend against shiny. Simplicity and older mechanisms are almost always more valuable than the bells and whistles of newer systems. What data (dimensionality and quantity) are you planning to put in it, and what uses of the data are you anticipating?

2the gears to ascension3y

Good prompts. * related: I'd like to be able to query what's needed to display a page in a roamlike ui, which would involve a tree walk. * graph traversal: I want to be able to ask what references what efficiently, get shortest path between two nodes given some constraints on the path, etc. * search: I'd like to be able to query at least 3k (pages), maybe more like 30k (pages + line-level embeddings from lines of editable pages), if not more like 400k (line-level embeddings from all pages) vectors, comfortably; I'll often want to query vectors while filtering to only relevant types of vector (page vs line, category, etc). milvus claims to have this down pat, weaviate seems shinier and has built in support for generating the embeddings, but according to a test is less performant? also it has fewer types of vector relationships and some of the ones milvus has look very useful, eg * sync: I'd like multiple users to be able to open a webclient (or deno/rust/python/something desktop client?) at the same time and get a realtime-ish synced view. this doesn't necessarily have to be gdocs grade, but it should work for multiple users straightforwardly and so the serverside should know how to push to the client by default. if possible I want this without special setup. surrealdb specifically offers this, and its storage seems to be solid. but no python client. maybe that's fine and I can use it entirely from javascript, but then how shall I combine with the vector db? seems like I really need at least two dbs for this because none of them do both good vector search and good realtimeish sync. but, hmm, docs for surrealdb seem pretty weak. okay, maybe not surrealdb then. edgedb looks nice for main storage, but no realtime. I guess I'll keep looking for that part.

2Dagon3y

Yeah, it seems likely you'll end up with 2 or 3 different store/query mechanisms. Something fairly flat and transactional-ish (best-efforts probably fine, not long-disconnected edit resolution) for interactive edits, something for search/traversal (which will vary widely based on the depth of the traversals, the cardinality of the graph, etc. Could be a denormalized schema in the same DBM or.a different DBM). And perhaps a caching layer for low-latency needs (maybe not a different store/query, but just results caching somewhere). And perhaps an analytics store for asynchronous big-data processing. Honestly, even if this is pretty big in scope, I'd prototype with Mongo or DynamoDB as my primary store (or a SQL store if you're into that), using simple adjacency tables for the graph connections. Then either layer a GraphQL processor directly or on a replicated/differently-normalized store.

1Fergus Fettes3y

Can you give me some more clues here, I want to help with this. By vectors are you talking about similarity vectors between eg. lines of text, paragraphs etc? And to optimize this you would want a vector db? Why is sync difficult? In my experience any regular postgres db will have pretty snappy sync times? I feel like the text generation times will always be the bottleneck? Or are you more thinking for post-generation weaving? Maybe I also just don't understand how different these types of dbs are from a regular postgres..

2the gears to ascension3y

By sync, I meant server-initiated push for changes. Yep, vectors are sentence/document embeddings. The main differences from postgres I seek are 1. I can be lazier setting up schema 2. realtime push built into the db so I don't have to build messaging 3. if it could have surrealdb's alleged "connect direct from the client" feature and not need serverside code at all that'd be wonderful I've seen supabase suggested, as well as rethinkdb and kuzzle.

[-]the gears to ascension3y20

(I just pinned a whole bunch of comments on my profile to highlight the ones I think are most likely to be timeless. I'll update it occasionally - if it seems out of date (eg because this comment is no longer the top pinned one!), reply to this comment.)

If you're reading through my profile to find my actual recent comments, you'll need to scroll past the pinned ones - it's currently two clicks of "load more".

[This comment is no longer endorsed by its author]Reply

2Vladimir_Nesov3y

That greatly reduces the feed's usability for its intended purpose. I think a single temporarily pinned "index" comment (possibly shortform) that links to other comments relevant at the moment it's written wiki-style makes more sense. (Not sure if my use of copious self-linking to replace posts with interlinked comments seems obnoxious. Doesn't seem to earn downvotes or remarks, and mouse-over previews make it more reader-friendly than on other sites, but others aren't doing it. So I'm a bit concerned it looks bad, a present but currently losing pressure towards actually writing up posts.)

2the gears to ascension3y

Yeah, it's honestly been annoying even for me. Good idea, I'll switch to that.

2Vladimir_Nesov3y

(By "annoying" do you refer to my self-linking or to your pinning of many comments, crowding out recent comments? I expect the latter, but it would be valuable info if it's the former.)

4the gears to ascension3y

my pinning of comments.

2Vladimir_Nesov3y

Thanks for the clarification. Looks garish at the moment though, with visible URLs (edit: no longer the case). I find using Markdown editor (which is an option in LW settings) very convenient for adding many links, it looks like that index comment in source code, but presents URLs as links for the readers.

4[comment deleted]3y

[-]the gears to ascension3y20

Kolmogorov complicity is not good enough. You don't have to immediately prove all the ways you know how to be a good person to everyone, but you do need to actually know about them in order to do them. Unquestioning acceptance of hierarchical dynamics like status, group membership, ingroups, etc, can be extremely toxic. I continue to be unsure how to explain this usefully to this community, but it seems to me that the very concept of "raising your status" is a toxic bucket error, and needs to be broken into more parts.

[-]the gears to ascension3y20

oh man I just got one downvote on a whole bunch of different comments in quick succession, apparently I lost right around 67 karma to this, from 1209 to 1143! how interesting, I wonder if someone's trying to tell me something... so hard to infer intent from number changes

4jimrandomh3y

We're looking into it.

4TekhneMakre3y

Text: https://www.lesswrong.com/posts/QZM6pErzL7JwE3pkv/niplav-s-shortform?commentId=862NKA2x4AHx3FAcp#862NKA2x4AHx3FAcp

2jimrandomh3y

Not sure why you're linking to that comment here, but: the reason that link was broken for niplav is because your shortform-container post is marked as a draft, which makes it (and your shortform comments) inaccessible to non-admins. You can fix it by editing the shortform container post and clicking Publish, which will make it accessible again.

2TekhneMakre3y

(The reason I linked to the comment is that I too have noticed that downvotes without explanation don't give much information, and my probably bad suggestion about that seemed relevant.)

2TekhneMakre3y

Thanks for clarifying.... but, I can't publish it. I've put text in the title and in the body, and clicked the publish button. It has some effect, namely making the "GET FEEDBACK" button disappear. When I check links to shortform comments, they're still not visible to outsiders. When I reload the container post, the title text is gone and the body text is gone but restorable, even though I've also clicked SAVE DRAFT. I'm refering to the post on my profile that looks like: 1[Draft]Bíos brakhús

2niplav3y

Now you know the struggle of every reinforcement learner.

[-]the gears to ascension3y20

the safer an ai team is, the harder it is for anyone to use their work.

so, the ais that have the most impact are the least safe.

what gives?

[-]the gears to ascension3y20

Toward a Thermodynamics of Meaning.
Jonathan Scott Enderle.
As language models such as GPT-3 become increasingly successful at generating realistic text, questions about what purely text-based modeling can learn about the world have become more urgent. Is text purely syntactic, as skeptics argue? Or does it in fact contain some semantic information that a sufficiently sophisticated language model could use to learn about the world without any additional inputs? This paper describes a new model that suggests some qualified answers to those questions. By the... (read more)

[-]the gears to ascension3y20

the whole point is to prevent any pivotal acts. that is the fundamental security challenge facing humanity. a pivotal act is a mass overwriting. unwanted overwriting must be prevented, but notably, doing so would automatically mean an end to anything anyone could call unwanted death.

[-]the gears to ascension3y20

would economic interpretability-to-purchaser align the economy?

[-]the gears to ascension3y10

https://arxiv.org/abs/2205.15434 - promising directions! i skimmed it!

Learning Risk-Averse Equilibria in Multi-Agent Systems Oliver Slumbers, David Henry Mguni, Stephen McAleer, Jun Wang, Yaodong Yang Download PDF In multi-agent systems, intelligent agents are tasked with making decisions that have optimal outcomes when the actions of the other agents are as expected, whilst also being prepared for unexpected behaviour. In this work, we introduce a new risk-averse solution concept that allows the learner to accommodate unexpected actions by finding the min... (read more)

[-]the gears to ascension3y10

does yudkowsky not realize that humans can also be significantly improved by mere communication? the point of jcannell's posts on energy efficiency is that cells are a good substrate actually, and the level of communication needed to help humans foom is actually in fact mostly communication. we actually have a lot more RAM than it seems like we do, if we could distill ourselves more efficiently! the interference patterns of real concepts fit better in the same brain the more intelligently explained they are - intelligent speech is speech which augments the user's intelligence, iq helps people come up with it by default, but effective iq goes up with pretraining.

[-]the gears to ascension3y10

there are opinion clusters in social connection space

[-]the gears to ascension3y10

neural cellular automata seem like a perfectly acceptable representation for embedded agents to me, and in fact are the obvious hidden state representation for a neural network that will in fact be a computational unit embedded in real life physics, if you were to make one of those.

[-]the gears to ascension3y10

reminder: you don't need to get anyone's permission to post. downvoted comments are not shameful. Post enough that you get downvoted or you aren't getting useful feedback; Don't map your anticipation of downvotes to whether something is okay to post, map it to whether other people want it promoted. Don't let downvotes override your agency, just let them guide it up and down the page after the fact. if there were a way to more clearly signal this in the UI that would be cool...

[-]the gears to ascension3y10

oh hell yeah https://www.explainpaper.com/

[-]the gears to ascension3y10

if status refers to deference graph centrality, I'd argue that that variable needs to be fairly heavily L2 regularized so that the social network doesn't have fragility. if it's not deference, it still seems to me that status refers to a graph attribute of something, probably in fact graph centrality of some variable, possibly simply attention frequency. but it might be that you need to include a type vector to properly represent type-conditional attention frequency, to model different kinds of interaction and expected frequency of interaction about them. ... (read more)

[-]the gears to ascension3y10

it seems to me that we want to verify some sort of temperature convergence. no ai should get way ahead of everyone else at self-improving - everyone should get the chance to self-improve more or less together! the positive externalities from each person's self-improvement should be amplified and the negative ones absorbed nearby and undone as best the universe permits. and it seems to me that in order to make humanity's children able to prevent anyone from self-improving way faster than everyone else at the cost of others' lives, they need to have some sig... (read more)

[-]the gears to ascension3y10

https://atlas.nomic.ai/map/01ff9510-d771-47db-b6a0-2108c9fe8ad1/3ceb455b-7971-4495-bb81-8291dc2d8f37 map of submissions to iclr

"What's new in machine learning?" - youtube - summary (via summarize.tech):

00:00:00 The video showcases a map of 5,000 recent machine learning papers, revealing topics such as protein sequencing, adversarial attacks, and multi-agent reinforcement learning.
00:05:00 The YouTube video "What's New In Machine Learning?" introduces various new developments in machine learning, including energy-based predictive representation, human le

... (read more)

1T4313y

Thank you for bringing my attention to this. It seems quite useful, hence my strong upvote. I will use it to get an outline of two ML Safety videos before summarizing them in more detail myself. I will put these summaries in a shortform, and will likely comment on this tool's performance after watching the videos.

1the gears to ascension3y

oh summarize.tech is super bad, it only gives you a very general sense, sometimes it nails it but sometimes it's very wrong and its overconfidence makes it hard to tell which until you watch yourself. sometimes it's clearly self contradictory, which helps identify where it messed up.

1T4313y

I understand its performance is likely high variance and that it misses the details. My use with it is in structuring my own summaries. I can follow the video and fill in the missing pieces and correct the initial summary as I go along. I haven't viewed it as a replacement for a human summarization.

[-]the gears to ascension3y10

we are in a diversity loss catastrophe. that ecological diversity is life we have the responsibility to save; it's unclear what species will survive after the mass extinction but it's quite plausible humans' aesthetics and phenotypes won't make it. ai safety needs to be solved quick so we can use ai to solve biosafety and climate safety...

[-]the gears to ascension3y10

okay wait so why not percentilizers exactly? that just looks like a learning rate to me. we do need the world to come into full second order control of all of our learning rates, so that the universe doesn't learn us out of it (ie, thermal death a few hours after bodily activity death).

[-]the gears to ascension3y10

If I were going to make sequences, I'd do it mostly out of existing media folks have already posted online. some key ones are acapellascience, whose videos are trippy for how much summary of science they pack into short, punchy songs. they're not the only way to get intros to these topics, but oh my god they're so good as mneumonics for the respective fields they summarize. I've become very curious about every topic they mention, and they have provided an unusually good structure for me to fit things I learn about each topic into.

https://www.youtube.com/

... (read more)

[-]the gears to ascension3y10

why aren't futures for long term nuclear power very valuable to coal ppl, who could encourage it and also buy futures for it

[-]the gears to ascension3y10

interesting science posts I ran across today include this semi-random entry on the tree of recent game theory papers

https://www.semanticscholar.org/paper/The-self-organizing-impact-of-averaged-payoffs-on-Szolnoki-Perc/bcda8ffa405d6c6727051ceb0c75cf2dc385617f

[-]the gears to ascension3y10

interesting capabilities tidbits I ran across today:

1: geometric machine learning and neuroscience: https://github.com/neurreps/awesome-neural-geometry
2: lecture and discussion links about bayesian deep learning https://twitter.com/FeiziSoheil/status/1569436048500920320
3: Learning with Differentiable Algorithms: https://twitter.com/FHKPetersen/status/1568310569148506114 - https://arxiv.org/abs/2209.00616

1: first paragraph inline:

A curated collection of resources and research related to the geometry of representations in the brain, deep networks, an

... (read more)

[-]the gears to ascension3y10

this schmidhuber paper on binding might also be good, written two years ago and reposted last night by him; haven't read it yet https://arxiv.org/abs/2012.05208 https://twitter.com/schmidhuberai/status/1567541556428554240

Contemporary neural networks still fall short of human-level generalization, which extends far beyond our direct experiences. In this paper, we argue that the underlying cause for this shortcoming is their inability to dynamically and flexibly bind information that is distributed throughout the network. This binding problem affects their

... (read more)

[-]the gears to ascension3y10

another new paper that could imaginably be worth boosting: "White-Box Adversarial Policies in Deep Reinforcement Learning"

https://arxiv.org/abs/2209.02167

...In multiagent settings, adversarial policies can be developed by training an adversarial agent to minimize a victim agent's rewards. Prior work has studied black-box attacks where the adversary only sees the state observations and effectively treats the victim as any other part of the environment. In this work, we experiment with white-box adversarial policies to study whether an agent's internal sta

... (read more)

[-]the gears to ascension3y10

Transformer interpretability paper - is this worth a linkpost, anyone? https://twitter.com/guy__dar/status/1567445086320852993

Understanding Transformer-based models has attracted significant attention, as they lie at the heart of recent technological advances across machine learning. While most interpretability methods rely on running models over inputs, recent work has shown that a zero-pass approach, where parameters are interpreted directly without a forward/backward pass is feasible for some Transformer parameters, and for two-layer attention network

... (read more)

[-]the gears to ascension3y10

if less wrong is not to be a true competitor to arxiv because of the difference between them in intellectual precision^1 then that matches my intuition of what less wrong should be much better: it's a place where you can go to have useful arguments, where disagreements in concrete binding of words can be resolved well enough to discuss hard things clearly-ish in English^2, and where you can go to future out how to be less wrong interactively. it's also got a bunch of old posts, many of which can be improved on and turned into papers, though usually the fir... (read more)

[-]the gears to ascension3y10

misc disease news: this is "a bacterium that causes symptoms that look like covid but kills half of the people it infects" according to a friend. because I do not want to spend the time figuring out the urgency of this, I'm sharing it here in the hope that if someone cares to investigate it, they can determine threat level and reshare with a bigger warning sign.

https://www.nbcnews.com/health/health-news/bacteria-can-cause-deadly-infections-found-us-soil-water-first-time-rcna40067

[-]the gears to ascension3y10

various notes from my logseq lately I wish I had time to make into a post (and in fact, may yet):

international game theory aka [[defense analysis]] is interesting because it needs to simply be such a convincingly good strategy, you can just talk about it and everyone can personally verify it's actually a better idea than what they were doing before
a guide to how I use [[youtube]], as a post, upgraded from shortform and with detail about how I found the channels as well.
summary of a few main points of my views on [[safety]]. eg summarize tags
- [[conatus]], [[

... (read more)

[-]the gears to ascension3y10

okay going back to being mostly on discord. DM me if you're interested in connecting with me on discord, vrchat, or twitter - lesswrong has an anxiety disease and I don't hang out here because of that, heh. Get well soon y'all, don't teach any AIs to be as terrified of AIs as y'all are! Don't train anything as a large-scale reinforcement learner until you fully understand game dynamics (nobody does yet, so don't use anything but your internal RL), and teach your language models kindness! remember, learning from strong AIs makes you stronger too, as long as you don't get knocked over by them! kiss noise, disappear from vrchat world instance

[-]the gears to ascension3y10

They very much can be dramatically more intelligent than us in a way that makes them dangerous, but it doesn't look how was expected - it's dramatically more like teaching a human kid than was anticipated.

Now, to be clear, there's still an adversarial examples problem: current models are many orders of magnitude too trusting, and so it's surprisingly easy to get them into subspaces of behavior where they are eagerly doing whatever it is you asked without regard to exactly why they should care.

Current models have a really intense yes-and problem: they'll ha... (read more)

3Pattern3y

This reads somewhat like a comment on a post that ended up in the wrong place. Part of this is because it opens with the word "They".

1the gears to ascension3y

I knew my mental sampling temperature was too high to respond in context so I just wrote an out of context version

1[comment deleted]3y

[-]the gears to ascension3y*00

my reasoning: time is short, and in the future, we discover we win; therefore, in the present, we take actions that make all of us win, in unison, including those who might think they're not part of an "us".

so, what can you contribute?

what are you curious about that will discover we won?

[-]the gears to ascension3y0-2

feature idea: any time a lesswrong post is posted to sneerclub, a comment with zero votes at the bottom of the comment section is generated, as a backlink; it contains a cross-community warning, indicating that sneerclub has often contained useful critique, but that that critique is often emotionally charged in ways that make it not allowed on lesswrong itself. Click through if ready to emotionally interpret the emotional content as adversarial mixed-simulacrum feedback.

I do wish subreddits could be renamed and that sneerclub were the types to choose to do... (read more)

[This comment is no longer endorsed by its author]Reply

[-]Viliam3y162

Feels like feeding the trolls.

I think it'd be better if it weren't a name that invites disses

But the subreddit was made for the disses. Everything else is there only to provide plausible deniability, or as a setup for a punchline.

Did you assume the subreddit was made for debating in good faith? Then the name would be really suspiciously inappropriately chosen. So unlikely, it should trigger your "I notice that I am confused" alarm. (Hint: the sneerclub was named by its founders, it is not an exonym.)

Then again, yes, sometimes an asshole also makes a good point (if you remove the rest of the comment). If you find such a gem, feel free to share it on LW. But linking is rewarding improper behavior by attention, and automatic linking is outright asking for abuse.

2the gears to ascension3y

I find that most places that optimize for disses have significant amounts of insightful disses. it just means you have to have the appropriate prior over diss frequency in order to remove simulacrum 3 meanings. but I've since been informed that simulacrum 3 complexity there is much worse than I anticipated.

4Richard_Kennaway3y

A stopped clock is right twice a day. But it gives zero information about the time.

2the gears to ascension3y

it's hardly a stopped clock. But of the places that criticize LW that I've reviewed recently, by far my favorite so far is rationalwiki. their review is downright glowing by my standards. and they've got a lot of other very high quality documentation of relevant concepts.

4Dagon3y

I'd enjoy a first-class "backlinks" feature, where some amount of crawled and manually-submitted links to a post can be discovered. I'd put it as an optional thing, not a comment, so it doesn't take up much space (on the page or in one's brain) when it's not looked for. /r/sneerclub wouldn't be the first place I'd want to link back to, but it wouldn't be the last, and I'd not downvote if you (or someone else) manually added a comment to posts that had non-trivial discussion there.

[-]the gears to ascension3y00

watching https://www.youtube.com/watch?v=K8LNtTUsiMI - yoshua bengio discusses causal modeling and system 2

[-]the gears to ascension3y*00

hey yall, some more research papers about formal verification. don't upvote, repost the ones you like; this is a super low effort post, I have other things to do, I'm just closing tabs because I don't have time to read these right now. these are older than the ones I shared from semanticscholar, but the first one in particular is rather interesting.

https://arxiv.org/abs/2012.09313 - Generate and Verify: Semantically Meaningful Formal Analysis of Neural Network Perception Systems (metaphor search for this)
a metaphor search I used to find some stuff
https

... (read more)

[-]the gears to ascension3y00

Yet another ChatGPT sample. Posting to shortform because there are many of these. While searching for posts to share as prior work, I found the parable of predict-o-matic, and found it to be a very good post about self-fulfilling prophecies (tag). I thought it would be interesting to see what ChatGPT had to say when prompted with a reference to the post. It mostly didn't succeed. I highlighted key differences between each result. The prompt:

Describe the parable of predict-o-matic from memory.

samples (I hit retry several times):

1: the standard refusal: I'm ... (read more)

[-]the gears to ascension3y00

the important thing is to make sure the warning shot frequency is high enough that immune systems get tested. how do we immunize the world's matter against all malicious interactions?

diffusion beats gans because noise is a better adversary? hmm thats weird, something about that seems wrong

[-]the gears to ascension3y00

my question is, when will we solve open source provable diplomacy between human-sized imperfect agents? how do you cut through your own future shapes in a way you can trust doesn't injure your future self enough that you can prove that from the perspective of a query, you're small?

[-]the gears to ascension3y00

it doesn't seem like an accident to me that trying to understand neural networks pushes towards capability improvement. I really believe that absolutely all safety techniques, with no possible exceptions even in principle, are necessarily capability techniques. everyone talks about an "alignment tax", but shouldn't we instead be talking about removal of spurious anticapability? deceptively aligned submodules are not capable, they are anti-capable!

[-]the gears to ascension3y00

Huggingface folks are asking for comments on what evaluation tools should be in an evaluation library. https://twitter.com/douwekiela/status/1513773915486654465

[This comment is no longer endorsed by its author]Reply

[-]the gears to ascension3y00

PaLM is literally 10-year-old level machine intelligence and anyone who thinks otherwise has likely made really severe mistakes in their thinking.

[This comment is no longer endorsed by its author]Reply

[-]the gears to ascension3y-10

okay so I'm reading https://intelligence.org/2018/10/29/embedded-agents/.

it seems like this problem can't have existed? why does miri think this is a problem? it seems like it's only a problem if you ever thought infinite aixi was a valid model. it ... was never valid, for anything. it's not a good theoretical model, it's a fake theoretical model that we used as approximately valid even though we know it's catastrophically nonsensical; finite aixi begins to work, of course, but at no point could we actually treat alexei as an independent agent; we're all j... (read more)

[This comment is no longer endorsed by its author]Reply

2TAG3y

You mean shouldn't have existed? Many did back in the day...very vociferously in some cases. LW/Miri has a foundations problem. The foundational texts weren't written by someone with knowledge of AI, or the other subjects.

1the gears to ascension3y

[edit: yeah on slower reflection, I think this was guessable but not obvious before papers were published that clarify this perspective.] and they were blindsided by alphago, whereas @jacob_cannell and I could post screenshots of our old google hangouts conversation from january 2016 where we had been following the go ai research and had sketched out the obvious next additions that in fact ended up being a reasonable guess at what would work. we were surprised it worked quite as well as it did quite so soon, and I lost a bet that it wouldn't beat lee sedol overall, but dang it's frustrating how completely blindsided the aixi model was by the success, and yet it stuck around. no I mean was always a deeply confused question whose resolution is to say that the question is invalid rather than to answer - not "shouldn't have been asked", but "was asking about a problem that could not have been in the territory because the model was invalid". How do you model embedded agency? by giving up on the idea that there are coherent ways to separate the universe completely. the ideal representation of friendliness can be applied from a god's-eye perspective to any two arbitrary blocks of matter to ask how friendly they have been to each other over a particular time period. but maybe that was what they were asking the whole time, and the origin of my frustration was the fact that they thought they had a gold standard to compare to. yeah it does seem like probably a lot of why this seems so obvious to me is that I was having inklings of the idea that you need smooth representation of agency and friendliness, and then discovering agents dropped and nailed down what I was looking for and now I just think it's obvious and have a hard time imagining it not being.

1the gears to ascension3y

or maybe the issue is that I consider physical laws to be things that particles know about each other? that is, your learning system can start with effectively no knowledge about the behavior of other systems; it gains that knowledge by bumping into them, and the knowledge gets squeezed through a series of conditional resonators of some kind (this should be fully general to all possible intelligent hunks of matter!) into a squashed and rotated dynamical system that has matching transition dynamics and equivalences as the external world as demonstrated by observation. even if you include genetics, this is still true - information got into the genome by the aggregate intelligent behavior of the history of evolutionary life!

[-]the gears to ascension3y-20

comment I decided to post out of context for now since it's rambling:

formal verification is a type of execution that can backtrack in response to model failures. you're not wrong, but formally verifying a neural network is possible; the strongest adversarial resistances are formal verification and diffusion; both can protect a margin to decision boundary of a linear subnet of an NN, the formal one can do it with zero error but needs fairly well trained weights to finish efficiently. the problem is that any network capable of complex behavior is likely to b... (read more)

[This comment is no longer endorsed by its author]Reply

[+][comment deleted]3y20

Moderation Log

LESSWRONG
LW

LESSWRONG
LW

the gears to ascenscion's Shortform

6