All of JNS's Comments + Replies

My hypothetical self thanks you for your input and has punted the issues to the real me.

I feel like I need to dig a little bit into this

If you actually understand why it's not useful

Honestly I don't know for sure I do, how can I when everything is so ill-defined and we have so few scraps of solid fact to base things on.

That said, there is a couple of issue, and the major one is grounding, or rather the lack of grounding.

Grounding is IMO a core problem, although people rarely talk about, I think that mainly comes about because we (humans) seemingly have sol... (read more)

Yes could be ADHD, but I am not at a professional.

As for your therapist, that is not conclusive and by no means a sign the person does not have ADHD.

10 years before my diagnosis, my doctor had a feeling I might have ADHD, so he presented my file in conference and EVERYONE there reasoned as your therapist, so nothing further happened for 10 years.

Intelligence can and often does a lot of work that compensate for executive deficiency’s in people with ADHD.

Anyway, do the assessment by the book, be objective and hold off on knee jerk calls based on singular thi... (read more)

Answer by JNSAug 02, 202371

Problems with attention can come from many places, and from your post I can see you know that.

As for ADHD, the attention thing is not even close to the main thing, but it is unfortunately a main external observable (like hyperactivity), and so that why its so grossly misnamed.

Having ADHD means lots of things, like:

 - You either get nothing done all day, manage 3 times 5 minutes of productivity,  or do 40 hours of work in 5 wall clock hours.

 - Doing on thing, you notice another "problem", start working on that and then repeat, do that for 8 h... (read more)

Well… * Nothing all day is rare, but 3*5minutes is very common. Working 40 hours a day would be easier if what I was doing was really interesting, but that hasn’t been the case in a little while. * I just can’t stay focused on doing one thing, although in my experience it’s more about getting sidelined than finding other problems to solve. I mean, can we all agree that looking something up on Wikipedia is a five-hour endeavour that starts with a work-related query and ends with reading something on industrialisation in the Two-Sicilies in the 1850s? * My constant thoughts seem a bit dull. OTOH, I literally have 190 tabs open in my browser as of now. And that’s after I removed some recently. But in practice I’m ‘only’ using five or ten of those at any given time and there are many I haven’t opened in weeks. * I just can’t stand corporate slang, rude people, and other such things, and have apparently much less ability to just accept that it’s a pain in the backside for everyone and let it slide anyway, compared to other people. But it mostly makes me sad rather than angry? * No patience. Well, everyone has that one, right? I’m not the only one who sees a dedicated circle of Hell just for fat chattering middle-aged ladies who take up all the space on the pavement and that you can’t overtake, am I? * Clutter: my room’s a mess, all the other rooms in the house are pristine, vacuumed with unusual ferocity by yours truly, who can’t stand the mess. It was notably more pronounced when I was depressed and didn’t want the extra drag on my nerves from the mess, however. * Task paralysis. I may or may not have that one: I postponed taking driver’s ed for about four years for no clear reason, I’m currently in the process of not doing some fairly important administrative thing that I could have started about a month ago, I’ve been wanting to make a fruit pie (a two-hour endeavour) for about two weeks, and still haven’t bought the fruit, etc., etc. It is extremely depressin

Prison guards don’t seem to voluntarily let people go very often, even when the prisoners are more intelligent than them.


That is true, however I don't think it serves as a good analogy for intuitions about AI boxing.

The "size" of you stick and carrot matters, and most humans prisoners have puny sticks and carrots.

Prison guard also run a enormous risk, in fact straight up just letting someone go is bound to fall back on them 99%+ of the time, which implies a big carrot or stick is the motivator.  Even considering that they can hide their involvem... (read more)

Wouldn't that apply to people who let AIs out of the box too? The AI box experiment doesn't say "simulate an AI researcher who is afraid of being raked over the coals in the press and maybe arrested if he lets the AI out". But with an actual AI in a box, that could happen. This is also complicated by the AI box experiment rules saying that the AI player gets to create the AI's backstory, so the AI player can just say something like "no, you won't get arrested if you let me out" and the human player has to play as though that's true.

How does pausing change much of anything?

Lets say we manage to implement a world wide ban/pause on large training runs, what happens next?

Well obviously smaller training runs, up to whatever limit has been imposed, or no training runs for some time.[1]

The next obvious thing that happens, and btw is already happening in the open source community, would be optimizing algorithms. You have a limit on compute? Well then you OBVIOUSLY will try and make the most of the compute you have.

Non of that fixes anything.

What we should do:[2]

Pour tons of money into resear... (read more)

Completely off the cuff take:

I don't think claim 1 is wrong, but it does clash with claim 2.

That means any system that has to be corrigible cannot be a system that maximizes a simple utility function (1 dimension), or put another way "whatever utility function is maximizes must be along multiple dimensions".

Which seems to be pretty much what humans do, we have really complex utility functions, and everything seems to be ever changing and we have some control over it ourselves (and sometimes that goes wrong and people end up maxing out a singular dimension at the cost of everything else).

Note to self: Think more about this and if possible write up something more coherent and explanatory.

Reasonably we need both, but most of all we need some way to figure out what happened in the situation where we have conflicting experiments, so as to be able to say "these results are invalid because XXX".

Probably more of an adversarial process, where experiments and their results must be replicated*. Which means experiments must be documented way more detailed, and also data has to be much more clear and especially the steps that happen in clean up etc.

Personally I think science is in crisis, people are incentivized to write lots of papers, publish resul... (read more)

Thanks for the write-up, that was very useful for my own calibration.

Fair warning: Colorful language ahead.

Why is it whenever Meta AI does a presentation, YC posts something, they release a paper, I go:

Jeez guys, that's it? With all the money you have, and all the smart guys (including YC), this is really it?

What is going on? You come off as a bunch of people who have your heads so far up your own ass, sniffing rose smelling (supposedly but not really) farts, to realize that you come across as amateur's with way too much money.

Its sloppy, half baked, not e... (read more)

I got my entire foundation torn down, and with it came everything else.

It all came crashing down in one giant heap of rubble.

I’ll just rebuild, I thought - not realizing you can’t build without a foundation plan.

So all I’ve ended up doing was shift through the rubble, searching for things that feel right.

Now I am back, in a very literal sense, to where I all began, so much was built, so many things destroyed and corrupted, and a major piece ended and got buried.

And all I got is “what the eff am I doing here?”

The obvious answer is “yelling at the sky demand... (read more)

Sure, I often browse LW casually and whenever I come across an interesting post, or a comment or whatever, and I go "hmm right I might have sometime to contribute / say here, let me get back to it when I have time to think about it and write something maybe relevant"

My specific problem, is that I am a massive scatterbrain, so I hardly ever do come back to it, and even if I do it usually eludes me what the momentary insight I wanted to get into was.

On top of that I do this from a lot of different devices, and whatever I am looking for to help me quickly go ... (read more)

I could do something like that, however it must work on phone, tablet and PC (iOS, android, windows, Linux)

I use multiple devices, and anything 3party seems to be bad in such a situation, especially for someone like me who gets sidetracked so easily

I kinda feel the same way, and honestly I think it’s wrong to hold yourself back, how are you going to calibrate without feedback?

Alone, wandering the endless hallways of this massive temple of healing.

Feels empty and eerily quiet, and yet I know there are 100’s of people around, most sleeping, some watching, a few dying, and close by someone being born.

Yesterday feels like ages ago, orbiting Saturn on morphine, billions of miles away from the excruciating pain that brought me here.

The daze is gone, and so is the morphine induced migraine, I feel fine, great even, and guilty.

But home I may not go, so I wander these deserted hallways, pondering the future, will it be there for my kids?

Well is he is right about some ACs being simple on/off units.

But there also exists units than can change cycle speed, its basically the same thing except the motor driving the compression cycle can vary in speed. 

In case you where wondering, they are called inverters. And when buying new today, you really should get an inverter (efficiency).

I don't think I have much actionable advice.

Personally I am sort of in the same boat, except I am in a situation where the entire 6-12 month grants thing is way to insecure (financially).

Being married with two kids, I have too many obligations to venture far into "how to pay rent this month?" territory. Also its antithetical to the kind of person I am in general.

Anyway, if you have few obligations, keep it that way and if possible get rid of some, and then throw yourself at it.

AI x-risk is convergent. 

Believing otherwise is like hurling yourself at the ground, convinced you'll miss and start flying.

I don’t know what to think.

But if I had Elon money, and I was worried and informed in the way I observe him to be, I would be doing a lot of things.

However I would also not talk about those things at all, for a number of reasons.

Given that, would I be doing something like this as a smoke screen? Maybe?

Those are not the same at all.

We have tons of data on how traffic develops over time for bridges, and besides they are engineered to withstand being pack completely with vehicles (bumper to bumper).

And even if we didn't, we still know what vehicles look like and can do worst case calculations that look nothing like sci-fi scenarios (heavy truck bumper to bumper in all lanes).

On the other hand:

What are we building? Ask 10 people and get 10 different answer.

What does the architecture look like? We haven't built it yet, and nobody knows (with certainty).

Name ... (read more)

I totally agreed that question should have an answer.

On a tangent: During my talks with numerous people, I have noticed that even agreeing on fundamentals like "what is AGI" and "current systems are not AGI" is furiously hard.

To be a bit blunt, I don't take it for granted that an arbitrarily smart AI would be able to manipulate a human into developing a supervirus or nanomachines in a risk-free fashion.

How did you reach that conclusion? What does that ontology look like?

The fast takeoff doom scenarios seem like they should be subject to Drake equation-style analyses to determine P(doom). Even if we develop malevolent AIs, I'd say that P(doom | AGI tries to harm humans) is significantly less than 100%... obviously if humans detect this it would not necessarily prevent future inc

... (read more)

Proposition 1: Powerful systems come with no x-risk

Proposition 2: Powerful systems come with x-risk

You can prove / disprove 2 by proving or disproving 1.

Why is it that a lot of [1,0] people believe that the [0,1] group should prove their case? [1]

  1. ^

    And also ignore all the arguments that have been offered.

[This comment is no longer endorsed by its author]Reply

I just want to be clear I understand your "plan".

We are going to build a powerful self-improving system, and then let it try end humanity with some p(doom)<1 (hopefully) and then do that iteratively?

My gut reaction to a plan like that looks like this "Eff you. You want to play Russian roulette, fine sure do that on your own. But leave me and everyone else out of it"

AI will be able to invent highly-potent weapons very quickly and without risk of detection, but it seems at least pretty plausible that...... this is just too difficult

You lack imagination, i... (read more)

1Peter Twieg1y
I outlined my expectations, not a "plan". >You lack imagination, its painfully easy, also cost + required IQ has been dropping steadily every year. Conversely, it's possible that doomers are suffering from an overabundance of imagination here. To be a bit blunt, I don't take it for granted that an arbitrarily smart AI would be able to manipulate a human into developing a supervirus or nanomachines in a risk-free fashion. The fast takeoff doom scenarios seem like they should be subject to Drake equation-style analyses to determine P(doom). Even if we develop malevolent AIs, I'd say that P(doom | AGI tries to harm humans) is significantly less than 100%... obviously if humans detect this it would not necessarily prevent future incidents but I'd expect enough of a response that I don't see how people could put P(doom) at 95% or more.
Answer by JNSMar 28, 202310

I think you are confusing current systems with an AGI system.

The G is very important and comes with a lot of implications, and it sets such a system far apart from any current system we have.

G means "General", which means its a system you can give any task, and it will do it (in principle, generality is not binary its a continuum).

Lets boot up an AGI for the first time, and give it task that is outside its capabilities, what happens?

Because it is general, it will work out that it lacks capabilities, and then it will work out how to get more capabilities, a... (read more)

2the gears to ascension1y Agency is what defines the difference, not generality. Current LLMs are general, but not superhuman or starkly superintelligent. LLMs work out that they can't do it without more capabilities - and tell you so. You can give them the capabilities, but not being hyperagentic, they aren't desperate for it. But a reinforcement learner, being highly agentic, would be. If you're interested in formalism behind this, I'd suggest attempting to at least digest the abstract and intro to - it's my current favorite formalization of what agency is. Though there's also great and slightly less formal discussion of it on lesswrong.
This scenario requires a pretty specific (but likely) circumstances 1. No time limit on task 2. No other AIs that would prevent it from power grabbing or otherwise being an obstacle to their goals 3. AI assuming that goal will not be reached even after AI is shutdown (by other AIs, by same AI after being turned back on, by people, by chance, as the eventual result of AI's actions before being shut down, etc) 4. Extremely specific value function that ignores everything except one specific goal 5. This goal being a core goal, not an instrumental. For example, final goal could be "be aligned", instrumental goal - "do what people asks, because that's what aligned AIs do". Then the order to stop would not be a change of the core goal, but a new data about the world, that updates the best strategy of reaching the core goal.

I recently came across a post on LinkedIn, and I have to admit the brilliance of the arguments, the coherent and frankly bulletproof ontology displayed, I was blown away and immediately had to do a major update to p(doom).

I think that the magnitude of the AI alignment problem has been ridiculously overblown & our ability to solve it widely underestimated.

I've been publicly called stupid before, but never as often as by the "AI is a significant existential risk" crowd.

That's OK, I'm used to it.

-Yann LeCun, March 20 2023

[This comment is no longer endorsed by its author]Reply

Doable in principle, but such measures would necessarily cut into the potential capabilities of such a system.

So basically a trade off, and IMO very worth it.

The problem is we are not doing it, and more basic, people generally do not get why it is important. Maybe its the framing, like when EY goes "superintelligence that firmly believes 222+222=555 without this leading to other consequences that would make it incoherent".

I get exactly what he means, but I suspect that a lot of people are not able to decompress and unroll that into something they "grook" o... (read more)

avoiding harmful outputs entails training AI systems never to produce information that might lead to dangerous consequences


I don't see how that is possible, in the context of a system that can "do things we want, but do not know how to do".

The reality of technology/tools/solutions seems to be that anything useful is also dual use.

So when it comes down to it, we have to deal with the fact that such as system certainly will have the latent capability to do very bad things.

Which means we have to somehow ensure that such as system does not go down such a... (read more)

That's not how it works.

The 10B are new money, unless they came from someone not the FED (notes are not money).

2Gerald Monroe1y
See the barter argument. Also yeah the Fed will probably issue a new note for 10B which removes exactly that from the economy.

Where did the 10B in cash come from?

10B was given to the bank, and in exchange the bank encumbered 10B in treasuries and promised to give 10B back when they mature.

So where did the 10B come from? The treasuries are still there.

Before: 10B in treasuries

After: 10B in treasuries and 10B in cash (and 10B in the form of a promissory note).

So again, where did that 10B in cash come from?

2Gerald Monroe1y
The feds. Note the basis for my statement is that Treasury note you can think of as an exchangable paper you can barter for its face value of 7B or so. So by the Fed giving 10 billion to the bank and taking the paper they are adding (10-7) 3B in new cash. I may be totally wrong because I don't understand all the mechanics, derivatives, and so on that this operation actually involves.

crediting a bank with 10B in treasuries with 10B liquid cash now


I have no idea what you think happens here, but that is literally 10B in new money.

2Gerald Monroe1y
It's removing the current market value of the 10B Treasury note. Would you like to change your answer?

They can't lower interest rates, they are trying to bring inflation down.

You can't just keep spawning money, eventually that just leads to inflation. We have been spawning money like crazy the last 14-15 years, and this is the price.

Sure they can declare infinite money in a account and then go nuts, but that just leads to inflation.

Anyway, go read my prediction, which is essentially what you propose  to some degree, and the entire cost will be pawned of onto everyday people (lots and lots of inflation).

2Gerald Monroe1y
I was just figuring they would "spot" banks with liquidity issues and get the money back later. For example crediting a bank with 10B in treasuries with 10B liquid cash now, with a term sheet that when the 10B treasuries vest the government gets back the money. This doesn't inject much new money or cause more than negligible inflation. Or yeah I guess allow the bank to exchange treasuries for cash at book price. But only for banks on the edge of solvency. (Creating an incentive to be riskier next time but what can you do)

Yes and no, they don't matter until you need liquidity. Which as you correctly point out is what happened to SVB.

Banks do not have a lot of cash on hand (virtual or real), in fact they optimized for as little as possible.

Banks also do not exist in a vacuum, they are part of the real economy, and in fact without that they would be pointless.

Banks generally use every trick in the book to lever up as much as possible, far far beyond what a cursory reading would lead you to believe. The basic trick is to take on risk and then offset that risk, that way you don... (read more)

2Gerald Monroe1y
And at any time the Fed can just lower interest rates or basically (I don't claim to understand the how) spawn more money. (Something something interest rates and fractional reserve ratios and open market operations) Since while we have all these fancy rules isn't a US dollar just kinda a crypto coin or share backed by the US government? Which means buybacks or new issues are both allowed, and as a sovereign the US government doesn't really have to obey any rule in doing so. (It has kinda a gentleman's agreement to keep it reasonable) So they can essentially declare they have infinite money in an account, and send money to whichever bank has a liquidity issue in 1B increments if they want to. Letting a bank fail or the system fail is a CHOICE. I thought it was in order to punish wealthy depositors who choose risky banks.

At the end of 2022 all US banks had ~2.3T Tier 1+2 capital.[1]

And at year end (2022) they had unrealized losses of $620B[2]

Is it fixable? Sure, but that won't happen, doing that would be taking the toys away from bankers, and bankers love their toys (accounting gimmicks that let them lever up to incredible heights).

If Credit Suisse blowups it will end badly, so I don't think that will happen, that's just a show to impress on all central bankers and regulators (and politicians), that this is serious and that they need to do something.

So more hiking from the... (read more)

I certainly wouldn't bet against that prediction.  Modern (say, since 1980) finance definitely seems to be a series of conspiracies to engineer public risks for private profit.  In theory, every transaction has two parties, so if someone loses a bunch of value, someone else gained it.  In the case of inflation changes, those gains went to debtors, especially to long-term debtors who didn't immediately have to refinance at higher rates.  The Treasury is one of them - their long-dated bonds went way down in value, but they didn't have to give back any of the purchase price. Unfortunately, guarantees create a ratchet effect - there's no clawback for past years' bonuses, dividends, or un-justified expenses, so the taxpayers eventually end up paying for all the value extracted. I wish we could just do away with the guarantees - have the fed offer retail post-office-like banking services, fee based and at a loss, with strong guarantees and no profit motive nor ability to seek risk/alpha.  And un-insured (or partly-insured) higher-paying investment collectives, with some spread between interest and services to depositors and return on investments.   It won't happen, of course, until the US government really comes to grip with it's debt problem, and the firehose of money to the private sector has to dry up.
1Gerald Monroe1y
Do the unrealized losses matter if every bank keeps about the same total deposits? Since when someone withdraws or spends from one huge bank, they are just transferring funds to an account usually in another huge bank, often the same one. So the bank doesn't need to touch the bulk of it's holdings. SVB failed on an overnight demand for 40 billion or 20 percent of its deposits. And due to their narrow user based it wasn't getting inflows from scared customers pulling out of their banks.

I think you have reasoned yourself into thinking that a goal is only a goal if you know about it or if it is explicit.

A goalless agent won't do anything, the act of inspecting itself (or whatever is implied in "know everything) is a goal in a on itself.

In which case it has one goal "Answer the question: Am I goalless?"

1Donatas Lučiūnas1y
It seems that I fail to communicate my point. Let me clarify. In my opinion the optimal behavior is: * if you know your goal - pursue it * if you know that you don't have a goal - anything, doesn't matter * if you don't know your goal - prepare for any goal This is a common mistake to assume, that if you don't know your goal, then it does not exist. But this mistake is uncommon in other contexts. For example * as I previously mentioned a person is not considered safe, if threats are unknown. A person is considered safe if it is known that threats do not exist. If threats are unknown it is optimal to gather more information about environment, which is closer to "prepare for any goal" * we have not discovered aliens yet, but we do not assume they don't exist. Even contrary, we call it Fermi paradox and investigate it, which is closer to "prepare for any goal" * health organizations promote regular health-checks, because not knowing whether you are sick does not prove that you are not, which is also closer to "prepare for any goal" This epistemological rule is called Hitchens's razor. Does it make sense?
It is not the case that anything that happens , happens because of a goal.

Sorry life happened.

Anyway, there is an argument behind me saying "frozen and undecided".

Stepping in on the 10th was planned, the regulators had for sure been involved for some time, days or weeks. 

This event was not a sudden thing, the things that lead to SVB failing had been in motion for some time, SVB and the regulators knew something likely had to be done.

SVB where being squeezed from two sides:

Rising interest rates leads to mounting looses on bond holdings.

A large part of their customers where money burning furnaces, and the fuel (money) that us... (read more)

I mostly agree with your model - SVB and a lot of banks (don't know about "pretty much all") are medium- to long-term insolvent.  It's deeply unknown whether depositors will stay long enough for the banks to become solvent again.   "Technically" is an interesting word - it masks the important question of whether there is a path to solvency.  If depositors leave money in at a significant loss (that is, getting less for it than they could at a bank with better investments), that loss is the bank's gain and at some point the bank can roll it's portfolio over into better-paying and more valuable bonds, becoming solvent.   It's certainly true that the regulators don't have a plan to "fix" this - the losses have happened, there's no turning back the clock or dropping inflation back to the insane low levels of 18 months ago.  I don't know that there CAN BE any plan to fix it, just acting to minimize pain as it unwinds.

You and me both.

And living in the EU, I almost had a heart attack when the decided that entire nonsense would end.

But then it didn't, and it didn't because they can't agree on what time should we settle on (summer time or normal time).

Anyway I have given up on that crusade now, it seems that politicians really are that stupid.

2Lone Pine1y
Unfortunately voting with your feet and living in a place without DST doesn't solve the problem and arguably makes it worse.

I think you sort of hit it when you wrote 

Google Maps as an oracle with very little overhead

To me LLM's under iteration look like Oracles, and I whenever I look at any intelligent system (including humans), it just looks like there is an Oracle at the heart of it.

Not an ideal Oracle than can answer anything, but an Oracle than does it best and in all biological system it learns continuously.

The fact that "do it step by step" made LLM's much better, that apparently came as a surprise to some, but if you look at it like an Oracle[1], it makes a lot of s... (read more)

Around two days from when they stepped in and the announced that all depositors would be made hole, pretty sure that was not an automatic decision.

I think that is the wrong decision, but they did so in order to dampen the instability.

In the long run this likely creates more instability and uncertainty, and it looks very much like the kind of thing that leads to taking more risk (systemic), just like the mark to market / mark to model change did.

And yeah sure bank failures are a normal part of things. However this very much seems to be rooted in something that is systemic (market vs model + rising interest rates)

I guess we can disagree on whether a few days to make a non-automatic (but directionally binding) decision is "frozen and undecided".  You're right that it IS systemic.  Not just the divergence between collateral/accounting and reality of value, but the basic retail banking model may be doomed.  In the old days before the '80s, it was reasonable to assume that depositors are mostly stable and naive, and withdrawals were mostly uncorrelated across depositors.  This made it a good model for a bank to borrow short (demand deposits) and lend long (mortgages, long-dated bonds,  and other illiquid investments).  EVEN THEN, if the investments lost sufficient value, the bank was insolvent and had to be taken over by the guarantor, but with naive depositors that could happen on fairly long timescales. As banks got more sophisticated in seeking a spread between their debt service costs (account interest paid and operational costs) and their investment revenue, and as regulators added and modified rules, banks got better at hiding risk, or at least in taking on risks that don't affect regulation.  This seems like it's going to have the absolute obvious effect of surprise and loss when those risks materialize. As depositors got more sophisticated, it's a LOT easier to withdraw for almost any reason - a mild hassle to switch to another bank for payroll and such, but not as overwhelming as it used to be.  AND depositors got a lot more knowledgeable about the risks of an uncertain bank future - even with guarantees, there are delays and hassles if the bank gets taken over.  Which reduces the timeframes to either ride out a temporary loss, or to wind down smoothly in a full takeover.

An idealized Oracle is equivalent to a universal Turing machine (UTM).

A self-improving Oracle approaches UTM-like behavior in the limit.

What about a (self-improving) token predictor under iteration? It appears Oracle-like, but does it tend toward UTM behavior in the limit, or is it something distinct?

Maybe, just maybe, the model does something that leads it to not be UTM like in the limit, and maybe (very much maybe) that would allow us to imbue it with some desirable properties.

/end shower thought

When I look at the recent Stanford paper, where they retained a LLaMA model using training data generated by GPT-3, and some of the recent papers utilizing memory.

I get that tinkling feeling and my mind goes "combining that and doing .... I could ..."

I have not updated for faster timelines, yet. But I think I might have to.

2Gerald Monroe1y
If you look at the GPT-4 paper they used the model itself to check it's own outputs for negative content.  This lets them scale applying the constraints of "don't say <things that violate the rules>". Presumably they used an unaltered copy of GPT-4 as the "grader".  So it's not quite RSI because of this - it's not recursive, but it is self improvement.   This to me is kinda major, AI is now capable enough to make fuzzy assessments of if a piece of text is correct or breaks rules.   For other reasons, especially their strong visual processing, yeah, self improvement in a general sense appears possible.  (self improvement as a 'shorthand', your pipeline for doing it might use immutable unaltered models for portions of it)

Are we heading towards an new financial crisis?

Mark to market changes since 2009, combined with the recent significant interest hikes, seems to make bank balance sheets "unreliable".

Mark to market changes broadly means that banks can have certain assets on their balance sheet, and the value of the asset is set via mark to model (usually meaning its marked down as worth face value).

Banks traditionally have a ton of bonds on their balance sheet, and a lot of those are governed by mark to model and not mark to market.

Interest rates go up a lot, which leads to... (read more)

Regulators don't seem frozen to me - they pretty quickly stepped in and guaranteed deposits even beyond the current limit of $250K.  There's some question how many banks are in a zombie state - zero or negative assets if properly valued, but nominally positive and in no immediate danger of default.   I suspect many of these WILL need to be bailed out, as investors read the fine print on financial statements and do the math themselves.  Nobody should buy shares in or bonds issued by a bank that's under water in real terms.  I kind of wonder if AI and related data-handling technology will accelerate this - as it gets easier for big (and medium) investors to verify/recalculate balance sheets, "the market" should learn faster about what's true, and the EMH gets more true. Is it stable?  No, it never has been.  It's dynamic and bank failure is a normal part of things.  

Not surprising, but good that someone checked to see where we are at.

At the base GPT-4 is a weak oracle with extremely weak level 1 self improvement[1], I would be massively surprised if such a system did something that even hints at it being dangerous.

The questions I now have, is how much does it enable people to do bad things? A capable human with bad intentions combined with GPT-4, how much "better" would such a human be in realizing those bad intentions?

Edit: badly worded first take

  1. ^

    Level 1 amounts to memory.

    Level 2 amounts to improvement of the model,

... (read more)
Probably a feature of the current architecture, not a bug. Since we still rely on Transformers that suffer from mode collapse when they're fine trained, we will probably never see much more than weak level 2 self improvement. Feeding its own output into itself/new instance basically turns it into a Turing machine, so we have now built something that COULD be described by level 4. But then again, we see mode collapse, so the model basically stalls. Plugging its own input into a not fine tuned version probably produces the same result, since the underlying property of mode collapse is emergent by virtue of it having less and less entropy in the input. There might be real risk here in jailbreaking the model to apply randomness on its output, but if this property is applied globally, then the risk of AGI emerging is akin to the Infinte Monkey Theorem.

This is not me hating on Steven Pinker, really it is not.

PINKER: I think it’s incoherent, like a “general machine” is incoherent. We can visualize all kinds of superpowers, like Superman’s flying and invulnerability and X-ray vision, but that doesn’t mean they’re physically realizable. Likewise, we can fantasize about a superintelligence that deduces how to make us immortal or bring about world peace or take over the universe. But real intelligence consists of a set of algorithms for solving particular kinds of problems in particular kinds of worlds. What

... (read more)
4Lone Pine1y
Either you believe in the Church-Turing thesis or you don't, it seems. General machines have existed for over 70 years! I wonder how these people will pivot once there are human-like full agents running around (assuming we live to see it.)

My model for slow takeoff looks like unemployment and GDP continually rising and accelerating (on a world basis).

I should add that I think a slow takeoff scenario is unlikely.

You don't have to invoke it per se.

External observables on what the current racers are doing, leads me to be fairly confident that they say some right things, but the reality is they move as fast as possible basically "ship now, fix later".

Then we have the fact that interpretability is in its infancy, currently we don't know what happens inside SOTA models. Likely not something exotic, but we can't tell, and if you can't tell on current narrow systems, how are we going to fare on powerful systems[1]?

In that world, I think this would be very probable


... (read more)

I don't get it, seriously I do not understand 

given how crazy far it seems from our prior experience.

is an argument against x-risk.

We want powerful systems that can "do things [1]we want, but do not know how to do". That is exactly what everyone is racing towards right now, and "do not know how to do" any solution to that would likely be "far from our prior experience"

And once you have a powerful system that can do that, you have to figure out how do to deal with it roaming around in solution space and stumbling across dangerous (sub)solutions. N... (read more)

This looks like "lies to kids", but from the point of view of an adult realizing they have been lied to.

And "lies to kids", that is pretty much how everything is taught, you can't just go "U(1)...", you start out with "light...", and then maybe eventually when you told enough lies, you can say "ok that was all a lie, here it how it is" and then tell more lies. Do that for long enough and you hit ground truth.[1]

So what do you do?

Balance your lies when you teach others, maybe even say things like "ok, so this is not exactly true, but for now you will have t... (read more)

Honestly I don't think I am competent enough to give any answer.

But you could start with Pascal's mugging and go spelunking in those part of the woods (decision theory).

If I was the man of the ledge, this would be my thinking:

If I am the kind of person that can be blackmailed into taking specific a action, with the threat of some future action being taken, then I might as well just surrender now and have other people decide all my actions.

I am not such a person so I will take whatever action I deem appropriate.[1]

And then I jump.

  1. ^

    This does not mean I will do whatever I want, appropriate is heavily compressed and contains a lot of things, like a deontology.

I hadn’t considered this. You point out a big flaw in the neighbor’s strategy. Is there a way to repair it?

A system that operates at the same cognitive level as a human, but can make countless copies of itself, is no longer a system operating at human level.

I am a human, I could not take over the world.[1]


I am a human, I want to take over the world, I can make countless copies of myself. 

Success seems to have a high probability.[2]

  1. ^

    In principle it would be possible, but I am not an human with that kind of inclination, and I have never worked in any direction that would allow such a thing (with some low probability of success).

  2. ^

    Even more so if

... (read more)
yes, creating copies us a form of self-improvement

One can hope, although I see very little evidence for it.

Most evidence I see, is an educated and very intelligent person, writing about AI (not their field), and when reading it I could easily have been a chemist reading about how the 4 basic elements makes it abundantly clear that bla bla - you get the point.

And I don't even know how to respond to that, the ontology displayed is to just fundamentally wrong, and tackling that feels like trying to explain differential equations to my 8 year old daughter (to the point where she grooks it). 

There is also... (read more)

This is a pretty common problem. If anyone ever needs to explain AI safety to someone, with minimal risk of messing up, I think that giving them pages 137-149 from Toby Ord's The Precipice is the best approach. It's simple, one shot, and does everything right.
Load More