Austin said they have $1.5 million in the bank, vs $1.2 million mana issued. The only outflows right now are to the charity programme which even with a lot of outflows is only at $200k. they also recently raised at a $40 million valuation. I am confused by running out of money. They have a large user base that wants to bet and will do so at larger amounts if given the opportunity. I'm not so convinced that there is some tiny timeline here.

But if there is, then say so "we know that we often talked about mana being eventually worth $100 mana per dollar, but ... (read more)

2Nathan Young3m

Austin took his salary in mana as an often referred to incentive for him to want mana to become valuable, presumably at that rate. I recall comments like 'we pay 250 in referrals mana per user because we reckon we'd pay about $2.50' likewise in the in person mana auction. I'm not saying it was an explicit contract, but there were norms.

2Nathan Young4m

That being said, we will do everything we can to communicate to our users what our plans are for the future and work with anyone who has participated in our platform with the expectation of being able to donate mana earnings." "everything we can" is not a couple of weeks notice and lot of hassle. Am I supposed to trust this organisation in future with my real money? (via https://manifoldmarkets.notion.site/Charitable-donation-program-668d55f4ded147cf8cf1282a007fb005)

2Nathan Young5m

Well they have a much larger donation than has been spent so there were ways to avoid this abrupt change: "Manifold for Good has received grants totaling $500k from the Center for Effective Altruism (via the FTX Future Fund) to support our charitable endeavors." Manifold has donated $200k so far. So there is $300k left. Why not at least, say "we will change the rate at which mana can be donated when we burn through this money" (via https://manifoldmarkets.notion.site/Charitable-donation-program-668d55f4ded147cf8cf1282a007fb005 )

Bogdan Ionut Cirstea's Shortform

Bogdan Ionut Cirstea

9mo

1Bogdan Ionut Cirstea6h

Hey Jacques, sure, I'd be happy to chat!

1Bogdan Ionut Cirstea6h

Yeah, I'm unsure if I can tell any 'pivotal story' very easily (e.g. I'd still be pretty skeptical of enumerative interp even with GPT-5-MAIA). But I do think, intuitively, GPT-5-MAIA might e.g. make 'catching AIs red-handed' using methods like in this comment significantly easier/cheaper/more scalable.

ryan_greenblatt10m20

But I do think, intuitively, GPT-5-MAIA might e.g. make 'catching AIs red-handed' using methods like in this comment significantly easier/cheaper/more scalable.

Noteably, the mainline approach for catching doesn't involve any internals usage at all, let alone labeling a bunch of things.

I agree that this model might help in performing various input/output experiments to determine what made a model do a given suspicious action.

The Best Tacit Knowledge Videos on Every Subject

302

Parker Conley, hans truman

26d

TL;DR

Tacit knowledge is extremely valuable. Unfortunately, developing tacit knowledge is usually bottlenecked by apprentice-master relationships. Tacit Knowledge Videos could widen this bottleneck. This post is a Schelling point for aggregating these videos—aiming to be The Best Textbooks on Every Subject for Tacit Knowledge Videos. Scroll down to the list if that's what you're here for. Post videos that highlight tacit knowledge in the comments and I’ll add them to the post. Experts in the videos include Stephen Wolfram, Holden Karnofsky, Andy Matuschak, Jonathan Blow, Tyler Cowen, George Hotz, and others.

What are Tacit Knowledge Videos?

Samo Burja claims YouTube has opened the gates for a revolution in tacit knowledge transfer. Burja defines tacit knowledge as follows:

Tacit knowledge is knowledge that can’t properly be transmitted via verbal or written instruction, like the ability to create

...

(Continue Reading – 3727 more words)

hans truman15m10

"Mise En Place", "[i]nterviews and kitchen walkthroughs:

Qualifies as tacit knowledge, in that people are showing what they're doing that you seldom have a chance to watch first-hand. Reasonably entertaining, seems like you could learn a bit here.

Caveat: most of the dishes are really high-class/meat/fish etc. that you aren't very likely to ever cook yourself, and knowledge seems difficult to transfer.

Eric Neyman's Shortform

Eric Neyman

2Wei Dai6h

Why do you think these values are positive? I've been pointing out, and I see that Daniel Kokotajlo also pointed out in 2018 that these values could well be negative. I'm very uncertain but my own best guess is that the expected value of misaligned AI controlling the universe is negative, in part because I put some weight on suffering-focused ethics.

ryan_greenblatt15m20

My current guess is that max good and max bad seem relatively balanced. (Perhaps max bad is 5x more bad/flop than max good in expectation.)
There are two different (substantial) sources of value/disvalue: interactions with other civilizations (mostly acausal, maybe also aliens) and what the AI itself terminally values
On interactions with other civilizations, I'm relatively optimistic that commitment races and threats don't destroy as much value as acausal trade generates on some general view like "actually going through with threats is a waste of resourc

... (read more)

1mesaoptimizer7h

e/acc is not a coherent philosophy and treating it as one means you are fighting shadows. Landian accelerationism at least is somewhat coherent. "e/acc" is a bundle of memes that support the self-interest of the people supporting and propagating it, both financially (VC money, dreams of making it big) and socially (the non-Beff e/acc vibe is one of optimism and hope and to do things -- to engage with the object level -- instead of just trying to steer social reality). A more charitable interpretation is that the philosophical roots of "e/acc" are founded upon a frustration with how bad things are, and a desire to improve things by yourself. This is a sentiment I share and empathize with. I find the term "techno-optimism" to be a more accurate description of the latter, and perhaps "Beff Jezos philosophy" a more accurate description of what you have in your mind. And "e/acc" to mainly describe the community and its coordinated movements at steering the world towards outcomes that the people within the community perceive as benefiting them.

1Quinn2h

sure -- i agree that's why i said "something adjacent to" because it had enough overlap in properties. I think my comment completely stands with a different word choice, I'm just not sure what word choice would do a better job.

Is being a trans woman (or just low-T) +20 IQ?

lukehmiles

Warning: This post might be depressing to read for everyone except trans women. Gender identity and suicide is discussed. This is all highly speculative. I know near-zero about biology, chemistry, or physiology. I do not recommend anyone take hormones to try to increase their intelligence; mood & identity are more important.

Why are trans women so intellectually successful? They seem to be overrepresented 5-100x in eg cybersecurity twitter, mathy AI alignment, non-scam crypto twitter, math PhD programs, etc.

To explain this, let's first ask: Why aren't males way smarter than females on average? Males have ~13% higher cortical neuron density and 11% heavier brains (implying ${1.11}^{2 / 3} - 1 = 7 %$ more area?). One might expect males to have mean IQ far above females then, but instead the means and medians are similar:

My theory...

(See More – 957 more words)

lukehmiles18m10

Yes my point is the low T did it before the transition

1kromem16h

It implicitly does compare trans women to other women in talking about the performance similarity between men and women: "Why aren't males way smarter than females on average? Males have ~13% higher cortical neuron density and 11% heavier brains (implying 1.112/3−1=7% more area?). One might expect males to have mean IQ far above females then, but instead the means and medians are similar" So OP is saying "look, women and men are the same, but trans women are exceptional." I'm saying that identifying the exceptionality of trans women ignores the environmental disadvantage other women experience, such that the earlier claims of unexceptionable performance of women (which as I quoted gets an explicit mention from a presumption of assumed likelihood of male competency based on what's effectively phrenology) are reflecting a disadvantaged sample vs trans women. My point is that if you accounted for environmental factors the data would potentially show female exceptionality across the board and the key reason trans women end up being an outlier against both men and other women is because they are avoiding the early educational disadvantage other women experience.

Cooperation is optimal, with weaker agents too - tldr

Ryo

This is a linkpost for https://medium.com/p/aeb68729829c

It's a ‘superrational’ extension of the proven optimality of cooperation in game theory
+ Taking into account asymmetries of power
// Still AI risk is very real

Short version of an already skimmed 12min post
29min version here

For rational agents (long-term) at all scale (human, AGI, ASI…)

In real contexts, with open environments (world, universe), there is always a risk to meet someone/something stronger than you, and overall weaker agents may be specialized in your flaws/blind spots.

To protect yourself, you can choose the maximally rational and cooperative alliance:

Because any agent is subjected to the same pressure/threat of (actual or potential) stronger agents/alliances/systems, one can take an insurance that more powerful superrational agents will behave well by behaving well with weaker agents. This is the basic rule allowing scale-free cooperation.

If you integrated this super-cooperative...

(Continue Reading – 1117 more words)

1Ryo 28m

Indeed, I am insisting in the three posts that from our perspective, this is the crucial point: Fermi's paradox. Now there is a whole ecosystem of concepts surrounding it, and although I have certain preferred models, the point is that uncertainty is really heavy. Those AI-lions are cosmical lions thinking on cosmical scales. Is it easy to detect an AI-Dragon you may meet in millions/billions of years? Is it undecidable? Probably. For many reasons* Is this [astronomical level of uncertainty/undecidability + the maximal threat of a death sentence] worth the gamble? -> "Meeting a stronger AI" = "death" -> Maximization = 0 -> AI only needs 1 stronger AI to be dead. What is the likelihood for a human-made AI to not encounter [a stronger alien AI], during the whole length of their lifetime? *(reachable but rare and far in space-time Dragons, but also cases where Dragons are everywhere and so advanced that lower technological proficiency isn't enough etc.).

Ryo 18m10

I can't be certain of the solidity of this uncertainty, and think we still have to be careful, but overall, the most parsimonious prediction to me seems to be super-coordination.

Compared to the risk of facing a revengeful super-cooperative alliance, is the price of maintaining humans in a small blooming "island", really that high?

Many other-than-human atoms are lions' prey.

And a doubtful AI may not optimize fully for super-cooperation, simply alleviating the price to pay in the counterfactuals where they encounter a super-cooperative cluster (resulti... (read more)

To get the best posts emailed to you, create an account! (2-3 posts per week, selected by the LessWrong moderation team.)

Explaining grokking through circuit efficiency

Vikrant Varma, Rohin Shah

Ω 508mo

This is a linkpost for https://arxiv.org/abs/2309.02390

This is a linkpost for our paper Explaining grokking through circuit efficiency, which provides a general theory explaining when and why grokking (aka delayed generalisation) occurs, and makes several interesting and novel predictions which we experimentally confirm (introduction copied below). You might also enjoy our explainer on X/Twitter.

Abstract

One of the most surprising puzzles in neural network generalisation is grokking: a network with perfect training accuracy but poor generalisation will, upon further training, transition to perfect generalisation. We propose that grokking occurs when the task admits a generalising solution and a memorising solution, where the generalising solution is slower to learn but more efficient, producing larger logits with the same parameter norm. We hypothesise that memorising circuits become more inefficient with larger training datasets while generalising circuits do...

(See More – 595 more words)

2LawrenceC6h

Higher weight norm means lower effective learning rate with Adam, no? In that paper they used a constant learning rate across weight norms, but Adam tries to normalize the gradients to be of size 1 per paramter, regardless of the size of the weights. So the weights change more slowly with larger initializations (especially since they constrain the weights to be of fixed norm by projecting after the Adam step).

Rohin Shah36mΩ220

Sounds plausible, but why does this differentially impact the generalizing algorithm over the memorizing algorithm?

Perhaps under normal circumstances both are learned so fast that you just don't notice that one is slower than the other, and this slows both of them down enough that you can see the difference?

social lemon markets

bhauth

This is a linkpost for https://www.bhauth.com/blog/culture/lemon%20markets.html

I refuse to join any club that would have me as a member.

— Groucho Marx

Alice and Carol are walking on the sidewalk in a large city, and end up together for a while.

"Hi, I'm Alice! What's your name?"

Carol thinks:

If Alice is trying to meet people this way, that means she doesn't have a much better option for meeting people, which reduces my estimate of the value of knowing Alice. That makes me skeptical of this whole interaction, which reduces the value of approaching me like this, and Alice should know this, which further reduces my estimate of Alice's other social options, which makes me even less interested in meeting Alice like this.

Carol might not think all of that consciously, but that's how human social reasoning tends to...

(See More – 809 more words)

gwern38m30

Hence the advice to lost children to not accept random strangers soliciting them spontaneously, but if no authority figure is available, to pick a random adult and ask them for help.

10jchan1h

A corollary of this is that if anyone at an [X] gathering is asked “So, what got you into [X]?” and answers “I heard there’s a great community around [X]”, then that person needs to be given the cold shoulder and made to feel unwelcome, because otherwise the bubble of deniability is pierced and the lemon spiral will set in, ruining it for everyone else. However, this is pretty harsh, and I’m not confident enough in this chain of reasoning to actually “gatekeep” people like this in practice. Does this ring true to you?

Scaling of AI training runs will slow down after GPT-5

Maxime Riché

40m

My credence: 33% confidence in the claim that the growth in the number of GPUs used for training SOTA AI will slow down significantly directly after GPT-5. It is not higher because of (1) decentralized training is possible, and (2) GPT-5 may be able to increase hardware efficiency significantly, (3) GPT-5 may be smaller than assumed in this post, (4) race dynamics.

TLDR: Because of a bottleneck in energy access to data centers and the need to build OOM larger data centers.

The reasoning behind the claim:

Current large data centers consume around 100 MW of power, while a single nuclear power plant generates 1GW. The largest seems to consume 150 MW.
An A100 GPU uses 250W, and around 1kW with overheard. B200 GPUs, uses ~1kW without overhead. Thus a 1MW

...

(See More – 730 more words)

LESSWRONG
LW

Quick Takes

Popular Comments

Recent Discussion