This is a story of an impossible outcome, where AI never worked, nanotechnology never worked, biotechnology only sort-of worked; and yet somehow humanity not only survived, but discovered a way to travel Faster-Than-Light:  The past's Future.

It features complex moral dilemmas. It begins with a woman shouting "ALIENS!".

Recent Discussion

This post is written as an explanation of a misconception I had with transformer embedding when I was getting started. Thanks to Stephen Fowler for the discussion last August that made me realise the misconception, and others for helping me refine my explanation. Any mistakes are my own. Thanks to feedback by Stephen Fowler and JustisMills on this post.

TL;DR: While the token vectors are stored as n-dimensional vectors, thinking of them as points in vector space can be quite misleading. It is better to think of them as directions on a hypersphere, with a size component.

The I think of distance as the Euclidean distance, with formula:

Thus does not match up with the distance forumla used when calculating logits:

But it does match up with the cosine similarity forumula:

And...

I Googled up 'how are tokens embedded' and this post came up third in the results - thanks for the post!

I intend to use my shortform feed for two purposes:

1. To post thoughts that I think are worth sharing that I can then reference in the future in order to explain some belief or opinion I have.

2. To post half-finished thoughts about the math or computer science thing I'm learning at the moment. These might be slightly boring and for that I apologize.

I'm not sure if I can find it easily, but I recall Eliezer pointing out (several years ago) that he thought that Value Identification was the "easy part" of the alignment problem, with the getting it to care part being something like an order of magnitude more difficult. He seemed to think (IIRC) this itself could still be somewhat difficult, as you point out. Additionally, the difficulty was always considered in the context of having an alignable AGI (i.e. something you can point in a specific direction), which GPT-N is not under this paradigm.

4habryka6h
I like this summary, though it seems to miss the arguments in things like Nate's recent post (which have also been made other places many years ago): https://www.lesswrong.com/posts/tZExpBovNhrBvCZSb/how-could-you-possibly-choose-what-an-ai-wants [https://www.lesswrong.com/posts/tZExpBovNhrBvCZSb/how-could-you-possibly-choose-what-an-ai-wants]  Reflective stability is a huge component of why value identification is hard, and why it's hard to get feedback on whether your AI actually understands human values before it reaches quite high levels of intelligence.
2Matthew Barnett5h
I don't understand this argument. I don't mean that I disagree, I just mean that I don't understand it. Reflective stability seems hard no matter what values we're talking about, right? What about human values being complex makes it any harder? And if the problem is independent of the complexity of value, then why did people talk about complexity of value to begin with?
2RobertM2h
Complexity of value is part of why value is fragile [https://www.lesswrong.com/posts/GNnHHmm8EzePmKzPk/value-is-fragile]. (Separately, I don't think current human efforts to "figure out" human values have been anywhere near adequate, though I think this is mostly a function of philosophy being what it is.  People with better epistemology seem to make wildly more progress in figuring out human values compared to their contemporaries.)

So you’ve been reading down a rabbit hole and you have seen something that doesn’t make sense.  You feel a quiver in your heart that feels like a cross between excitement and fear.  Could all the people who already know stuff be Wrong™.  Could eminent researchers in their field be working on a wrong model?

Maybe you have a disease or condition and you followed the directions of the doctors or experts exactly and found results opposite to the intended results.  That’s when you started googling.  You found an obscure blog written in 2005 by someone with the same problem.  You find a few people proposing a different working mechanism and you are rapidly falling down the rabbit hole… 

IF and that’s a big bold and underlined “if”, the experts...

1Answer by mukashi7m
I believe I have found a perfect example where the "Medical Model is Wrong," and I am currently working on a post about it. However, I am swamped with other tasks, I wonder if I will ever finish it. In my case, I am highly confident that my model is correct, while the majority of the medical community is wrong.  Using your bullet points: 1.Personal: I have personally experienced this disease and know that the standard treatments do not work.  2.Anecdotal: I am aware of numerous cases where the conventional treatment has failed. In fact, I am not aware of any cases where it has been successful.  3.Research papers: I came across a research paper from 2022  that shares the same opinion as mine.  4.Academics: Working in academia, I am well aware of its limitations. In this specific case, there is a considerable amount of inertia and a lack of communication between different subfields, as accurately described in the book "Inadequate Equilibria" by EY.  5.Medical: Most doctors hold the same opinion because they are influenced by their education. Therefore, if 10 doctors provide the same response, it should not be considered as 10 independent opinions.  6.Countercultural experts: No idea here 7.Communities: I have not explored this extensively, but completing this post I am talking about might be the beginning  8. Someone claims to have completely made the condition disappear using arbitrary methods. I am not personally aware of any such cases but I suspect that it is feasible and could potentially be relatively simple.  9.Models: I have a precise mechanistic model of the disease and why the treatments fail to cure it. I work professionally in a field closely related to this disease. In summary, my confidence comes from, 1. being an expert in a closely related field and understanding what other people are missing and above all, why they are missing it, 2. having a mechanistic model 3. finding publications that manifest similar opinions.  
1Answer by Oleander21m
I'm not exactly in this position, but I think it is somewhat adjacent. I have stage 4 prostate cancer and after initial treatment (chemo + castration) decided to stop seeing my urologist for periodic checkups. I do regularly get my blood tested by a lab outside of our (European) healthcare system. This was not exactly due to the establishment being "wrong", but a combination of factors: * Quality of care wasn't great and I can't stand the paternalistic nature of healthcare (in my country). * After five months of pretty intensive self study and an occasional question to my doctor I felt that I had a sufficient understanding to improve my treatment beyond "standard of care". It probably helps a bit that my brother is a doctor. Three and a half years after diagnosis I'm doing better than expected, but of course there is no way to tell if this is due to luck or my own treatment. At first I was a bit frustrated that none of my doctors seem to care, but I understand that they've been thoroughly trained to ignore anecdotes (and they're probably also overworked). I feel it important to mention that I'm not into alternative medicine. I got my GP to prescribe off label medication and the rest is just lifestyle adjustments (diet and exercise). Feel free to ask me questions.

I wish you recover soon with all my heart

The big news this week was that OpenAI is not training GPT-5, and that China’s draft rules look to be crippling restrictions on their ability to develop LLMs. After all that talk of how a pause was impossible and working with China was impossible and all we could do was boldly rush ahead, the biggest American player and biggest foreign rival both decided for their own internal reasons to do something not entirely unlike a pause.

They just went ahead and did it. We kept saying they’d never do it no matter what, and they just… went ahead and did it. At least somewhat.

This is excellent news. I sincerely hope people are updating on the new information, now that they know such things are not only possible but...

The most compelling-to-me argument I've seen in that vein is that human civilization is currently, even without AI, on a trajectory to demand more and more energy, and eventually that will involve doing things on a scale sufficient to significantly change the amount of sunlight that reaches the surface of the Earth.

Humans probably won't do that, because we live here (though even there, emphasis on "probably" -- we're not exactly doing great in terms of handling climate change from accidentally changing the amount of CO2 in the atmosphere, and while that's ... (read more)

2James Payor4h
Count me surprised if they're not working on GPT-5. I wonder what's going on with this? I saw rumors that this is because they're waiting on supercomputer improvements (H100s?), but I would have expected at least early work like establishing their GPT-5 scaling laws and whatnot. In which case perhaps they're working on it, just haven't started what is considered the main training run? I'm interested to know if Sam said any other relevant details in that talk, if anyone knows.
2starship0063h
I'm not sure if you've seen it or not, but here's [https://twitter.com/amyneurons/status/1646649674122641409] a relevant clip where he mentions that they aren't training GPT-5. I don't quite know how to update from it. It doesn't seem likely that they paused from a desire to conduct more safety work, but I would also be surprised if somehow they are reaching some sort of performance limit from model size. However, as Zvi mentions, Sam did say [https://www.wired.com/story/openai-ceo-sam-altman-the-age-of-giant-ai-models-is-already-over/]:
2jacob_cannell2h
The expectation is that GPT-5 would be the next GPT-N but 100x the training compute of GPT-4, but that would probably cost tens of $billions, so GPT-N scaling is over for now.

Some time back, I saw a tweet from somebody that read:

Much of social psychology seems to be premised on the bizarre assumption that what people really care about is not real-world outcomes but the state of their own mind: self-esteem, a positive self-image, dissonance reduction, feelings of control, reducing uncertainty, etc.

I've certainly seen versions of the same myself. Maybe the most poignant example comes from this book review, which suggested that gambling addicts get hooked on a sense of control - even though someone who's hooked on gambling to the point of ruining their life clearly isn't in much control of anything:

The primary objective that machine gambling addicts have is not to win, but to stay in the zone. The zone is a state that suspends real life, and

...

This is one of my favorite sequences on this site and I'm quite glad to see a new entry.

Thank you!

How does one gain confidence that the read on their own emotions is an accurate description of the message they're trying to communicate? That is, how can one be more sure that they're actually listening to their emotions and not just assuming?

It can be difficult! Some thoughts:

1) There's a certain difference in what it feels like to intellectualize or guess what your emotions are saying, as opposed to actually listening to them. @pjeby had a nice exercise abo... (read more)

This is a linkpost for https://epochai.org/trends

Developments in Machine Learning have been happening extraordinarily fast, and as their impacts become increasingly visible, it becomes ever more important to develop a quantitative understanding of these changes. However, relevant data has thus far been scattered across multiple papers, has required expertise to gather accurately, or has been otherwise hard to obtain.

Given this, Epoch is thrilled to announce the launch of our new dashboard, which covers key numbers and figures from our research to help understand the present and future of Machine Learning. This includes:

  • Training compute requirements
  • Model size, measured by the number of trainable parameters
  • The availability and use of data for training
  • Trends in hardware efficiency
  • Algorithmic improvements for achieving better performance with fewer resources
  • The growth of investment in training runs over time

Our dashboard gathers all of this...

1Edouard Harris9h
Looks awesome! Minor correction on the cost of the GPT-4 training run: the website says $40 million, but sama confirmed publicly that it was over $100M (and several news outlets [https://www.wired.com/story/openai-ceo-sam-altman-the-age-of-giant-ai-models-is-already-over/] have reported the latter number as well).

Thanks!

Our current best guess is that this includes costs other than the amortized compute of the final training run.

If no extra information surfaces we will add a note clarifying this and/or adjust our estimate.

2Jsevillamol10h
Thanks Neel! The difference between tf16 and FP32 comes to a x15 factor IIRC. Though also ML developers seem to prioritise other characteristics than cost effectiveness when choosing GPUs like raw performance and interconnect, so you can't just multiply the top price performance we showcase by this factor and expect that to match the cost performance of the largest ML runs today. More soon-ish.
To get the best posts emailed to you, create an account! (2-3 posts per week, selected by the LessWrong moderation team.)
Subscribe to Curated posts
Log In Reset Password
...or continue with

A while back I read How to Measure Anything and found it fascinating. In my day job, I spend quite a bit of time trying to make sense of the world by looking at dashboards of requests, latencies, error rates, etc. (software systems).

After finishing the book and taking copious notes, I understood that it gave me a prepackaged process that I could apply as-is, but I found it very difficult to adapt to everyday situations. I don't think I picked up a good intuition about stats, in other words.

I'm looking to change that. Specifically, I want to learn to apply stats in these two situations:

  • measuring things. Mostly software systems, but open to little experiments. Dan Luu used to measure a lot of fun things.
  • understanding how others measure
...
Answer by mikesApr 21, 202310

Being able to accurately assess a paper's claims is, unfortunately, a very high bar. A large proportion of scientists fall short of it. see: [https://statmodeling.stat.columbia.edu/2022/03/05/statistics-is-hard-etc-again/

Most people with a strong intuition for statistics have taken courses in probability. It is foundational material for the discipline.

If you haven't taken a probability course, and if you're serious about wanting to learn stats well, I would strongly recommend to start there. I think Harvard's intro probability course is good and has... (read more)

3Answer by Derek M. Jones8h
I'm assuming you are interested in learning about something by measuring one or more of its attributes, and then using statistics to extract information from the measurements, i.e., you are interested in a hands-on application, then books I found useful include: Statistics for experimenters by Box, Hunter and Hunter Design and Analysis of experiments by Montgomery.

This is a great compilation of the arguments for why cryonics service providers should offer a brain-only option.

Not mentioned in the article: Note that a brain-only option is also offered by OregonCryo and Cryonics Germany.

Summary of the linked post:

The main advantages are:

  • Lower storage cost
  • Easier emergency relocation
  • Lower "yuck" factor
  • Compatible with organ donation
  • Compatible with a burial ceremony
  • Reduction of fracturing
  • Permits emergency fixation (in case of ln2 delivery issues)

Related posts:

How much more advantageous would this be than a "head only" option? To get to the brain, wouldn't you have to cut open the head anyways?

8avturchin9h
Also, legally in some jurisdictions brain is not a body, as it doesn't have bones, but only a tissue sample. 

Some have pointed out seemingly large amounts of status-anxiety EAs generally have. My hypothesis about what's going on:

A cynical interpretation: for most people, altruism is significantly motivated by status-seeking behavior. It should not be all that surprising if most effective altruists are motivated significantly by status in their altruism. So you've collected several hundred people all motivated by status into the same subculture, but status isn't a positive-sum good, so not everyone can get the amount of status they want, and we get the above dyn

... (read more)