brambleboy

Posts

Sorted by New

Wiki Contributions

Comments

It may be through extrapolating too much from your (first-person, subjective) experiences with objects that seemingly possess intrinsic, observer-independent properties, like the classical objects of everyday life.

 

Are you trying to say that quantum physics provides evidence that physical reality is subjective, with conscious observers having a fundamental role? Rob implicitly assumes the position advocated by The Quantum Physics Sequence, which argues that reality exists independently of observers and that quantum stuff doesn't suggest otherwise. It's just one of the many presuppositions he makes that's commonly shared on here. If that's your main objection, you should make that clear.

Another example in ML of a "non-conservative" optimization process: a common failure mode of GANs is mode collapse, wherein the generator and discriminator get stuck in a loop. The generator produces just one output that fools the discriminator, the discriminator memorizes it, the generator switches to another, until eventually they get back to the same output again.

In the rolling ball analogy, we could say that the ball rolls down into a divot, but the landscape flexes against the ball to raise it up again, and then the ball rolls into another divot, and so on.

So of course Robin Hanson offered polls on these so-called taboo topics. The ‘controversial’ positions got overwhelming support. The tenth question, whether demographic diversity (race, gender) in the workplace often leads to worse performance got affirmed 54%-17%, and the rest were a lot less close than that. Three were roughly 90%-1%. I realize Hanson has unusual followers, but the ‘taboo questions’ academics want to discuss? People largely agree on the answers, and the academics have decided saying that answer out loud is not permitted.

 

I understand criticizing the censorship of controversial research, but to suggest that these questions aren't actually controversial or taboo outside of academia is absurd to me. People largely agree that “Genetic differences explain non-trivial (10% or more) variance in race differences in intelligence test scores"? Even a politician in a deeply conservative district wouldn't dare say that and risk scaring off his constituents.

For those curious about the performance: eyeballing the technical report, it roughly performs at the level of LLama-3 70B. It seems to have an inferior parameters-to-performance ratio because it was only trained on 9 trillion tokens, while the Llama-3 models were trained on 15 trillion tokens. It's also trained with a 4k context length as opposed to Llama-3's 8k. Its primary purpose seems to be the synthetic data pipeline thing.

I encountered this while I was reading about an obscure estradiol ester, Estradiol undecylate, used for hormone replacement therapy and treating prostate cancer. It's very useful because it has a super long half-life, but it was discontinued. I had to reread the article to be sure I understood that the standard dose chosen arbitrarily in the first trials was hundreds of times larger than necessary, leading to massive estrogen overdoses and severe side effects that killed many people due to cardiovascular complications, and yet these insane doses were typical for decades and might've caused its discontinuation.

Although it has been over a decade, decent waterproof phone mounts now exist, too.

Thank you for writing this, this is by far the strongest argument for taking this problem seriously tailored to leftists I've seen and I'll be sharing it. Hopefully the frequent (probably unavoidable) references to EA doesn't turn them off too much.

Answer by brambleboy73

Here's why determinism doesn't bother me. I hope I get it across.

Deterministic systems still have to be simulated to find out what happens. Take cellular automata, such as Conway's Game of Life or Wolfram's Rule 110, . The result of all future steps is determined by the initial state, but we can't practically "skip ahead" because of what Wolfram calls 'computational irreducibility': despite the simplicity of the underlying program, there's no way to reduce the output to a calculation that's much cheaper than just simulating the whole thing. Same with a mathematical structure like the Mandelbrot Set: its appearance is completely determined by the function, and yet we couldn't predict what we'd see until we computed it. In fact all math is like this.

What I'm getting at is that all mathematical truths are predetermined, and yet I doubt this gives you a sense that being a mathematician is pointless, because obviously these truths have to be discovered. As with the universe: the future is determined, and yet we, or even a hypothetical outsider with a massive computer, have to discover it.

Our position is better than that, though: we're not just looking at the structure of the universe from the outside, we're within it. We're part of what determines the future: it's impossible to calculate everything that happens in the future without calculating everything we humans do. The universe is determined by the process, and the process is us. Hence, our choices determine the future.

I disagree that the Reversal Curse demonstrates a fundamental lack of sophistication of knowledge on the model’s part. As Neel Nanda explained, it’s not surprising that current LLMs will store A -> B but not B -> A as they’re basically lookup tables, and this is definitely an important limitation. However, I think this is mainly due to a lack of computational depth. LLMs can perform that kind of deduction when the information is external, that is, if you prompt it with who Tom Cruise’s mom is, it can then answer who Mary Lee Pfeiffer’s son is. If the LLM knew the first part already, you could just prompt it to answer the first question before prompting it with the second. I suspect that a recurrent model like the Universal Transformer would be able to perform the A -> B to B -> A deduction internally, but for now LLMs must do multi-step computations like that externally with a chain-of-thought. In other words, it can deduce new things, just not in a single forward pass or during backpropagation. If that doesn't count, then all other demonstrations of multi-step reasoning in LLMs don't count either. This deduced knowledge is usually discarded, but we can make it permanent with retrieval or fine-tuning. So, I think it's wrong to say that this entails a fundamental barrier to wielding new knowledge.