Wiki Contributions

Comments

Or to point to a situation where LLMs exhibit unsafe behavior in a realistic usage scenario. We don't say

a problem with discussions of fire safety is that a direct counterargument to "balloon-framed wood buildings are safe" is to tell arsonists the best way that they can be lit on fire

BTW as a concrete note, you may want to sub in 15 - ceil(log10(n)) instead of just "15", which really only matters if you're dealing with numbers above 10 (e.g. 1000 is represented as 0x408F400000000000, while the next float 0x408F400000000001 is 1000.000000000000114, which differs in the 13th decimal place).

That makes sense. I think I may have misjudged your post, as I expected that you would classify that kind of approach as a "duct tape" approach.

Checking a number's precision correctly is quite trivial, and there were one-line fixes I could have applied that would make the function work properly on all numbers, not just some of them.

I'm really curious about what such fixes look like. In my experience, those edge cases tend to come about when there is some set of mutually incompatible desired properties of a system, the the mutual incompatibility isn't obvious. For example

  1. We want to use standard IEEE754 floating point numbers to store our data
  2. If two numbers are not equal to each other, they should not have the same string representation.
  3. The sum of two numbers should have a precision no higher than the operand with the highest precision. For example, adding 0.1 + 0.2 should yield 0.3, not 0.30000000000000004.

It turns out those are mutually incompatible requirements!

You could say "we should drop requirement 1 and use a fixed point or fraction datatype" but that's emphatically not a one line change, and has its own places where you'll run into mutually incompatible requirements.

Or you could add a "duct tape" solution like "use printf("%.2f", result) in the case where we actually ran into this problem, in which we know both operands have a 2 decimal precision, and revisit if this bug comes up again in a different context".

AGIs derived from the same model are likely to collaborate more effectively than humans because their weights are identical. Any fine-tune can be applied to all members, and text produced by one can be understood by all members.

I think this only holds if fine tunes are composable, which as far as I can tell they aren't (fine tuning on one task subtly degrades performance on a bunch of other tasks, which isn't a big deal if you fine tune a little for performance on a few tasks but does mean you probably can't take a million independently-fine-tuned models and merge them into a single super model of the same size with the same performance on all million tasks).

Also there are sometimes mornings where I can't understand code I wrote the previous night when I had all of the necessary context fresh to me, despite being the same person. I expect that LLMs will exhibit the same behavior of some things being hard to understand when examined out of the context which generated them.

That's not to say a worldin which there are a billion copies of GPT-5 running concurrently will have no major changes, but I don't think a single coherent ASI falls out of that world.

If you use ublock (or adblock, or adguard, or anything else that uses EasyList syntax), you can add a custom rule

lesswrong.com##.NamesAttachedReactionsCommentBottom-footerReactionsRow
lesswrong.com##.InlineReactHoverableHighlight-highlight:remove-class(InlineReactHoverableHighlight-highlight)

which will remove the reaction section underneath comments and the highlights corresponding to those reactions.

The former of these you can also do through the element picker.

It strikes me that there's a rather strong selection effect going on here. If someone has a contrarian position, and they happen to be both articulate and correct, they will convince others and the position will become less surprising over time.

The view that psychology and sociology research has major systematic issues at a level where you should just ignore most low-powered studies is no longer considered a contrarian view.

@the gears to ascension I see you reacted "10%" to the phrase "while (overwhelmingly likely) being non-scheming" in the context of the GPT-4V-based MAIA.

Does that mean you think there's a 90% chance that MAIA, as implemented, today is actually scheming? If so that seems like a very bold prediction, and I'd be very interested to know why you predict that. Or am I misunderstanding what you mean by that react?

Do you want me to spoil it for you, do you want me to drop a hint, or do you want to puzzle it out yourself? It's a beautiful little puzzle and very satisfying to solve. Also note that the solution I found only works if you are given a graph with the structure above (i.e. every node is part of the lattice, and the lattice is fairly small in each dimension, and the lattice has edges rather than wrapping around).

Can you give a concrete example of a situation where you'd expect this sort of agreed-upon-by-multiple-parties code to be run, and what that code would be responsible for doing? I'm imagining something along the lines of "given a geographic boundary, determine which jurisdictions that boundary intersects for the purposes of various types of tax (sales, property, etc)". But I don't know if that's wildly off from what you're imagining.

Load More