bruberu - LessWrong

As for one more test, it was rather close on reversing 400 numbers:

Given these results, it seems pretty obvious that this is a rather advanced model (although Claude Opus was able to do it perfectly, so it may not be SOTA).

Going back to the original question of where this model came from, I have trouble putting the chance of this necessarily coming from OpenAI above 50%, mainly due to questions about how exactly this was publicized. It seems to be a strange choice to release an unannounced model in Chatbot Arena, especially without any sort of associated update on GitHub for the model (which would be in https://github.com/lm-sys/FastChat/blob/851ef88a4c2a5dd5fa3bcadd9150f4a1f9e84af1/fastchat/model/model_registry.py#L228 ). However, I think I still have some pretty large error margins, given how little information I can really find.

avturchin's Shortform

bruberu13d70

OK, what I actually did was not realize that the link provided did not link directly to gpt2-chatbot (instead, the front page just compares two random chatbots from a list). After figuring that out, I reran my tests; it was able to do 20, 40, and 100 numbers perfectly.

I've retracted my previous comments.

avturchin's Shortform

bruberu13d10

Interesting; maybe it's an artifact of how we formatted our questions? Or, potentially, the training samples with larger ranges of numbers were higher quality? You could try it like how I did in this failing example:

When I tried this same list with your prompt, both responses were incorrect:

[This comment is no longer endorsed by its author]Reply

avturchin's Shortform

bruberu13d51

By using @Sergii's list reversal benchmark, it seems that this model seems to fail reversing a list of 10 random numbers from 1-10 from random.org about half the time. This is compared to GPT-4's supposed ability to reverse lists of 20 numbers fairly well, and ChatGPT 3.5 seemed to have no trouble itself, although since it isn't a base model, this comparison could potentially be invalid.
This does significantly update me towards believing that this is probably not better than GPT-4.

[This comment is no longer endorsed by its author]Reply

Reconsider the anti-cavity bacteria if you are Asian

bruberu1mo130

I looked a little into the literature on how much alcohol consumption actually affects rates of oral cancers in populations with ALDH polymorphism, and this particular study seems to be helpful in modelling how the likelihood of oral cancer increases with alcohol consumption for this group of people (found in this meta-analysis).

The specific categories of drinking frequency don't seem to be too nice here, given that it was split between drinking <=4 days a week, drinking >=5 days a week and having less than 46g of ethanol per week, and drinking >=5 days a week and having more than 46g of ethanol per week. Only in the latter category was there an actual significant increase in oral cancer rates (4.4x), although there is some non-significant evidence for about a 1.5x increase in the high-moderate group and the moderate group. Comparing the ~77 mg/week ingestion rate from the document to the 10-40g range I would estimate the high-moderate group to have, I would imagine that there is probably a much more minor effect for Lumina (if I had to estimate, maybe like a 1.1x risk, which might be offset by the benefits of lower levels of lactic acid at that level).

One other argument against this (which I would put a lower epistemic status on given my basic intuition of enzyme kinetics) would be that since people probably tend to have more than 10 milligrams of alcohol every time they have a gulp of an alcoholic beverage, ALDH2 deficiency would be much more of a bottleneck as acetaldehyde levels rise rapidly after consuming alcohol compared to the rather low background generation of acetaldehyde we might see for Lumina users.

As I am being slightly a "man of one study" here, I'd be interested to see if you've found any studies of your own that demonstrate more of an effect of alcohol consumption on oral cancer for ALDH2 deficiency than I've been listing here.

On attunement

bruberu2mo30

One other interesting quirk of your model of green is that it appears most of the central (and natural) examples of green for humans involve the utility function box adapting to these stimulating experiences so that their utility function is positively correlated with the way latent variables change over the course of one of an experience. In other words, the utility function gets "attuned" to the result of that experience.

For instance, taking the Zadie Smith example from the essay, her experience of greenness involved starting to appreciate the effect that Mitchell's music had on her, as opposed to starting to dislike it. Environmentalist greenness, in the same vein, might arise from humans' utility boxes attuning to the current processes of life, leading to a wish for it to continue.

Notably, I can't really think of any examples where green alone goes against one of these processes in humans, with most examples of people being "attuned" against an experience being caused by it simply conflicting with a separate, already-existing goal. While disgust does technically conflict with what changes during the process of becoming physically sick, I can't think of any reason that might occur other than how it prevents a human from achieving goals (black, or perhaps red). Desires for immortality, while conflicting with the process of death, seem to mostly extend from a red desire to have things continue to live (which would, by this model, was a desire that stemmed itself from green). If I attempt to think of some experience that would be completely uncorrelated with a human's prior preferences (e.g. an infant looking at a flowing river for the first time from a distance), it doesn't seem natural to imagine the human suddenly disliking that in any particular circumstance (the infant wouldn't start despising flowing rivers), but I could still see a small chance of it beginning to appreciate it (as long as I'm not missing some obvious counterexample).

This natural positive correlation (or "attunement", "appreciation", or for especially spicy takes, "alignment"), if I had to guess, could be explained from either humans simply gaining reward from expanding their world models (maybe this just simplifies to "humans naturally like learning", but that feels a little anti-climactically blue). It is also possible that attunement as opposed to indifference is only created just by some minor positive association generated from deeper levels in the brain, although that would imply that green could just be a consequence of red for a human's utility function box.

LESSWRONG
LW

Posts

Wiki Contributions

Comments