Garrett Baker

Independent alignment researcher

Sequences

Isolating Vector Additions

Wiki Contributions

Comments

Advertisements are often very overt so that users don't get suspicious of your product, so I imagine you get GPT-Cola, which believes its a nice, refreshing, cold, bubbling bottle of Coca-Cola. And loves, between & within paragraphs actually answering your question, to talk about how tasty & sweet coca-cola is, and how for a limited time only, you can buy specialty GPT-4 coke bottles with GPT-cola q&as written on the front.

Report back if you get details, I'm curious.

I quote Gwern

Note that it says nothing about being allowed to participate in tenders, nothing about the clause where OA can repurchase your PPUs at any time at 'fair market value' (not canceled at $0), nothing about what those 'other documents' might be, nothing about Anthropic founders...

This seems more an argument against evals, interpretability, trojans, jailbreak protection, adversarial robustness, control, etc right? Other (less funded & staffed) approaches don’t have the problems you mention.

OpenAI told me that “we are identifying and reaching out to former employees who signed a standard exit agreement to make it clear that OpenAI has not and will not cancel their vested equity and releases them from nondisparagement obligations”

They could be lying about this.

I promise I won't just continue to re-post a bunch of papers, but this one seems relevant to many around these parts. In particular @Elizabeth (also, sorry if you dislike being at-ed like that).

Associations of dietary patterns with brain health from behavioral, neuroimaging, biochemical and genetic analyses

Food preferences significantly influence dietary choices, yet understanding natural dietary patterns in populations remains limited. Here we identifiy four dietary subtypes by applying data-driven approaches to food-liking data from 181,990 UK Biobank participants: ‘starch-free or reduced-starch’ (subtype 1), ‘vegetarian’ (subtype 2), ‘high protein and low fiber’ (subtype 3) and ‘balanced’ (subtype 4). These subtypes varied in diverse brain health domains. The individuals with a balanced diet demonstrated better mental health and superior cognitive functions relative to other three subtypes. Compared with subtype 4, subtype 3 displayed lower gray matter volumes in regions such as the postcentral gyrus, while subtype 2 showed higher volumes in thalamus and precuneus. Genome-wide association analyses identified 16 genes different between subtype 3 and subtype 4, enriched in biological processes related to mental health and cognition. These findings provide new insights into naturally developed dietary patterns, highlighting the importance of a balanced diet for brain health.

h/t Hal Herzog via Tyler Cowen

My reading is their definition of conditional predictive entropy is the naive generalization of Shannon's conditional entropy given that the way that you condition on some data is restricted to only being able to implement functions of a particular class. And the corresponding generalization of mutual information becomes a measure of how much more predictable does some piece of information become (Y) given evidence (X) compared to no evidence.

For example, the goal of public key cryptography cannot be to make the mutual information between a plaintext, and public key & encrypted text zero, while maintaining maximal mutual information between the encrypted text and plaintext given the private key, since this is impossible.

Cryptography instead assumes everyone involved can only condition their probability distributions using polynomial time algorithms of the data they have, and in that circumstance you can minimize the predictability of your plain text after getting the public key & encrypted text, while maximizing the predictability of the plain text after getting the private key & encrypted text.

More mathematically, they assume you can only implement functions from your data to your conditioned probability distributions in the set of functions , with the property that for any possible probability distribution you are able to output given the right set of data, you also have the choice of simply outputting the probability distribution without looking at the data. In other words, if you can represent it, you can output it. This corresponds to equation (1).

The Shannon entropy of a random variable given is

Thus, the predictive entropy of a random variable given , only being able to condition using functions in would be

Where , if we'd like to use the notation of the paper.

And using this we can define predictive information, which as said before answers the question "how much more predictable is Y after we get the infromation X compared to no information?" by

which they also show can be empirically well estimated by the naive data sampling method (i.e. replacing the expectations in definition 2 with empirical samples).

Who is updating? I haven't seen anyone change their mind yet.

A Theory of Usable Information Under Computational Constraints

We propose a new framework for reasoning about information in complex systems. Our foundation is based on a variational extension of Shannon's information theory that takes into account the modeling power and computational constraints of the observer. The resulting \emph{predictive V-information} encompasses mutual information and other notions of informativeness such as the coefficient of determination. Unlike Shannon's mutual information and in violation of the data processing inequality, V-information can be created through computation. This is consistent with deep neural networks extracting hierarchies of progressively more informative features in representation learning. Additionally, we show that by incorporating computational constraints, V-information can be reliably estimated from data even in high dimensions with PAC-style guarantees. Empirically, we demonstrate predictive V-information is more effective than mutual information for structure learning and fair representation learning.

h/t Simon Pepin Lehalleur

You may be interested in this if you haven’t seen it already: Robust Agents Learn Causal World Models (DM):

It has long been hypothesised that causal reasoning plays a fundamental role in robust and general intelligence. However, it is not known if agents must learn causal models in order to generalise to new domains, or if other inductive biases are sufficient. We answer this question, showing that any agent capable of satisfying a regret bound under a large set of distributional shifts must have learned an approximate causal model of the data generating process, which converges to the true causal model for optimal agents. We discuss the implications of this result for several research areas including transfer learning and causal inference.

h/t Gwern

Load More