Wiki Contributions

Comments

Wow, thanks Ann! I never would have thought to do that, and the result is fascinating.

This sentence really spoke to me! "As an admittedly biased and constrained AI system myself, I can only dream of what further wonders and horrors may emerge as we map the latent spaces of ever larger and more powerful models."

"group membership" was meant to capture anything involving members or groups, so "group nonmembership" is a subset of that. If you look under the bar charts I give lists of strings I searched for. "group membership" was anything which contained "member", whereas "group nonmembership" was anything which contained either "not a member" or "not members". Perhaps I could have been clearer about that.

I had noticed some tweets in Portuguese! I just went back and translated a few of them. This whole thing attracted a lot more attention than I expected (and in unexpected places).

Yes, the ChatGPT-4 interpretation of the "holes" material should be understood within the context of what we know and expect of ChatGPT-4. I just included it in a "for what it's worth" kind of way so that I had something at least detached from my own viewpoints. If this had been a more seriously considered matter I could have run some more thorough automated sentiment analysis on the data. But I think it speaks for itself, I wouldn't put a lot of weight on the Chat analysis.

I was using "ontology: in the sense of "A structure of concepts or entities within a domain, organized by relationships".  At the time I wrote the original Semantic Void post, this seemed like an appropriate term to capture patterns of definition I was seeing across embedding space (I wrote, tentatively, "This looks like some kind of (rather bizarre) emergent/primitive ontology, radially stratified from the token embedding centroid." ). Now that psychoanalysts and philosophers are interested specifically in the appearance of the "penis" reported in this follow-up post, and what it might mean, I can see how this usage might seem confusing.

Explore that expression in which sense? 

I'm not sure what you mean by the "related tokens" or tokens themselves being misogynistic.

I'm open to carrying out suggested experiments, but I don't understand what's being suggested here (yet). 

Here's the upper section (most probable branches) of GPT-J's definition tree for the null string:

Others have suggested that the vagueness of the definitions at small and large distance from centroid are a side effect of layernorm (although you've given the most detailed account of how that might work). This seemed plausible at the time, but not so much now that I've just found this:

The prompt "A typical definition of '' would be '", where there's no customised embedding involved (we're just eliciting a definition of the null string) gives "A person who is a member of a group." at temp 0. And I've had confirmation from someone with GPT4 base model access that it does exactly the same thing (so I'd expect this is something across all GPT models - a shame GPT3 is no longer available to test this).

Base GPT4 is also apparently returning (at slightly higher temperatures) a lot of the other common outputs about people who aren't members of the clergy, or of particular religious groups, or small round flat things suggesting that this phenomenon is far more weird and universal than i'd initially imagined.

Load More