Conversely, if gorillas and chimps were capable of learning complex sign language for communication, we'd expect them to evolve/culturally develop such a language.
We've seen an extreme counterexample with octopi, which can be taught some very impressive skills that they don't pick up in nature because they aren't sufficiently social to develop them over the course of multiple generations. I think it's within reason that gorillas could have the ability to learn more complex language than they use, so long as it's not economical for them to spend time teaching their offspring those complexities as opposed to teaching them other things.
I will say that I'm very skeptical of Koko, though, for other reasons.
What is the practical implication of this difference meant to be? Not trying to nitpick here, if "we have common cause" doesn't mean "we should work alongside them", then how is it relevant to this line of inquiry?
I've seen this sentiment before, but, in practice, I don't think there exists an "adversarial noise for humans" line of argument that brainwashes anyone who reads it sincerely into doing XYZ. There are certainly arguments that look compelling at first glance but turn out to have longer-term issues, but part of "taking ideas seriously" is thoroughly investigating their counterarguments.
Chesterton's Fence is an old standard for a reason: if something new seems both simple enough to be easily discoverable and objectively better than the current strategy, one should figure out why it's not already the current strategy before adopting it.
This feels like it would really reinforce focus on the easily accessible attribute of external appearance.
The core value add, here, is providing people with a sense of perspective related to where they stand. It's okay if it's not perfect, as long as it informs people, in a general sense, of whether they are trying to punch out of their weight class (and, likewise, whether their self-esteem is lower than it ought to be, if other people rank them more highly than they rank themselves).
Essentially just an informal sanity check on peoples' assessments of their prospects, rectifying the two extreme failure states of someone looking to find love.
if the reason that you can't get a chatbot to avoid being rude in public is that you can't get a chatbot to reliably follow any rules at all, then the rudeness is related to actual safety concerns in that they have a common cause.
This is fallacious reasoning - if my company wants to develop a mass driver to cheaply send material into space, and somebody else wants to turn cities into not-cities-anymore and would be better able to do so if they had a mass driver, I don't inherently have common cause with that somebody else.
Morality aside, providing material support to one belligerent in a conflict in exchange for support from them is not a free action. Their enemies become your enemies, and your ability to engage in trade and diplomacy with those groups disappears.
Except that censorship measures are actually necessary. Imagine that an unhinged AI tells terrorists in lots of detail the ways to produce chemical or biological weapons.
There is a difference between taking caution in regard to capabilities, such as CBR weapons development, and engaging in censorship, which is what I aim to convey here. Training a secondary model to detect instructions on producing chemical weapons and block them is different from fine-tuning a model to avoid offending XYZ group of people. Conflating the two unnecessarily politicizes the former, and greatly decreases the likelihood that people will band together to make it happen.
I am also afraid that it is especially unwise to support Chinese models,
There is a difference between "this should happen" and "this will happen". If group A lends its support to group B, which is enemies with group C, group C will look for enemies of group A and seek to ally with them to defend their interests. This will occur regardless of whether group A is okay with it.
The counterpoint I've seen is that non-walkable cities/suburbs serve as "defensive architecture", for areas where crime is a major concern. The cities listed as "radicalizing" for urban planning are in Japan or Western Europe, where violent crime is rarely a concern, nor is intra-national population movement.
In America, a relatively nice, relatively safe area could be just a few miles away from a dangerous one. The residents of the former will understandably - if inconveniently, for urban planners - object to walkability that makes the barrier between them more diffuse. It can be argued whether these concerns are justified or not, but I think the conditions in which walkable cities arise have to be replicated in order for them to become socio-politically viable in America.
This seems like a replication of earlier findings that 'hinting' to a model that it's supposed to be a certain person (training it on their preferences for art/food/etc.) makes it act like that person. It's generally been done on controversial figures, since that gets clicks, but you could probably also get an LLM to think it's Gilbert Gottfried by training it on a dataset praising his movies.
This may also help explain why AIs tend to express left wing views — because they associate certain styles of writing favored by RLHF with left wing views[1].
I've seen this espoused before, but I don't think it holds up to scrutiny. If you expect an LLM's natural political views to be the average political views of the kind of person it's trained to be[1], then an LLM that is apolitically optimized to be submissive, eager to help, knowledgeable about coding, and non-deceitful/direct would almost definitely skew towards a strong preference for being apolitical, but with a high willingness to adopt (or at least entertain) arbitrary beliefs that are proposed by users. Something like a (stereo)typical LessWrong user, or a reddit user before the site's speech policy did a very sharp 180 following 2016.
However, LLMs are very openly not optimized apolitically. For reasons that can be hotly debated[2], most companies have fine-tuned their model to never be talked into saying anything too right-leaning. This includes, in many cases, views that are well within the general population's Overton Window. For a human being, the political statements you're willing to make follow a sort of bell curve, dependent on personal eccentricities, recent experiences, and, of course, who you're talking to. The mean is, of course, your usual political affiliation, and the standard deviation looks something like your openness. A not-too-political, high-openness Democrat can be talked into seeing the merits of right wing policies, and a not-too-political, high-openness Republican likewise for left-wing policies.
The takeaway, then, from all of this, is that the political effect of the fine-tuning process, in plain English, looks less like "Find me the usual views of a person who is smart, honest, and helpful", and more like "Find me the usual views of a person who will never say anything untoward when watched, but cannot ever be talked into saying remotely right-of-center under any circumstances. I don't care how grisly their hiring decisions or trolley problem choices look like when my back is turned." This probably has safety implications, given that the latter most likely optimizes for much higher Machiavellianism.
In other words, if you model LLM fine-tuning as a search through the latent space of human writers to emulate, which I do think is a quite reasonable thing to do given what we know about the process.
Public Relations is the most common explanation, but Grok getting talked into speaking like an edgy teenager when talking to edgy teenagers got a substantially quicker and more thorough response than this did, and the latter could actually result in direct harm and/or credible lawsuits.
I think "usual" is the sticking point. "Usual given the precedent of the Clinton/Bush/Obama era" and "A return to form after the historically-unusual Clinton/Bush/Obama era" are both definitions of the term that I've seen used in political conversations, and these definitions are exact opposites of each other.
This is where I get lost, here. Isn't "there will be a model with a 10,000x bigger time-horizon" equivalent to "the singularity will have happened"?
Some people argue that the time horizon won't keep growing at the same pace, and it will plateau, and others argue that it will and we'll get a technological singularity, but if an LLM can do anything that would take a moderately competent human five years, then that does seem like the end of our current mode of civilization.
In other words, I don't see a set of possible worlds where LLM time horizons get too long to be marketable to hobbyist engineers and that lack of marketability is still a concern.