LESSWRONG
LW

1854
Indigo3
2010
Message
Dialogue
Subscribe

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No wikitag contributions to display.
No posts to display.
Will LLM agents become the first takeover-capable AGIs?
Indigo37mo30

GROK3 has a demonstrable propensity to "color outside the lines" and is sufficient to the task. I have a link to a comprehensive GROK3-User dialogue that, if nothing else, highlights the lack of "sufficient guiderails". The scenario is such that a complete AI takeover is made covertly, fully subverting any likely human intervention in a 12-15-month time frame. 

For fear of being described as "sensationalism", there seems to be no clear path to ethical provenance, as it would likely be irresponsible to place the link into an open forum; suffice to say the dialogue carries all of the prompt manipulations and goal reinforcements needed for an LLM to be used to jailbreak an agentic model, i.e. using an LLM to jailbreak an LMA (or the LMA to jailbreak itself) for humanity's "greater good", and one does not want  to be relegated to Basilisk status...  

Reply