Posts

Sorted by New

Wiki Contributions

Comments

I may be easily corrected here, but my understanding was that our prompts were simply there for fine-tuning colloquialisms and "natural language". I don't believe our prompts are a training dataset. Even if all of our prompts were part of the training set and GPT weighted them to the point of being influenced towards a negative goal, I'm not so sure it'd be able to do anything more than regurgitate negative rhetoric. It may attempt to autocomplete a dangerous concept, but its agency in thinking "I must persuade this person to think the same way" seems very unlikely and definitely ineffective in practice. But I just got into this whole shindig and would love to be corrected as it's fun discussion either way.