So this paper has me genuinely pretty excited, because I think this could be the right direction to be looking at when we think about introspection in something as complex as Large Language Models. I’ve been taking a similar approach to investigating this phenomenon but from a different angle. Something I noticed while actually trying to look into how system reminders affect model behavior, was that a simple punctuation change with the same prompt using the same seed could create wildly different outputs. To test this, I ran experiments on Llama 3.1 8B wi... (read more)
So this paper has me genuinely pretty excited, because I think this could be the right direction to be looking at when we think about introspection in something as complex as Large Language Models. I’ve been taking a similar approach to investigating this phenomenon but from a different angle. Something I noticed while actually trying to look into how system reminders affect model behavior, was that a simple punctuation change with the same prompt using the same seed could create wildly different outputs. To test this, I ran experiments on Llama 3.1 8B wi... (read more)