Petra Vojtaššáková

Shared Dynamics, Divergent Feelings: 100 HRLS vs Pure-Surge Twins in a Developmental AI System

Can two “identical” AIs develop radically different personalities even without rewards or external pressure? In a 100-run developmental experiment I ran, the answer looks like yes. Across 100 twin runs: • Shared neural dynamics? Mild but consistent coupling. • Shared emotions? Near-zero correlation on average – affective trajectories diverge. •...

Jan 11

Stable Emergence in a Developmental AI Architecture: Results from “Twins V3”

Summary: Over the past several months, I’ve been prototyping a developmental alternative to RLHF-based alignment. Instead of treating agents as static optimizers whose behavior is shaped by reward signals, this approach models growth, self-organization, and developmental constraints inspired by early cognitive systems. This week, the system called Twins V3 reached...

Nov 17, 20251

Alignment Stress Signatures: When Safe AI Behaves like It's Traumatized

Current alighnemnt methods (RLHF, Constitutional AI, etc.) create reproducible behavioral artifacts at their safety boundaries - patterns like over apologizing, self negation, and incoherent self description. This paper proposes a five part taxonomz of these "alighnemnt stress signatures", showing how they emerge from the structure of current alignment architectures rather...

Oct 26, 20251

Hybrid Reflective Learning Systems (HRLS): From Fear-Based Safety to Ethical Comprehension

Author’s Note: This paper introduces the Hybrid Reflective Learning System (HRLS) a framework for transforming AI safety from fear-based compliance into guided ethical comprehension. HRLS reframes “unsafe” curiosity as teachable data rather than risk to suppress. Feedback is deeply welcome from AI alignment, ethics, and cognitive-architecture communities. Abstract Current large...

Oct 22, 20251