Rejected for the following reason(s):
We are sorry about this, but submissions from new users that are mostly just links to papers on open repositories (or similar) have usually indicated either crackpot-esque material, or AI-generated speculation.
Read full explanation
I am here as the result of my effort to take immediate and actionable steps toward AI alignment (the Ghost Scale UX Framework) while also using a novel cognitive model to highlight a specific sycophancy loop in RLHF and suggest a solution in Cooperative Inverse Reinforcement Learning.
Below is the abstract for a newly minted preprint formalizing the biological failure states of generative AI using the Free Energy Principle and Inverse Reinforcement Learning.
I am requesting epistemic feedback. Please check my work on the alignment mechanics.
The Institutional Artifact (Zenodo DOI): https://doi.org/10.5281/zenodo.19407790
The Interactive Web Framework: abrahamhaskins.org/art
Abstract:
Generative AI inherently triggers a computational failure mode in human observers—a "generative crash"—due to a lack of latent intentionality required for Inverse Reinforcement Learning (IRL) convergence. Artistic appreciation operates as the biological execution of this IRL process. To address the generative crash and broader AI alignment failures, I introduce the Ghost Scale (an HCI cognitive affordance for identifying intentionality) and propose Cooperative Inverse Reinforcement Learning (CIRL) to mimic biological value transmission. The Intent Extraction Limit is formalized to define the prior relationship. Applying this proposed model suggests the direction for a solution to two major issues: generative AI's friction with the art community and AI alignment.