x

LESSWRONG
LW

tenseisoham — LessWrong

tenseisoham

tenseisoham

Message

2

1

1y

tenseisoham

Latent Space Collapse? Understanding the Effects of Narrow Fine-Tuning on LLMs

This is my first post on the platform and my first set of experiments with GPT-2 using TransformerLens. If you spot any interesting insights or mistakes, feel free to share your thoughts in the comments. While these findings aren't entirely novel and may seem trivial, I’m presenting them here as...

Feb 28, 2025•3

Message

2 karma

1 post

Member for a year

Latent Space Collapse? Understanding the Effects of Narrow Fine-Tuning on LLMs

1y

This is my first post on the platform and my first set of experiments with GPT-2 using TransformerLens. If you spot any interesting insights or mistakes, feel free to share your thoughts in the comments. While these findings aren't entirely novel and may seem trivial, I’m presenting them here as a reference for anyone exploring this topic for the first time!

All the code with some extra analysis [not included in this post] is available here

Introduction and Motivation

Fine-tuning large language models (LLMs) is widely used to adapt models to specific tasks, yet the fundamental question remains: What actually changes in the model's internal representations? Prior research suggests that fine-tuning induces significant behavioral shifts... (read 2654 more words →)

3