LESSWRONG
LW

Harsha
0010
Message
Dialogue
Subscribe

PhD student in conversational AI

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
Race and Gender Bias As An Example of Unfaithful Chain of Thought in the Wild
Harsha2mo10

I guess it can be argued that the anti-bias prompting carries its own biases. All those phrasings ultimately encourage better minority representations because there are imbalances in the current distribution. (irrespective of whether this is desirable or not).

It also feels like there are different types of biases at play with larger and smaller model parameter sizes (even though the small ones are distilled versions). It would be interesting to know if the same candidates were rejected by the bigger and smaller models.

Reply
No wikitag contributions to display.
No posts to display.