x
This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
LESSWRONG
LW
Login
Erik Garrison — LessWrong
Erik Garrison
Posts
Sorted by New
Wikitag Contributions
Comments
Sorted by
Newest
Adam Optimizer Causes Privileged Basis in Transformer LM Residual Stream
Erik Garrison
1y
3
0
Could this affect distributed training that might make the assumption of rotational invariance?
Reply
Could this affect distributed training that might make the assumption of rotational invariance?