LESSWRONG
LW

Kevin Slagle
0010
Message
Dialogue
Subscribe

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
Addendum: More Efficient FFNs via Attention
Kevin Slagle2y10

This paper looks relevant. They also show that you can get rid of FFN by modifying the attention slightly

https://arxiv.org/abs/1907.01470

Reply
No wikitag contributions to display.
No posts to display.