x

LESSWRONG
LW

Glen Taggart — LessWrong

Glen Taggart

Glen Taggart

Message

47

Ω

26

1

8

3y

Glen Taggart

47

Ω

26

3y

;

ProLU: A Nonlinearity for Sparse Autoencoders

Abstract This paper presents ProLU, an alternative to ReLU for the activation function in sparse autoencoders that produces a pareto improvement over both standard sparse autoencoders trained with an L1 penalty and sparse autoencoders trained with a Sqrt(L1) penalty. ProLU(mi,bi)={miif mi+bi>0 and mi>00otherwiseSAEProLU(x)=ProLU((x−bdec)Wenc,benc)Wdec+bdec The gradient wrt. b is zero, so...

Apr 23, 2024•44