Sparse trinary weighted RNNs as a path to better language model interpretability
Epistemic status: Strongly arguing for what I feel is a neglected approach. May somewhat overstate the case and fail to adequately steelman counter arguments. I hope and expect that readers will point out flaws in my logic. Introduction Currently, large transformers with dense floating point weights are state of the...
Sep 17, 202219