Visualizing neural network planning
by Nevan Wichers, Victor Tao, fbarez, and Riccardo Volpato
TLDR We develop a technique to try and detect if a NN is doing planning internally. We apply the decoder to the intermediate representations of the network to see if it’s representing the states it’s planning through internally. We successfully reveal intermediate states in a simple Game of Life model,...
May 9, 20244