Forecasting progress in language models — LessWrong