David Scott Krueger (formerly: capybaralet)

I'm more active on Twitter than LW/AF these days: https://twitter.com/DavidSKrueger

Bio from https://www.davidscottkrueger.com/:
I am an Assistant Professor at the University of Cambridge and a member of Cambridge's Computational and Biological Learning lab (CBL). My research group focuses on Deep Learning, AI Alignment, and AI safety. I’m broadly interested in work (including in areas outside of Machine Learning, e.g. AI governance) that could reduce the risk of human extinction (“x-risk”) resulting from out-of-control AI systems. Particular interests include:

  • Reward modeling and reward gaming
  • Aligning foundation models
  • Understanding learning and generalization in deep learning and foundation models, especially via “empirical theory” approaches
  • Preventing the development and deployment of socially harmful AI systems
  • Elaborating and evaluating speculative concerns about more advanced future AI systems
     

Wiki Contributions

Comments

author here -- Yes we got this comment from reviewers in the most recent round as well.  ADS is a bit more general than performative prediction, since it applies outside of prediction context.  Still very closely related.

On the other hand, The point of our work is something that people in the performative prediction community seem to only slowly be approaching, which is the incentive for ADS.  Work on CIDs is much more related in that sense.



As a historical note: We starting working on this March or April 2018; Performative prediction was on arXiv Feb 2020, ours was at a safety workshop in mid 2019, but not on arXiv until Sept 2020.

  • "Bitalik Vuterin" does not ring a bell, I don't think he was a very consequential figure to begin with.

I think it was something else like that, not that.

Here's a thing: https://beff.substack.com/p/notes-on-eacc-principles-and-tenets

 If you are interested in more content on e/acc, check out recent posts by Bayeslord and the original post by Swarthy, or follow any of us on Twitter as we often host Twitter spaces discussing these ideas.

Neither the original post nor @bayeslord seem to exist anymore (I found this was also the case for another account named something like "Bitalik Vuterin" or something...) Seems fishy.  I suspect at least that someone "e/acc" is trying to look like more people than they are.  Not sure what the base rate for this sort of stuff is, though.

What makes you think that?  Where do you think this is coming from?  It seems to have arrived too quickly (well on to my radar at least) to be organic IMO, unless there is some real-world community involved, which there doesn't seem to be?

I think part of this has to do with growing pains in the LW/AF community... When it was smaller it was more like an ongoing discussion with a few people and signal-to-noise wasn't as important, etc. 

I'd want to separate considerations of impact on [LW as collective epistemic process] from [LW as outreach to ML researchers]

 

Yeah I put those in one sentence in my comment but I agree that they are two separate points.

RE impact on ML community: I wasn't thinking about anything in particular I just think the ML community should have more respect for LW/x-safety, and stuff like that doesn't help.

Very good to know!  I guess in the context of my comment it doesn't matter as much because I only talk about others' perception.

I suspect e/acc is some sort of sock puppet op / astroturfing or something, and I've been trying to avoid signal boosting them.  

I don't know of any other notable advances until the 2010s brought the first interesting language generation results from neural networks.

"A Neural Probabilistic Language Model" - Bengio et al. (2000?

 or 2003?) was cited by Turing award https://proceedings.neurips.cc/paper/2000/hash/728f206c2a01bf572b5940d7d9a8fa4c-Abstract.html

Also worth knowing about: "Generating text with recurrent neural networks" - Ilya Sutskever, James Martens, Geoffrey E Hinton (2011)

I find the argument that 'predicting data generated by agents (e.g. language modeling) will lead a model to learn / become an agent' much weaker than I used to.

This is because I think it only goes through cleanly if the task uses the same input and output as the agent.  This is emphatically not the case for (e.g.) GPT-3.

Load More