New Answer

New Comment

3 Answers sorted by
top scoring

Dec 18, 2021

100

The original sparse coding paper^[1] in 1997 - major early advance in learned features for vision and also neuroscience; significant downstream influence on later DL.

Also I can see from the google scholar page for Juergen Schmidhuber that you are missing some of his lab's papers that fit your criteria - such as "Gradient flow in recurrent nets". If he were here he would hate that. Schmidhuber claims that much of the key ideas in DL were discovered at his lab in 1990-1991. Even if that seems like a stretch, I do think they early explored a wide range of foundational ideas that only became more important over time: vanishing gradients, distillation and compression, memory/attention, metalearning, artificial curiosity, and more.

Olshausen, Bruno A., and David J. Field. "Sparse coding with an overcomplete basis set: A strategy employed by V1?." Vision research 37.23 (1997): 3311-3325. ↩︎

A Ray

Dec 22, 2021

Ω240

One the most important deployed applications of machine learning at this point would be web search, so papers relating to that (PageRank, etc) would probably score highly.

I'd expect some papers in spam filtering (which was pretty important / interesting as a machine learning topic at the time) to maybe meet the threshold.

TD-Gammon would probably qualify in the world of RL https://en.wikipedia.org/wiki/TD-Gammon

DistBelief just barely predates that, and since it's basically directly in the lineage to modern deep learning, I think might qualify https://en.wikipedia.org/wiki/TensorFlow#DistBelief

Leo Gao's 2010's summary post has some citations that directly qualify https://bmk.sh/2019/12/31/The-Decade-of-Deep-Learning/

Lastly for now, I think Kevin Murphy's excellent 2012 machine learning textbook: "Machine Learning: A Probabilistic Perspective" has a ton of sidebars and sections for applied machine learning systems, and would probably worth going through.

Edit to add: this source might also be useful https://mlstory.org/

Lone Pine

Dec 18, 2021

Eurisko from Douglas Lenat in 1976 (before he started Cyc).

2 comments, sorted by

top scoring

Click to highlight new comments since: Today at 6:12 PM

[-]Tom Lieberum4y10

Are you asking exclusively about "Machine Learning" systems or also GOFAI? E.g. I notice that you didn't include ELIZA in your database, but that was a hard coded program so maybe doesn't match your criteria.

[-]Jsevillamol4y*20

There has to be a learning component to it.

Bayesian Networks learned from data, SVMs and n-grams all count, but hard coded programs like ELIZA do not

Moderation Log

LESSWRONG
is fundraising!
LW

LESSWRONG
is fundraising!
LW

12

[ Question ]

Important ML systems from before 2012?

12

12

3 Answers sorted by
top scoring

Dec 18, 2021

Dec 22, 2021

Dec 18, 2021

12

[ Question ]

Important ML systems from before 2012?

12

12

3 Answers sorted by top scoring

Dec 18, 2021

Dec 22, 2021

Dec 18, 2021

3 Answers sorted by
top scoring