Boring machine learning is where it's at

by George4 min read20th Oct 202116 comments

28

Machine LearningAI
Frontpage

It surprises me that when people think of "software that brings about the singularity" they think of text models, or of RL agents. But they sneer at decision tree boosting and the like as boring algorithms for boring problems.

To me, this seems counter-intuitive, and the fact that most people researching ML are interested in subjects like vision and language is flabergasting. For one, because getting anywhere productive in these fields is really hard, for another, because their usefulness seems relatively minimal.

I've said it before and I'll say it again, human brains are very good at the stuff they've been doing for a long time. This ranges from things like controlling a human-like body to things like writing prose and poetry. Seneca was as good of a philosophy writer as any modern, Shakespear as good of a playwright as any contemporary. That is not to say that new works and diversity in literature isn't useful, both from the perspective of diversity and of updating to language and zeitgeist, but it's not game-changing.

Human brains are shit at certain tasks, things like finding the strongest correlation with some variables in an n-million times n-million valued matrix. Or heck, even finding the most productive categories to quantify a spreadsheet with a few dozen categorical columns and a few thousand rows. That's not to mention things like optimizing 3d structures under complex constraints or figuring out probabilistic periodicity in a multi-dimensional timeseries.

The later sort of problem is where machine learning has found the most amount of practical usage, problems that look "silly" to a researcher but implacable to a human mind. On the contrary, 10 years in, computer vision is still struggling to find any meaningfully large market fits outside of self-driving. There are a few interesting applications, but they have limited impact and a low market cap. The most interesting applications, related to bioimaging, happen to be things people are quite bad at; They are very divergent from the objective of creating human-like vision capabilities, since the results you want are anything but human-like.

Even worst, there's the problem that human-like "AI" will be redundant the moment it's implemented. Self-driving cars are a real challenge precisely until the point when they become viable enough that everybody uses them, afterwards, every car is running on software and we can replace all the fancy CV-based decision making with simple control structures that rely on very constrained and "sane" behaviour from all other cars. Google assistant being able to call a restaurant or hospital and make a booking for you, or act as the receptionist taking that call, is relevant right until everyone starts using it, afterwards everything will already be digitized and we can switch to better and much simpler booking APIs.

That's not to say all human-like "AI" will be made redundant, but we can say that its applications are mainly well-known and will diminish over time, giving way to simpler automation as people start being replaced with algorithms. I say its applications are "well known" because they boil down to "the stuff that humans can do right now which is boring or hard enough that we'd like to stop doing it". There's a huge market for this, but it's huge in the same way as the whale oil market was in the 18th century. It's a market without that much growth potential.

On the other hand, the applications of "inhuman" algorithms are boundless, or at least only bounded by imagination. I've argued before that science hasn't yet caught up to the last 40 years of machine learning. People prefer designing equations by hand and validating them with arcane (and easy to fake, misinterpret and misuse) statistics, rather than using algorithmically generate solutions and validating them with simple, rock-solid methods such as CV. People like Horvath are hailed as genius-level polymaths in molecular biology for calling 4 scikit-learn functions on a tiny dataset.

Note: Horvath's work is great and I in no way want to pick on him specifically, the world would be much worse without him, I hope epigenetic clocks predict he'll live and work well into old age. I don't think he personally ever claimed the ML side of his work is in any way special or impressive, this is just what I've heard other biologists say.

This is not to say that the scientific establishment is doomed or anything, it's just slow at using new technologies, especially those that shift the onus of what a researcher ought to be doing. The same goes for industry; A lot of high-paying, high-status positions involve doing work algorithms are better at, precisely because it's extremely difficult for people, and thus you need the smartest people for it.

However, market forces and common sense are at work, and there's a constant uptick in usage. While I don't believe this can bring about a singularity so to speak, it will accelerate research and will open up new paradigms (mainly around data gathering and storage) and new problems that will allow ML to take centre stage.

So in that sense, it seems obvious to postulate a limited and decreasing market for human-like intelligence and a boundless and increasing market for "inhuman" intelligence.

This is mainly why I like to focus my work on the latter, even if it's often less flashy and more boring. One entirely avoidable issue with this is that the bar of doing better than a person is low, and the state of benchmarking is so poor as to make head-to-head competition between techniques difficult. Though this in itself is the problem I'm aiming to help solve.

That's about it, so I say go grab a spreadsheet and figure out how to get the best result on a boring economics problem with a boring algorithm; Don't worry so much about making a painting or movie with GANs, we're already really good at doing that and enjoy doing it.

28

16 comments, sorted by Highlighting new comments since Today at 8:44 AM
New Comment

I don't really buy the main argument here, but there were some great sub-arguments in this post. In particular, I found this bit both novel and really interesting:

Even worst, there's the problem that human-like "AI" will be redundant the moment it's implemented. Self-driving cars are a real challenge precisely until the point when they become viable enough that everybody uses them, afterwards, every car is running on software and we can replace all the fancy CV-based decision making with simple control structures that rely on very constrained and "sane" behaviour from all other cars. Google assistant being able to call a restaurant or hospital and make a booking for you, or act as the receptionist taking that call, is relevant right until everyone starts using it, afterwards everything will already be digitized and we can switch to better and much simpler booking APIs.

It's a good point, but it's like saying that to improve a city you can just bomb it and build it from scratch. In reality improvements need to be incremental and coexist with the legacy system for a while.

The issue with this argument is that the architectures and techniques that are best at “human-like” data processing are are now turning out to be very good at “inhuman” data processing. Some examples:

  • TABERT is a BERT-like transformer that interprets tabular data as a sequence of language tokens
  • Weather prediction (specific example)
  • Protein structure prediction (admittedly, humans are surprisingly good at this, but AlphaFold is better)

Also, this paper shows that deep learnings relative weakness on tabular data can be overcome with careful choice of regularization.

Protein structure prediction (admittedly, humans are surprisingly good at this, but AlphaFold is better)


 

Protein folding problems are cool and all, but as a computational biologist, I do think people underrate simpler ML models on a variety of problems. Researchers have to make a lot of decisions - such as which hits to follow up on in a high throughput screen. The data to inform these kinds of tasks can be expressed as tabular data and decision trees can perform quite well!

Got Any examples of this being used? I'm always on the lookout for these kind of usecases.

I don't disagree, as I said before, I'm focused on problem type not method.

The fact that human mimicking problems have loads of cheap training data and can lead to interesting architectures is something I didn't think of that makes them more worthwhile.

On the contrary, 10 years in, computer vision is still struggling to find any meaningfully large market fits outside of self-driving. There are a few interesting applications, but they have limited impact and a low market cap.

I think most people would disagree with this description.

I'm surprised. Where do you think CV has had high impact/high market cap?

Surveillance alone seems like a much larger marketcap than self-driving cars. And that's just one niche out of many; what do you think all those ASICs being put in smartphones are for?

The main issue with surveillance is the same AI vs automation issue.

Once the population is fine with being tracked 24/7 and you have the infrastructure for it:

  1. The need to do so diminishes
  2. You can just make it illegal to not carry a wifi-enabled ID (there's the issue of having someone elses IDs, but so is there the issue of face-covering, and finding incentives to force people into else'scarrying only your own ID is easy)

I hadn't heard of people widely using automated surveillance. Is that a thing now? I mean, I'd heard about some governments deploying it here and there, but my impression was that it was mostly pretty gimmicky and narrow.

Same with ASICs in smartphones. Sure, it's a gimmick that might sell more phones, but I don't see a killer app yet.

To be clear, I totally find it plausible that these will grow into huge things, but "struggling to find any meaningfully large market fits outside of self-driving" sounds like a pretty reasonable description of the current state of things. I think you pay more attention to these developments than I do, though, so please do let me know if I'm wrong.

I think I was wrong in writing this, and I corrected it on my blog.

What I mean to say was closer to "human mimicking CV" (i.e. classification, segmentation, tracking and other images -> few numbers/concepts tasks). There's certainly a case to be made that image-as-input and/or output techniques as a whole have very large potential, even if not actualized

I think this quote is relevant though I can't exactly say how:

Technological change actually arrives in ways that leave human behavior minimally altered.

The future always seems like something that is going to happen rather than something that is happening.

Successful products are precisely those that do not attempt to move user experiences significantly, even if the underlying technology has shifted radically. In fact the whole point of user experience design is to manufacture the necessary normalcy for a product to succeed and get integrated

 -- Welcome to the Future Nauseous by Venkatesh Rao (found via OvercomingBias

As a person that spent last 7 years of life in the company dedicated to make "old boring algorithms" easier to apply to as frictionless as possible to many problem types - 100% agree :)

I don't think people focus on language and vision because they're less boring than things like decision trees; they focus on those because the domains of language and vision are much broader than the domains decision trees, etc., are applied to. If you train a decision tree model to predict the price of a house it will do just that, whereas if you train a language model to write poetry it could conceivably write about various topics such as math, politics and even itself (since poetry is a broad scope). This is a (possibly) a step towards general intelligence, which is what people are worried/excited about.

I agree with your argument that algorithms such as decision trees are much better at doing things that humans can't, whereas language and vision models are not.

Hmh, I didn't want to give the impression I'm discounting particular architectures, I just gave the boosting example to help outline the target class of problems.