Maybe deep learning researchers are just like mathematicians. They think of coding as the hard part of what they do so they spend a lot of time teaching people how to implement neural nets from scratch. But actually the hard part of what they do is math (or at least keeping track of tensor shapes).

So I'm frustrated that I've heard a dozen explanations of gradient descent, which I understood in high school, but I have to learn about variational autoencoders from scratch by reading academic papers.

New to LessWrong?

New Comment
1 comment, sorted by Click to highlight new comments since: Today at 2:46 AM

I would agree, a lot of DL concepts became much clearer after I worked with tensors enough to be able to manipulate them in my mind. Before that I just sort of just made sure the shapes matched up and hoped for the best. Also, you tend to see a lot of beginner questions/confusion on DL libraries (Keras especially) about tensor shape mismatches, suggesting that it is quite a common problem area.