Back Propagation: aka The Witchcraft of AI and Pytorch
This post is for readers interested to know the wizardry of AI behind the curtain. It is an attempt to explain the principle of learning via back propagation. There are different ways of learning how neural networks work. By making back propagation the pivotal focus, we are able to connect theory with practical computation (via Pytorch).
If you know basic high-school math and what a basic python program looks like, you can follow along very easily. The deepest math we will invoke is the chain rule (quick refresher here, but not needed).
Why yet another explanation?
There are plenty of online guides to the principles of neural networks. So how does this one differ?
There is a single line of code in Pytorch scripts that hides a wonderfully elegant and powerful procedure (back propagation) that does all the work in deep learning:
Our explanation takes you from the high-level understanding of a neural network all the way down to what lies behind this line of code. We use minimal math and computer science to get there, but enough to be satisfying for the technically curious.
Yes, we do use calculus, but hopefully in the least cumbersome way possible by avoiding lots of awkward notation, too many layers and the full math of artificial neurons.
[Don’t worry — we revert back to the fully-blown neuronal math at the very end and show how it makes no difference because Pytorch takes care of it.]
Most explanations stop short of introducing how back propagation works at the computational level inside of Pytorch via a library called Autograd. We get right down into the inner workings whilst avoiding cumbersome ideas like Vector Jacobi Products, Hessians and the like. Phew!
Forgive the somewhat relaxed tone of the voice-over, but these were recorded in one take, fairly late at night, without a script or any kind of rehearsal. The slides are designed to be clear and usable standalone, but they are best understood via the video narrations.