1. Overview

🎬 Read this post before any other post.

Overview

I wrote most of the content for the Background section of my Master’s Thesis “Multi-camera Multi-object Tracking with Transformers”. During this time, I discovered some new aspects and thought about how to best introduce some concepts like backpropagation or the transformer. The purpose of this series is to share this content about the fundamentals of deep learning with you.

This is the structure:

  1. Overview
  2. Supervised Learning
  3. Optimization
  4. Backpropgation
  5. Feedforward Neural Networks
  6. Convolutional Neural Networks
  7. Recurrent Neural Networks
  8. Transformer

Highlights

Some highlights are:

  1. Preparing the introduction of the transformer carefully via self-made, detailed figures of vanilla RNN, encoder-decoder RNN and encoder-decoder RNN with attention
  2. Explicitly spelling out what makes the transformer great
  3. Explaining convolution (or rather cross-correlation) with the right figure and equation
  4. Explaining backpropagation with a very simple toy example 🪀
  5. An explanation of why we do not “frontpropagate” although the multiplication in the chain rule is commutative 🤯
  6. Showing backward passes of important layers. Did you know that the backward pass of a convolution is also a convolution? 🤯

In my view, many deep learning concepts are usually introduced way too quickly or there is not enough reflection about the properties of some architectures. An example of the first observation is showing the pictures of the architecture from the original transformer paper without clarifying the encoder-decoder beforehand in greater detail. Examples for the second observation are that it is frequently not stated what makes the transformer great, what the inductive bias of a convolutional layer is, or why we even propagate gradients back. I took the time to think about and research these things.

Target group

In this series, the content is very dense, although I go into some detail regarding the main concepts. Ideally, there would be much more pictures, too. As a consequence, these posts are not for beginners. Instead, I recommend this series to three groups of people:

  1. as complementary material for students who are taking a deep learning class right now
  2. people who want to refresh some already present knowledge
  3. DL professionals who can perhaps fill some small gaps.

Credits

Of course, this series is essentially a recompilation of material from other people. These are my main references: I recommend the Deep Learning book by Goodfellow , the CS231n lecture notes by Fei-Fei Li , and Andrej Karpathy’s dissertation Other references are the Wikipedia article about automatic differentiation , these blogposts about convolutional neural nets , recursive neural nets , and these posts about transformers , , .

I also want to thank Prof. Rainer Gemulla for his excellent lectures, especially his Deep Learning class, at Mannheim University.

Citation

In case you like this series, cite it with:

@misc{stenzel2023deeplearning,
  title   = "Deep Learning Series",
  author  = "Stenzel, Tobias",
  year    = "2023",
  url     = "https://www.tobiasstenzel.com/blog/2023/dl-overview/
}