🎬 Read this post before any other post.
I wrote most of the content for the Background section of my Master’s Thesis “Multi-camera Multi-object Tracking with Transformers”. During this time, I discovered some new aspects and thought about how to best introduce some concepts like backpropagation or the transformer. The purpose of this series is to share this content about the fundamentals of deep learning with you.
This is the structure:
Some highlights are:
In my view, many deep learning concepts are usually introduced way too quickly or there is not enough reflection about the properties of some architectures. An example of the first observation is showing the pictures of the architecture from the original transformer paper without clarifying the encoder-decoder beforehand in greater detail. Examples for the second observation are that it is frequently not stated what makes the transformer great, what the inductive bias of a convolutional layer is, or why we even propagate gradients back. I took the time to think about and research these things.
In this series, the content is very dense, although I go into some detail regarding the main concepts. Ideally, there would be much more pictures, too. As a consequence, these posts are not for beginners. Instead, I recommend this series to three groups of people:
Of course, this series is essentially a recompilation of material from other people.
These are my main references: I recommend the Deep Learning book by Goodfellow
I also want to thank Prof. Rainer Gemulla for his excellent lectures, especially his Deep Learning class, at Mannheim University.
In case you like this series, cite it with:
@misc{stenzel2023deeplearning,
title = "Deep Learning Series",
author = "Stenzel, Tobias",
year = "2023",
url = "https://www.tobiasstenzel.com/blog/2023/dl-overview/
}