Why is deep learning successful today?
(Raza 2023; Pichler and Hartig 2023)
(Roser, Ritchie, and Mathieu 2023)
(Roser, Ritchie, and Mathieu 2023)
Deep learning will continue to shape the future of artificial intelligence.
What’s behind the scenes?
Deep Learning models are neural networks.
(Shukla 2019)
Neural networks are modeled after neural cells.
Artificial Neurons are the elementary
units of artificial neural
networks.
An artificial neuron is a function that receives one or more inputs, applies weights to these inputs and sums them to produce an output.
Many artificial neurons together form a neural network.
The input is one neuron per feature.
Here, it is a 28x28 pixel image. 28x28=784 input neurons.
The number of output neurons depends on the number of predictions you want to make.
For regression, this can be one neuron.
For classification, this is one neuron per class.
(Shukla 2019)
Defining the number of hidden layers and number
of
neurons per hidden layer is… pure
magic?
There are even more hyperparameters…
Learning means adjusting the inner input weights after data processing and based on the amount of error in the output.
By the way, the GPT-3 model had 175 Billion inner weights.
Measuring the amount of error in the output is
the purpose of a
loss function.
Ultimately, the entire goal of training is to minimize loss.
Two widely used loss functions:
Learning rate is a tuning parameter that
determines how quickly a
model “learns”.
It influences to what extent new information
overrides old
information.
Setting the learning rate is a trade-off:
either converge too
slow or miss important details…
These have been just a few of the available hyperparameters.
There are also
Funnily enough, studies even yield opposing recommendations in setting certain parameters.
(Karpathy 2019)
Recent advancements in the field even top it up a notch…
Kolmogorov-Arnold Networks (Liu et al. 2024)
They turn this (the good’ol multilayer perceptron)…
…into this
Is there a secret to successfully set up a neural network?
Most of the time it will train but silently work a bit worse. (Karpathy 2019)
Suffering is a perfectly natural part of getting a neural network to work well. (Karpathy 2019)
Some more hightlights…just slightly older.
Backpropagation
Backpropagation takes a neural networks output error and propagates this error backwards through the network determining which paths have the greatest influence on the output. (Scarff 2021)
Please refer to this
fantastic
explanation of backpropagation on YouTube
With all of this in mind… how to design your own neural network?
https://towardsdatascience.com/understanding-backpropagation-abcc509ca9d0
Luckily, there is Neural Architecture Search.
Neural architecture search (NAS), the process of automating the design of neural architectures for a given task, is an inevitable next step in automating machine learning and has already outpaced the best human-designed architectures on many tasks. (White et al. 2023)
Or, you use someone else’s model instead!
(Biggerj1 2024)
You need to be careful what base model you use!
…at the same time
initializing a network with transferred features from almost any number of layers can produce a boost to generalization (Yosinski et al. 2014)
A strong recommendation for further “reading”:
https://www.youtube.com/watch?v=aircAruvnKk&list=PLZHQObOWTQDNU6R1_67000Dx_ZCJB-3pi