Skip to main content

Most of us are used to hearing terms like artificial intelligence and machine learning, but it is not always easy to understand what it is and how it works. In this post, we will try to explain deep learning, one of the methods of artificial intelligence, in a simple and understandable way.

What is deep learning?

In short, cunstig intelligence, or AI (Artificial Intelligence) as it is often calledes, an umbrella term for a wide range of methods used to make computers do a task without explicitly programming how to do it. One method in artificial artificial intelligence that has proven to be very effective in solving a number of complex problems in recent years is deep learning.

So how does it actually work?

Artificial neural networks

The cornerstone of deep learning methods are artificial neural networks (Artificial Neural Networks). These networks are inspired by the way the neural network in the brainn our works and is made up of nods that are structured in several layers.

In the figure above you see an illustration of a very simple neural net, where each circle represents a node. We can divide this network into three layers: the input layer (the blue ones), the hidden layer (the orange) and the output layer (the yellow ones). Each node holds a numerical value.

    • For the input layer, this will often represent a physical worlddi, this could for example, the value of a piksel in an image.
    • In the output layer, the values in the nodes will often represent the answer to the task you set the web to solve. A common example is to give the web images of theirds and cats as well train neta to determine whether it is a dog or a cat in the image. In such an example, z₁ and z₂ can be values between 0 and 1 representing how likely the web thinks it is a cat vs.. a dog on the carit.
    • In thisjwelded layer the values in the nodes will often not represent any physical or conceptual valuebut it is the hidden layer that picks up patterns that make the networkthe able to resolvetasks.

Weights and activation functions

As illustrated in the figure, the nodes are connected to each other, so the value of the nodes in the hidden layer depends on the values coming from the input layer. y₁ will for example be given by

 

y₁= σ ( w₁₁₁x₁+w₂₁x₂+w₃₁x₃ )

 

Where w₁₁ represents how much the value in y₁ is weighted in x₁, w₂₁ is how much the value in y₁ is weighted in x₂, etc.

σ is an activation function and ensures that we get nonlinearities i network so that it is able to recognize more complex patterns. There aremany different types of activation functions. They can for examples be stepfunctions, sigmoid functions or tanh functions, to name a few.

Furthermore, the output layer will depend on the values in the hidden layer, so that:

 

z₁= σ ( v₁₁₁y₁+v₂₁y₂+v₃₁y₃ )

 

Here the v's represent weights between the hidden layer and the output layer as w representsatered weights between the input and the hidden layer.

If you insert the expression for y₁, y₂ and y₃ into z₁ and z₂, you get an expression for the output based on the input.

Based on this expression, we can adjust the weights in the mesh(the w's and v's) when the mesh is trained, and it is these that enable the mesh to recognise patterns.

Gradient descent method

When adjusting the weights, the grid often uses what is called the gradient descent method. Very briefly, it involves calculating a gradient that tells you which direction and how fast the weights should change to minimise the error in the output layer. When we feed the grid with a lot of datawhere we already know what the correct output is, the grid can tune the weights over many iterations to be as efficient as possible at recognising the patterns we want.

 

Deep Neural Networks

The neural network drawn in the figure above is unlikely to be able to recognise particularly complex patterns since it has only one hidden layer with only three nodes. If we want to solve complextasks, such as for example classification of images, need we need several hidden layers and preferably many more nodes in each layer. When an artificial neural network has multiple hidden layers, we call a deep neural network and this is what is deep learning.

To make these networks as efficient as possible, there are numerous variations of artificial neural networks, clever techniques to build the layers and different ways to train the network. Some notable examples are Convolutional Neural Networks (CNN) and Recurrent Neural Networks (RNN)but we will not go deeper into these in this post.

Want to learn more?

Do you want to know more about machine learning?
Sign up for our morning seminar on machine learning in digital product developmentor take a look at one of our other blog posts.

We can help you digitise your business - book a meeting with us!

Contact us
Herman Dragesund

Author Herman Dragesund

More posts by Herman Dragesund