A Lightning-Fast Introduction to Deep Learning and TensorFlow 2.0 – Built In

Posted: May 14, 2020 at 4:52 pm

From navigating to a new place to picking out new music, algorithms have laid the foundation for large parts of modern life. Similarly, artificial Intelligence is booming because it automates and backs so many products and applications. Recently, I addressed some analytical applications for TensorFlow. In this article, Im going to lay out a higher-level view of Googles TensorFlow deep learning framework, with the ultimate goal of helping you to understand and build deep learning algorithms from scratch.

Over the past couple of decades, deep learning has evolved rapidly, leading to massive disruption in a range of industries and organizations. The term was coined in 1943 when Warren McCulloch and Walter Pitts created a computer model based on neural networks of a human brain, creating the first artificial neural networks (or ANNs). Deep learning now denotes a branch of machine learning that deploys data-centric algorithms in real-time.

Backpropagation is a popular algorithm that has had a huge impact in the field of deep learning. It allows ANNs to learn by themselves based on the errors they generate while learning. To further enhance the scope of an ANN, architectures like Convolutional Neural Networks, Recurrent Neural Networks, and Generative Networks have come into the picture. Before we delve into them, lets first understand the basic components of a neural network.

Neurons and Artificial Neural Networks

An artificial neural network is a representational framework that extracts features from the data its given. The basic computational unit of an ANN is the neuron. Neurons are connected using artificial layers through which the information passes. As the information flows through these layers, the neural network identifies patterns between the data. This type of processing makes ANNs useful for several applications, such as for prediction and classification.

Now lets take a look at the basic structure of an ANN. It consists of three layers: the input layer, the output layer, which is always fixed or constant, and the hidden layer. Inputs initially pass through an input layer. This layer always accepts a constant set of dimensions. For instance, if we wanted to train a classifier that differentiates between dogs and cats, the inputs (in this case, images) should be of the same size. The input then passes through the hidden layers and the network updates the weights and recognizes the patterns. In the final step, we classify the data at the output layer.

Weights and Biases

Every neuron inside a neural network is associated with parameters, weight and bias. The weight is an integer that controls the signals between any two neurons. If the output is desirable, meaning that the output is in proximity to the one that we expected it to produce, then the weights are ideal. If the same network is generating an erroneous output thats far away from the actual one, then the network alters the weights to improve the subsequent results.

Bias, the other parameter, is the algorithms tendency to consistently learn the wrong thing by not taking into account all the information in the data. For the model to be accurate, bias needs to be low. If there are inconsistencies in the dataset, like missing values, fewer data tuples, or erroneous input data, the bias would be high and the predicted values could be wrong.

Working of a Neural Network

Before we get started with TensorFlow, lets examine how a neural network produces an output with weights, biases, and input by taking a look at the first neural network, called Perceptron, which dates back to 1958. The Perceptron network is a simple binary classifier. Understanding how this works will allow us to comprehend the workings of a modern neuron.

The Perceptron network is a supervised machine learning technique that uses a binary classifier function by mapping a vector of binary variables to a single binary output. It works as follows:

Multiply the inputs (x1, x2, x3) of the network to their corresponding weights (w1, w2, w3).

Add the multiplied weights and inputs together. This is called the weighted sum, denoted by, x1*w1 + x2*w2 +x3*w3

Apply the activation function. Determine whether the weighted sum is greater than a threshold (say, 0.5), if yes, assign 1 as the output, otherwise assign 0. This is a simple step function.

Of course, Perceptron is a simple neural network that doesnt wholly consider all the concepts necessary for an end-to-end neural network. Therefore, lets go over all the phases that a neural network has to go through to build a sophisticated ANN.

Input

A neural network has to be defined with the number of input dimensions, output features, and hidden units. All these metrics fall in a common basket called hyperparameters. Hyperparameters are numeric values that determine and define the neural network structure.

Weights and biases are set randomly for all neurons in the hidden layers.

Feed Forward

The data is sent into the input and hidden layers, where the weights get updated for every iteration. This creates a function that maps the input with the output data. Mathematically, it is defined asy=f(x), where y is the output, x is the input, and f is the activation function.

For every forward pass (when the data travels from the input to the output layer), the loss is calculated (actual value minus predicted value). The loss is again sent back (backpropagation) and the network is retrained using a loss function.

Output error

The loss is gradually reduced using gradient descent and loss function.

The gradient descent can be calculated with respect to any weight and bias.

Backpropagation

We backpropagate the error that traverses through each and every layer using the backpropagation algorithm.

Output

By minimizing the loss, the network re-updates the weights for every iteration (One Forward Pass plus One Backward Pass) and increases its accuracy.

As we havent yet talked about what an activation function is, Ill expand that a bit in the next section.

Activation Functions

An activation function is a core component of any neural network. It learns a non-linear, complex functional mapping between the input and the response variables or output. Its main purpose is to convert an input signal of a node in an ANN to an output signal. That output signal is the input to the subsequent layer in the stack. There are several types of activation functions available that could be used for different use cases. You can find a list comprising the most popular activation functions along with their respective mathematical formulae here.

Now that we understand what a feed forward pass looks like, lets also explore the backward propagation of errors.

Loss Function and Backpropagation

During training of a neural network, there are too many unknowns to be deciphered. As a result, calculating the ideal weights for all the nodes in a neural network is difficult. Therefore, we use an optimization function through which we could navigate the space of possible ideal weights to make good predictions with a trained neural network.

We use a gradient descent optimization algorithm wherein the weights are updated using the backpropagation of error. The term gradient in gradient descent refers to an error gradient, where the model with a given set of weights is used to make predictions and the error for those predictions is calculated. The gradient descent optimization algorithm is used to calculate the partial derivatives of the loss function (errors) with respect to any weight w and bias b. In practice, this means that the error vectors would be calculated commencing from the final layer, and then moving towards the input layer by updating the weights and biases, i.e., backpropagation. This is based on differentiations of the respective error terms along each layer. To make our lives easier, however, these loss functions and backpropagation algorithms are readily available in neural network frameworks such as TensorFlow and PyTorch.

Moreover, a hyperparameter called learning rate controls the rate of adjustment of weights of a network with respect to the gradient descent. The lower the learning rate, the slower we travel down the slope (to reach the optimum, or so-called ideal case) while calculating the loss.

TensorFlow is a powerful neural network framework that can be used to deploy high-level machine learning models into production. It was open-sourced by Google in 2015. Since then, its popularity has increased, making it a common choice for building deep learning models. On October 1st, a new, stable version got released, called TensorFlow 2.0, with a few major changes:

Eager Execution by Default - Instead of creating tf.session(), we can directly execute the code as usual Python code. In TensorFlow 1.x, we had to create a TensorFlow graph before computing any operation. In TensorFlow 2.0, however, we can build neural networks on the fly.

Keras Included - Keras is a high-level neural network built on top of TensorFlow. It is now integrated into TensorFlow 2.0 and we can directly import Keras as tf.keras, and thereby define our neural network.

TF Datasets - A lot of new datasets have been added to work and play with in a new module called tf.data.

1.0 Support: All the existing TensorFlow 1.x code can be executed using TensorFlow 2.0; we need not modify any of our previous code.

Major Documentation and API cleanup changes have also been introduced.

The TensorFlow library was built based on computational graphs a runtime for executing such computational graphs. Now, lets perform a simple operation in TensorFlow.

Here, we declared two variables a and b. We calculated the product of those two variables using a multiplication operation in Python (*) and stored the result in a variable called prod. Next, we calculated the sum of a and b and stored them in a variable named sum. Lastly, we declared the result variable that would divide the product by the sum and then would print it.

This explanation is just a Pythonic way of understanding the operation. In TensorFlow, each operation is considered as a computational graph. This is a more abstract way of describing a computer program and its computations. It helps in understanding the primitive operations and the order in which they are executed. In this case, we first multiply a and b, and only when this expression is evaluated, we take their sum. Later, we take prod and sum, and divide them to output the result.

TensorFlow Basics

To get started with TensorFlow, we should be aware of a few essentials related to computational graphs. Lets discuss them in brief:

Variables and Placeholders: TensorFlow uses the usual variables, which can be updated at any point of time, except that these need to be initialized before the graph is executed. Placeholders, on the other hand, are used to feed data into the graph from outside. Unlike variables, they dont need to be initialized.Consider a Regression equation, y = mx+c, where x and y are placeholders, and m and c are variables.

Constants and Operations: Constants are the numbers that cannot be updated. Operations represent nodes in the graph that perform computations on data.

Graph is the backbone that connects all the variables, placeholders, constants, and operators.

Prior to installing TensorFlow 2.0, its essential that you have Python on your machine. Lets look at its installation procedure.

Python for Windows

You can download it here.

Click on the Latest Python 3 release - Python x.x.x. Select the option that suits your system (32-bit - Windows x86 executable installer, or 64-bit - Windows x86-64 executable installer). After downloading the installer, follow the instructions that are displayed on the setup wizard. Make sure to add Python to your PATH using environment variables.

Python for OSX

You can download it here.

Click on the Latest Python 3 release - Python x.x.x. Select macOS 64-bit installer,and run the file.

Python on OSX can also be installed using Homebrew (package manager).

To do so, type the following commands:

Python for Debian/Ubuntu

Invoke the following commands:

This installs the latest version of Python and pip in your system.

Python for Fedora

Invoke the following commands:

This installs the latest version of Python and pip in your system.

After youve got Python, its time to install TensorFlow in your workspace.

To fetch the latest version, pip3 needs to be updated. To do so, type the command

Now, install TensorFlow 2.0.

This automatically installs the latest version of TensorFlow onto your system. The same command is also applicable to update the older version of TensorFlow.

The argument tensorflow in the above command could be any of these:

tensorflow Latest stable release (2.x) for CPU-only.

tensorflow-gpu Latest stable release with GPU support (Ubuntu and Windows).

tf-nightly Preview build (unstable). Ubuntu and Windows include GPU support.

tensorflow==1.15 The final version of TensorFlow 1.x.

To verify your install, execute the code:

Now that you have TensorFlow on your local machine, Jupyter notebooks are a handy tool for setting up the coding space. Execute the following command to install Jupyter on your system:

Now that everything is set up, lets explore the basic fundamentals of TensorFlow.

Tensors have previously been used largely in math and physics. In math, a tensor is an algebraic object that obeys certain transformation rules. It defines a mapping between objects and is similar to a matrix, although a tensor has no specific limit to its possible number of indices. In physics, a tensor has the same definition as in math, and is used to formulate and solve problems in areas like fluid mechanics and elasticity.

Although tensors were not deeply used in computer science, after the machine learning and deep learning boom, they have become heavily involved in solving data crunching problems.

Scalars

The simplest tensor is a scalar, which is a single number and is denoted as a rank-0 tensor or a 0th order tensor. A scalar has magnitude but no direction.

Vectors

A vector is an array of numbers and is denoted as a rank-1 tensor or a 1st order tensor. Vectors can be represented as either column vectors or row vectors.

A vector has both magnitude and direction. Each value in the vector gives the coordinate along a different axis, thus establishing direction. It can be depicted as an arrow; the length of the arrow represents the magnitude, and the orientation represents the direction.

Matrices

A matrix is a 2D array of numbers where each element is identified by a set of two numbers, row and column. A matrix is denoted as a rank-2 tensor or a 2nd order tensor. In simple terms, a matrix is a table of numbers.

Tensors

A tensor is a multi-dimensional array with any number of indices. Imagine a 3D array of numbers, where the data is arranged as a cube: thats a tensor. When its an nD array of numbers, that's a tensor as well. Tensors are usually used to represent complex data. When the data has many dimensions (>=3), a tensor is helpful in organizing it neatly. After initializing, a tensor of any number of dimensions can be processed to generate the desired outcomes.

TensorFlow represents tensors with ease using simple functionalities defined by the framework. Further, the mathematical operations that are usually carried out with numbers are implemented using the functions defined by TensorFlow.

Firstly, lets import TensorFlow into our workspace. To do so, invoke the following command:

This enables us to use the variable tf thereafter.

Now, lets take a quick overview of the basic operations and math, and you can simultaneously execute the code in the Jupyter playground for a better understanding of the concepts.

tf.Tensor

The primary object in TensorFlow that you play with is tf.Tensor. This is a tensor object that is associated with a value. It has two properties bound to it: data type and shape. The data type defines the type and size of data that will be consumed by a tensor. Possible types include float32, int32, string, et cetera. Shape defines the number of dimensions.

tf.Variable()

The variable constructor requires an argument which could be a tensor of any shape and type. After creating the instance, this variable is added to the TensorFlow graph and can be modified using any of the assign methods. It is declared as follows:

Output:

tf.constant()

The tensor is populated with a value, dtype, and, optionally, a shape. This value remains constant and cannot be modified further.

Follow this link:

A Lightning-Fast Introduction to Deep Learning and TensorFlow 2.0 - Built In