Neural network models

Neural network models#

Let us begin with a simple illustration. A Facebook user uploads some image files containing photographs of family and friends. The system automatically highlights faces and is able to identify the names of many of the individuals in those photos. How? In fact, how does it even know what part of a complex, multifaceted image represents a human face? The short answer is, they use biometric data combined with specialized, deep learning algorithms based on artificial neural network (ANN) models. These algorithms attempt to mimic the current scientific understanding of how our human brain works, and how it learns new things.
In late 2021 Facebook announced the shutting down of its facial recognition feature, for a variety of socio-political reasons. But the technological innovations behind it continue to grow, and to power many other cutting-edge, real world applications, such as self-driving cars, voice recognition, credit card fraud detection, targeted advertising, and more. This chapter introduces the foundational concepts underlying ANNs and thier use in modern, data-centric applications. Our focus is primarily on methods that belong to the category known as “supervised learning” in machine learning nomenclature.

Conceptual overview#

An artificial neural network (ANN) is essentially a mathematical model that receives a set of numeric inputs, based on which it produces an output. To a mathematician this, of course, describes a function, which is certainly one way to conceptualize an ANN. From a model-building perspective, an ANN is a set of very simple neuron-like components that are interconnected in the form of a network, whose structure and parameters determine what precise output is generated. It is common to think of the structure of these networks as comprised of layers (see Figure 1.1{reference-type=”ref” reference=”F:ann_schematic”}). Typically, there is an input layer and an output layer, together with optional layers in-between, usually referred to as hidden layers.

Example of layered structure of a typical neural network. [Image source: https://www.geeksforgeeks.org/introduction-to-artificial-neural-networks/]
A neuron receiving n inputs and producing output .

To understand the functioning of a neural network, let us look at how a single component, which we will call a neuron, works. Figure 1.2{reference-type=”ref” reference=”F:neuron_sketch”} shows a schematic of a neuron receiving \(n\) inputs and, following a sequence of steps, producing an output \(\hat{y}\). Neurons typically take in multiple inputs and produce a single output. The effect of each input \(x_i\) is moderated by a numerical weight coefficient \(w_i\), which represents the strength of the connection between \(x_i\) and the output. Typically, we want to compare the sum \(\sum_{i=1}^n x_i \cdot w_i\) with some specified threshold value \(b\), known as the bias, to determine the neuron’s output. This is done by an activation function that takes the input \(\sum_{i=1}^n x_i \cdot w_i - b\) and produces the output \(\hat{y}\).

Let’s try to display some eqns [ f(x) = \left[ \frac{e^{x^3}}{4x+5x^4} \right] ]