top of page
  • Nikhil Adithyan

Building mathematical functions with NumPy in Python

Updated: Mar 14, 2021

Familiarize the fundamentals of NumPy with practical implementation




Introduction


NumPy is a python package that is primarily used for solving mathematical problems and high-level computations. It provides programmers with a wide variety of tools and built-in functions to create their own custom and complex functions to solve specific problems.


In this article, let’s familiarize the basics of NumPy by building some mathematical functions.


Function: 1 | Sigmoid


A Sigmoid function is a mathematical function that has a characteristic S-shaped curve. The Sigmoid function is normally used to refer specifically to the logistic function, also called the logistic sigmoid function.


All sigmoid functions have the property that they map the entire number line into a small range such as between 0 and 1, or -1 and 1, so one use of a sigmoid function is to convert a real value into one that can be interpreted as a probability.



Sigmoid functions have become popular in deep learning because they can be used as an activation function in an artificial neural network. They were inspired by the activation potential in biological neural networks.


Sigmoid functions are also useful for many machine learning applications where a real number needs to be converted to a probability. A sigmoid function placed as the last layer of a machine learning model can serve to convert the model’s output into a probability score, which can be easier to work with and interpret.


Sigmoid formula:



Now, let’s built a function that calculates the sigmoid for the given ‘x’ value using NumPy in python.


Python Implementation:



# 1. Sigmoid

import numpy as np

def sigmoid(x):
    sigmoid = 1 / (1 + np.exp(-x))
    return sigmoid

print('Sigmoid of 5 is ' + str(sigmoid(5)))

Output:



Sigmoid of 5 is 0.9933071490757153

Code Interpretation: Firstly, we are importing the NumPy package into the python environment. Next, we are defining the sigmoid function that takes ‘X’ as the parameter. Inside the function, we are calculating the sigmoid value using its mathematical formula and storing the value into the ‘sigmoid’ variable. You can observe that we have used a function called ‘np.exp’ which is nothing but calculates the exponential value for the given number. Finally, we are returning the value using the ‘return’ function. To see our function in action, we are using our ‘sigmoid’ function to calculate the sigmoid value of 5.


Function: 2 | Sigmoid Gradient


Calculating the sigmoid value randomly is not the most suitable in solving problems. So it is necessary to compute gradients to optimize loss functions using backpropagation.


Python Implementation:



# 2. Sigmoid derivative

def sigmoid_derivative(x):
    sigmoid = 1 / (1 + np.exp(-x))
    sigmoid_derivative = sigmoid * (1 - sigmoid)
    return sigmoid_derivative

x = np.array([1, 5, 10])
sig_der = sigmoid_derivative(x)

print('Sigmoid Derivative of [1, 2, 3] : ' + str(sig_der))

Output:



Sigmoid Derivative of [1, 2, 3] : [1.96611933e-01 6.64805667e-03 4.53958077e-05] 

Code Interpretation: The structure of the code and the functions used are almost similar to the previous sigmoid function but only the mathematical formula changes. Inside the function, we are calculating the sigmoid function. Next, to calculate the gradients, we are subtracting the sigmoid value by 1 and multiplying it with the sigmoid value. Finally, we are returning the output and testing the built function.


Function: 3 | Data normalization


Data normalization is used to scale the data of an attribute so that it falls in a smaller range, such as -1.0 to 1.0 or 0.0 to 1.0. It is generally useful for classification algorithms.


It is generally required when we are dealing with attributes on a different scale, otherwise, it may lead to a dilution on the effectiveness of an important equally important attribute(on a lower scale) because of other attributes having values on a larger scale.


In simple words, when multiple attributes are there but attributes have values on different scales, this may lead to poor data models while performing data mining operations. So they are normalized to bring all the attributes on the same scale.


Python Implementation:



# 3. Normalization

def normalizeRows(x):
    x_norm = np.linalg.norm(x, ord = 2, keepdims = True, axis = 1)
    x = x/x_norm
    return x

x = np.array([
    [0, 3, 4],
    [1, 6, 4]])

print("normalizeRows(x) = " + str(normalizeRows(x)))

Output:



normalizeRows(x) = [[0.         0.6        0.8       ]  [0.13736056 0.82416338 0.54944226]] 

Code Interpretation: We are defining a function ‘normalizeRows’ that normalize the rows in data. Inside the function, we are defining a variable ‘x_norm’ to store the normalized values. We have used the ‘np.linalg.norm’ function which can return one of eight different matrix norms, or one of an infinite number of vector norms, depending on the value of the ‘ord’ parameter. Next, we are defining another variable ‘x’ in which we are storing the final normalized values. Finally, we are returning the output and testing our function.


Function: 4 | Softmax


The softmax function is a function that turns a vector of K real values into a vector of K real values that sum to 1. The input values can be positive, negative, zero, or greater than one, but the softmax transforms them into values between 0 and 1 so that they can be interpreted as probabilities. If one of the inputs is small or negative, the softmax turns it into a small probability, and if the input is large, then it turns it into a large probability, but it will always remain between 0 and 1.


The softmax function is sometimes called the softargmax function, or multi-class logistic regression. This is because the softmax is a generalization of logistic regression that can be used for multi-class classification, and its formula is very similar to the sigmoid function which is used for logistic regression.


Softmax formula:



Python Implementation:



# 4. Softmax

def softmax(x):
    x_exp = np.exp(x)
    x_sum = np.sum(x_exp, axis = 1, keepdims = True)
    s = x_exp / x_sum
    return s

x = np.array([
    [9, 2, 5, 0, 0],
    [7, 5, 0, 0 ,0]])

print("softmax(x) = " + str(softmax(x)))

Output:



softmax(x) = [[9.80897665e-01 8.94462891e-04 1.79657674e-02 1.21052389e-04
  1.21052389e-04]
 [8.78679856e-01 1.18916387e-01 8.01252314e-04 8.01252314e-04
  8.01252314e-04]]
 

Code Interpretation: We are defining a function ‘softmax’ that takes an ‘x’ matrix as the parameter. Inside the function, we are defining three different variables. The first variable is the ‘x_exp’ variable in which we are storing the exponential values of the given ‘x’ matrix. The next variable is the ‘x_sum’ variable that sums each row of ‘x_exp’. The last variable is the ‘s’ variable that stores the final values by dividing ‘x_exp’ by ‘x_sum’. Finally, we are returning the final output and testing the built function.


Function: 5 | Mean Absolute Error or L1 loss function


Before moving on to the L1 loss function, it is essential to know what exactly is a loss function. The loss function is the function that computes the distance between the current output of the algorithm and the expected output. It’s a method to evaluate how your algorithm models the data. It can be categorized into two groups. One for classification (discrete values, 0,1,2…) and the other for regression (continuous values).


The L1 loss function measures the average magnitude of the errors in a set of predictions, without considering their direction. It’s the average over the test sample of the absolute differences between prediction and actual observation where all individual differences have equal weight.


L1 loss function formula:



Python Implementation:



# 5. L1 loss function

def L1(yhat, y):
    loss = np.sum(np.abs(yhat - y), axis = 0)
    return loss

yhat = np.array([.9, 0.2, 0.1, .4, .9])
y = np.array([1, 0, 0, 1, 1])

print("L1 = " + str(L1(yhat,y)))

Output:



L1 = 1.1 

Code Interpretation: We are defining a function ‘L1’ that takes the predicted and actual values as parameters. Inside the function, we are defining a variable ‘loss’ in which we are storing the calculated loss value. To calculate the loss value, we are first taking the original value (eliminating negative values if any) of both the predicted and actual values using the ‘np.abs’ function, and then we are calculating the sum (row-wise) of the original values. Finally, we are returning and testing the function.


Function: 6 | Mean Squared Error or L2 loss function


The Mean Squared Error (MSE) or the L2 loss function of an estimator measures the average of error squares i.e. the average squared difference between the estimated values and true value. It is a risk function, corresponding to the expected value of the squared error loss. It is always non — negative and values close to zero are better. The L2 loss function is the second moment of the error (about the origin) and thus incorporates both the variance of the estimator and its bias.


L2 loss function formula:



Python Implementation:



# 6. L2 loss function

def L2(yhat, y):
    loss = np.dot(np.abs(yhat - y), np.abs(yhat - y))
    return loss

yhat = np.array([.9, 0.2, 0.1, .4, .9])
y = np.array([1, 0, 0, 1, 1])

print("L2 = " + str(L2(yhat,y)))

Output:



L2 = 0.43 

Code Interpretation: We are defining a function ‘L2’ that takes the predicted and actual values as parameters. Inside the function, we are defining a variable ‘loss’ which stores the calculated L2 loss value. To calculate the L2 loss value, we are first taking the original values of the given inputs using the ‘np.abs’ function and calculating the dot product of those two matrices using the ‘np.dot’ function. Finally, we are returning and testing the built function.


Final Thoughts!


Even though, I personally hate mathematics, I try to make it enjoyable by programming and calculating the answers. By doing this, makes me understand mathematics easily, and helps me in applying these functions to solve real-world problems. You too can follow my way if you face a similar problem. This tutorial is just a glimpse of how to build your own mathematical functions to solve problems but there is much more to be explored. Learn to code and overcome your mathematics nightmare. Don't worry if you forgot to follow any of the coding parts as I've provided the full source code of this article. Programming makes life easier!


Happy programming!


Full code:



import numpy as np

# 1. Sigmoid

def sigmoid(x):
    sigmoid = 1 / (1 + np.exp(-x))
    return sigmoid

print('Sigmoid of 5 is ' + str(sigmoid(5)))

# 2. Sigmoid derivative

def sigmoid_derivative(x):
    sigmoid = 1 / (1 + np.exp(-x))
    sigmoid_derivative = sigmoid * (1 - sigmoid)
    return sigmoid_derivative

x = np.array([1, 5, 10])
sig_der = sigmoid_derivative(x)

print('Sigmoid Derivative of [1, 2, 3] : ' + str(sig_der))

# 3. Rows normalization

def normalizeRows(x):
    x_norm = np.linalg.norm(x, ord = 2, keepdims = True, axis = 1)
    x = x/x_norm
    return x

x = np.array([
    [0, 3, 4],
    [1, 6, 4]])

print("normalizeRows(x) = " + str(normalizeRows(x)))

# 4. Softmax

def softmax(x):
    x_exp = np.exp(x)
    x_sum = np.sum(x_exp, axis = 1, keepdims = True)
    s = x_exp / x_sum
    return s

x = np.array([
    [9, 2, 5, 0, 0],
    [7, 5, 0, 0 ,0]])

print("softmax(x) = " + str(softmax(x)))

# 5. L1 loss function

def L1(yhat, y):
    loss = np.sum(np.abs(yhat - y), axis = 0)
    return loss

yhat = np.array([.9, 0.2, 0.1, .4, .9])
y = np.array([1, 0, 0, 1, 1])

print("L1 = " + str(L1(yhat,y)))

# 6. L2 loss function

def L2(yhat, y):
    loss = np.dot(np.abs(yhat - y), np.abs(yhat - y))
    return loss

yhat = np.array([.9, 0.2, 0.1, .4, .9])
y = np.array([1, 0, 0, 1, 1])

print("L2 = " + str(L2(yhat,y)))
5 comments
bottom of page