An Essential Guide to Numpy for Machine Learning in Python

Spread the love

Why would this be useful to you?

Well since most of us tend to forget(In case of those already who already implemented ML algorithms) the various library functions and end up writing code for pre-existing functions using sheer logic which is a waste of both time and energy, in such times it becomes essential if one understands the nuances of the Library being used efficiently. So Numpy being one of the essential libraries for Machine Learning requires an article of its own.

Who are the readers of this Article?

Since understanding Numpy is the starting point of Data Pre-processing and later on implementing ML Algorithms, So you can be someone who is about to learn Machine Learning in the near future or has just begun and wants to get a more Hands on experience in learning Numpy for ML.

But my main focus while writing this article is for it to serve as a quick refresher to Numpy for those who have had experience with the library but need a swift recap.

What are we waiting for ?Let’s begin!!

Numpy is a library for the Python programming language, adding support for large, multi-dimensional arrays and matrices, along with a large collection of high-level mathematical functions to operate on these arrays.

Moreover Numpy forms the foundation of the Machine Learning stack. In this article we cover the most frequently used Numpy operations.

1) Creating a Vector

Here we use Numpy to create a 1-D Array which we then call a vector.

#Load Library
import numpy as np

#Create a vector as a Row
vector_row = np.array([1,2,3])

#Create vector as a Column
vector_column = np.array([[1],[2],[3]])

Output:

[1 2 3]
[[1]
[2]
[3]]

2) Creating a Matrix

We Create a 2-D Array in Numpy and call it a Matrix. It contains 2 rows and 3 columns.

#Load Library
import numpy as np

#Create a Matrix
matrix = np.array([[1,2,3],[4,5,6]])
print(matrix)

Output:

[[1 2 3]
[4 5 6]]

3) Creating a Sparse Matrix

Given data with very few non zero values you want to efficiently represent it.

A frequent situation in Machine Learning is having a huge amount of Data; however most of the elements in the data are Zeros. For example, lets imagine a Matrix where the columns are all the products on Amazon and the rows signify whether a given user has bought that item before or not. Now as you might have guessed there would be many products which haven’t been bought even a single time till now and thus a vast majority of elements would be Zero.

Sparse Matrices store only non zero elements and assume all other values will be zero, leading to significant computational savings.

#Load Library
import numpy as np
from scipy.sparse import csr_matrix

#Create a Matrix
matrix = np.array([[0,0],[0,1],[3,0]])
print(matrix)
#Create Compressed Sparse Row(CSR) matrix
matrix_sparse = csr_matrix(matrix)
print(matrix_sparse)

Output:

[[0 0]
[0 1]
[3 0]]
(1, 1) 1
(2, 0) 3

4) Selecting Elements

When you need to select one or more element in a vector or matrix

#Load Library
import numpy as np

#Create a vector as a Row
vector_row = np.array([ 1,2,3,4,5,6 ])

#Create a Matrix
matrix = np.array([[1,2,3],[4,5,6],[7,8,9]])
print(matrix)

#Select 3rd element of Vector
print(vector_row[2])

#Select 2nd row 2nd column
print(matrix[1,1])
#Select all elements of a vector
print(vector_row[:])
#Select everything up to and including the 3rd element
print(vector_row[:3])
#Select the everything after the 3rd element
print(vector_row[3:])
#Select the last element
print(vector_row[-1])
#Select the first 2 rows and all the columns of the matrix
print(matrix[:2,:])
#Select all rows and the 2nd column of the matrix
print(matrix[:,1:2])

Output:

[[1 2 3]
[4 5 6]
[7 8 9]]
3
5
[1 2 3 4 5 6]
[1 2 3]
[4 5 6]
6
[[1 2 3]
[4 5 6]]
[[2]
[5]
[8]]

5) Describing a Matrix

When you want to know about the shape size and dimensions of a Matrix.

import numpy as np


#Create a Matrix
matrix = np.array([[1,2,3],[4,5,6],[7,8,9]])
#View the Number of Rows and Columns
print(matrix.shape)
#View the number of elements (rows*columns)
print(matrix.size)
#View the number of Dimensions(2 in this case)
print(matrix.ndim)

Output:

(3, 3)
9
2

6) Applying operations to elements

You want to apply some function to multiple elements in an array.

Numpy’s vectorize class converts a function into a function that can apply to multiple elements in an array or slice of an array.

#Load Library
import numpy as np

#Create a Matrix
matrix = np.array([[1,2,3],[4,5,6],[7,8,9]])
print(matrix)

#Create a function that adds 100 to something
add_100 =lambda i: i+100

#Convert it into a vectorized function
vectorized_add_100= np.vectorize(add_100)

#Apply function to all elements in matrix
print(vectorized_add_100(matrix))

Output:

[[1 2 3]
[4 5 6]
[7 8 9]]
[[101 102 103]
[104 105 106]
[107 108 109]]

7) Finding the max and min values

We use Numpy’s max and min functions:

#Load Library
import numpy as np

#Create a Matrix
matrix = np.array([[1,2,3],[4,5,6],[7,8,9]])
print(matrix)
#Return the max element
print(np.max(matrix))
#Return the min element
print(np.min(matrix))
#To find the max element in each column
print(np.max(matrix,axis=0))
#To find the max element in each row
print(np.max(matrix,axis=1))

Output:

[[1 2 3]
[4 5 6]
[7 8 9]]
9
1
[7 8 9]
[3 6 9]

8) Calculating Average, Variance and Standard deviation

When you want to calculate some descriptive statistics about an array.

#Load Library
import numpy as np

#Create a Matrix
matrix = np.array([[1,2,3],[4,5,6],[7,8,9]])
print(matrix)
#Mean
print(np.mean(matrix))
#Standard Dev.
print(np.std(matrix))
#Variance
print(np.var(matrix))

Output:

[[1 2 3]
[4 5 6]
[7 8 9]]
5.0
2.581988897471611
6.666666666666667

9) Reshaping Arrays

When you want to reshape an array(changing the number of rows and columns) without changing the elements.

#Load Library
import numpy as np

#Create a Matrix
matrix = np.array([[1,2,3],[4,5,6],[7,8,9]])
print(matrix)
#Reshape
print(matrix.reshape(9,1))
#Here -1 says as many columns as needed and 1 row
print(matrix.reshape(1,-1))
#If we provide only 1 value Reshape would return a 1-d array of that length
print(marix.reshape(9))
#We can also use the Flatten method to convert a matrix to 1-d array
print(matrix.flatten())

Output:

[[1 2 3]
[4 5 6]
[7 8 9]]
[[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]]
[[1 2 3 4 5 6 7 8 9]]
[1 2 3 4 5 6 7 8 9]
[1 2 3 4 5 6 7 8 9]

10) Transposing a vector or a Matrix

By transposing you interchange the rows and columns of a Matrix

#Load Library
import numpy as np

#Create a Matrix
matrix = np.array([[1,2,3],[4,5,6],[7,8,9]])
print(matrix)
#Transpose the matrix
print(matrix.T)

Output:

[[1 2 3]
[4 5 6]
[7 8 9]]
[[1 4 7]
[2 5 8]
[3 6 9]]

11) Finding the Determinant and Rank of a Matrix

The rank of a Matrix is the number of dimensions of the vector space spanned by its rows or columns.

#Load Library
import numpy as np

#Create a Matrix
matrix = np.array([[1,2,3],[4,5,6],[7,8,9]])
print(matrix)
#Calculate the Determinant
print(np.linalg.det(matrix))
#Calculate the Rank
print(np.linalg.matrix_rank(matrix))

Output:

[[1 2 3]
[4 5 6]
[7 8 9]]
6.66133814775094e-16
2

12) Getting the Diagonal of a Matrix

When you need to extract only the diagonal elements of a matrix

#Load Library
import numpy as np

#Create a Matrix
matrix = np.array([[1,2,3],[4,5,6],[7,8,9]])
print(matrix)
#Print the Principal diagonal
print(matrix.diagonal())
#Print the diagonal one above the Principal diagonal
print(matrix.diagonal(offset=1))
#Print the diagonal one below Principal diagonal
print(matrix.diagonal(offset=-1))

Output:

[[1 2 3]
[4 5 6]
[7 8 9]]
[1 5 9]
[2 6]
[4 8]

13) Calculating the trace of a Matrix

Trace of a Matrix is the sum of elements on the Principal Diagonal of the Matrix.

#Load Library
import numpy as np

#Create a Matrix
matrix = np.array([[1,2,3],[4,5,6],[7,8,9]])
print(matrix)
#Print the Trace
print(matrix.trace())

Output:

[[1 2 3]
[4 5 6]
[7 8 9]]
15

14) Finding Eigenvalues and Eigenvectors

Eigenvectors are widely used in Machine Learning libraries. Intutively given a linear transformation represented by a matrix,A, eigenvectors are vectors that when that transformation is applied, change only in scale(not direction).More formally

Av=Kv

Here A is a square matrix, K contains the eigenvalues and v contains the eigenvectors.

#Load Library
import numpy as np

#Create a Matrix
matrix = np.array([[1,2,3],[4,5,6],[7,8,9]])
print(matrix)

# Calculate the Eigenvalues and Eigenvectors of that Matrix
eigenvalues ,eigenvectors=np.linalg.eig(matrix)
print(eigenvalues)
print(eigenvectors)

Output:

[[1 2 3]
[4 5 6]
[7 8 9]]
[ 1.61168440e+01 -1.11684397e+00 -1.30367773e-15]
[[-0.23197069 -0.78583024 0.40824829]
[-0.52532209 -0.08675134 -0.81649658]
[-0.8186735 0.61232756 0.40824829]]

15) Calculating Dot Products

#Load Library
import numpy as np

#Create vector-1
vector_1 = np.array([ 1,2,3 ])
#Create vector-2
vector_1 = np.array([ 4,5,6 ])
#Calculate Dot Product
print(np.dot(vector_1,vector_2))
#Alternatively you can use @ to calculate dot products
print(vector_1 @ vector_2)

32
32

16) Adding, Subtracting and Multiplying Matrices

#Load Library
import numpy as np

#Create Matrix-1
matrix_1 = np.array([[1,2,3],[4,5,6],[7,8,9]])
#Create Matrix-2
matrix_2 = np.array([[7,8,9],[4,5,6],[1,2,3]])
#Add the 2 Matrices
print(np.add(matrix_1,matrix_2))
#Subtraction
print(np.subtract(matrix_1,matrix_2))
#Multiplication(Element wise, not Dot Product)
print(matrix_1*matrix_2)

Output:

[[ 8 10 12]
[ 8 10 12]
[ 8 10 12]]
[[-6 -6 -6]
[ 0 0 0]
[ 6 6 6]]
[[ 7 16 27]
[16 25 36]
[ 7 16 27]]

17) Inverting a Matrix

This is used when you want to calculate the inverse of a Square Matrix

#Load Library
import numpy as np

#Create a Matrix
matrix = np.array([[1,2,3],[4,5,6],[7,8,9]])
print(matrix)
#Calculate its inverse
print(np.linalg.inv(matrix))

Output:

[[1 2 3]
[4 5 6]
[7 8 9]]
[[-4.50359963e+15 9.00719925e+15 -4.50359963e+15]
[ 9.00719925e+15 -1.80143985e+16 9.00719925e+15]
[-4.50359963e+15 9.00719925e+15 -4.50359963e+15]]

18) Generating Random values

Numpy offers a wide variety of means to generate Random Numbers.

Moreover, It can sometimes be useful to return the same random numbers to get predictable, repeatable results. We can do so by setting the ‘Seed’ (An Integer) of the pseudorandom generator. Random processes with the same seed would always produce the same result.

#Load Library
import numpy as np
#Set seed
np.random.seed(1)
#Generate 3 random integers b/w 1 and 10
print(np.random.randint(0,11,3))
#Draw 3 numbers from a normal distribution with mean 1.0 and std 2.0
print(np.random.normal(1.0,2.0,3))

Output:

[5 8 9]
[-0.60434568 0.10224438 -1.21187015]

So this pretty much covers all the standard Numpy Operations which would be required for you to kickstart your Machine Learning journey with Python. For others I hope this was a good refresher to your pre-existing knowledge in the domain.

About the author:

Siddharth Dikshit

A Artificially Intelligent life form…Writer @TowardsDataScience , @Coinmonks || Sophomore Maths & Computing || Shiv Nadar University

An Essential Guide to Numpy for Machine Learning in Python

Why would this be useful to you?

Who are the readers of this Article?

What are we waiting for ?Let’s begin!!

1) Creating a Vector

2) Creating a Matrix

3) Creating a Sparse Matrix

Output:

4) Selecting Elements

5) Describing a Matrix

6) Applying operations to elements

7) Finding the max and min values

8) Calculating Average, Variance and Standard deviation

9) Reshaping Arrays

10) Transposing a vector or a Matrix

11) Finding the Determinant and Rank of a Matrix

12) Getting the Diagonal of a Matrix

13) Calculating the trace of a Matrix

14) Finding Eigenvalues and Eigenvectors

15) Calculating Dot Products

16) Adding, Subtracting and Multiplying Matrices

17) Inverting a Matrix

18) Generating Random values

About the author:

admin

Numpy.argsort() function with example in python

Numpy.lexsort() function with example in python

numpy.ogrid function with example in python

numpy.mgrid function with example program in python

numpy.geomspace() function with example program in python

numpy.logspace() function with example in python

Leave a Reply Cancel reply

Best Courses

Best Nanodegree Reviews

Latest Courses

Trending Courses

An Essential Guide to Numpy for Machine Learning in Python

Why would this be useful to you?

Who are the readers of this Article?

What are we waiting for ?Let’s begin!!

1) Creating a Vector

2) Creating a Matrix

3) Creating a Sparse Matrix

Output:

4) Selecting Elements

5) Describing a Matrix

6) Applying operations to elements

7) Finding the max and min values

8) Calculating Average, Variance and Standard deviation

9) Reshaping Arrays

10) Transposing a vector or a Matrix

11) Finding the Determinant and Rank of a Matrix

12) Getting the Diagonal of a Matrix

13) Calculating the trace of a Matrix

14) Finding Eigenvalues and Eigenvectors

15) Calculating Dot Products

16) Adding, Subtracting and Multiplying Matrices

17) Inverting a Matrix

18) Generating Random values

About the author:

Related posts:

admin

Numpy.argsort() function with example in python

Numpy.lexsort() function with example in python

numpy.ogrid function with example in python

numpy.mgrid function with example program in python

numpy.geomspace() function with example program in python

numpy.logspace() function with example in python

Leave a Reply Cancel reply