A Beginner’s Guide to Numpy in Python (Mini Project Included)

Jovan
4 min readApr 18, 2023

--

Python is one of the most popular programming languages used for scientific computing, data analysis, and machine learning. One of the key libraries in Python for these purposes is NumPy, which is short for Numerical Python. NumPy provides a high-performance multidimensional array object, and tools for working with these arrays. In this article, we will introduce the basics of NumPy and provide a mini project to practice NumPy skills.

What is NumPy?

NumPy is a Python library that provides a multidimensional array object and a set of functions for working with these arrays. NumPy arrays are similar to Python lists, but they are optimized for numerical operations and can handle large amounts of data efficiently. NumPy arrays can have any number of dimensions, and the elements of the arrays are all of the same type, usually a numeric type such as float or int.

Creating NumPy arrays:

NumPy arrays can be created in several ways. One common way is to convert a Python list to a NumPy array using the numpy.array() function. For example:

import numpy as np

my_list = [1, 2, 3, 4, 5]
my_array = np.array(my_list)
print(my_array)

This will output:

[1 2 3 4 5]

NumPy arrays can also be created using other functions such as numpy.zeros() and numpy.ones() which create arrays filled with zeros and ones, respectively.

import numpy as np
zeros_array = np.zeros((2, 3))
print(zeros_array)
ones_array = np.ones((3, 2))
print(ones_array)

This will output:

[[0. 0. 0.]
[0. 0. 0.]]

[[1. 1.]
[1. 1.]
[1. 1.]]

Indexing and Slicing NumPy arrays:

NumPy arrays can be indexed and sliced similar to Python lists. For example:

import numpy as np

my_array = np.array([1, 2, 3, 4, 5])
print(my_array[0]) # Output: 1
print(my_array[1:3]) # Output: [2 3]

NumPy arrays with more than one dimension can be indexed using multiple indices. For example:

import numpy as np

my_array = np.array([[1, 2, 3], [4, 5, 6]])
print(my_array[0, 1]) # Output: 2
print(my_array[:, 1]) # Output: [2 5]

Performing operations on NumPy arrays:

One of the advantages of using NumPy arrays is the ability to perform numerical operations on them efficiently. For example, to perform element-wise addition of two NumPy arrays:

import numpy as np

array1 = np.array([1, 2, 3])
array2 = np.array([4, 5, 6])
result = array1 + array2
print(result) # Output: [5 7 9]

NumPy provides a wide range of mathematical functions that can be applied to arrays, such as numpy.sin(), numpy.cos(), numpy.exp(), and many others.

Mini Project: Analyzing Student Grades with NumPy

Let’s say we have a dataset of student grades for five subjects: Math, Science, English, History, and Art. We want to analyze this data using NumPy. The dataset is provided as a Python list of lists, where each inner list represents the grades of one student.

import numpy as np

# Define the dataset
grades = [[80, 85, 90, 75, 95],
[90, 75, 80, 85, 90],
[70, 80, 75, 85, 80],
[60, 70, 75, 80, 85],
[95, 90, 85, 90, 90]]

# Convert the dataset to a NumPy array
grades_array = np.array(grades)

# Calculate the average grade for each subject
subject_averages = np.mean(grades_array, axis=0)
print("Subject Averages:", subject_averages)

# Calculate the average grade for each student
student_averages = np.mean(grades_array, axis=1)
print("Student Averages:", student_averages)

# Calculate the highest grade for each subject
subject_max = np.max(grades_array, axis=0)
print("Subject Max:", subject_max)

# Calculate the lowest grade for each student
student_min = np.min(grades_array, axis=1)
print("Student Min:", student_min)

In this mini project, we first define the dataset as a Python list of lists. We then convert the dataset to a NumPy array using the np.array() function. We then calculate various statistics using NumPy functions, such as the average grade for each subject using np.mean() function with the axis=0 argument, which specifies that we want to calculate the mean across rows (i.e. for each subject). Similarly, we calculate the average grade for each student using the np.mean() function with the axis=1 argument, which specifies that we want to calculate the mean across columns (i.e. for each student).

We also calculate the highest grade for each subject using the np.max() function with the axis=0 argument, which specifies that we want to find the maximum across rows (i.e. for each subject). Finally, we calculate the lowest grade for each student using the np.min() function with the axis=1 argument, which specifies that we want to find the minimum across columns (i.e. for each student).

Conclusion:

In this article, we have introduced the basics of NumPy and demonstrated some of its key features, such as creating arrays, indexing and slicing arrays, and performing numerical operations on arrays. We have also provided a mini project to practice using NumPy for data analysis. NumPy is a powerful library that is widely used in scientific computing, data analysis, and machine learning, and it is an essential tool for anyone working with numerical data in Python.

--

--

Jovan

I code and write random stories. Buy me a coffee if you like what you're reading - https://www.paypal.me/ltcjovan