NumPy
Creating Array

Understanding NumPy Arrays, Their Creation, Attributes, and Basic Operations

In the previous post, we introduced NumPy, a powerful Python library for numerical computations. In this post, we will delve into the basics of NumPy, including understanding arrays, their creation, attributes, and basic operations.

Understanding NumPy Arrays

The core functionality of NumPy is its ndarray, or n-dimensional array data structure. These arrays are homogenous collections of "items" (numbers, records, etc.) indexed by a tuple of positive integers. All ndarrays are a grid of values (of the same type) that can be indexed and iterated over.

import numpy as np
 
# Create a simple one-dimensional array
a = np.array([1, 2, 3])
print(a)  # Output: [1 2 3]
🎒

Analogy between Lego and Array

Imagine you have a big box of Legos. You can arrange these Legos in different ways. You can make a long line with them, or you can stack them up to make a wall. You can even make a big cube if you have enough Legos.

Now, in each of these arrangements, each Lego block is like a piece of information. In a NumPy array, this information could be a number or a word or anything else you want to keep track of.

So, a NumPy array is like a special box where we keep our Legos (or our pieces of information). We can arrange them in different ways - in a line (which we call a 1-dimensional array), in a wall (a 2-dimensional array), or in a cube (a 3-dimensional array), and so on.

Dimensions of NumPy Arrays and their Use

NumPy arrays can have different dimensions, ranging from 0D to higher-dimensional arrays. Each dimension adds another level of structure and organization to the data. Here are the different dimensions of NumPy arrays:

0D Array (Scalar)

A 0D array, also known as a scalar, contains a single value.

scalar = np.array(42)
 
print(scalar.ndim)  # Output: 0

Note: You will cover the .ndim attribute of array later in this post.

1D Array (Vector)

A 1D array, or vector, is a sequence of values arranged in a single dimension. They can be used to store a list of numbers, strings, or other data types.

Use: One-dimensional arrays can be used to store the scores of a basketball team, the temperatures of a day, daily closing prices of a single stock over time, or other time series data.

vector = np.array([10, 20, 30, 40, 50])
 
print(vector.ndim)  # Output: 1

2D Array (Matrix)

Two-dimensional arrays are used to store data in a grid-like format. Each element in a two-dimensional array is associated with two indices: a row index and a column index.

Use: Two-dimensional arrays can be used to store the values of a matrix, the pixels of an image, geographical data, or the data from a spreadsheet(tabular data).

matrix = np.array([[1, 2, 3],
                   [4, 5, 6],
                   [7, 8, 9]])
                   
print(matrix.ndim)  # Output: 2

3D Array

Three-dimensional arrays are used to store data in a three-dimensional grid-like format. Each element in a three-dimensional array is associated with three indices: a row index, a column index, and a layer index.

Use: Three-dimensional arrays can be used to store the values of a volumetric dataset, the data from a medical scan, or the data from a video game.

array_3d = np.array([[[1, 2, 3],
                      [4, 5, 6],
                      [7, 8, 9]],
                     
                     [[10, 11, 12],
                      [13, 14, 15],
                      [16, 17, 18]]])
                      
print(array_3d.ndim)  # Output: 3

N-Dimensional Array

NumPy arrays can have any number of dimensions, often referred to as n-dimensional arrays. N-dimensional arrays are arrays with more than three dimensions. N-dimensional arrays can be useful for storing data that has a complex structure.

Use: N-dimensional arrays can be used to store tensor data, multi-spectral image data(e.g., remote sensing), or climate data.

n_dimensional = np.array([[[[1, 2], [3, 4]],
                           [[5, 6], [7, 8]]],
 
                          [[[9, 10], [11, 12]],
                           [[13, 14], [15, 16]]]])
print(n_dimensional.ndim)  # Output: 4

Methods to Create Array

NumPy provides several functions to create arrays:

np.array

np.array: We have covered this extensively above. This function creates an array from a regular Python list or tuple.

# Create a simple one-dimensional array
a = np.array([1, 2, 3])
 
print(a)  # Output: [1 2 3]

np.zeros:

The np.zeros function creates an array filled with zeros. It's particularly useful when you want to initialize an array with a specific shape and fill it with zeros.

Use: Image processing, numerical simulations, or feature initialization

# Create a 3x3 array of zeros
a = np.zeros((3, 3))
 
print(a)  # Output: [[0. 0. 0.]
           #         [0. 0. 0.]
           #         [0. 0. 0.]]

np.ones

The np.ones function creates an array filled with ones. It's useful when you need to initialize an array with a specific shape and populate it with ones.

Use: Image processing, scaling, or probabilistic models

# Create a 3x3 array of ones
a = np.ones((3, 3))
 
print(a)  # Output: [[1. 1. 1.]
           #         [1. 1. 1.]
           #         [1. 1. 1.]]

np.empty

The np.empty function creates an array without initializing its values. It allocates memory but doesn't set any values, resulting in whatever values were previously in that memory location. This function is useful when you intend to fill the array with your own data shortly after creation.

Use: Performance Optimization (In numerical computations, preallocating memory with np.empty can be more efficient than creating arrays with np.zeros or np.ones, especially for large arrays.)

# Create an uninitialized array of three integers
# The values will be whatever already exist at that memory location
a = np.empty(3)
 
print(a)  # Output: [1.39069238e-309 1.39069238e-309 1.39069238e-309]

Array Attributes: shape, ndim, dtype

NumPy arrays have many useful attributes:

  • shape: This attribute returns a tuple representing the dimensions of the array.
a = np.array([(1, 2, 3), (4, 5, 6)])
 
print(a.shape)  # Output: (2, 3)

In this case, the array has 2 rows and 3 columns, so the shape is (2, 3).

  • ndim: This attribute returns the number of array dimensions.
a = np.array([(1, 2, 3), (4, 5, 6)])
 
print(a.ndim)  # Output: 2

The array in the previous example is a 2-dimensional array, so its ndim value is 2.

  • dtype: This attribute returns the type of the elements in the array.
a = np.array([(1, 2, 3), (4, 5, 6)])
 
print(a.dtype)  # Output: int64

In this case, the elements are integers, so the data type is int64.

Basic Operations: Arithmetic, Comparison

NumPy arrays can be used with arithmetic and comparison operators:

  • Arithmetic operators: NumPy allows element-wise operations on arrays which significantly reduces the need for loops.
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])
 
print(a + b)  # Output: [5 7 9]
print(a * b)  # Output: [ 4 10 18]
  • Comparison operators: You can also do element-wise comparison of arrays.
a = np.array([1, 2, 3])
b = np.array([4, 2, 1])
 
print(a == b)  # Output: [False  True False]
print(a < b)   # Output: [ True False False]

That's it for the basics of NumPy! In the next post, we will discuss array manipulation