Published

2025-03-31

Open In Colab

NumPy

One of the most famous modules for statistics is called numpy

  • A library for scientific computing

    • Provides the multidimensional array object (such as vectors and matrices) and methods for manipulating them
  • We can bring in the module with import and the convention is to reference it as np

import numpy as np

As with our other data types, let’s go through and…

  • Learn how to create
  • Consider commonly used functions and methods
  • See control flow and other tricks along the way

This topic, compound objects: - numpy array

Recall: functions & methods act on objects. We’ll see how to obtain attributes here as well!

Note: These types of webpages are built from Jupyter notebooks (.ipynb files). You can access your own versions of them by clicking here. It is highly recommended that you go through and run the notebooks yourself, modifying and rerunning things where you’d like!


Creating an Array

  • Arrays are like lists but process much faster
  • They also require that the data be of the same type
  • They can be multidimensional (like a matrix or even higher dimension

The picture below from https://predictivehacks.com/tips-about-numpy-arrays/ shows a 1D, 2D, and 3D array visually.

  • To create an ndarray object, pass a list, tuple, or any array-like object to np.array()
a = np.array(1)
a
array(1)
type(a)
numpy.ndarray
  • ndarrays have a shape attribute
  • Attributes can be accessed like methods except we don’t use () at the end
  • We did this with the .__doc__ attribute on functions
a.shape
()
b = np.array([1, 2, 3])
print(b)
print(type(b))
print(b.shape)
[1 2 3]
<class 'numpy.ndarray'>
(3,)

Array Dimension

  • 0D arrays are a scalar (sort of… see here for discussion)

  • 1D arrays are vectors

  • 2D arrays are matrices

  • 3D and up are just called arrays

  • .shape attribute returns the dimensions of an array as a tuple

c = np.array([1, "a", True])
print(c)
c.shape
['1' 'a' 'True']
(3,)
d = np.array([
  [1, 2, 3],
  [4, 5, 6]]
  )
print(d)
d.shape
[[1 2 3]
 [4 5 6]]
(2, 3)

Functions for Fillling/Creating Arrays

Creating a vector or matrix of all zeros

  • Row vector
A0 = np.zeros(4) #row vector of length 4
A0
array([0., 0., 0., 0.])
  • Column vector
A0 = np.zeros((4,1)) #column vector of length 4
A0
array([[0.],
       [0.],
       [0.],
       [0.]])
  • Matrix of zeros
A = np.zeros((4,2)) #matrix with dimension 4, 2, given as a tuple
A
array([[0., 0.],
       [0., 0.],
       [0., 0.],
       [0., 0.]])
A.shape
(4, 2)
  • Row of all ones
b = np.ones(4) #row vector
b
array([1., 1., 1., 1.])
  • Matrix of all ones
B = np.ones((2,3))
B
array([[1., 1., 1.],
       [1., 1., 1.]])
  • Matrix of 10’s
C = np.ones((2, 3)) * 10
C
array([[10., 10., 10.],
       [10., 10., 10.]])
  • np.full() does this automatically
C = np.full((2,3), 10) #specify the value to fill with after the tuple giving dimension
C
array([[10, 10, 10],
       [10, 10, 10]])
  • Be careful! C is an integer valued array
C = np.full((2,3), 10)
C[0,0] = 6.5                 #replace the top left element
C
array([[ 6, 10, 10],
       [10, 10, 10]])
  • Avoid by creating the matrix with a float instead
C = np.full((2,3), 10.0)  #or C = np.ones((2, 3)) * 10.0
C[0,0] = 6.5
C
array([[ 6.5, 10. , 10. ],
       [10. , 10. , 10. ]])
  • Create an identity matrix with np.eye() (this has 1’s on the diagonal and 0’s elsewhere)
D = np.eye(3)
D
array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])
  • Create a random matrix (values between 0 and 1) with np.random.random()
E = np.random.random((3,5))
E
array([[0.82782158, 0.92669984, 0.28811706, 0.8048095 , 0.31863604],
       [0.43125583, 0.95565594, 0.81946103, 0.96181153, 0.10190225],
       [0.92238437, 0.66130983, 0.8828503 , 0.06677584, 0.78615673]])

Reshaping an Array

  • Reshape an array with the .reshape() method
  • Changes the dimension in some way
  • We’ll need to do this type of thing when fitting models!
F = np.random.random((10,1))
F
array([[0.38620732],
       [0.02246848],
       [0.75057807],
       [0.64596504],
       [0.9782189 ],
       [0.3074028 ],
       [0.20987403],
       [0.73177229],
       [0.8167644 ],
       [0.03675048]])
F.shape
(10, 1)
G = F.reshape(1, -1) #-1 flattens to a 1D array
G
array([[0.38620732, 0.02246848, 0.75057807, 0.64596504, 0.9782189 ,
        0.3074028 , 0.20987403, 0.73177229, 0.8167644 , 0.03675048]])
G.shape
(1, 10)
G = F.reshape(2, 5)
G
array([[0.38620732, 0.02246848, 0.75057807, 0.64596504, 0.9782189 ],
       [0.3074028 , 0.20987403, 0.73177229, 0.8167644 , 0.03675048]])
  • Careful! G is actually a view of the original array
  • View means that we haven’t created a new array, just a different way of viewing the values (essentially). The data is still stored in the same memory
  • .base attribute will tell you whether you are referencing another array
G.base
array([[0.38620732],
       [0.02246848],
       [0.75057807],
       [0.64596504],
       [0.9782189 ],
       [0.3074028 ],
       [0.20987403],
       [0.73177229],
       [0.8167644 ],
       [0.03675048]])
G.base is None #a way to return a bool based on whether it is a view or not
False

Copying an Array

  • To avoid getting a view, copy the array with .copy() method
H = F.reshape(2, 5).copy()
H.base is None
True
H.base

Indexing an Array

  • Access in the same was as lists []
  • With multiple dimensions, separate the indices you want with a ,
b = np.array([1, 2, 3]) #row vector
b
array([1, 2, 3])
print(b[0], b[1], b[2])
1 2 3
b[0] = 5 #overwrite the 0 element
b
array([5, 2, 3])
  • Depending on the dimensions, you add the required commas
  • Here we have a 3D array so we have three slots
  • Notation: array[1stD, 2ndD, 3rdD]
E = np.random.random((3, 2, 2))
E
array([[[0.44271423, 0.36194369],
        [0.67811074, 0.36893479]],

       [[0.18957687, 0.89085357],
        [0.40869827, 0.1685411 ]],

       [[0.28849053, 0.65884175],
        [0.71058619, 0.41460453]]])
E[0, 0, 0]
0.4427142296735752
E[0, 1, 0]
0.6781107434665593
E[1, 0, 1]
0.8908535743830176

Slicing an Array

  • Recall [start:end] for slicing sequence type objects. We can do that with arrays as well

    • Returns everything from start up to and excluding end
    • Leaving start blank implies a 0
    • Leaving end blank returns everything from start through the end of the array
A = np.array([
  [1,2,3,4],
  [5,6,7,8],
  [9,10,11,12]])
A
array([[ 1,  2,  3,  4],
       [ 5,  6,  7,  8],
       [ 9, 10, 11, 12]])
B = A[:2, 1:3]
B
array([[2, 3],
       [6, 7]])
  • Careful with modifying! We have a view here so the values in both A and B are referencing the same computer memory
  • Changing an element of B changes A!
B[0, 0] = 919
A
array([[  1, 919,   3,   4],
       [  5,   6,   7,   8],
       [  9,  10,  11,  12]])
  • Returning All of One Index

  • Use a : with nothing else

A = np.array([
  [1,2,3,4],
  [5,6,7,8],
  [9,10,11,12]])
A1 = A[1, :]
A1
array([5, 6, 7, 8])
A1.shape
(4,)
A2 = A[1:3, :]
A2
array([[ 5,  6,  7,  8],
       [ 9, 10, 11, 12]])
A2.shape
(2, 4)

Operations on Arrays

  • We saw that multiplying by a constant was performed elementwise
  • All basic functions act elementwise
x = np.array([
  [1,2],
  [3,4]])
y = np.array([
  [5,6],
  [7,8]])

x
array([[1, 2],
       [3, 4]])
y
array([[5, 6],
       [7, 8]])
x + 10
array([[11, 12],
       [13, 14]])
  • Lots of methods exist such as the .add() method for adding arrays elementwise
np.add(x, y)
array([[ 6,  8],
       [10, 12]])
  • If we just do something like x * y we get elementwise multiplication
x * y
array([[ 5, 12],
       [21, 32]])
  • The .multiply() method does elementwise multiplication too
  • Can also add in conditions on when to multiply though!
    • where = argument gives the condition on when to do the multiplication
    • out = tells it which values to use if you don’t do the multiplication
np.multiply(x, y, where = (x >= 3), out = x)
array([[ 1,  2],
       [21, 32]])
  • Elementwise division
x / y
array([[0.2       , 0.33333333],
       [3.        , 4.        ]])
  • We can do matrix multiplication (if you are familiar with that) using the .matmul() method
np.matmul(x, y)
array([[ 19,  22],
       [329, 382]])
  • sqrt() function can be used to find the square roots of the elements of a matrix
np.sqrt(x)
array([[1.        , 1.41421356],
       [4.58257569, 5.65685425]])
  • np.linalg.inv() will provide the inverse of a square matrix (if you’re familiar with that type of thing!)
np.linalg.inv(x)
array([[-3.2,  0.2],
       [ 2.1, -0.1]])

Computations on Arrays

  • NumPy has some useful functions for performing basic computations on arrays
x = np.array([
  [1,2,10],
  [3,4,11]])
np.sum(x)
31
  • Column-wise and row-wise sums
x.shape
(2, 3)
np.sum(x, axis=0)
array([ 4,  6, 21])
np.sum(x, axis=1)
array([13, 18])
  • Combine arrays (appropriately sized)
x = np.array([
  [1,2],
  [3,4]])
y = np.array([
  [5,6],
  [7,8]])

np.hstack((x, y))
array([[1, 2, 5, 6],
       [3, 4, 7, 8]])
np.vstack((x, y))
array([[1, 2],
       [3, 4],
       [5, 6],
       [7, 8]])

Quick Video

This video shows the creation of numpy arrays, the use of np.nditer() for iterating over an array, and the use of ufuncs which act on the elements of an array. Remember to pop the video out into the full player.

The notebook written in the video is available here.

from IPython.display import IFrame
IFrame(src="https://ncsu.hosted.panopto.com/Panopto/Pages/Embed.aspx?id=d84d2c18-c4ba-4107-bc7f-b0f800f76b2e&autoplay=false&offerviewer=true&showtitle=true&showbrand=true&captions=false&interactivity=all", height="405", width="720")

Recap

  • NumPy is a widely used library that provides arrays

    • Lots of functions to create arrays

    • Very fast computation and many useful functions for operating on arrays!

This wraps up the content for week 2. I know that was a lot about these objects. Now we require some more practice! You should head back to our Moodle site to check out your homework assignment for this week.

Otherwise, if you are on the course website, use the table of contents on the left or the arrows at the bottom of this page to navigate to the next learning material!

If you are on Google Colab, head back to our course website for our next lesson!