import numpy as np
NumPy
One of the most famous modules for statistics is called numpy
A library for scientific computing
- Provides the multidimensional array object (such as vectors and matrices) and methods for manipulating them
We can bring in the module with
import
and the convention is to reference it asnp
As with our other data types, let’s go through and…
- Learn how to create
- Consider commonly used functions and methods
- See control flow and other tricks along the way
This topic, compound objects: - numpy array
Recall: functions & methods act on objects. We’ll see how to obtain attributes here as well!
Note: These types of webpages are built from Jupyter notebooks (.ipynb
files). You can access your own versions of them by clicking here. It is highly recommended that you go through and run the notebooks yourself, modifying and rerunning things where you’d like!
Creating an Array
- Arrays are like lists but process much faster
- They also require that the data be of the same type
- They can be multidimensional (like a matrix or even higher dimension
The picture below from https://predictivehacks.com/tips-about-numpy-arrays/ shows a 1D, 2D, and 3D array visually.
- To create an
ndarray
object, pass alist
,tuple
, or any array-like object tonp.array()
= np.array(1)
a a
array(1)
type(a)
numpy.ndarray
ndarrays
have ashape
attribute- Attributes can be accessed like methods except we don’t use
()
at the end - We did this with the
.__doc__
attribute on functions
a.shape
()
= np.array([1, 2, 3])
b print(b)
print(type(b))
print(b.shape)
[1 2 3]
<class 'numpy.ndarray'>
(3,)
Array Dimension
0D arrays are a scalar (sort of… see here for discussion)
1D arrays are vectors
2D arrays are matrices
3D and up are just called arrays
.shape
attribute returns the dimensions of an array as a tuple
= np.array([1, "a", True])
c print(c)
c.shape
['1' 'a' 'True']
(3,)
= np.array([
d 1, 2, 3],
[4, 5, 6]]
[
)print(d)
d.shape
[[1 2 3]
[4 5 6]]
(2, 3)
Functions for Fillling/Creating Arrays
Creating a vector or matrix of all zeros
- Row vector
= np.zeros(4) #row vector of length 4
A0 A0
array([0., 0., 0., 0.])
- Column vector
= np.zeros((4,1)) #column vector of length 4
A0 A0
array([[0.],
[0.],
[0.],
[0.]])
- Matrix of zeros
= np.zeros((4,2)) #matrix with dimension 4, 2, given as a tuple
A A
array([[0., 0.],
[0., 0.],
[0., 0.],
[0., 0.]])
A.shape
(4, 2)
- Row of all ones
= np.ones(4) #row vector
b b
array([1., 1., 1., 1.])
- Matrix of all ones
= np.ones((2,3))
B B
array([[1., 1., 1.],
[1., 1., 1.]])
- Matrix of 10’s
= np.ones((2, 3)) * 10
C C
array([[10., 10., 10.],
[10., 10., 10.]])
np.full()
does this automatically
= np.full((2,3), 10) #specify the value to fill with after the tuple giving dimension
C C
array([[10, 10, 10],
[10, 10, 10]])
- Be careful! C is an integer valued array
= np.full((2,3), 10)
C 0,0] = 6.5 #replace the top left element
C[ C
array([[ 6, 10, 10],
[10, 10, 10]])
- Avoid by creating the matrix with a float instead
= np.full((2,3), 10.0) #or C = np.ones((2, 3)) * 10.0
C 0,0] = 6.5
C[ C
array([[ 6.5, 10. , 10. ],
[10. , 10. , 10. ]])
- Create an identity matrix with
np.eye()
(this has 1’s on the diagonal and 0’s elsewhere)
= np.eye(3)
D D
array([[1., 0., 0.],
[0., 1., 0.],
[0., 0., 1.]])
- Create a random matrix (values between 0 and 1) with
np.random.random()
= np.random.random((3,5))
E E
array([[0.82782158, 0.92669984, 0.28811706, 0.8048095 , 0.31863604],
[0.43125583, 0.95565594, 0.81946103, 0.96181153, 0.10190225],
[0.92238437, 0.66130983, 0.8828503 , 0.06677584, 0.78615673]])
Reshaping an Array
- Reshape an array with the
.reshape()
method - Changes the dimension in some way
- We’ll need to do this type of thing when fitting models!
= np.random.random((10,1))
F F
array([[0.38620732],
[0.02246848],
[0.75057807],
[0.64596504],
[0.9782189 ],
[0.3074028 ],
[0.20987403],
[0.73177229],
[0.8167644 ],
[0.03675048]])
F.shape
(10, 1)
= F.reshape(1, -1) #-1 flattens to a 1D array
G G
array([[0.38620732, 0.02246848, 0.75057807, 0.64596504, 0.9782189 ,
0.3074028 , 0.20987403, 0.73177229, 0.8167644 , 0.03675048]])
G.shape
(1, 10)
= F.reshape(2, 5)
G G
array([[0.38620732, 0.02246848, 0.75057807, 0.64596504, 0.9782189 ],
[0.3074028 , 0.20987403, 0.73177229, 0.8167644 , 0.03675048]])
- Careful!
G
is actually a view of the original array - View means that we haven’t created a new array, just a different way of viewing the values (essentially). The data is still stored in the same memory
.base
attribute will tell you whether you are referencing another array
G.base
array([[0.38620732],
[0.02246848],
[0.75057807],
[0.64596504],
[0.9782189 ],
[0.3074028 ],
[0.20987403],
[0.73177229],
[0.8167644 ],
[0.03675048]])
is None #a way to return a bool based on whether it is a view or not G.base
False
Copying an Array
- To avoid getting a view, copy the array with
.copy()
method
= F.reshape(2, 5).copy()
H is None H.base
True
H.base
Indexing an Array
- Access in the same was as lists
[]
- With multiple dimensions, separate the indices you want with a
,
= np.array([1, 2, 3]) #row vector
b b
array([1, 2, 3])
print(b[0], b[1], b[2])
1 2 3
0] = 5 #overwrite the 0 element
b[ b
array([5, 2, 3])
- Depending on the dimensions, you add the required commas
- Here we have a 3D array so we have three slots
- Notation:
array[1stD, 2ndD, 3rdD]
= np.random.random((3, 2, 2))
E E
array([[[0.44271423, 0.36194369],
[0.67811074, 0.36893479]],
[[0.18957687, 0.89085357],
[0.40869827, 0.1685411 ]],
[[0.28849053, 0.65884175],
[0.71058619, 0.41460453]]])
0, 0, 0] E[
0.4427142296735752
0, 1, 0] E[
0.6781107434665593
1, 0, 1] E[
0.8908535743830176
Slicing an Array
Recall
[start:end]
for slicing sequence type objects. We can do that with arrays as well- Returns everything from start up to and excluding end
- Leaving start blank implies a 0
- Leaving end blank returns everything from start through the end of the array
= np.array([
A 1,2,3,4],
[5,6,7,8],
[9,10,11,12]])
[ A
array([[ 1, 2, 3, 4],
[ 5, 6, 7, 8],
[ 9, 10, 11, 12]])
= A[:2, 1:3]
B B
array([[2, 3],
[6, 7]])
- Careful with modifying! We have a view here so the values in both A and B are referencing the same computer memory
- Changing an element of
B
changesA
!
0, 0] = 919
B[ A
array([[ 1, 919, 3, 4],
[ 5, 6, 7, 8],
[ 9, 10, 11, 12]])
Returning All of One Index
Use a
:
with nothing else
= np.array([
A 1,2,3,4],
[5,6,7,8],
[9,10,11,12]])
[= A[1, :]
A1 A1
array([5, 6, 7, 8])
A1.shape
(4,)
= A[1:3, :]
A2 A2
array([[ 5, 6, 7, 8],
[ 9, 10, 11, 12]])
A2.shape
(2, 4)
Operations on Arrays
- We saw that multiplying by a constant was performed elementwise
- All basic functions act elementwise
= np.array([
x 1,2],
[3,4]])
[= np.array([
y 5,6],
[7,8]])
[
x
array([[1, 2],
[3, 4]])
y
array([[5, 6],
[7, 8]])
+ 10 x
array([[11, 12],
[13, 14]])
- Lots of methods exist such as the
.add()
method for adding arrays elementwise
np.add(x, y)
array([[ 6, 8],
[10, 12]])
- If we just do something like
x * y
we get elementwise multiplication
* y x
array([[ 5, 12],
[21, 32]])
- The
.multiply()
method does elementwise multiplication too - Can also add in conditions on when to multiply though!
where =
argument gives the condition on when to do the multiplicationout =
tells it which values to use if you don’t do the multiplication
= (x >= 3), out = x) np.multiply(x, y, where
array([[ 1, 2],
[21, 32]])
- Elementwise division
/ y x
array([[0.2 , 0.33333333],
[3. , 4. ]])
- We can do matrix multiplication (if you are familiar with that) using the
.matmul()
method
np.matmul(x, y)
array([[ 19, 22],
[329, 382]])
sqrt()
function can be used to find the square roots of the elements of a matrix
np.sqrt(x)
array([[1. , 1.41421356],
[4.58257569, 5.65685425]])
np.linalg.inv()
will provide the inverse of a square matrix (if you’re familiar with that type of thing!)
np.linalg.inv(x)
array([[-3.2, 0.2],
[ 2.1, -0.1]])
Computations on Arrays
NumPy
has some useful functions for performing basic computations on arrays
= np.array([
x 1,2,10],
[3,4,11]])
[sum(x) np.
31
- Column-wise and row-wise sums
x.shape
(2, 3)
sum(x, axis=0) np.
array([ 4, 6, 21])
sum(x, axis=1) np.
array([13, 18])
- Combine arrays (appropriately sized)
= np.array([
x 1,2],
[3,4]])
[= np.array([
y 5,6],
[7,8]])
[
np.hstack((x, y))
array([[1, 2, 5, 6],
[3, 4, 7, 8]])
np.vstack((x, y))
array([[1, 2],
[3, 4],
[5, 6],
[7, 8]])
Quick Video
This video shows the creation of numpy arrays, the use of np.nditer()
for iterating over an array, and the use of ufuncs
which act on the elements of an array. Remember to pop the video out into the full player.
The notebook written in the video is available here.
from IPython.display import IFrame
="https://ncsu.hosted.panopto.com/Panopto/Pages/Embed.aspx?id=d84d2c18-c4ba-4107-bc7f-b0f800f76b2e&autoplay=false&offerviewer=true&showtitle=true&showbrand=true&captions=false&interactivity=all", height="405", width="720") IFrame(src
Recap
NumPy
is a widely used library that provides arraysLots of functions to create arrays
Very fast computation and many useful functions for operating on arrays!
This wraps up the content for week 2. I know that was a lot about these objects. Now we require some more practice! You should head back to our Moodle site to check out your homework assignment for this week.
Otherwise, if you are on the course website, use the table of contents on the left or the arrows at the bottom of this page to navigate to the next learning material!
If you are on Google Colab, head back to our course website for our next lesson!