Published

2025-03-31

Open In Colab

List Basics & Strings

Justin Post


Big Picture

We’ve learned a little about how python and our Jupyterlab coding environment works.

Next, we’ll go through and look at a number of common data structures used in python. We’ll try to follow a similar introduction for each data struture where we

  • introduce the data structure
  • discuss common functions and methods
  • do some quick examples of using them

Along the way we’ll learn some things we want to do with data along with control flow operators (if/then/else, looping, etc.)!

Note: These types of webpages are built from Jupyter notebooks (.ipynb files). You can access your own versions of them by clicking here. It is highly recommended that you go through and run the notebooks yourself, modifying and rerunning things where you’d like!


Data Structures

  • We’ll start by discussing the most important built-in data types
    • Strings, Numeric types, Booleans
    • Compound data types (Lists, Tuples, Dictionaries)
  • Then we’ll move to commonly used data structures from modules we use in statistics/data science
    • NumPy arrays
    • Pandas data frames

Lists, Tuples, Strings, and arrays are all sequences (ish) so they have similar functions and behavior! It is important to recognize these common behaviors


Lists

Properties of lists:

  • One-dimensional
  • Elements have an ordering (starting at 0)
  • Heterogenous (can have elements with different types)
  • Can have duplicate values

Constructing a List

Four major ways to create a list - [element1, element2] - list((element1, element2, ...)) - create an empty list and use the append method to add elements - list comprehensions

#first create via [1st_element, 2nd_element, etc]
x = [10, 15, 10, 100, "Help!"]
print(type(x))
x
<class 'list'>
[10, 15, 10, 100, 'Help!']
#create via list()
#Note the 'extra' set of () needed within
y = list(("Python", "List", 5))
y
['Python', 'List', 5]
#range() is a function that is 'iterable'. By putting it in a list, we get the values out
range(1,10)
range(1, 10)
#notice range doesn't give the 'last' value
z = list(range(1,10))
z
[1, 2, 3, 4, 5, 6, 7, 8, 9]

On sequence type objects, * replicates the object a certain number of times. This is common behavior to remember!

z * 2
[1, 2, 3, 4, 5, 6, 7, 8, 9, 1, 2, 3, 4, 5, 6, 7, 8, 9]

As lists don’t really restrict what its elements can be, lists can contain lists!

w = [list(range(1,3)), z, 3]
w
[[1, 2], [1, 2, 3, 4, 5, 6, 7, 8, 9], 3]

List Operations

Indexing

Very often we want to obtain pieces or elements of an object. We can easily do this with lists.

  • Index with a [] after the object name
  • Counting starts at 0
  • Negative index counts from reverse (starting at 1…)
x = [10, 15, 10, 100, "Help!"]
print(x[0])
print(x[1])
print(x[-1])
print(x[-2])
10
15
Help!
100
w = [list(range(1,5)), x, 3]
print(w)
#the first element is a list so the list is returned
print(w[0])
#similar with the second element
print(w[1])
[[1, 2, 3, 4], [10, 15, 10, 100, 'Help!'], 3]
[1, 2, 3, 4]
[10, 15, 10, 100, 'Help!']

We can do more than one level of indexing with a single line of code (when applicable). As w[1] returns a list we can use [] after w[1] to return a specific element or slice from that list.

print(w[1][0])
10

Slicing

Often we want to return more than one element at a time with our sequence type objects. This is called slicing.

  • We can return multiple elements at once with :
    • Leaving it blank on the left gives everything up until the index prior to the number given
    • Blank on the right gives everything after the desired starting index (counting starts at 0)
x = [10, 15, 10, 100, "Help!"]
x
[10, 15, 10, 100, 'Help!']
x[:2]
[10, 15]
x[:3]
[10, 15, 10]
x[1:]
[15, 10, 100, 'Help!']
x[1:3]
[15, 10]

Again, if we have a list with lists (or other sequence type objects in them) slicing will still return those objects as a list.

w = [list(range(1,5)), x, 3]
w
[[1, 2, 3, 4], [10, 15, 10, 100, 'Help!'], 3]
#here a list of lists
w[:2]
[[1, 2, 3, 4], [10, 15, 10, 100, 'Help!']]
#here just the single list
w[1]
[10, 15, 10, 100, 'Help!']
#can index what gets returned if that makes sense to do!
w[1][1:3]
[15, 10]

Functions & Methods

Recall: Two major ways to do an operation on a variable/object: functions and methods

  • Functions: function_name(myvar, other_args)
  • We saw the len() and max() functions earlier
myList = [1, 10, 100, 1000]
print(len(myList))
max(myList)
4
1000
  • Methods: myvar.method(other_args)
  • Recall that .pop() returns and removes the last element
myList.pop(3)
1000
myList
[1, 10, 100]
  • The .append() method adds an element to the end of the list
myList.append(100000)
myList
[1, 10, 100, 100000]

The methods for lists are listed at the top of this page of the python 3 documentation.

Some of the common functions in python are listed on this page of the documentation.


Strings

Next we’ll look at another sequence type object in python, the string. As with lists we’ll go through

  • Learn how to create
  • Consider commonly used functions and methods
  • See some examples of using them

Constructing Strings

  • Text is represented as a sequence of characters (letters, digits, and symbols) called a string (Nice reference)
    • Data type: str
    • Created using single or double quotes
#can use either ' or " to create a string
'wolf'
'wolf'
"pack"
'pack'
x = 'wolf'
print(type(x))
print(x)
<class 'str'>
wolf
  • Instead of ’ or “, you can use str() to create a string. This is called casting
x = str(10)
x
'10'

String Operations

Indexing

Remember that strings and lists are both sequence type objects. Therefore, we have similar operations on these objects.

my_string = "wolf pack"
  • Each element of the my_string variable contains a different character from "wolf pack"
  • As with lists, we access these elements using []
  • The first element is indexed by 0
my_string[0]
'w'
my_string[1]
'o'
  • Access the elements of the my_string variable in reverse order using a - (start with 1 not 0 for the last element though!)
my_string[-1]
'k'

Slicing

my_string = "wolf pack"
  • Slicing a string refers to returning more than one character of a string (similar to lists!)
    • Slice using :
my_string[4:]
' pack'
my_string[:3]
'wol'
my_string[3:4]
'f'
#s[:i] + s[i:] gives back s
my_string[:3] + my_string[3:]
'wolf pack'

Concatenating

Several built-in operations on strings

  • + will concatenate two strings together
'wolf' + ' pack'
'wolf pack'
'wolf' + ' pack' + " is" + " cool"
'wolf pack is cool'
  • String literals next to each other are automatically concatenated
'wolf' ' pack'
'wolf' ' pack' ' is' ' cool'
'wolf pack is cool'
  • This won’t work on variables though!
x = 'wolf'
#throws an error
x ' pack'
  File "<ipython-input-48-e8177530793e>", line 3
    x ' pack'
      ^
SyntaxError: invalid syntax
x + ' pack'
'10 pack'

This behavior actually works with lists as well!

[1, 2, 3] + ["a", [5, 6,]]
[1, 2, 3, 'a', [5, 6]]

No Implicit Coercion

You might wonder what happens when an operator like + is applied to a string and a numeric value.

#throws an error
'wolfpack' + 2
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-51-0f3c839ed4d7> in <cell line: 2>()
      1 #throws an error
----> 2 'wolfpack' + 2

TypeError: can only concatenate str (not "int") to str
  • If you come from R, R does this implicit coercion for you without warning (dangerous but you get used to it!)
  • In python, to join a string and number cast the number as a string!
'Four score and ' + str(7) + ' years ago'
'Four score and 7 years ago'

String Operations (Concatenating Repeats)

You can also repeat strings with the * operator and an integer (again similar to a list)

print('go pack ' * 3)
print('go pack ' * 0) #returns an empty string ''
print('go pack ' * 5)
go pack go pack go pack 

go pack go pack go pack go pack go pack 

Functions & Methods

len('wolf pack')
9
len('241!')
4
len(' ')
1
len("")
0
sorted("wolf pack")
[' ', 'a', 'c', 'f', 'k', 'l', 'o', 'p', 'w']
  • Many methods as well. Some common examples are below:
my_string = '  wolf pack  '
#create an upper case version of the string
my_string.upper()
'  WOLF PACK  '
#this doesn't overwrite the string though!
my_string
'  wolf pack  '
#remove whitespace from the ends
my_string.strip()
'wolf pack'
#replace elements
my_string.replace("a", "e")
'  wolf peck  '
#split the string by a character (here a space) (note this returns a list!)
my_string.strip().split(" ")
['wolf', 'pack']

Immutability of Strings

  • We saw that lists could be modified. That means they are mutable

  • Strings are immutable

    • Individual characters can’t be modified
my_string = "wolf pack"
#this will throw an error
my_string[1] = "a"
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-73-865d7b8d0579> in <cell line: 3>()
      1 my_string = "wolf pack"
      2 #this will throw an error
----> 3 my_string[1] = "a"

TypeError: 'str' object does not support item assignment

Inserting Values Into Strings

Sometimes we want to place certain elements into a string via variables or values. This can be done using the .format() method.

years = 3
salary = 100000
myorder = "I have {1} years of experience and would like a salary of {0}."
print(myorder.format(salary, years))
I have 3 years of experience and would like a salary of 100000.
  • Don’t need the numbers, but then you must position correctly
myorder = "I have {} years of experience and would like a salary of {}."
print(myorder.format(years, salary))
I have 3 years of experience and would like a salary of 100000.

There are a few other ways to do this that we’ll visit later on!


Video Demo

This quick video demonstration shows some quick exercises with strings and lists. Remember to pop the video out into the full player.

The notebook written in the video is available here.

from IPython.display import IFrame
IFrame(src = "https://ncsu.hosted.panopto.com/Panopto/Pages/Embed.aspx?id=72bd0292-4c48-4064-8977-b0ef017167f6&autoplay=false&offerviewer=true&showtitle=true&showbrand=true&captions=false&interactivity=all", height="405", width="720")

Recap

  • Lists are 1D ordered objects

  • Strings are sequences of characters

    • Immutable

    • Index with [] (starting at 0)

    • Many functions and methods built in to help

    • + for concatenation and * for repeating a string

  • Sequence type objects have similar behavior!

If you are on the course website, use the table of contents on the left or the arrows at the bottom of this page to navigate to the next learning material!

If you are on Google Colab, head back to our course website for our next lesson!