Published

2025-03-31

Open In Colab

SQL Joins on chinook

Now let’s do some joins on the chinook database. We’ll reread in the packages are make our connection to the database.

import sqlite3
import pandas as pd
con = sqlite3.connect("chinook.db")

Using pandas we can check out the tables returned as a data frame.

pd.read_sql("SELECT * FROM sqlite_schema WHERE type = 'table';", con)
type name tbl_name rootpage sql
0 table albums albums 2 CREATE TABLE "albums"\r\n(\r\n [AlbumId] IN...
1 table sqlite_sequence sqlite_sequence 3 CREATE TABLE sqlite_sequence(name,seq)
2 table artists artists 4 CREATE TABLE "artists"\r\n(\r\n [ArtistId] ...
3 table customers customers 5 CREATE TABLE "customers"\r\n(\r\n [Customer...
4 table employees employees 8 CREATE TABLE "employees"\r\n(\r\n [Employee...
5 table genres genres 10 CREATE TABLE "genres"\r\n(\r\n [GenreId] IN...
6 table invoices invoices 11 CREATE TABLE "invoices"\r\n(\r\n [InvoiceId...
7 table invoice_items invoice_items 13 CREATE TABLE "invoice_items"\r\n(\r\n [Invo...
8 table media_types media_types 15 CREATE TABLE "media_types"\r\n(\r\n [MediaT...
9 table playlists playlists 16 CREATE TABLE "playlists"\r\n(\r\n [Playlist...
10 table playlist_track playlist_track 17 CREATE TABLE "playlist_track"\r\n(\r\n [Pla...
11 table tracks tracks 20 CREATE TABLE "tracks"\r\n(\r\n [TrackId] IN...
12 table sqlite_stat1 sqlite_stat1 864 CREATE TABLE sqlite_stat1(tbl,idx,stat)
13 table justin_music justin_music 865 CREATE TABLE justin_music (\n album TEX...

The tracks and albums tables each share a variable/column/key of AlbumId. Let’s do an inner join on those using pd.merge(). Remember that we can pull the full tables into pandas using pd.read_sql() and then use pandas to do the join!

tracks_albums = pd.merge(left = pd.read_sql("SELECT * FROM tracks", con), right = pd.read_sql("SELECT * FROM albums", con),
         how = "inner",
         on = "AlbumId")
tracks_albums
TrackId Name AlbumId MediaTypeId GenreId Composer Milliseconds Bytes UnitPrice Title ArtistId
0 1 For Those About To Rock (We Salute You) 1 1 1 Angus Young, Malcolm Young, Brian Johnson 343719 11170334 0.99 For Those About To Rock We Salute You 1
1 6 Put The Finger On You 1 1 1 Angus Young, Malcolm Young, Brian Johnson 205662 6713451 0.99 For Those About To Rock We Salute You 1
2 7 Let's Get It Up 1 1 1 Angus Young, Malcolm Young, Brian Johnson 233926 7636561 0.99 For Those About To Rock We Salute You 1
3 8 Inject The Venom 1 1 1 Angus Young, Malcolm Young, Brian Johnson 210834 6852860 0.99 For Those About To Rock We Salute You 1
4 9 Snowballed 1 1 1 Angus Young, Malcolm Young, Brian Johnson 203102 6599424 0.99 For Those About To Rock We Salute You 1
... ... ... ... ... ... ... ... ... ... ... ...
3498 3499 Pini Di Roma (Pinien Von Rom) \ I Pini Della V... 343 2 24 None 286741 4718950 0.99 Respighi:Pines of Rome 226
3499 3500 String Quartet No. 12 in C Minor, D. 703 "Quar... 344 2 24 Franz Schubert 139200 2283131 0.99 Schubert: The Late String Quartets & String Qu... 272
3500 3501 L'orfeo, Act 3, Sinfonia (Orchestra) 345 2 24 Claudio Monteverdi 66639 1189062 0.99 Monteverdi: L'Orfeo 273
3501 3502 Quintet for Horn, Violin, 2 Violas, and Cello ... 346 2 24 Wolfgang Amadeus Mozart 221331 3665114 0.99 Mozart: Chamber Music 274
3502 3503 Koyaanisqatsi 347 2 10 Philip Glass 206005 3305164 0.99 Koyaanisqatsi (Soundtrack from the Motion Pict... 275

3503 rows × 11 columns

Now we’ll join that table we just made with another table, invoice_items and do a left join (treating invoice_items as the left table). We’ll join on the key that is shared, trackId.

next = pd.merge(left = pd.read_sql("SELECT * FROM invoice_items", con),
         right= tracks_albums,
         how = "left",
         on = "TrackId")
next
InvoiceLineId InvoiceId TrackId UnitPrice_x Quantity Name AlbumId MediaTypeId GenreId Composer Milliseconds Bytes UnitPrice_y Title ArtistId
0 1 1 2 0.99 1 Balls to the Wall 2 2 1 None 342562 5510424 0.99 Balls to the Wall 2
1 2 1 4 0.99 1 Restless and Wild 3 2 1 F. Baltes, R.A. Smith-Diesel, S. Kaufman, U. D... 252051 4331779 0.99 Restless and Wild 2
2 3 2 6 0.99 1 Put The Finger On You 1 1 1 Angus Young, Malcolm Young, Brian Johnson 205662 6713451 0.99 For Those About To Rock We Salute You 1
3 4 2 8 0.99 1 Inject The Venom 1 1 1 Angus Young, Malcolm Young, Brian Johnson 210834 6852860 0.99 For Those About To Rock We Salute You 1
4 5 2 10 0.99 1 Evil Walks 1 1 1 Angus Young, Malcolm Young, Brian Johnson 263497 8611245 0.99 For Those About To Rock We Salute You 1
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
2235 2236 411 3136 0.99 1 Looking For Love 141 1 3 Sykes 391941 12769847 0.99 Greatest Hits 100
2236 2237 411 3145 0.99 1 Sweet Lady Luck 141 1 3 Vandenberg 273737 8919163 0.99 Greatest Hits 100
2237 2238 411 3154 0.99 1 Feirinha da Pavuna/Luz do Repente/Bagaço da La... 248 1 7 Arlindo Cruz/Franco/Marquinhos PQD/Negro, Jove... 107206 3593684 0.99 Ao Vivo [IMPORT] 155
2238 2239 411 3163 0.99 1 Samba pras moças 248 1 7 Grazielle/Roque Ferreira 152816 5121366 0.99 Ao Vivo [IMPORT] 155
2239 2240 412 3177 1.99 1 Hot Girl 249 3 19 None 1325458 267836576 1.99 The Office, Season 1 156

2240 rows × 15 columns

Lastly, we’ll now combine that table with the invoices table using an outer join on the key InvoiceId. Note that you can actually do all of these joins in SQL using one call (which is likely much more efficient!). Please see the additional readings for topic 2 for more information.

pd.merge(left = pd.read_sql("SELECT * FROM invoices", con),
         right = next,
         how = "outer",
         on = "InvoiceId")
InvoiceId CustomerId InvoiceDate BillingAddress BillingCity BillingState BillingCountry BillingPostalCode Total InvoiceLineId ... Name AlbumId MediaTypeId GenreId Composer Milliseconds Bytes UnitPrice_y Title ArtistId
0 1 2 2009-01-01 00:00:00 Theodor-Heuss-Straße 34 Stuttgart None Germany 70174 1.98 1 ... Balls to the Wall 2 2 1 None 342562 5510424 0.99 Balls to the Wall 2
1 1 2 2009-01-01 00:00:00 Theodor-Heuss-Straße 34 Stuttgart None Germany 70174 1.98 2 ... Restless and Wild 3 2 1 F. Baltes, R.A. Smith-Diesel, S. Kaufman, U. D... 252051 4331779 0.99 Restless and Wild 2
2 2 4 2009-01-02 00:00:00 Ullevålsveien 14 Oslo None Norway 0171 3.96 3 ... Put The Finger On You 1 1 1 Angus Young, Malcolm Young, Brian Johnson 205662 6713451 0.99 For Those About To Rock We Salute You 1
3 2 4 2009-01-02 00:00:00 Ullevålsveien 14 Oslo None Norway 0171 3.96 4 ... Inject The Venom 1 1 1 Angus Young, Malcolm Young, Brian Johnson 210834 6852860 0.99 For Those About To Rock We Salute You 1
4 2 4 2009-01-02 00:00:00 Ullevålsveien 14 Oslo None Norway 0171 3.96 5 ... Evil Walks 1 1 1 Angus Young, Malcolm Young, Brian Johnson 263497 8611245 0.99 For Those About To Rock We Salute You 1
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
2235 411 44 2013-12-14 00:00:00 Porthaninkatu 9 Helsinki None Finland 00530 13.86 2236 ... Looking For Love 141 1 3 Sykes 391941 12769847 0.99 Greatest Hits 100
2236 411 44 2013-12-14 00:00:00 Porthaninkatu 9 Helsinki None Finland 00530 13.86 2237 ... Sweet Lady Luck 141 1 3 Vandenberg 273737 8919163 0.99 Greatest Hits 100
2237 411 44 2013-12-14 00:00:00 Porthaninkatu 9 Helsinki None Finland 00530 13.86 2238 ... Feirinha da Pavuna/Luz do Repente/Bagaço da La... 248 1 7 Arlindo Cruz/Franco/Marquinhos PQD/Negro, Jove... 107206 3593684 0.99 Ao Vivo [IMPORT] 155
2238 411 44 2013-12-14 00:00:00 Porthaninkatu 9 Helsinki None Finland 00530 13.86 2239 ... Samba pras moças 248 1 7 Grazielle/Roque Ferreira 152816 5121366 0.99 Ao Vivo [IMPORT] 155
2239 412 58 2013-12-22 00:00:00 12,Community Centre Delhi None India 110017 1.99 2240 ... Hot Girl 249 3 19 None 1325458 267836576 1.99 The Office, Season 1 156

2240 rows × 23 columns