Streaming Joins
The video below discusses how we can join two streams, join a stream with a static data frame, and the requirements to do so!
I highly recommend watching the video using the ‘full’ Panopto player. There is a ‘pop out’ button in the bottom right of the video to enter this viewer.
The notebook used in the video is available here. You’ll need to download this .ipynb
file and upload it to your JupyterHub
environment. Make sure that the kernel used to run the notebook is a pyspark
kernel!
Remember, if you are off campus you should log in to the VPN and then you can access our JupyterHub
.
The impressions and clicks data sets are available at https://www4.stat.ncsu.edu/online/datasets/.
Notes
This ends the course! I hope you enjoyed it :)
Head back to the Moodle site to work on your final project.