An Introduction to PySpark
Part of the All Things Data! specialist track
What is PySpark? Can it solve all of my data problems? Are you sure I can’t just use pandas instead?
This talk will aim to answer at least some of the questions you may have if you’re starting out with or looking to scale up your use of PySpark when working with data. This will be a mix of useful intro hints mixed with more technical backing, including a peek into what goes on behind the API and some common issues I’ve run into so that you can hopefully spend slightly less time getting frustrated by them.
So whether you’re prepping for your next data engineering interview or are just curious about what might be going on in one small corner of the world of data, you’ll hopefully leave this talk feeling a bit more confident taking your next steps with big data.
See this talk and many more by getting your ticket to PyCon AU now!I want a ticket!
Alex is a software engineer working for Geoscape Australia and is based in Canberra, Australia. She previously spent several years working as a data engineer in the Australian Public Service. She's a co-organiser of the Canberra Python User Group and a co-organiser of the upcoming Django Girls Canberra workshop. She's passionate about big data, clean code and supporting those with a marginalised experience of gender in tech.