This is a talk by Neil McCulloch, Data Science Engineer at dunnhumby.
Improving the performance of problematic PySpark applications can often seem like a daunting task. In this talk, I will outline a strategy for tackling these projects, delving into a case study on the performance our in-store availability reporting science, and how we have slashed runtimes in half.
This session was part of the Data Science Festival Summer School in 2023. Find out more at https://datasciencefestival.com/event…
The Data Science Festival is the place for data-driven people to come together, share cutting-edge ideas and solve real-world problems. We run monthly events, meetups and the biggest free-to-attend data festivals in the UK. Join the community at https://datasciencefestival.com/