Reproducible Data Science with Open Source Python Tools and Real-World Data
This open textbook uses real-world social data sets related to the COVID-19 pandemic to provide an accessible introduction to open, reproducible, and ethical data analysis using hands-on Python Jupyter coding, modern open-source computational tools, and data science techniques. Topics include open reproducible research workflows, data wrangling, exploratory data analysis, data visualisation, pattern discovery (e.g., clustering), prediction & machine learning, causal inference, and network analysis.