SALDRU Data Wrangling with Stata 2024 Course

UCT Training Programme in Data Wrangling with Stata: How to Manage your Empirical Research Project

8-19 July 2024

Presented by SALDRU at UCT

Each year since 1999, SALDRU has run a workshop designed for university students or graduates seeking further training in statistics and survey analysis using Stata.

Together with providing training in basic statistical analysis, this two-week course aims to address a crucial aspect of empirical research that is often overlooked: data management. Graduate programmes often focus on mastering sophisticated estimation models. However, without the ability to effectively manage and transform data into a well-structured and analysis-ready format, these advanced methods can be challenging to implement and may yield unreliable results. This hands-on course will provide participants with the essential skills and knowledge to manage and organise their data for empirical research projects using Stata. Through a series of practical real-world examples and exercises, participants will manage, clean, and process data of various types, use do files and macros, merge and reshape datasets, and document their work for reproducibility and transparency.

By the end of the course, participants will be equipped with the necessary tools and techniques to handle data efficiently, allowing them to focus on the analysis and interpretation of results rather than getting lost in data management challenges. This course is recommended for graduate students and researchers conducting empirical social science. It should not only enhance your research skills but also provide solid foundation for reproducible and transparent research practices.

Attendees are encouraged to attend all 10 days of the course. The first 2 days are however optional for those who already have a basic knowledge of Stata (including the use of do files) and are comfortable with performing basic analysis in Stata. Applicants should indicate whether they will be attending the first 2 days when applying for the course.

Preliminary course outline:

  1. Introduction to Household Surveys
    • An overview of the major household surveys in South Africa
    • Hands-on training using data from these surveys and other data sources
  2. An Overview of Basic Descriptive Statistics
    • Creation of frequency distributions
    • Measures of central tendency and dispersion
  3. Basic Analysis of Survey Data using Stata
    • Creating graphs and tables
    • Undertaking basic statistical analysis, including regressions
  4. Introduction to Data Management with Stata
    • An overview of Stata’s data management capabilities
    • Basic commands for loading, saving, and exploring data
    • Understanding data types and formats
    • Basic data cleaning and manipulation techniques
  5. Managing Data with Do Files
    • Advantages of do files for reproducibility
    • Writing and using macros in do files
    • Organizing do files for efficient data management
  6. Advanced Data Management Techniques
    • Working with missing values
    • Dealing with skip patterns
    • Using date and time data
    • Dealing with duplicates
    • Avoiding common variable construction errors
    • Using the “egen” command for more complex data transformations
  7. Merging and Reshaping Data
    • Understanding the “merge” command and its options
    • Merging datasets with different structures
    • Using “reshape” to transform data from wide to long format and vice versa
  8. Organizing and Documenting Your Data
    • Best practices for organizing your data files and folders
    • Creating data dictionaries and codebooks
    • Using Stata’s “notes” and “comment” features to document your work
    • Creating clear and well-documented code for reproducibility and transparency
  9. Tips for Efficient Data Management
    • Using Stata’s built-in help and resources
    • Leveraging Stata’s community resources for additional support and guidance


There is no cost to attend the course and a light lunch will be provided daily.

To apply for the course, click on the following link: UCT Training Programme Application.

Application deadline: 11 June 2024


  • As this workshop tends to be heavily over-subscribed, we cannot guarantee a place for every applicant. Successful applicants will be notified by 18 June 2024. Should you not have heard from us by that date, please consider your application unsuccessful.