Introduction to PyCPS¶
PyCPS is a package for working with the Current Population Survey
I wrote PyCPS to as part of a project using CPS data. I wanted my results to be reproducable, and a big part of that involves getting the data. The CPS doesn’t have an API to use, so this is the result.
The CPS¶
This is a very brief overview of the CPS, you can get more detailed explanations on the Census Bureau’s or BLS’s websites. A selected household is interviewed for four consecutive months, exits the survey for the next eight months, and then returns to be surveyed for a final four months. In total, a household is interviewed for eight months, spread over a sixteen month period.
The basic goal of this package is to construct a somewhat consistent timeseries from the monthly CPS files. This goal is complicated by the fact that the CPS wasn’t really designed to be a longitudinal dataset.
There’s a few related functions PyCPS provides:
- Download data dictionaries and monthly data files
- Standardize features across months as much as possible
- Merge households across months to create a time series