About the P.L. 94-171 "Legacy" Data Project
Public Law (P.L.) 94-171 was enacted by Congress in 1975. The law requires the Census Bureau to release the results of the decennial census for small geographic areas (census blocks) within one year of Census Day so they can be used for legislative redistricting. For the 2020 redistricting cycle, this deadline would have been March 31, 2021. Due to the COVID pandemic and the resulting problems with data collection, the census received an extension of this deadline to July 30, 2021.
On February 12, 2021, the Census Bureau published a revised timeline for the delivery of the P.L. 94-171 Redistricting Datafiles to September 30, 2021. This announcement resulted in considerable push-back and legal action by states with redistricting and election deadlines that could not be met within that schedule. Subsequently, the Census Bureau announced that their continued evaluation of plans and processes had resulted in a determination that an interim format of the P.L. 94-171 dataset could be released at an earlier date: by August 16, 2021. This interim data product is referred to as the “legacy data”. The Census Bureau subsequently released these data on August 12, 2021.
The "legacy" data files contain the same data that the Census Bureau will release by September 30, 2021, in an easier-to-use, tabulated format. The "legacy" datafiles consist of the final P.L. 94-171 data product that is essentially not user-friendly and necessitates more advanced database, analysis and manipulation skills to be usable for redistricting purposes than the later release of the P.L. 94-171. The "legacy" data consist of a group of text files with the geography file and data segments as separate files. The separate geography file and data segments were released in pipe delimited text file formats.
Under state law (Government Code Section 8253) “The Legislature shall take all steps necessary to ensure that a complete and accurate computerized database is available for redistricting,” with the Statewide Database tasked to meet this mandate.
In order to facilitate an earlier start to the California redistricting process and relieve tension in the local and state timelines, the SWDB was tasked with taking the steps necessary to process the “legacy data” into the tabulated, easier-to-use format.
SWDB collaborated on this project with California’s State Demographer, Dr. Walter Schwarm, who serves as the Chief of the Demographic Research Unit (DRU) of the State’s Department of Finance. The Demographic Research Unit represents the State in national programs relating to population statistics including the Federal-State Cooperative Program for Population Projections and Estimates, and the Census Data Center Programs. Collectively Dr. Schwarm, DRU and SWDB determined that the best way to ensure that the dataset would be accurately reformatted was to set up parallel processing of the data in our respective departments, which allowed for data to be processed using different systems independent of each other, and to be compared along the same parameters.
On August 12, 2021, the Census Bureau released the “legacy data” file on their website. SWDB and DRU immediately downloaded the files, processed them using SAS, R and Python scripts and then compared the results by census block. Upon verification that all files agreed, our departments began to work with the newly formatted file to conduct a preliminary evaluation of the actual census data. Please note that the reformatted legacy data is now in one file, not in individual pieces.
Under Government Code Section 13073.5., the Demographic Research Unit is charged with validating all official census data and population statistics for the State of California. The Demographic Research Unit has validated the reformatted legacy data on the SWDB site as the official census data for California.
Please note that validation does not mean that it was determined that the Census Bureau’s data are free of errors. During our preliminary evaluations of the data, we have begun to find errors, some of which may be due to data processing, and others that may be due to the Census Bureau’s new privacy methodology (Differential Privacy).
Unfortunately, the Census Bureau has not provided us with enough information to be able to fully assess the data and differentiate between error and anomalies introduced by the new privacy algorithms. We will continue to investigate these data and utilize the programs available to us, including the Count Question Resolution Program(CQR), to flag inconsistencies and obvious problems to the bureau.
We would appreciate the collaboration of California data users in letting us know about errors or anomalies that they find when working with the census data on the local level. Over the next few weeks we will set up a mechanism to collect user feedback, and develop a plan to report all verified inconsistencies and errors to the Census Bureau.
The reformatted legacy dataset that we are providing on August 18, 2021, on this site is not the ‘official redistricting database’ that the State’s Citizen Redistricting Commission and most local jurisdictions are required by law to use for their redistricting processes. State law provides that data about inmates that were enumerated by the Census Bureau in facilities under the control of the California Department of Corrections and Rehabilitation (CDCR) be reallocated to their last known residential address. The reformatted legacy datafile does not include these reallocations. The Statewide Database will be working on this reallocation process, in addition to other geography and data related processes, for another 30 days after release of the reformatted legacy dataset. The official redistricting database provided for under Election Code Section 21003 and Government Code Section 8253 will be released on September 20, 2021 on this website.
Links to