General
Where is the Statewide Database?
What type of data do you have?
Where do you get your data?
How can I get local election data?
How much do you charge?
What are FIPS county codes?
FIPS stands for Federal Information Processing Standard Code. FIPS county codes are unique three-digit codes that identify counties in California. In census files, the county FIPS codes are five-digits with the last three digits indicating the county and the first two digits designating the state FIPS code (which for California is 06). Below is a table of county numbers, county names are in alphabetical order accompanied by FIPS codes. FIPS codes are calculated by taking the county number, multiplying it by 2 and subtracting 1.
County Number | County Name | County Fips |
---|---|---|
1 | Alameda | 001 |
2 | Alpine | 003 |
3 | Amador | 005 |
4 | Butte | 007 |
5 | Calaveras | 009 |
6 | Colusa | 011 |
7 | Contra Costa | 013 |
8 | Del Norte | 015 |
9 | El Dorado | 017 |
10 | Fresno | 019 |
11 | Glenn | 021 |
12 | Humboldt | 023 |
13 | Imperial | 025 |
14 | Inyo | 027 |
15 | Kern | 029 |
16 | Kings | 031 |
17 | Lake | 033 |
18 | Lassen | 035 |
19 | Los Angeles | 037 |
20 | Madera | 039 |
21 | Marin | 041 |
22 | Mariposa | 043 |
23 | Mendocino | 045 |
24 | Merced | 047 |
25 | Modoc | 049 |
26 | Mono | 051 |
27 | Monterey | 053 |
28 | Napa | 055 |
29 | Nevada | 057 |
30 | Orange | 059 |
31 | Placer | 061 |
32 | Plumas | 063 |
33 | Riverside | 065 |
34 | Sacramento | 067 |
35 | San Benito | 069 |
36 | San Bernardino | 071 |
37 | San Diego | 073 |
38 | San Francisco | 075 |
39 | San Joaquin | 077 |
40 | San Luis Obispo | 079 |
41 | San Mateo | 081 |
42 | Santa Barbara | 083 |
43 | Santa Clara | 085 |
44 | Santa Cruz | 087 |
45 | Shasta | 089 |
46 | Sierra | 091 |
47 | Siskiyou | 093 |
48 | Solano | 095 |
49 | Sonoma | 097 |
50 | Stanislaus | 099 |
51 | Sutter | 101 |
52 | Tehama | 103 |
53 | Trinity | 105 |
54 | Tulare | 107 |
55 | Tuolumne | 109 |
56 | Ventura | 111 |
57 | Yolo | 113 |
58 | Yuba | 115 |
Do I need a special program to read the data?
Do you make maps?
Do you draw district lines
While we are the State of California's "Redistricting Database" we do NOT draw lines. Our purpose is to readily provide data to ALL who wish to use them. While some of our users may intend to use our data to draw their own plans, we do NOT provide instruction on drawing lines, nor will we draw them for anyone.
Where was the new 2001 California congressional seat created?
Do you also have Census Data?
Census Data
What Census data do you provide?
Can I merge 2000 census tract data to the 2001 Assembly, Congressional, Senate, and Board of Equalization Districts?
How do you determine which census blocks/ tracts are in a given city?
Are the 1992G to 2000G census block data files by 1990 or 2000 census block units?
How/by what method was the 1992 to 2000 voting data merged to the 2000 census block from the precinct?
How is the 1992G to 2000G block-level data produced? Given that there are several blocks to a precinct, how were the votes assigned to the block level? Does this create any issues when aggregating blocks to the city level?
Does SWDB provide voter or registration data by zipcode units? What about by ZCTAs?
The Statewide Database is California's redistricting database and redistricting is done with Census TIGER/Line files. Zip codes are United States Postal Service delivery areas, and as such they are maintained and changed at will by the USPS. USPS zip codes areas do not necessarily align with Census geography, such as census block boundaries.
Though it is not our practice to maintain data or geography for zip code units here at the Statewide Database, there are data reports for zip codes on our website that we have either collected or that were created as part of special research projects. These old data reports can be downloaded from our Zip Code Reports archive. Please note that we do not have any current or future plans to publish our precinct data sets by USPS zip code.
ZCTAs, or Zip Code Tabulation Areas, are Census statistical tabulation areas, unlike USPS zip codes; and as such, they do align with census geography. We don't intend to create ZCTA data reports, however you can create your own ZCTA data reports using our Precinct to Block Conversion files, which can be found in the “Geographic Data” links under our Election Data page. Using the Statewide Database's rg precinct to block conversion files, it is possible to aggregate block-level registration to the ZCTA using a ZCTA to census block cross-walk file.
A ZCTA to census block cross-walk file can be found on the Census Bureau's web site.
Can we obtain the exact count of the registration breakdown by block for elections after 2000?
You would need to geocode the voter registration file to the census block using the registered voter's address. This would allow you to retain/associate all of the data associated with a registrant including party affiliation. The Statewide Database's precinct block to conversion files are based only on a geocode of total registrants in a precinct and not their party affiliation.
*Next spring the Statewide Database will be releasing all of our precinct data from the 2002G to the 2010G on the 2011 census blocks. This census block data set of registration and voting data will be constructed using a more precise method than the precinct to block conversion files.
How can I estimate the registration breakdown (i.e. number of Republicans and Democrats) for a particular block using the precinct data files.
Technical GIS and Importing Data
What projection and coordinate system does SWDB's GIS spatial files use?
How do you import DBF files into R?
- (If your version of R does not have the "foreign import" library - for example, library (foreign) returns false then the R cannot directly read or import the DBF file): Convert the DBF to a CSV by opening the DBF in Excel saving it as a CSV. Then use a command line of R to read in that CSV.
- Open Excel. Select File → Open.
- In the "Look In" drop down box, select the directory in which you saved the DBF file; and in the "Files of Type” drop down box, select "All Files (*.*)." Double click on the file name and the .dbf file will now open in Excel
- With that open DBF, select File → Save As and in the "Save as Type” drop down box, select "CSV (Comma delimited)." Specify the filename and save
- In R, use the following syntax to open your newly-created CSV: read.csv(filename, header = TRUE, sep = ",", quote="\"", dec=".", fill = TRUE);
- (If library (foreign) returns true, that means you are ready to load DBF directly into R): Please refer to the syntax in this page.
Citation: Introduction to SAS. UCLA: Academic Technology Services, Statistical Consulting Group. "How to input data into R".(October 31, 2012).
Conversion Files
What are the DATA CONVERSION files i.e. the SRPREC to BLK, RGPREC to BLK, BLK to MPREC, SRPREC to RGPREC and SRPREC to CITY files and what are they used for?
What is the significance of the "UNASSIGN" values in the spatial and geographic data conversion files?
How is a census geography associated with a precinct in the precinct to blk conversion files i.e. SRPREC to BLK, RGPREC to BLK and BLK to MPREC files and what is the "conversion" based on?
First of all, the conversion files between blocks and precincts are obtained from the digitally-recorded precincts overlaid on the census geography. Then every individual registered voter within an address is "geocoded," or put into his or her census geography by means of address matching.
This process allows the number of registered voters per census block - precinct piece to be determined. The number of registered voters in a given precinct-block piece is reported in the BLKREG field in the block to precinct conversion files.
When the BLKREG is divided by the total number of registered voters in the precinct (SRTOTREG/ RGTOTREG) one can derive what proportion of a precinct is composed of a given block-precinct piece and vice verse when the BLKREG is divided by the BLKTOTREG, the percent of a block belonging to a given precinct can be derived. The measure is reported in the PCTBLK field of the block to precinct conversion files.
How can I merge registration and voting precinct data to census block units using the precinct to block conversion files on the Geographic Data page i.e. the SRPREC to BLK and RGPREC to BLK files? What about merging to 2000 census block groups and/ or 2000 census tracts?
The precinct to block conversion files have a field called "PCTRGPREC" in the RGPREC to BLK files and "PCTSRPREC" in the SRPREC to BLK files. These fields contain the proportion of a given precinct's total registered voters (RGTOTREG or SRTOTREG) that are contained within the portion of the precinct that is encompassed by the census bock. The number of registered voters in the census block piece is "BLKREG."
Hence the values in the " PCTRGPREC" and " PCTSRPREC" variable fields are derived thus: BLKREG / RGTOTREG = PCTRGPREC and BLKREG / SRTOTREG = PCTSRPREC
The case is often that multiple census blocks will transect a single precinct with some census blocks being split between several precincts. Overview of the procedure:
Is it also possible to use these same files to do the reverse i.e. merge/ convert census block data to the sr, rg and map precincts?
Yes, both of the RGPREC to BLK and the SRPREC to BLK as well as, the BLK to MPREC file can be used to merge census block data to the RG, SR and Map precinct types.
In the case of merging census data to precinct units, you will be distributing the value in the "PCTBLK" field, rather than the "PCTRGPREC" and "PCTSRPREC," across the census block data you want to merge to precinct units.
The precinct to block conversion files and the MPREC to BLK conversion file have a field called "PCTBLK." This field contains the value of the proportion of a given 2000 census block's total registered voters (RGTOTREG or SRTOTREG) that are encompassed by a single precinct. The number of registered voters in the census block piece is "BLKREG" and the total registered voters in the census block is the "BLKTOTREG."
Hence the values in the "PCTBLK" variable field are derived thus: BLKREG / BLKTOTREG = PCTBLK Overview of the procedure:1. Merge the 2000 census block to precinct conversion file to the census block data that you want by precinct units.
2. Distribute PCTBLK value across the precinct data. This renders the census block data into its 2000 census block - precinct pieces.
3. Sum the records by precinct so that you have one record for each block.
What about merging block group and tract data to the precincts? Can this also be done with the block to precinct conversion files?
Yes, the RGPREC to BLK, the SRPREC to BLK as well as, the BLK to MPREC file can be used to do this but it is a more advanced analysis.
To merge 2000 tract records or block group data to one of the precinct types i.e. rg, sr or map it is first necessary to re-tabulate the records in the conversion file to determine the proportion of a block group/ tract's total registrants that fall into a given precinct.
Overview of the procedure:1. Merge the re-tabulated precinct to 2000 census block group/ tract conversion file to the block group/ tract data that you would like to merge to precinct units.
2. Distribute the percent block group/ tract value, depending on which census unit you are working with, across the precinct data. This renders the precinct data into its 2000 census block group/tract-precinct pieces.
3. Sum the records by precinct so that you have one record for each precinct.
Precinct Data
I am comparing precinct level data from the Statewide Database to the precinct data from the Registrar of Voters and I am finding discrepancies with registration and the number of Votes cast. Why?
Election Data are kept on different files for different purposes. When you pull election returns from the Registrar of Voters file, you are looking at the Statement of Vote (SOV). You may also see Voter Registration on that file, which corresponds (in some but not necessarily all counties) to the registration 29 days before the day of the election. The SWDB uses the 15-day close file to process registration data: this means the file is frozen 15 days prior to Election Day and changes leading up to the election will not be captured. The SWDB also uses SOV data at the precinct level and those generally agree with the data published on the respective county’s website, although with some exceptions including that turnout may not match in cases of multi-page ballots. Rarely, there are processing errors and those are addressed in our errata files. The SWDB uses an additional file for processing voter data: the Voter History file. Via that file, we can track the voter irrespective of where the person resides, which is important for the building of a longitudinal database. That file also tells us the method by which the voter participated, e.g. did the voter use a mail ballot or did they vote at the polling place.
The use of these different files has implications for data reporting by precinct:
One, as mentioned above, the registration numbers may differ due to our use of the 15- day close and the Registrars’ use of the 29-day close of Registration. Two, the SOV is stable for each election but the Voter History file is continuously updated. Three, the different categories of vote method don't always add up to the total registration and votes cast in the precinct. Finally, provisional votes are added to the precinct totals later in the process (after Election Day). You are seeing differences in the SOV due to all of the above: the fact that the categories don't add up and the fact that voters are no longer in the precinct (moved etc.) or are new to the precinct by the time we receive the file.
Why is there a discrepancy in the number of precinct records in the precinct data files versus the number of precinct records in the precinct geographic files?
The discrepancy in the number of records is due to the fact that not all registration records can be associated to geographical locations but still must be reported in the Statement of Registration nor can all ballots that are cast can be associated with the registered voter's precinct.
I want to aggregate the precinct voting data(SOV) data to the city level, is there any practical implication for using one type of precinct instead of the other? In other words, would it better to use the sv precinct files instead of the sr precincts or vice versa?
You should use the sr precincts, since this is the precinct unit for which we have precinct to city conversion/equivalency files.These sr precinct to city files describe which precincts are in which cities.
What are the SOV, REG, ABS, POLL and VOTE files?
Statement of Vote (SOV) data files are available by the sv and sr precinct types. The SOV files are in the first column of the data pages. These files contain the precinct-level voting results. The SOV data files are available for the sv and sr precinct types.
Statement of Registration (SOR) data files are processed into four file types: REG = registration data for all registered voters; ABS = registration data for registered voters that voted by mail ballot; POLLV = registration data for registered voters that voted at the polling place; VOTE = registration data for all voters that voted. The VOTE files are the sum of the ABS and POLLV files.
Each of the SOR files is available by the rg, rr, and sr precinct types. The same registration data variables are reported for each election. Please refer to the Statement of Registration codebook in the precinct data page for a complete listing of the variables in the SOR files.
Why doesn't the Statewide Database have any registration and SOV data files for map precincts?
The Map Precinct (mprec)is a geographic precinct type that is created by the Statewide Database to reflect the geography of the county's registration precincts as consistently as possible. The RR precincts are the non-geographic version of the MPREC and are aggregations of RG precinct (tabular data) into MPRECs (geographic data). Generally speaking, Map Precincts and RR Precincts follow the same boundaries.
Because the resulting RR Precincts may include RG Precincts that are consolidated into different SV Precincts, we create a geographic consolidation known as the SR Precinct to contain whole RR and SV Precincts.
How can I estimate the registration breakdown (i.e. number of Republicans and Democrats) for a particular census block using the precinct data files?
Are precinct data from one election comparable with precinct data from another election?
For example, Registrars of Voters will often "consolidate" (combine 2 or more into 1) precincts for low turnout elections such as Primary Elections and will "split" (create 2 or more from 1) precincts for high turnout elections, such as the General Presidential Elections.
Why isn't the 2000 to 2008 precinct data available by 2000 census block like the 1992 to 2000 electoral data is?
The 1992 to 2000 census block data sets are part of the merged data set that was produced for the 2001 redistricting cycle. In between redistricting cycles, we do not create merged data sets. We do produce precinct/block conversion files/equivalency files which allow experienced GIS users to perform their own merges through a geographical proportionality method.
How/by what method was the 1992 to 2000 voting data merged to the 2000 census block from the precinct?
What projection and coordinate system does SWDB's GIS spatial files use?
How is the 1992G to 2000G block-level data produced? Given that there are several blocks to a precinct, how were the votes assigned to the block level? Does this create any issues when aggregating blocks to the city level?
I am interested in aggregating the data to the county level. Does it matter if I use by rgprec, by rrprec, by srprec, or by block files?
What is a registration cycle as it applies to the cycles registered variables i.e. Dem registered 1 cycle (DREG1G), Dem registered 2 cycles (DREG2G) in the Statewide Database registration precinct data (REG, ABS, POLLV & VOTE) files?
There is a registration date on the registered voter file that is classified according to how many elections ago the voter registered. That is, if you registered in September, 2006, and we are processing for the November, 2010 election, you were registered for three general elections (2006, 2008, 2010).
The program then compares the registered date from the registrant's record with the date in the cohorts to determine where it falls and then increases the value of that cohort by one.
Here are the cohorts:
insert into reg_cohort (dt) values (#election_date);
update reg_cohort set cohort_1 = date_sub( dt, interval 2 year);
update reg_cohort set cohort_2 = date_sub( dt, interval 4 year);
update reg_cohort set cohort_3 = date_sub( dt, interval 6 year);
update reg_cohort set cohort_4 = date_sub( dt, interval 8 year);
update reg_cohort set cohort_5 = date_sub( dt, interval 10 year);
update reg_cohort set cohort_6 = date_sub( dt, interval 12 year);
update reg_cohort set cohort_7 = date_sub( dt, interval 14 year);
update reg_cohort set cohort_8 = date_sub( dt, interval 16 year);
So going backwards from the election date (say the g10 election) puts a registration date of December 2008 into cohort_1.
Since there is a 15-day close of registration in California, if you registered at the last moment for the 2008 election, you had to be registered in October, so you would end up in cohort_2 (cohort_1 would be in the last two years, and no other general election. cohort_9 is registering over 16 years ago or no registration date, but that is rare now). No adjustments are made for purges of the registration roles.
Voting Data
How does SWDB obtain demographic information about voters?
Do you have data on voting patterns by ethnicity?
Please note that Black and white registered voters cannot be matched to an ethnic group based on their last name, so this data will not be able to provide numbers for Black and white registration totals.