In 2022, Justin de Benedictis-Kessner (Harvard Kennedy School), Diana Da In Lee (Columbia University), Yamil Velez (Columbia University), and Christopher Warshaw (Georgia Washington University) released the Local Elections Database - a new, comprehensive database of about 78,000 candidates in 57,500 local electoral contests between 1989-2021. The novel dataset contains information on candidates, election results, and key demographics, including race and gender.

Citing the Original Dataset

+ Who created this dataset?

Justin de Benedictis-Kessner, Diana Da In Lee, Yamil Velez, and Christopher Warshaw contributed equally to creating the American Local Government Elections Database. To access the original datasets, you can visit the database page on the OSF.

+ How should I cite the original authors' work?

You can cite the database as:

de Benedictis-Kessner, J., Lee, D., Velez, Y. R., & Warshaw, C. (2023, May 16). American Local Government Elections Database. Retrieved from osf.io/mv5e6

You can cite the authors' working paper as:

de Benedictis-Kessner, Justin, Diana Da In Lee, Yamil R. Velez, and Christopher Warshaw. "American Local Government Elections Database." HKS Faculty Research Working Paper Series RWP22-013, September 2022 (rev. June 2023).

Access the working paper.

About this Website

+ How should I cite this website?

You can cite the website as:

Ahmed, K, Grucza, S., Martinez, B., Nakamura, K. The Local Election Databse Project. (2024, Jan. 15). de Benedictis-Kessner, J., Lee, D., Velez, Y. R., & Warshaw, C. (2023, May 16). https://code4policy.com/2024-b1-Local-Election-Database/index.html.

+ How has the team modified the original dataset?

The maps that display representation on a 10-point scale (e.g. "Heading of Scoring Map on gender-web-page") are for activists and organization leaders that would like a ready-made heuristic for governmental entities that represent social groups like women less than their share of the population.

We downloaded the file ledb_candidadatelevel.csv from the original dataset and rn a program to create a dataframe of aggregated elections by county with the number of men and women who ran for office, the number that won, and the total number of seats available.

Additionally, we merged the data with counties_constituency_data1.csv, a file with data on the proportion of women per county in the American Local Government Election Database, and it is also available for download from our project repository.

Finally, the program also creates a representation variable that is the number of female winners over the percent of women in the population. We transformed this value into a representation score by multiplying each value (which was in the range {0,2}) by 5.

We also displayed representation as a statistical value using the Chi-square statistical test. The Chi-square test helps researchers identify outlier counties with statistically-anomalous levels of representation.

We modified ledb_candidatelevel.csv again to create a modified dataset with fips codes, year of election, and the estimated gender value, value or female, of the elected representatives.

Then, we ran the program gender-count-chisquare-with-female-representation-score.py to apply the Chi-Sqaure Goodness of Fit Test for each row of data. These results populate a file called chi_square_results_with_female_representation_score_by_fips.csv. This file is also within the "Gender Web Page" folder in the repository.

