PI: Joseph Price
Overview
The purpose of this project was to use linked census records to evaluate the long-run effects of access to clean water. By linking individuals across census years we can specifically determine where they lived during their childhood and also have information about their educational attainment and earnings later in life (using the 1940 census). The project was designed to use a time period when the US was a developing country and beginning to use chlorination to clean its water to assess the economic returns to clean water.
Evaluation of Academic Objectives
In terms of the specific academic article that we hoped to result from this project, we were not successful. The exact paper that we had planned to write was published this year in the Journal of Economic History this year by a group researchers headed by Joe Ferrie at Northwestern University. We became aware of their working paper as we were finishing our data collection. They had gathered very similar data on chlorination dates and had used the same census datasets from Ancestry.com that we were using. We looked into doing a slightly different project using data specifically from Utah but couldn’t get the data we needed on the timing of when cities first chlorinated their water. Ultimately, the specific academic outcome was a working paper titled, “Economics Returns to Clean Water”.
However, while the immediate goals of our proposal were not successful, this project opened up a much broader set of research opportunities that will ultimately result in several papers and has already resulted in an additional $50,000 in funding from outside of BYU to cover the wages of BYU undergraduate research assistants. This project directly resulted in the creation of the BYU Record Linking Lab (rll.byu.edu). Through this lab we have contracts with scholars at UCLA and Michigan to create linkages across records by adding individuals to the Family Tree at FamilySearch, then using the search tools on FamilySearch to attach records to individuals and link families together over time.
In addition to the direct research experiences the students in the lab have had this last year, we have also added about 60,000 new individuals to the Family Tree. We will be able to continue to monitor these new additions to see how quickly other people make edits to these individuals and expand that part of the Family Tree. We are now working with FamilySearch to use these insights to better understand the nature of the Family Tree, how quickly it is changing over time, and when the US part of the Family Tree is likely to be complete.
Evaluation of the Mentoring Environment
The project had three main components that provided distinct research experiences for students. The most skill intensive part of the project was linking together individuals across the 1900-1940 census years. This data was provided to us by Ancestry.com through a partnership with the National Bureau of Economic Research. These are the complete count versions of these censuses and are enormous datasets. The project involved students developing methods to work with these large datasets and identifying the best approaches to match individuals across records. Transcription errors are pretty common in these records and so we had to work with various fuzzy matching approaches to link people across records.
The second part of the project involved identifying the date that each city in our sample first began to chlorinate their water. Our research assistants gathered this data from newspapers, websites, and by contacting the public works department in various cities. There were several students who helped with this part of the project.
The final part of the project was conducting the analysis and writing up the draft of the paper. All of the data collecting, analysis, and writing tasks were completed by the research assistants working on the project. One of the students working on the project (Michael Gmeiner) did the bulk of the analysis and writing and is a coauthor on the working paper that we wrote. He is starting his PhD program at Northwestern this fall and will likely be working with the author (Joe Ferrie) who ended up publishing the paper that we hoped to publish.
Students Involved in the Research
All of the students listed below helped with various aspects of gathering and preparing the data, conducting the analysis, creating the figures and tables for the paper, and writing the final text. This list does not include the 20+ students who worked in the Record Linking Lab this summer as part of the externally funded projects that directly resulted from this initial project. Three of the students that worked on this project are currently pursuing PhDs, which are listed below.
(1) Michael Gmeiner (Northwestern)
(2) Josh Witter (Texas A&M)
(3) Adam Shumway (Cornell)
(4) Dallin Pope
(5) Alex Doss
(6) Michael Proudfoot
(7) Kevin Bessey
Description of How the Budget was Spent
All of the funding from this MEG grant was spent on wages for undergraduate research assistants.