Document Collection and Data Entry For a Refined Huguenot Immigrant Database With an Emphasis on New York and Rhode Island

Kirk Skidmore and Jacob Ekins with Professor Byron R. Merrill, Ancient Scripture

During the 16th and 17th centuries, the Huguenots (French Protestants) suffered bitter religious persecution at the hands of their fellow countrymen. Some recanted their faith but many continued to live, some publicly and others privately, according to the dictates of their consciences. Difficulties for these did not ameliorate. Persecutions, robbings, and bloodshed caused thousands upon thousands to seek refuge in countries near and far. Huguenots, of whom many were artisans and merchants, chose to bring their industry and religious values to the British Colonies and began life anew. They were an active minority and soon held positions of rank and status within the colonies and their immediate descendents played significant roles in the Revolutionary War and in the founding of the United States of America. Henry Cabot Lodge spoke of the Huguenot as follows: “I believe that, in proportion to their numbers, the Huguenots produced and gave to the American Republic more men of ability than any other race.”1

With this unique legacy in mind, a collaborative effort has begun to draw together the records of these scattered refugees. The manner of their exile and the passage of time make the genealogy of the Huguenot people quite disconnected. The Huguenot Immigrants Database, a project with its origins in Professor Byron Merrill of Brigham Young University, attempts to draw together a patchwork of the information still available, concerning this people. Creating and managing a database that will serve as the entry mechanism for more than 100,000 records requires a number of issues be addressed. In order to further this work, research goals this year focused on two efforts: performing data entry and reevaluating the database platform. Concerning the latter goal, three fundamental topics were reevaluated: the database platform itself, the overall extraction strategy, and many of the extraction conventions.

Courtesy of the Computer Science Department, an initial prototype platform was implemented at the debut of this project. This allowed data entry to begin. As record entry progressed this original platform became inadequate. The database was slow, changing the format was laborious, and the information, once inputted, was impossible to analyze. In October of 2000, Microsoft Access 2000 was evaluated as an alternative. Access 2000 soon proved to be an attractive option. The speed with which the database operated improved five and ten fold. Being able to edit the layout of the database no longer required the volunteer time of computer science graduates students. It could be performed at will and with great ease. Analyzing records— essential to evaluating the progress and effectiveness of the database—was a strength of this Microsoft application. Its prospects for future use were very good.

With the ultimate goal of publishing the information collected onto the Internet, one must consider the logistics of a database search engine in addition to those of data entry. With this in mind, a data entry strategy was thoughtfully explored after formally electing to use Access 2000.

It was determined that a genealogical database must be able to bring together information from different records based on its similarity to a particular search and, simultaneously, keep individuals from the same record, who are related, together. Now identified, this strategy serves to ensure the integrity of the genealogy being extracted from the records. Relational and biographical information entry fields were used to effectuate these criteria. A relational field indicates the absolute relationship between one individual and another. A biographical field permits information to be searched by similarity, even when the information is not exactly the same. The use of these particular fields then became of great interest to the project.

The entry fields of the database, still in its infancy, needed to be refined. In order to test the capabilities of the fields chosen, two-dozen documents were selected and entered as a trial. These materials represented both common genealogical documents and a substantial selection of records that were less common. As dilemmas presented themselves, each one was methodically addressed by two or more individuals. A problem solving synergy spawned an exceptional handling of the issues and gave way for an improved data entry interface that was remarkably simple and effective. Troubleshooting was surprisingly time consuming but vital to the future effectiveness and integrity of this genealogical tool. The time was well spent considering the number of records which will follow that will need a minimum of backtracking for corrections or changes. As each issue was addressed, the database was modified to accommodate the idiosyncrasies of each of the trial documents. Some of these problems included establishing a temporal and demographical scope for the project, selecting which entry fields to include in the layout, identifying where and how certain types of information that would be entered, learning extractors’ limits of interpretation within documents, making protocols for translation of foreign languages, and working to assure the highest fidelity record transfer from the document to the computer. A record of the evaluation and resolution of each problem encountered was diligently kept. This record will serve as a protocol manual for those who continue with this project.

Once a strategy crystallized, entry fields were adjusted, and data entry ambiguities were answers with convention and protocol, data entry progressed more aggressively. More than 1800 names have been inputted since August of 2000. Of those, 1100 were entered within the last four months. Of these names a significant portion of them came from the Northern Colonies. All of the available records from the Huguenot colony of Naragansett, Rhode Island have been inputted. Numerous names from the towns of New York City, New Rochelle, and New Paltz— former bastions of French Protestantism in America—have also been accomplished. The records from New York City have been an excellent extraction source because of their clarity and simplicity. The New Rochelle and New Paltz records have posed some difficulty for a few reasons. For one, these records follow a format unlike many of the records used. Secondly, the validity of the materials is questionable because they are second-hand and we do not have access to more original sources. It was hoped that all the existing records would be finished by this time, however, troubleshooting issues continue to arise and this has slowed the data entry.

The Huguenot Immigrants Database project continues to be a challenge but the outlook is brightening. Special thanks goes to Professor Byron Merrill for his invaluable assistance and counsel in each aspect of the project. Another key source of wisdom and perspective is Professor Ray Wright, his team of extraction workers and his computer specialists. His group is working on a similar project concerning German immigrants; collaborative efforts have allowed the two groups to make critical evaluations of each other’s projects. In the coming year with the help of these skills and these individuals, the need to hone the database will minimize while data collection and extraction will become the major theatre for the Huguenots. The database is moving forward with vigor and the pace is exciting

References

Fosdick, Lucian J., French Blood in America. Baltimore: Genealogical Publishing Co. Inc., 1973:19.

This coming year may even include an introductory move towards Internet publishing. The nature of the work has fostered the development of team problem solving skills, library savvy, a greater respect for genealogy, and patience.