Crandall Specimen Collection Database

Mark Valentine and Dr. Keith Crandall, Biology

The goal of this project was to design and implement a database and web-based interface, to store information on the specimens collected for study in the Crandall lab. The creation of a user-friendly centralized database to warehouse the information on all of the specimens studied is anticipated to be of great use to the Crandall Lab. Previously, the information was stored in various locations and formats, and on different media. Naturally, this made it difficult to use. The database seeks to resolve these problems in several ways. First, the database now provides an easy and fast interface to add existing information from the many different sources where it is currently being stored. The database also makes it easy to add new specimens as they are collected, and update them as they are processed. Further, the database is fully searchable, allowing quick and efficient access to the information that is needed. Finally, the database links to other sites, for easy transfer of information between databases (e.g., downloading specimen information into the M. L. Bean Life Science Museum’s Specify software or retrieving GenBank accession numbers for genetic data associated with specimens). The centralization, standardization, and ease of use provided by the database streamlines and simplifies many of the tasks performed in the Crandall Lab.

The database is written with MySQL. The database and interface scripts are be located on rey.cs.byu.edu, a server in Dr. Clements lab. I wrote the web-interface in PHP and employed ajax as implemented by xajax. My programming was done using Zend Studio, a PHP development environment, putty, a windows ssh client, and WinSCP, a windows sftp client.

The web-interface provides a way to access, edit and search the data contained in the database and to add new information. Specifically, there will be pages for the following tasks:

-Managing collection site information
-Managing organismal information (taxonomy etc)
-Managing users
-Managing genes, primers and sequences
-Adding individual specimens
-Uploading files detailing multiple specimens, sites or organisms
-Viewing, searching and editing multiple or individual specimens
-Downloading sequence and other information for one or several specimens
-Uploading photos representative of a certain species
-Viewing photos and other information for a certain species
-Navigating through the site

There were several changes and additions to my original plan for the database design. I added two new tables and several fields to existing tables. This was done in order to be able to add more functionality to the database. The changes added the ability to upload and view photos for each species. It also added information on the primers used to generate each sequence and chromatograms from each sequence read. The result is a more complete database with information for specimens ranging from their initial collection to the final processing of their sequences.

In my original proposal, I included a proposed timetable for the completion of various stages of the project. I have included the same timetable updated with the current state of the project. Although I not everything was done on schedule, I did complete the entire project.

Project Timetable

Creation of database infrastructure (tables, organization and relations) Done
Administrative Interface (Adding and modifying accounts etc) Done
Interface for adding new specimens Done
Interface for uploading spreadsheets (For adding existing specimens) Done
Interface for editing specimens Done
Links to NCBI website Done
Interface to view multiple and individual specimens Done
Search functionality Done
Interface for downloading and sequences and other information Done
Interface for adding sequence information for existing specimens Done
Interface to upload/view photos of representative specimens Done
Layout and Style Done

As with all software, there are surely some bugs that I have not tested, but I will continue to be available for repairs and troubleshooting help as the database is used and more completely tested by lab personnel. Also, there is always the potential for adding more functionality and improving existing features. To the extent possible, I will add and improve features as requested by lab members. Eventually, I hope that another student with the necessary qualifications and experience to take over the database maintenance and development will be found. When a replacement is found I will train him/her and pass the project along.

I feel that this project was a success. I was able to create a database that facilitates research in the Crandall lab. I accomplished all of the goals that I set out to complete at the outset of my project, and even added to the original scope of the project. I believe that the database can and should be used for years to come and will solve many of the organization and storage problems that the lab previously faced.

Brigham Young University

Journal of Undergraduate Research

Crandall Specimen Collection Database

Mark Valentine and Dr. Keith Crandall, Biology