Joshua Sailsbery and Dr. David McClellan, Integrative Biology
The GeneWorkshop project had one goal, to enable biologists to better utilize bioinformatic tools. Generally, bioinformatic tools have a large learning curve and are not very robust as to the data they accept. In order to meet these criteria, GeneWorkshop was separated into two main objectives. The first objective was the obtaining and maintaining of genetic data. The second objective was interpreting and implementing various bioinformatic software packages.
To accomplish these objectives, it was proposed to produce an online distributed system on a JBoss server. This system would not only achieve our goal, but provide an environment for development for bioinformaticians. However, as work progressed, it became evident that GeneWorkshop would best be created in several small pieces, each of which would be relatively large projects. Progress was made on each of these smaller projects, and summarized below.
While evaluating exactly what kind of genetic data a biologist would be studying, it became evident that the first objective would take on a much larger roll than suggested in the proposal. Given the multiple sources and variety of formats genetic data may be found, it was evident that a software package was needed to interpret each format. Therefore, DataConvert was created to meet this need. DataConvert has the ability to read in DNA, RNA, and amino acid information for several individuals (taxa) from many formats. DataConvert was then able to translate this information into another format compatible for the bioinformatic software the user intended to use. The first objective was met, but there is still work to provide a database core to meet the maintenance needs of biologists.
For the second objective differing web-layout designs were researched to determine how biologists would best interpret genetic data. These designs have been very effective and utilized on graphic user interfaces for different projects. However, they still lack web deployment to be fully utilized on a distributed system. Lastly, one of the most challenging projects was to interact with various bioinformatic software packages. Due to time constraints, utilizing JBoss to execute remote commands was not feasible. Therefore, Perl scripts were designed to interface with these packages.
During development, implementation as presented in the proposal was adapted to meet the goals. In the end, despite the change in strategy the groundwork has been laid to allow further development and refinement.