Kristen Crofts and John S. K. Kauwe
Brigham Young University, Biology
December 31, 2016
pVAAST Analysis of Alzheimer’s Disease Sequencing Project Pedigree Data
Introduction
Alzheimer’s disease is a progressive brain disorder affecting more than 10 percent of Americans
over the age of 65 (1). This disease destroys memory and thinking skills and is a leading cause of
dementia. Although research in recent years has provided substantial information related to
risk factors associated with Alzheimer’s disease, a great deal of information still remains
unknown. PVAAST is a software tool that can analyze genome sequence data of Alzheimer’s
patients in a creative way by also analyzing the patients’ pedigree information. With the
combined information from family members, it is more feasible to identify genetic mutations
related to Alzheimer’s disease. The objective of this analysis was to learn valuable information
that will aid in identifying risk factors for Alzheimer’s disease and creating effective
intervention.
Methods
I have access to the recently released genome sequence data from ADSP, which contains whole
genome sequence data for over 500 individuals in almost 100 families. Additionally, I have
access to the sequenced dataset from the 1000 genomes project, which contains information
for over 1000 individuals worldwide. I downloaded the pVAAST software tool, which uses
“variant frequency data in cases and controls” and “linkage information in pedigrees for disease
finding.” With the use of various software, including R, python, perl, and bash, I converted the
data into background files, target files, and pedigree family files formatted specifically for use in
pVAAST. I created subsets of the data to run the analysis on various confidence levels, and I
planned to add to those subsets and analyze the new results. Unfortunately, another research
group completed this analysis before I had time to finish my own analysis, so before finding
results, I stopped the research.
Results
Unfortunately, another research group completed this analysis before I had time to finish my
own analysis, so before finding results, I was required to end the research and focus on a more
timely issue.
Discussion
Although I was unable to complete the project in a timely manner, I learned valuable tools in
the field of Bioinformatics research. I became familiar with tools such as R, perl, and bash, and
was able to refine my python skills. I also learned how to combine various datasets and
integrate knowledge I had previously gained from research and coursework with new
knowledge. I learned how to plan, execute, and evaluate.
I faced quite a few setbacks as I tried to complete this analysis. First of all, because I was an
inexperienced programmer, code I attempted to run failed time and time again. I had to learn how to
set aside frustration, look at the problem, and find a solution. Additionally, in the
process of deleting extra files in my workspace, I accidentally deleted an important file of data.
This data took weeks to retrieve again, and set back my analysis by quite a bit. Again, I learned
that science and research is filled with mistakes, and I just had to continue onward. Finally, I
was faced with frustration when I learned the project I had spent so many hours on had been
completed by somebody else. My mentor explained to me that this is the nature of a scientific
field that is growing so quickly. I should expect that often, many groups are researching the
exact same thing. While that is exciting, it also adds pressure to the research experience. I
learned that if I have an idea, I should jump on it immediately and do all I can to find results as
soon as possible. However, if somebody else beats me to the results, I should be excited for
them as well, because we are all trying to further the field of bioinformatics.
Conclusion
Despite the setbacks, this project helped me to develop an interest in Alzheimer’s disease
research. Although I had to end this project early, I have continued my research in Alzheimer’s
disease by studying SNP and gene interactions. This research has required me to use the same
skills I learned while completing the pVAAST analysis project. I am grateful I was given the
opportunity to complete a research project with the help of my mentor, Dr. Kauwe, and BYU’s
ORCA program.