NCGR :: Nat'l Center for Genome Resources
About NCGR Our Work Jobs at NCGR Contact Us Search NCGR Support
 


  Link to our
Sequence
Analysis


Comparative Map
and Trait Viewer


Genomic Explorer

Integrated
Software Systems






 
Schizophrenia

Preventing Rare
Genetic Diseases


Genetic Changes
that Cause Cancer


GEYSIR

CAPSOD

NM-INBRE


 
 LIS

Phytophthora
Studies



 

 
NCGR
Outreach Program


Core Infrastructure
for New Mexico


Summer Internships

Educational Outreach





Our Work - Collaborative Projects and Scientific Software

The success of NCGR depends on collaborative research at the intersection of bioscience, computing and mathematics. Today at NCGR, our scientists and partners study the influence of genetic variability of both host and pathogen on infectious disease progression.  NCGR's software engineers develop scientific software solutions to support and enable those studies. A range of federal and state funded programs support our programs and projects in Human Health, Infectious Disease, Legume Crop Improvement, and Food Security.

Alpheus® Software System
 
Alpheus Software System ALPHEUS

In 2006, NCGR commenced development of the Alpheus™ (N. Miller, Project Lead) web–based software system for analysis of data in massively parallel resequencing projects. Specifically Alpheus™ was designed for resequencing–based case–control association studies to identify the genetic basis of complex diseases and traits. Alpheus™ provides massively parallel sequence pipelining, visualization, analysis, and project management capabilities.

Alpheus™ provides dynamic queries and visualization of read data, variant data and results via an intuitive user interface. Alpheus™ reports sSNPs, nsSNPs, indels, premature stop codons, and splice isoforms. Read coverage statistics are reported by gene or transcript together with a visualization module based upon an individual transcript or genomic segment.

Alpheus™ is ideal for all current DNA sequence formats including:

  • Sanger
  • Roche-454
  • Illumina-Solexa
  • ABI SOLiD
  • 100s of GigaBases
  • Nucleotide variant, splice isoform identification

Alpheus™ provides data management services, an analysis pipeline, and internet-accessible software for variant discovery and analysis for ultra-high throughput next-generation sequence data with minimal human manipulation. Alpheus™ is available on a software–as–a–service basis to academic and industry clients. Upon provision of sequence data and reference database coordinates, NCGR provides clients with a secure, custom web–interface in which to analyze aligned reads and discover differences between samples. Alpheus™ is also available for local installation.
Contact Faye Shilkey (fds@ncgr.org) for details or pricing information.
 
Human Health/Infectious Disease
 
Schizophrenia Genome Project schizophrenia

The Schizophrenia Genome Project (S. Kingsmore, PI) was established in 2007 to identify the genetic basis of Schizophrenia. The SGP is a collaboration with Dr. Nora Bizzozero at the University of New Mexico (UNM) and Dr. Gary Schroth at Illumina Inc.

To date, the SGP has sequenced the transcriptome of 20 case and control samples, generating more than 15 billion nucleotides of sequence. Most of the samples analyzed to date have been from cerebellar cortex, an affected tissue in schizophrenia. Investigators are performing case–control comparisons to identify non–synonymous nucleotide variants that are associated with schizophrenia. The National Institutes of Mental Health has generously provided thousands of archived samples for validation studies.

Preventing Rare Genetic Dieseases

Preventing Rare Genetic Diseases

There are hundreds of rare, so-called .orphan. diseases like Batten disease, a fatal childhood neurodegenerative disorder. Beyond Batten Disease Foundation is working with NCGR (PI: Dr. Callum Bell) to develop one easy and inexpensive blood test to detect the gene mutations for hundreds of rare autosomal recessive diseases. The science is possible today, and with your support, the test will become a standard of care for all young women.

Understanding the genetic changes that cause Cancer

Understanding the Genetic Changes that Cause Cancer

Cancer is the prototypic genetic disease. For most sporadic cancers, mutations that arise in cells during life (somatic mutations) are responsible for change from normal responses to growth signals to uncontrolled growth and metastasis. NCGR is working with the International Mesothelioma Program (PI David J. Sugarbaker, MD), to understand the causative factors in the development of the asbestos-associated lung cancer, mesothelioma, and to translate these findings into improved therapy.

GEYSIR/deCODE Genetics GEYSIR

This NIAID-funded Population Genetics project (J. Gulcher, PI, deCODE Genetics) is aimed at discovering host genes involved in immune response and adverse effects to vaccination. Specifically, the teams at deCODE, NCGR and the University of New Mexico Health Sciences Center (UNM-HSC) will collaborate to study four different populations having

  1. adverse effect to smallpox vaccination,
  2. clinical tuberculosis infection versus seroconversion,
  3. serious influenza infections, and
  4. one or more severe infections associated with encapsulated bacteria such as S. pneumoniae, H. influenzae, and N. meningitidis.

Using the Icelandic genealogy database, deCODE is identifying extended families affected in each category and carrying out genome-wide linkage and case-control association studies to map host genes. Following identification of host genes that confer substantial risk for infection or vaccine response, the UNM-HSC team will functionally validate them by testing protein and mRNA expression differences in monocyte and dendritic cells of patients with infection susceptibility versus controls, with or without in vitro pathogen exposure. NCGR (B. Beavis, PI; S. Baxter, PM) is using its expertise in creating informatics systems and analyzing large datasets to create and update a discovery platform of linkage analysis and validation results called GEYSIR.

Diagnostics for severe sepsis and community acquired pneumonia (CAPSOD) capsod

This NIAID-funded program, titled "CAPSOD", is a public-private, multidisciplinary collaboration involving investigators at ten organizations: NCGR; Duke University Medical Center, Durham, N.C.; Henry Ford Hospital, Detroit, MI; Durham Veterans Administration Medical Center, Durham, NC; Eli Lilly and Co., Indianapolis, IN; Monarch Life Sciences, Indianapolis, IN; Pfizer, Inc., Groton, CT; Metabolon, Inc., Durham, NC; Roche Diagnostics Corp., Indianapolis, IN; and ProSanos Corp., La Jolla, Calif. CAPSOD is a five-year program that will prospectively enroll patients with sepsis and CAP at Duke University Medical Center and Henry Ford Hospital. The study will use advanced bioinformatic and proteomic technologies to identify specific protein changes, or biomarkers, in patient blood samples that predict outcome in sepsis and CAP. Development of biomarker-based tests will permit patient selection for appropriate disposition, such as the intensive care unit, and use of intensive medical therapies, thereby reducing mortality and increasing effectiveness of resource allocation. See the full CAPSOD description at ClinicialTrials.gov.

New Mexico Idea Network of Biomedical Research Excellence (NM-INBRE) INBRE

The NIH/NCRR-funded NM-INBRE program is a collaboration among a number of New Mexican institutions including: New Mexico State University (NMSU), the University of New Mexico (UNM), Eastern New Mexico University (ENMU), New Mexico Institute of Mining and Technology (NMT), and New Mexico Highlands University (NMHU) and NCGR. INBRE aims to strengthen biomedical research in New Mexico's institutions of higher education and to prepare faculty and students for participation in the research programs of the National Institutes of Health. NCGR provides bioinformatics training and research, develops customized bioinformatic tools, hosts and maintains the NM-INBRE website to support collaboration, and hosts an annual Bioinformatics Symposium. NCGR's work also includes an outreach program for students at other 4-year undergraduate institutions, tribal and community colleges in the state to increase matriculation in graduate biomedical research programs.

Plant Biology/Nutrition
 
Legume Information System (LIS) LIS

The Legume Information System (LIS) is the result of a cooperative research agreement between NCGR (G. May, PI) and the USDA Agricultural Research Service (ARS) as part of the Model Plant Initiative (MPI). The LIS project provides a publicly accessible legume resource that integrates genetic and molecular data from multiple legume species and enables genomic, transcript and map cross-species comparisons.

Phytophthora Studies

Oomycetes, or water molds, are among the most important eukaryotic plant pathogens. Annually, they cause $100s of billions of damage to agricultural and ecological systems worldwide, impacting the productivity and sustainability of food crops, ornamentals, forest products, and seafood. Several oomycete species represent sufficient threats to the safety of the nation’s food supply to merit inclusion on the Animal and Plant Health Inspection Service agricultural bioterrorism list or the USDA regulated plant pest list. There exists an urgent need to identify the genetic determinants of virulence and host range in order to develop improved control methods.

Phytophthora capsici Genome Project capsici capsici

P. capsici is a non-indigenous US pathogen. It was first reported in the US in 1922 on chili peppers in New Mexico and spread to vegetable production areas in Colorado and Florida in the 1930's and 1940's, affecting tomatoes, eggplants, squash, and melons.

NCGR (S. Kingsmore, PI), along with biologists at University of Tennessee and Ohio State University, is funded by USDA/NSF to sequence the P. capsici genome.  The rationale for these studies is:

  1. P. capsici is a devastating pathogen of vegetable crops of national economic importance;
  2. P. capsici is an excellent genetic model. This project will create broadly applicable resources for gene models and population genetic studies of oomycete biology and hemibiotroph-induced disease;
  3. 454 sequencing technology will be evaluated and benchmarked for de novo and re-sequencing in the largest genome studied to date (65MB).

In collaboration and support from DOE's JGI sequencing group, the aims are to use novel 454 Life Sciences sequencing technology to generate:

  1. 20X draft genome sequence of the vegetable pathogen Phytophthora capsici,
  2. 2X coverage resequencing in 4 outbred isolates, and
  3. a catalog of single nucleotide variation.

These resources will be disseminated at the Phytophthora Functional Genomic Database (PFGD).

Phytophthora Functional Genomic Database (PFGD) PFGD

PFGD is a web based clade-oriented information resource that builds upon data formerly available from the Phytophthora Genome Consortium (PGC) and at the Oomycete Genomics Database, as well as all publicly available P. infestans transcript data. PFGD is funded by NSF (S. Kamoun, Ohio State University, PI). Oomycete sequence data is analyzed and automatically annotated using NCGR's XGI system. PFGD includes functional assays and gene expression data, combined with transcript and genomic analysis and annotation. PFGD integrates the P. sojae and P. ramorum genomes and their annotations as well for comparative analysis. In addition, host species data — available at solgd.org - is integrated at PFGD. Going forward, P. capsici sequence data and variant analysis will also be available at PFGD.

Software Tools and Active Software Development Projects

NCGR's programs have produced a number of software tools that are freely available to the scientific community using the internet or for software download.  Some software packages are released under open source licenses; others are freely available to non-profit organizations under a licensing agreement. For additional information on NCGR's software tools contact us at info@ncgr.org.

Virtual Plant Information Network (VPIN)

The technical foundation of the VPIN (M. Montoya, PM) is based on evolving semantic web and web services technologies, first developed as part of the previous NSF-funded MOBY projects. The on-going development of the VPIN platform is run as an open source project. The code is currently hosted at Open Bioinformatics Foundation.

The X Genome Initiative (XGI)

The X Genome Initiative (K. Gajendran, PM) is NCGR's high-throughput computational, species-independent sequence analysis pipeline and database software system. XGI uses a variety of algorithms for sequence pattern recognition, comparison and annotation of genomic, EST or ORF sequence data types. XGI is the annotation and database engine behind both PFGD and the Legume Information System. The system is being modified to incorporate assembly algorithms for 454 Life Sciences and Sanger reads, in addition to handling sequence variant detection. XGI jobs are queued across NCGR's Linux cluster and results are stored in a relational database. XGI operates on batch files and can be configured to perform any series of sequence similarity or motif searching operations based on user preference or sequence type. Pipeline analyses can include BLAST analyses, InterProScan algorithms and a variety of tools for gene prediction and assembly.  Automated post-analysis annotation links best match annotation to Gene Ontology entries and annotations.

Comparative Map and Trait Viewer (CMTV) CMTV

The Comparative Map and Trait Viewer (A. Farmer, PM) is a graphical client for integrating various types of genomic data from different sources, including annotated sequences, genetic maps and QTL data. The tool allows comparison of maps using a variety of different algorithms. The results of these comparisons are used  to integrate data from multiple maps into a common framework. As a component of the ISYS integration platform, it provides a structural/comparative perspective on data that may be simultaneously viewed in relation to functional classification systems such as GO or biochemical interaction networks. Our collaborators at CIMMYT have used the tool to construct drought tolerance consensus maps for Zea mays based on the results of multiple trait mapping experiments under different genetic backgrounds and environmental conditions. The Legume Information Network has used the tool via Java WebStart to provide a client to aggregate genomic data provided via semantic web services and explore synteny relationship among legume species.

The earliest versions of CMTV were developed in collaboration with four CGIAR centers (CIAT, CIMMYT, CIP and IRRI). The current funding for the project comes from a USAID Linkage Grant with CIMMYT. CMTV source code is available from SourceForge.

Genomic Explorer y Survey of Immune Response (GEYSIR) GEYSIR

GEYSIR (Faye Schilkey, PM) is an interactive, web-based genomic visualization tool developed as part of the NIAID/deCODE population genetics project. Web-based tools for exploring genomic data typically are statically rendered HTML pages, which lack live interactivity.  With the exceptional amount of data and scales of size involved in working with genomic data, this lack of dynamic interaction usually becomes not only cumbersome for the user but inadequate for scientific discovery.  To address this issue in the context of population genetics studies, GEYSIR was developed to enable exploration of a wide scale of genomic data, from single nucleotide polymorphisms to gene neighborhoods to marker sets and association data spanning all chromosomes.  GEYSIR is designed to be a highly interactive, dynamic, and responsive web application.  Additionally, it was designed up front for extensibility and reusability so that the code base and architecture can be reused for a wide variety of genomic data, organisms, research, and data models.

Integrated Software Systems (ISYS) ISYS

ISYS (A. Farmer, PM) is a dynamic, flexible platform for the integration of bioinformatics software tools and databases. ISYS offers a component-based architecture that enables scientists to "plug and play" among tools of interest. These tools may be separately developed and independently evolving. In addition, ISYS allows web-based resources to be integrated with programs running on the scientist's desktop.

ISYS's DynamicDiscovery™ technology creates an exploratory environment in which scientists can navigate freely among registered components. DynamicDiscovery helps to guide the user by suggesting appropriate registered components to process selected data objects. In addition, ISYS supports visual synchronization among components, which helps each one to complement the others. ISYS is written in Java for platform independence and is supported on Windows and Solaris. It is also available without a Java Virtual Machine for Linux and other types of UNIX. The ISYS Platform code has been released under an Open Source license and is available for download from SourceForge.

Outreach
NCGR Outreach Program

Part of NCGR's founding mission was to enrich New Mexico by providing educational science and research opportunities. NCGR has established a multi-faceted outreach strategy that is focused on encouraging students and faculty in New Mexico to study science and math. We seek to establish relationships with the New Mexico science community through working partnerships. Together with regional programs and universities, we work to directly involve students in the research process through mentored training opportunities.

Core Infrastructure for New Mexico:

An example of our ongoing outreach efforts to regional students and faculty is the NIH-funded New Mexico IDeA Networks of Biomedical Research Excellence (NM-INBRE) program, of which NCGR is a participating institution. As part of this program, NCGR hosts an annual New Mexico Bioinformatics Symposium (NMBIS). Now in its third year, NMBIS brings together about 120 students and faculty from New Mexico, eastern Arizona and west Texas for research presentations, student poster presentations, and hands-on bioinformatic workshops covering such topics as Microarray Experiment Design and Microarray Data Handling, Bioinformatic Tools for Gene Discovery and Comparative Genomics. For many students, this represents their only opportunity to meet and listen to nationally-recognized scientists. One highlight of the meeting includes the catered student and faculty poster sessions, at which students and faculty have an opportunity to chat informally with one another and with nationally-recognized scientists.

Summer Internships:

NCGR offers summer research internships (eight to ten weeks in duration, funded by NSF and NIH) to give New Mexico's undergraduates and faculty the opportunity to couple their classroom knowledge to a research experience. Our interns work on various research projects (for example: studying the relationship between peanut protein structure and allergenicity, using bioinformatic tools and resources to study the evolution of antibiotic resistance, and performing advanced computer simulations of protein dynamics) Interns are given the opportunity to see 'science at work', see science as a viable career path, and to think about their continued education at the Master's or Doctoral level.

Educational Outreach:

NCGR scientists and staff travel to institutions within New Mexico to give hands-on workshops on topics such as genomics and protein structure visualization and manipulation. Furthermore we participate in group-targeted outreach activities such as the Sandia National Laboratory Dream Catcher Science Program, a program intended for American Indian students interested in science, math and engineering.