Soon scientists and clinicians will use new DNA technologies to detect mutations driving cancer and other diseases, identify new strains of pathogens, track subtle changes in our immune repertoire, predict drug response, and make innumerable other contributions to our health. The scale and complexity of the data will vastly exceed anything the medical community has faced before. David Haussler's group at UC Santa Cruz is tackling this challenge by applying advanced engineering and new computer algorithms to revolutionize medicine through deeper, ubiquitous use of DNA information.
Haussler is an organizing member of the Global Alliance for Genomics and Health, composed of research, health care, and disease advocacy organizations that have taken the first steps to standardize and enable secure sharing of genomic and clinical data, and co-leading the data working group.
UCSC Genome Browser
Haussler’s group assembled and posted the first working draft of the human genome on the Internet, resulting in what is now an essential tool for biomedical research, the UCSC Genome Browser. A decade after the human genome went on the web, his group produced the UCSC Cancer Genomics Browser, a new way to visualize and analyze data from studies aimed at improving cancer treatment by unraveling the complex genetic roots of the disease.
The UCSC Genome Browser serves as the platform for several large-scale genomics projects, including NHGRI’s ENCODE project to use omics methods to explore the function of every base in the human genome (for which UCSC serves as the Data Coordination Center), NIH’s Mammalian Gene Collection, NHGRI’s 1000 genomes project to explore human genetic variation, and the Genome 10K project, co-founded by Haussler to assemble a genomic zoo—a collection of DNA sequences representing the genomes of 10,000 vertebrate species. With Prof. Josh Stuart, Haussler's group is building the genomics infrastructure for the $3B California Institute for Regenerative Medicine.
The Haussler lab's informatics work on cancer genomics, including the UCSC Cancer Genomics Browser, provides a complete analysis pipeline from raw DNA reads through the detection and interpretation of mutations and altered gene expression in tumor samples. Haussler's group collaborates with researchers at medical centers nationally—including members of the Stand Up To Cancer “Dream Teams,” the Cancer Genome Atlas (TCGA), and the International Cancer Genome Consortium (ICGC)—to discover molecular causes of cancer and pioneer a new personalized, genomics-based approach to cancer treatment.
The Haussler lab built the UCSC Cancer Genomics Hub (CGHub) to handle all the cancer genomics data generated by the large-scale genomics projects of the National Cancer Institute. It is a secure repository for storing, cataloging, and accessing cancer genome sequences, alignments, and mutation information from TCGA—a pioneering project involving more than 20 cancer types, from the Therapeutically Applicable Research to Generate Effective Treatments (TARGET) project—which focuses on the five most severe childhood cancers—and from other related projects. The current planned capacity of this data center is five petabytes. We anticipate that the CGHub will serve as a platform to aggregate other large-scale cancer genomics information, growing to provide the statistical power to attack the complexity of cancer.
Genome 10K project
The Genome 10K project was co-founded by Haussler, Steve O'Brien and Olliver Ryder to assemble a genomic zoo—a collection of DNA sequences capturing the genomic diversity of 10,000 vertebrate species—as a resource for the life sciences and for worldwide conservation efforts.
Molecular and stem cell biology
Research by the Haussler bioinformatics group generates an increasing number of very specific hypotheses about the evolution and function of human genes. Through wet-lab experiments, we explore and validate predictions generated from computational genomic research. For instance, we use embryonic and induced pluripotent stem cells to investigate neurodevelopment from a functional and evolutionary perspective. Research project areas include genome evolution, comparative genomics, alternative splicing, and functional genomics. Current focus areas:
- Understanding very early neurodevelopment from an evolutionary perspective, including the role of NOTCH signalling and of non-protein-coding regions of the human genome in this and other aspects of vertebrate development. The lab uses in vitro differentiation of human and primate embryonic stem cells and mouse models to study these processes.
- Developing computational analysis and cancer cell-line models to mimic specific tumor types, especially brain tumors, using extensive genome-wide analysis of transcription, genomic alterations, and epigenetic modifications of patient tumor samples.
Early research interests
Haussler's current research stems from his early work in machine learning, statistical decision theory, pattern recognition, neural networks, algorithms, and complexity.