Spotlight on UCSC Stem Cell Scholar Martina Koeva

UCSC stem cell training program predoctoral scholar<br />
<br />
Photo by Branwyn Wagman
Monday, April 9, 2007
Branwyn Wagman

The UCSC Training Program in Systems Biology of Stem Cells is sponsored by the California Institute for Regenerative Medicine (CIRM), established in early 2005 with the passage of Proposition 71, the California Stem Cell Research and Cures Initiative. The first set of scholars chosen for the UCSC program started in July 2006. This story is one of a series of spotlights on the research conducted by these scholars.

When the opportunity to apply for the UCSC CIRM stem cell training program came around, graduate student Martina Koeva had already begun working with stem cell data. With a data-oriented background in applied mathematics and a place in the systems biology laboratory of biomolecular engineer Josh Stuart, Koeva had begun to think of ways to relate stem cell data across experiments, systems, and organisms.

Koeva wants to take advantage of the wealth of large-scale data that has become available in the past decade. “With the boom of large-scale experiments such as DNA microarray and protein–protein interactions, we shouldn’t have to start over again with each new experiment.”

To analyze a high-throughput experiment, systems biologists identify functional themes among a large number of genes. Since gene pathway information is scarce for most organisms, the task of looking for significant biological themes in large sets of genes can be daunting.

“My project will provide an automatic way of finding other data that correlates to any new experimental data,” Koeva said.

“So many microarray experiments have been done in the past, and these could help researchers understand the results of the new experiments. I am interested in developing a computational framework that would help use already-known experiments for annotation of a new experiment.”

Stuart elaborated on how this tool could link to experiments that lend greater insight into the genetic processes at work: “For example, her program could link a set of genes up-regulated in human cancer to a set of orthologous genes in mouse found to be specifically expressed in mouse bone tissue. Such a link might suggest that a bone morphogenesis module may be erroneously activated in the cancer cells, providing specific clues about how to target the cancer with therapeutics.”

A collaboration between Stuart’ UCSC lab and Irv Weissman’s lab at Stanford University provided the impetus for Koeva’s project.

In this collaborative study, the Weissman group uses DNA microarray analysis, among other tools, to study the development of blood cells. These involve a collection of thousands of microscopic DNA segments, corresponding to single known genes. The DNA microarray experiments show changes in the expression of one or many genes between two conditions.

The Weissman lab focuses on three types of hematopoietic stem cells—precursor cells that give rise to the different types of blood cells, such as red and white blood cells, T-cells, and platelets. These blood-forming stem cells reside in the bone marrow and the thymus. The Weissman laboratory has identified and isolated the blood-forming stem cells in mice and has defined the stages of development from stem cells to their mature progeny.

Hematopoietic stem cells are adult stem cells and have a limited ability to regenerate—they can only become blood cells. This is in contrast to embryonic stem cells, which can regenerate indefinitely and can become any type of cell found in the organism. Normal hematopoietic stem cells in bone marrow remain quiescent, not proliferating or differentiating into other cells until spurred on by some trigger.

The Stuart lab develops computational methods to analyze the results of DNA microarray experiments, often across multiple organisms, to identify mechanisms of gene activity. This includes large-scale data resulting from the Weissman lab’s microarray analyses of the genes involved in blood cell formation. The computational analysis shows which genes are either turned on or turned off during the different developmental stages. In so doing, they find patterns that show how genes regulate the process of stem cell differentiation and maintenance. For example, what keeps long-term stem cells from differentiating during their quiescent stage, what causes them to finally differentiate, and what triggers the transformation into cancer cells?

Once Koeva develops a tool for relating data sets to each other, she will apply it to analyze blood-forming stem cells in a number of developmental stages. Koeva said, “Blood cells are a good system to start with for this purpose, because blood is already well studied, especially in mice, but the gene regulation mechanisms for blood cell differentiation are still under study.”

She will compare the gene activity in quiescent stem cells to ones in the process of differentiating, and she will examine gene activity in stem cells that have transformed into cancer cells, namely chronic myeloid leukemic cells.

More than one gene may combine into a genetic network to produce a particular response. Studying multiple organisms together can yield clues to how genetic networks evolve. Koeva’s tool will help to predict how genes function and to discover the regulation and modulation involved in producing orchestrated responses on the cellular level.

So far Koeva has developed a computational inference method to find, based on gene expression data, if certain genes turn on or off together when moving from one cell state to another. Using positive and negative controls, she can determine the probability of whether a gene will be on or off in a certain experiment.

As the project progresses, she will set her computational tool up to access data from large, publicly-available databases, such as the Gene Expression Omnibus from the National Center for Biotechnology Information, ArrayExpress from the European Bioinformatics Institute, and the Stanford Microarray Database.

She will organize this data into clusters, and then from the clusters, infer gene modules—groups of genes that are significantly co-regulated in a particular disease or cellular state.

Koeva can use algorithms that already exist for inferring gene modules. For example, the CODENSE algorithm developed at USC connects graphs of data from various sources into a combined graph and draws lines of relationship between genes that respond to a stimulus in concert—genes that are co-expressed. If the same relationship occurs in more than one such graph and across different experiments, then the relationship constitutes a gene module.

To facilitate data access, Koeva will attempt to fit the annotation that is available from current data sets and in publications into a controlled vocabulary—the Medical Subject Headings (MeSH) used by PubMed. This will afford a consistent way to retrieve information that uses different terminology for the same concepts.

Another step in Koeva’s project is to build a map of module interactions that will link modules from different experiments if their member genes exhibit co-regulation under the same experimental conditions. Koeva hopes to build sets of gene modules across different organisms and all conditions and to correlate these modules with the annotation that is already available. How to correlate modules on this scale remains a puzzle to be solved.

Ultimately, when a researcher enters a gene module from a new experiment into the system Koeva constructs, the system will generate a list of gene modules that seem to have a similar regulation or expression pattern to that of genes in other experiments. If these other experiments were on, for example, chronic myeloid leukemia, this would suggest a similar role for the module in question.

The method can also be used to predict whether a particular gene or set of genes is likely to be on or off at a point in time. “You can speculate that a particular segment of DNA will shut off after a cell has differentiated to a particular cell type. With that information, you have more information about the gene’s role.”

Stuart added that connecting the pattern of gene activity across organisms can shed important light on the evolutionary history of a gene module. “If a set of genes is found to be co-regulated in humans as well as rodents, we may be looking at a mammalian-specific subprocess; on the other hand, if a set of genes is co-regulated in humans as well as single-cellular organisms, we know we are looking at a very ancient subprocess. The ‘age’ of the module may provide insights into a module's function inside the cell.”

While Koeva’s first work on the tool will involve stem cell data, such a tool can be used for a variety of experiments. “This project is particularly relevant to stem cell research, though,” she said, “because the genetic networks involved in stem cell regulation are not as well understood as in other parts of biology. Since stem cell research is so new, investigators don’t always know what they are seeing—this tool will give access to more information about experimental results.“

Participation in the stem cell training program—classes, journal club, and the Bay Area stem cell club—has given Koeva’s a broader glimpse into the possibilities for stem cell research: “I took the stem cell biology class last fall, and it was a great opportunity, because we had a lot of outside speakers. It was interesting to get exposed to different types of stem cell research that are out there.”

about the UCSC stem cell training program


  Center for Biomolecular Science & Engineering
1156 High St, Mail Stop CBSE,
University of California, Santa Cruz, CA 95064
831-459-1477 ext. 9-1477 | 

For questions about the UCSC Genome Browser:

UCSC Home | BSOE Home | CBSE Home | Institute HomeInternal