Dissection of transcriptional regulation networks and prediction of gene functions in Saccharomyces cerevisiae
| Authors | |
|---|---|
| Supervisors | |
| Cosupervisors | |
| Award date | 18-01-2008 |
| ISBN |
|
| Number of pages | 136 |
| Organisations |
|
| Abstract |
Molecular biology aims to unravel the functions of cells by studying cellular processes at the molecular level. Amodel organism that is well established in molecular biology is bakers yeast (Saccharomyces cerevisiae). Bakers yeast cells are remarkably similar to human cells, but much easier to grow and manipulation of its DNA is straight-forward. In 1996, the complete DNA sequence of the yeast genome has been deciphered, revealing that the whole genome contains 12 million basepairs and that the estimated amount of genes is around 6000. In comparison, the recently sequenced human genome contains 3 billion basepairs and the number of genes is estimated to be between 20.000 and 25.000.
To translate genes into functional proteins, the gene-DNA is first copied to messenger RNA (mRNA) during a process called transcription, and subsequently the mRNA is translated to proteins. One of the important questions of the molecular biology is how transcription is organized. It is already known that during this process a very important role is played by transcription factors, which are proteins that bind small specific stretches of DNA (called motifs) to enable transcription of genes. The scope of this thesis is the regulation of transcription; when are which genes transcribed to mRNA and which transcription factors are involved? An important new technique that revolutionized the study of transcription regulation is microarray analysis. With this technique it is possible to measure transcription of all genes of a certain cell type in a single experiment. Microarray analysis generates large amounts of data that are processed using informatics and that are analyzed by statistical methods. As a result, a new area of biology has emerged, named bioinformatics. This thesis describes the development of several (bioinformatic) methods that help analyze and interpret microarray data. Although the technique has improved dramatically, there are still some problems associated with microarray data, making it difficult to analyze them. First of all the data are noisy, and since the technique is expensive, it is not possible to repeat experiments many times to reduce the noise. Secondly, microarray experiments are difficult to reproduce; results from identical experiments performed on different array platforms are often not the same. Finally, methods that allow a biological interpretation of microarray data are lacking. At the begin of this study, the method of choice was cluster analysis, for which data from multiple experiments where needed In chapter two of this thesis we present T-profiler, a microarray analysis method that we developed to address these problems. The idea of T-profiler is not to focus on the transcription of individual genes but instead to look at groups of genes with a common feature. This might be genes that are bound by the same transcription factor, or groups of genes with a similar biological function. A major advantage of measuring transcription of groups of genes is reduction of the influence of noise. In addition, the common feature of the gene groups also provides information about the effect of the experimental condition, for example, which transcription factor or which functional group is active. T-profiler is available through a web application (www.tprofiler.org). In chapter three we use the microarray technique to measure the transcriptional response of yeast to compounds that cause cell wall stress. Analysis of the data revealed that besides a general stress response, a specific response is triggered. This specific response is regulated by the transcription factor Rlm1 that is known to mainly regulate cell wall related genes. In addition we used our analysis method to compare these data to that of publicly available microarray data of two mutants that constitutively activate the cell wall stress response. Not unexpectedly, the analysis profiles of these datasets were highly comparable. Surprisingly, we found activation of the transcription factor Sko1p, that is known to be involved in the response of osmo-stress. In chapter four we take the comparison of public available microarray data a step further by comparing about 1000 different microarray experiments. First we used T-profiler to calculate the activity of the different transcription factors in these studies and then we used this information for correlation analysis. The final results provide several new insights into the basic process of transcription in bakers yeast. For example, we show a strong negative correlation between the so-called PAC and rRPE motifs, and transcription factors of the general stress response. So far, no transcription factors have been assigned yet to the PAC and rRPE motifs, and we hypothesize that they are part of a special class of motifs, the so-called core-promoter elements. Furthermore we used our correlation matrix to built a network of transcription factor activities. We used this network to predict new functions for some transcription factors. The focus in chapter four is on the activity of transcription factors. We performed T-profiler analysis on the same microarray dataset, using gene groups based on similar biological functions. This information is used in chapter five to make predictions about the biological function of uncharacterized genes. The method has been validated by testing the reliability of predictions on well-characterized genes. A special website has been developed (www.science.uva.nl/~boorsma/funkey) that can be used to generate functional predictions. The examples from chapter four and five demonstrate the power of new bioinformatic techniques such as T-profiler; it is now possible to compare different datasets and to make functional predictions that were not possible by studying only individual genes and proteins. The techniques that were developed and described in this thesis are now also being used for mouse, rat and human microarray data. |
| Document type | PhD thesis |
| Note | Research conducted at: Universiteit van Amsterdam |
| Language | English |
| Downloads | |
| Permalink to this page | |