Modelling of interactions in microRNA and gene expression and validation of predicted targets


Project assigned to: SMRITI SHRIDHAR


Understanding the regulation of gene activity is key to learning about or controlling biological processes. MicroRNAs are an only recently discovered class of powerful regulatory agents. In this PhD project, the student will study gene expression patterns and examine the role of microRNAs in the regulation of gene activity. We will compare established analysis methods with novel approaches that exploit the latest knowledge about microRNA structure. Their structure being a key to microRNA function, this will improve the identification of potential targets and allow better predictions of the effects of microRNAs on gene expression. Experiments studying the performance of different CHO cell lines as producer strains will provide a test case. With our collaborators, we can verify predicted interactions by controlled perturbations in the laboratory.The control of gene expression is complex and affected at many levels, from the chromatin structure of DNA to protein modifications of regulators. At the transcript level, oligonucleotide microarrays already allow genome-wide quantitative snapshots of mRNA expression. The availability of microarray probes with LNA-modified oligonucleotide probes now permits an extension of this established profiling technology to miRNAs, a recently discovered group of prolific regulator molecules. Deep sequencing can systematically identify the expressed miRNA sequences, which are required for microarray probe design. Together, genome-wide accurate expression levels of mRNAs and miRNAs can thus be obtained efficiently. It is now interesting to further develop computational approaches for their joint quantitative analysis. Exploiting perturbation experiments allows the testing of predicted regulatory interactions. Such experiments moreover support the further improvement of miRNA target prediction algorithms.

Aims and methods.

Initial analyses of miRNA / mRNA interactions were typically based on ad hoc approaches such as correlation thresholding. This does not, however, exploit state-of-the-art algorithms. Subtle patterns in gene expression data can be detected with a variety of modern techniques, including probabilistic factor analysis (e. g., Kreil & MacKay, 2003). Powerful tools for the prediction of miRNA targets exploit thermodynamic RNA hybridization models (Hofacker, 2007) . Rather than identify miRNA targets in a separate preprocessing step for subsequent gene-set analysis, we note that recent developments have highlighted the value of a joint analysis of gene expression data and nucleotide sequences (‘Allegro’, Halperin et al., 2009). Instead of relying on de novo motif discovery like Allegro, however, we will in the first phase of this project develop an algorithm for the joint analysis of gene expression data and nucleotide sequences that exploits the predictive power of thermodynamic models for miRNA target identification. This is feasible because both differential gene expression analyses and thermodynamic predictions yield results in form of probabilities that can naturally be combined in a Bayesian framework (cf., Sykacek et al., 2007). This algorithm will then be compared to established deterministic and probabilistic approaches (like bi-clustering or Allegro).

The joint analysis of miRNA and gene expression data under alternative physiological and stress conditions will be applied to the study of CHO strain production performance. With the CHO genome sequence available, model-based microarray design allows highly specific mRNA profiling (Leparc et al., 2009). The planned experiments are well suited for the validation of the developed improved algorithms: Besides profiling mammalian cells in highly controlled environments, interactions predicted by our analysis can be tested by the targeted perturbations. The validated algorithms, however, will be valuable beyond this project in improving our understanding of the regulation of gene expression in general, and miRNA / mRNA interactions in particular.

Halperin, Y., Linhart, C., Ulitsky, I. Shamir, R. (2009) Allegro: analyzing expression and sequence in concert to discover regulatory programs. Nucleic Acids Res. 37, 1566–1579.
Hofacker, I. L. (2007) How microRNAs choose their targets. Nature Genetics 39, 1191–1192.
Hsu, P. W., Huang, H. D., Hsu, S. D., Lin, L. Z., Tsou, A. P., Tseng, C. P., Stadler, P. F., Washietl, S., Hofacker, I. L. (2006) miRNAMap: genomic maps of microRNA genes and their target genes in mammalian genomes. Nucleic Acids Res., 34, D135–139.
Kreil, D. P., MacKay, D. J. C. (2003) Reproducibility assessment of independent component analysis of expression ratios from DNA microarrays. Comparative and Functional Genomics, 4, 300–317.
Leparc, G. G., Tüchler, T., Striedner, G., Bayer, K., Sykacek, P., Hofacker, I., Kreil, D. P. (2009) Model based probe set optimization for high-performance microarrays. Nucleic Acids Res., 37, e18.
Sykacek, P., Clarkson, R., Print, C., Furlong, R., Micklem, G. (2007) Bayesian modelling of shared gene function. Bioinformatics, 23, 1936–1944. s