Our view on organismal evolution is intimately connected to our understanding of how genomes and the encoded information change over time, and how this translates to the phenotypic and functional characteristics of contemporary species. The sequencing of entire genomes and transcriptomes from species covering all major groups in the tree of life has lifted the data basis for evolutionary research with a functional perspective to an unprecedented level. In its combination, this data facilitates access to the full repertoire of information stored in a species’ genome and allows unraveling individual cellular programs translating genetic information into a diverse set of functions. However, the effort connected to the experimental functional characterization of even considerably few proteins in the lab is still enormous. It is for this reason that exhaustive functional studies are limited to few and well established model organisms, many of which are of economical or medical relevance. More often only individual pathways are studied in niche model organisms featuring a particular trait of interest. However, for the vast majority of species only a draft genome assembly or transcript data is available without further experimental support. In these instances the in silico prediction of genes together with a subsequent tentative transfer of functional annotation from corresponding sequences in experimentally characterized model organisms provides the only source of functional information. Integrating all available information into a comprehensive picture of organismal and functional evolution is the common denominator of the individual projects in our group.
More specifically, we concentrate on the following main topics:
1) Deep phylogenies and phylogenetic profiling
We use phylogenomics approaches considering hundreds of genes across a similar number of species to reconstruct comprehensive phylogenies up to the kingdom level. We attempt to assess the credibility of the reconstructed trees by using – whenever possible – multiple and non-overlapping data sets to support individual splits. The resulting trees provide the scaffold for subsequently mapping information about the presence and absence of genes in large numbers of species considering both sequence homology and functional domain architecture. With the help of these phylogenetic profiles we can start tracing entire protein interaction networks together with the associated function across species and provide insights into their evolutionary history.
2) Functional annotation transfer
The transfer of functional annotations between biological sequences is a multi-layered procedure of which the most basic step is typically the identification of orthologs to functionally annotated proteins from model organisms in non-model organisms. Unfortunately, evolutionary relationships between proteins alone are only a poor proxy for functional equivalence. To ameliorate this problem, we aim at including additional evidences to achieve a more reliable annotation transfer by that minimizing the requirement of human curation. We are currently integrating an automated scoring of functional domain architecture similarities with the search for homologs. Moreover, we take the phylogenetic profiles of the respective proteins together with those of proteins interacting in the same functional pathway into account.
3) Phylostratigraphy and evolution of gene interaction networks.
Phylogenetic profiles of proteins sharing the same function allow reconstructing when in evolutionary history individual gene interaction networks emerged, and help assessing their fate in individual phylogenetic lineages. This provides valuable insights into the direction of organismal evolution. Partial or complete losses of evolutionary old pathways indicate reductive evolution often associated with the change of an ancestral phenotype. For example, Microsporidia, obligate intracellular parasites closely related to fungi, lack more than half of the otherwise highly conserved eukaryotic ribosome biogenesis factors. Whether this reduction coincides with an - among eukaryotes - unique way of ribosome biogenesis facilitated by their endoparasitic lifestyle, or whether they recruit host proteins to rescue the conventional pathway remains unclear. Evolutionary young functional modules confined to few and closely related organisms living under similar environmental conditions represent the other extreme. They can pinpoint recent innovations facilitating the adaptation of species to their particular ecological niches. Part of this work with a particular focus on the evolution of calcium and stress signaling in plants is funded by the FP7-PEOPLE-2013-ITN CALIPSO (http://itn-calipso.univie.ac.at/).
4) Source of genetic and functional innovation
The evolution of genomes and their functions cannot be exhaustively assessed on the level of individual species. Rather the interplay between members of multi species communities has to be taken into account. Of particular interest to us is the question of how individual species accomplish the genetic innovation facilitating the adaptation to sometimes extreme environments. In a DFG funded project to trace the evolution of pathogenicity in Acinetobacter baumannii (2251/1) we are investigating the relevance of recruiting pre-existing functional modules from other species via lateral acquisition of the corresponding genes (link to FOR web page). More specifically, we are interested in the relevance of natural competence for bacterial evolution, that is the capability of bacteria to directly uptake and utilize environmental DNA.
In a second project we want to shed light on the consequences of (obligate) symbiosis on the molecular evolution of the involved partner organisms. Currently, we are sequencing and assembling the metagenome of the lichen Lasallia pustulata that can conquer harsh environments such as pure rocks and withstands repeated periods of hyper- and dehydration. However, when isolated the photobiont and the mycobiont grow either poorly or not at all, even when cultivated under optimal conditions. Aim of this project is to identify the genetic changes underlying this mutual functional complementation and dependency. The lichen project is done in cooperation with the group of Imke Schmidt at the BIK-F.
5) Development of software and workflows for biological sequence analysis
Complementary to our evolutionary research activities we are developing, improving and benchmarking software and workflows for biological sequence analysis in a functional and evolutionary context. Main ongoing projects include (i) the targeted search for orthologs in large species sets (HaMStR; sourceforge.net/projects/hamstr), (ii) the use of feature architectures for similarity-based searches independent from amino acid sequence similarity (FACT http://www.biomedcentral.com/1471-2105/11/417), and (iii) the integration of the two concepts to develop a tool for a function-aware phylogenetic profiling of individual proteins. In addition we are currently investigating how to interpret phylogenetic profiles in an imperfect world where non-detection of a protein cannot be equated with its absence in a given species. To this end we build a simulation framework around the tool REvolver (www.ncbi.nlm.nih.gov/pubmed/22383532) to delineate the evolutionary distances beyond which homologous sequences are likely to have diverged to an extent that they no longer display a significant sequence similarity. If for two species and the corresponding proteins this distance is not exceeded non-detection can indeed be equated with absence.
Department for Applied Bioinformatics
Institute for Cell Biology and Neuroscience
Prof. Dr. Ingo Ebersberger
Max-von-Laue Str. 13
Phone +49 69 798 - 42112
Biologicum; Room 3.205
Phone +49 69 798-42110