The fast accumulation of genomic data, combined with the development of bioinformatics techniques, allows to study gene and protein evolution at an unprecedented scale. During evolution, genes and proteins undergo sequence and copy-number modification. Patterns of genetic variability contain information about several aspects of genes’ and organisms’ evolution, such as the relationships between organisms or the selective pressures that have acted on a given gene.

Genes in a genome widely differ in their patterns and modes of evolution. Some proteins are highly conserved during evolution, whereas others are much freer to evolve; some are more likely than others to undergo adaptive evolution; some genes tend to duplicate, while others remain as a single-copy during long evolutionary periods. One of the research interests in the lab is understanding what factors explain this variability in genes’ patterns of evolution.


Slow-evolving protein (A) vs. fast-evolving protein (B).


Genes and proteins rarely act in isolation. Instead, they tend to function as parts of complex pathways and networks of interacting molecules, such as metabolic pathways and networks, signal transduction pathways and networks, gene regulatory pathways and networks, protein-protein interaction networks, etc. As a result, genes do not evolve in isolation either: their evolution is affected by their interactions with other genes, and by their position within the network. E.g., central genes within a network (e.g., those involved in several interactions) are more selectively constrained than the more peripheral ones, and interacting genes tend to have relatively similar patterns of evolution.

We are interested in how gene and protein networks evolve as systems. Network evolution is conditioned by network function, and vice versa. Therefore, understanding network evolution has several implications, from biotechnology to biomedicine.


Human protein-protein interaction network.


Organisms live at different temperatures, and have different body temperatures. We are interested in the impact of body temperature on genome evolution.



Organisms can be classified into three domains of life (Archaebacteria, Eubacteria and Eukaryotes). Little is known, however, about how these three domains are related to each other. Endosymbiotic theory suggests that Eukaryotes arose, ~2 billion years ago, from a fusion of an archaebacterium and a eubacterium. When it comes to studying organisms as distant as the three domains of life, one of the problems of standard phylogenetic analyses is that phylogenetic signals may have been partially eroded by the passing of time. Instead of phylogenetic trees, we used gene similarity networks (with nodes representing genes and edges representing BLAST hits) to study the origin of Eukaryotes. Such networks allowed us to use information that was not amenable to phylogenetic analyses.

Network analyses also show that current genomes/proteomes still exhibit signatures of the origin of Eukaryotes. Eukaryotic genes inherited from the archaebacterial ancestor of eukaryotes tend to differ from those inherited from the eubacterial ancestor of eukaryotes, in terms of gene expression and position in the protein-protein interaction network.


BLAST hits among members of a gene family.