The first atlas of virus-human protein interactions

Viruses are intracellular obligate pathogens that rely on the molecular machinery of the host to fulfill their life cycle. In order to achieve this, viruses interact with host proteins, rewiring the molecular network of the host for their own benefit. Therefore, understanding how viruses interact with the host protein network is of particular interest to understand the molecular principles behind viral infection.

Despite the relevance of these interactions, our knowledge is exceedingly sparse, and very few viruses have been studied extensively experimentally. Partly, due to the difficulty to study such interactions in an experimental setting.

With this in mind, I have tuned PrePPI, a well-known algorithm developed at Barry Honig’s lab, to predict protein protein interactions (PPIs). PHIPSTER (Pathogen Host Interactome Prediction using STructurE similaRity) evaluates potential direct PPIs between a viral and a host protein by considering domain-domain contacts and peptide-domain contacts.

We applied our predictive algorithm on a viral dataset comprising 1,001 human-infecting viruses (12,237 viral proteins) against the human proteome (~20,000 human proteins), generating ~282,000 high confidence viral-host PPI predictions.

Overall, our high confidence predictions have a good predictive performance and overlap with high-throughput experimental methods. Our approach can have multiple applications, such as understanding the role of each viral protein, revealing the host biological pathways underlying human infection, and discovering functional relationships between viruses.

Our predicted Zika virus (ZIKV) protein interactome involves human proteins whose biological role are very much related to the phenotype of the Zika virus infection observed in the latest outbreak in the Americas. Moreover, we identified a particular druggable human protein acting as rheostat of viral infection that might lead to a new strategy to treat zika virus infection.

We also identified particular viral-host PPIs that are able to classify the known the oncogenic potential of human papilloma viruses (HPVs). These viruses are the leading cause of cervical cancer in women, but not all HPVs have the same ability to induce cancer. Our predictions discern between high risk HPVs and low risk HPVs, enabling the classification of other HPVs whose oncogenic potential is not known yet and hopefully bringing some additional light (since HPVs have already been widely studied) into the molecular processing underpinning cervical cancer.

Having now all viruses described by their (predicted) set of viral-host PPIs and host biological functions important during viral infection, there is some cool stuff we can do such as identifying functional relationships between viruses independently of their evolutionary origin. Throughout evolution viruses diversify, by-passing the host’s antiviral mechanisms, and explore new infection routes that can give them an adaptive advantage over the host (the so-called arms-race). Following this, it is possible that unrelated viruses have converged into certain routes, which could be an interesting target in drug therapy to treat multiple infectious diseases at once.

It is important to remark that this work reports predictions and further experimental work should follow up to explore the hypotheses generated here. Theoretical and experimental work beautifully together when theory generates hypotheses that are tested experimentally and these results are used to further refine the theoretical model in an feedback loop.

Considering that “All models are wrong, but some are useful” (George Box), I would like to believe that some of our predictions are correct and contribute to understand viral pathogenesis.

The discovery of Bombali virus, a new ebolavirus

To date, five species of ebolavirus are known: Zaire virus, Bundibugyo virus, Sudan virus, Tai Forest virus and Reston virus. Four of these species (all but Reston virus) are known to cause severe disease in human.

In this new article we describe the discovery of a new ebolavirus, named Bombali virus (BOMV), found in free-tailed bats in Sierra Leone. Modeling the interaction between the viral GP1 protein and its receptor in humans (NPC1 protein) suggested that this new virus can indeed bind the human protein; thus, mediating viral entry in human cells. Subsequent experimental analysis confirmed this finding. This however, does not imply that BOMV is pathogenic in humans since we still don’t know if the virus is capable of interacting with the human molecular machinery required to fulfill its life cycle.

This research provides strong evidence that bats serve as hosts for ebolaviruses and highlights the importance of wild life surveillance to evaluate the zoonotic risk of emergent viruses.

Most importantly, this study does not intend to create alarm or incite the retaliatory culling of bats.  Bats play an important ecological role as insectivores, pollinators and seed dispersers and killing or disturbing bats does not reduce the risk of transmission. On the contrary, it might enhance the disease transmission by exposing non infected bats.

Additional information

Pyruvate Carboxylase, towards a consensus view on the conformational landscape during its catalytic cycle

One of the projects I was leading in my previous lab (at CICbioGUNE) was recently published in Structure. Pyruvate Carboxylase (PC) is a multifunctional tetrameric enzyme that is involved in several biosynthetic pathways. This enzyme has two catalytic domains and it is known that a domain, known as the BCCP domain, travels between both catalytic sites; thus linking the corresponding chemical reactions. The malfunctioning of this important enzyme is related to diseases that predominantly manifest with lactic acidemia and neurological dysfunction.

When I started working on this protein back in 2008 there were two crystallized structures of this enzyme, both with a similar quaternary structure yet with a fundamental difference that was not well understood: the different arrangement of a domain, key for the function of the protein, known as as PC tetramerization (PT) domain or allosteric domain. This domain showed a symmetrical arrangement in S. aureus while it showed an asymmetrical arrangement in R. etli. The lack of consensus on this functionally important domain led to the assumption that such difference was due to speciation. However, I found this difficult to believe considering that i) PC carries out exactly the same molecular function in both species; ii) how important the PT domain is for the function of this enzyme and; iii)  how similar the overall quaternary arrangement of the tetramer is in both S. aureus and R. etli. On top of this, a recent work on a truncated PC from R. etli showed, unexpectedly, a symmetrical arrangement of the PT domains opening up the door for new answers to this unresolved question.

In this published work, we precisely tackled this question and showed that such conformational difference corresponds in fact to different conformational states of the enzyme rather to speciation. For this, we perfomed cryo-EM on working PC enzymes from S. aureus. Given the fact that each copy of PC would be in a different stage of this catalytic cycle, the challenge here was to sort out the single particle images into different conformational states. Luckily for us, existing computational methods are capable of separating conformations within conformational heterogeneous datasets. Our results showed that PC in S. aureus not only mapped to the known symmetrical conformation but also to the assymetrical conformation. Furthermore, we were able to assign each conformational state to a particular  enzymatic reaction (PC carries out two enzymatic reactions, each taking place in a diferent catalytic site) and suggest the conformational transitions happening throughout the catalytic cycle.

Our work has brought together different structures and gives light into how this protein, and similar ones, carry out their function. Also, I think that this work is a good example of how important combining different techniques can be to achieve one goal. Here, we combined X-ray crystallography, cryo-EM and molecular dynamics into one integrative modeling pipeline in order to get a bigger picture of our biological question.