top of page
Search
Emily Cowlishaw

Genomic Epidemiology in Ebola Outbreaks

Updated: Jan 27, 2022


Introduction

The study of genomic epidemiology has dramatically influenced the response to disease outbreaks and their ability to be tracked. From the outbreak of the Black Death in the mid-1300s to the 1918 flu pandemic, global and national epidemics can be devastating to a population. Discovering the origins of an outbreak through genomic epidemiological studies allows transmission routes to be configured and potential reservoirs to be identified. Pathogens with high mutation rates and short replication periods, such as Ebola virus, make genomic sequencing and single variation studies invaluable as a means to trace a disease through time and location. Construction of phylodynamic data using whole genome sequencing allows pathogenic isolates to be identified and their evolution to be traced. In the case of the largest Ebola outbreak in history, genomic epidemiology allowed viral evolution to be surveyed through West Africa via whole genome sequencing.


Genomic Epidemiology in Ebola Outbreaks

Beginning in December 2013 and continuing into 2016, an outbreak of Ebola occurred that infected more than 28,000 people, killing upwards of 11,000, was attributed to a novel variant of Ebola, termed Ebola Makona (EBOV-Makona). Ebola virus, from the family of Filoviridae, falls into Baltimore classification Group V being a negative, single stranded RNA virus with a genome size of 19kb. Similar to other RNA viruses, it accumulates mutations rapidly due to the error-prone nature of RNA viruses and lack of proofreading capabilities. Over the course of this outbreak, the combination of different studies led to over 1,500 Ebola Makona genomes being sequenced, accounting for about 5% of all cases.

The introduction of EBOV-Makona into the human population that began this epidemic is thought to have originated from a two-year old boy who was bitten by an infected bat in Guinea, Africa. After identifying this boy as the index case, the genetic evolution was traced by 179 samples of patient serum from March 2014 to January 2015 via the European Mobile Laboratory, a diagnostic unit that was organized in the epicenter of the outbreak. The initial linage, named Lineage A, was consistent with cases from March 2014 in Guinea, followed by the emergence of Lineage B, which occurred in May of 2014.

Sierra Leone was among the countries most heavily impacted by the Ebola outbreak, accounting for almost 50% of all reported cases. 232 genomes were sequenced of the 14,124 cases presented in Sierra Leone in this time period. Parallel genome sequencing was executed on the 673 samples taken from patients, which were subsequentially divided into two cohorts based on when and where the cases were confirmed. Upon assembling the genomes (each at least 18.9 kb in length, with an average of 374x coverage), they were all compared to the 86 EBOV-Makona genomes from Guinea that were published earlier. When compared only to the original earliest strain from Guinea, 464 single-nucleotide polymorphisms (SNPs) were documented. Of these, 125 were nonsynonymous, 176 were synonymous, 63 of them were non-coding, and there were 7 total insertions (five single nucleotide and two insertions of two bases). These mutations were specific to each strain, never being found in more than one sample at the same location in the Sierra Leone collection.

Three lineages from Sierra Leone have been categorized as SL1 (also referred to as Lineage A), SL2, and SL3 (see Figure 1). The transmission of Ebola appears to occur in events of clustering both temporally and geographically. Due to common transmission events such as funerals or hospital/healthcare visits, Ebola transmits from one person to many very rapidly.

The first introduction of EBOV-Makona into Sierra Leone occurred in twelve people who attended the funeral of a confirmed EBOV-Makona patient from Guinea; this became the SL1 lineage. SL1 was two to five mutations different than Guinea lineage, thus evident of transmission from Guinea. The addition of four new SNPs to SL1 resulted in the formation of SL2. Lastly, in June of 2014, a single SNP at the 10,218th position of SL2 led to the lineage of SL3, which ultimately rose to be the most prevalent strain in Sierra Leone, accounting for 97% of all genomes that were sequenced. This concluded that following the introduction of SL1 into Sierra Leone from Guinea, the rest of the transmission of the virus could be attributed to human-to-human contact within Sierra Leone and eliminated the possibility of a zoonotic reservoir in this outbreak.

Figure 1 | A temporally arranged phylogenic tree of genomes from the 2013-2016 Ebola Makona outbreak. Dots indicate case location (green is Guinea, red is Liberia, and blue is Sierra Leone).

Liberia, which was also affected by this Ebola epidemic, analyzed the outbreak via the Liberian Institute for Biomedical Research (LIBR) with help from the Center for Genomic Sciences at the United States Army Medical Research Institute of Infectious Diseases (USAMRIID) in Maryland. Of the 1,700 confirmed cases in Liberia, samples were selected from 25 patients of different age groups that had been treated at 7 different facilities recently. Resulting viral RNA from the samples was amplified and then sequenced via Illumina, and the genomes were assembled with reference to a previously published EBOV-Makona genome. Sequencher was used to form consensus sequences that aligned with the published genome, and SnpEff was used to identify all SNPs present in the genomes compared in reference to a 2014 Makona strain. It was found that these 25 genomes had 97 new SNPs, of which about half were synonymous, one quarter of the genomes were nonsynonymous, and the remaining quarter were noncoding. One nonsense mutation was found. It was determined that the common ancestor to all of these isolates originated between May 2 and July 9, 2014; this timeframe corresponds to an outbreak that occurred in Monrovia, Liberia and is consistent with a single transmission event, rather than continuous reinfection (see Figure 2).

Another study performed on the cases of Ebola in Liberia in 2014 sequenced 140 genomes in comparison to 734 genomes published from the West Africa outbreak. In this comparison, there were 1,474 SNPs, 960 of which occurred in the coding regions of the virus; 555 of these were synonymous mutations, 403 were nonsynonymous, and the remaining two resulted in missense mutations. This shows consistency within the Ebola virus mutation rate in Liberia.

Figure 2 | A graphical analysis of the introduction and transmission of Ebola virus in Liberia. The initial infection originated from Guinea, mutated in Liberia, and then returned to Guinea.

As of 2015, a minimum of 33 viral mutations had occurred since the beginning of the West Africa epidemic. Of these, 26 were nonsynonymous mutations to the epitopes recognized by combative immunotherapy treatments; 5 of the 33 occurred in the binding regions of therapeutic drugs; the remaining 2 appeared in the primer binding region used for PCR analysis that is used as a diagnostic test during outbreaks. Fortunately, no change in efficacy of the diagnostic tests used in Liberia occurred, and no more than one change per each genome sequence that affected therapeutic drug binding was recorded.


Moving Forward

In epidemiological studies of diseases with large transmission bottlenecks, which presents the possibility for many genetically different isolates to be transferred during an infectious encounter, the combination of whole genome sequencing and SNP analysis seems to have a promising future. To make a definitive diagnosis of Ebola, any of these methods could be performed: real-time quantitative PCR, antigen or antibody reaction, or viral genome sequencing. While not every outbreak zone has a conveniently located laboratory, the invention of the small, portable, and highly accurate genome sequencer, MinION, makes genome sequencing and epidemiology easily attainable all around the world. This will likely be the future of real time genomic epidemiology. Because of these new technologies, Ebola has been traced all through Western Africa, originating in Guinea and spreading to Liberia and Sierra Leone


Conclusions

Epidemics seem to be inevitable, and while perhaps nothing can be done to prevent them, the use of whole genome sequencing combined with SNP analysis as genomic surveillance can be used to track and trace the outbreaks as they occur, allowing public health information about transmissions to be recorded and generate a public health response. The use of genomic epidemiology in the Ebola epidemic has been informative due to the high mutation rate of Ebola virus. It has allowed for transmission events to be identified and subsequently avoided in the future, as well as following mutations in factors of detection and drug treatment that may impact diagnosis and treatments in the future.



This article was prepared by the author in their personal capacity. The opinions expressed in this article are the author's own and do not reflect the view of their place of employment.




Commentaires


bottom of page