Enhanced bioinformatic profiling of VIDISCA libraries for virus detection and discovery

Cormac M. Kinsella; Martin Deijs; Lia van der Hoek

doi:https://doi.org/10.1016/j.virusres.2018.12.010

Enhanced bioinformatic profiling of VIDISCA libraries for virus detection and discovery

Cormac M. Kinsella, Martin Deijs, Lia van der Hoek

Research output: Contribution to journal › Article › Academic › peer-review

15 Citations (Scopus)

Abstract

VIDISCA is a next-generation sequencing (NGS) library preparation method designed to enrich viral nucleic acids from samples before highly-multiplexed low depth sequencing. Reliable detection of known viruses and discovery of novel divergent viruses from NGS data require dedicated analysis tools that are both sensitive and accurate. Existing software was utilised to design a new bioinformatic workflow for high-throughput detection and discovery of viruses from VIDISCA data. The workflow leverages the VIDISCA library preparation molecular biology, specifically the use of Mse1 restriction enzyme which produces biological replicate library inserts from identical genomes. The workflow performs total metagenomic analysis for classification of non-viral sequence including parasites and host, and separately carries out virus specific analyses. Ribosomal RNA sequence is removed to increase downstream analysis speed and remaining reads are clustered at 100% identity. Known and novel viruses are sensitively detected via alignment to a virus-only protein database, and false positives are removed. A new cluster-profiling analysis takes advantage of the viral biological replicates produced by Mse1 digestion, using read clustering to flag the presence of short genomes at very high copy number. Importantly, this analysis ensures that highly repeated sequences are identified even if no homology is detected, as is shown here with the detection of a novel gokushovirus genome from human faecal matter. The workflow was validated using read data derived from serum and faeces samples taken from HIV-1 positive adults, and serum samples from pigs that were infected with atypical porcine pestivirus.

Original language	English
Pages (from-to)	21-26
Journal	Virus Research
Volume	263
DOIs	https://doi.org/10.1016/j.virusres.2018.12.010
Publication status	Published - 2019

Access to Document

https://doi.org/10.1016/j.virusres.2018.12.010

Cite this

@article{b218c9d44f954a7cbe909ee7c9681811,

title = "Enhanced bioinformatic profiling of VIDISCA libraries for virus detection and discovery",

abstract = "VIDISCA is a next-generation sequencing (NGS) library preparation method designed to enrich viral nucleic acids from samples before highly-multiplexed low depth sequencing. Reliable detection of known viruses and discovery of novel divergent viruses from NGS data require dedicated analysis tools that are both sensitive and accurate. Existing software was utilised to design a new bioinformatic workflow for high-throughput detection and discovery of viruses from VIDISCA data. The workflow leverages the VIDISCA library preparation molecular biology, specifically the use of Mse1 restriction enzyme which produces biological replicate library inserts from identical genomes. The workflow performs total metagenomic analysis for classification of non-viral sequence including parasites and host, and separately carries out virus specific analyses. Ribosomal RNA sequence is removed to increase downstream analysis speed and remaining reads are clustered at 100% identity. Known and novel viruses are sensitively detected via alignment to a virus-only protein database, and false positives are removed. A new cluster-profiling analysis takes advantage of the viral biological replicates produced by Mse1 digestion, using read clustering to flag the presence of short genomes at very high copy number. Importantly, this analysis ensures that highly repeated sequences are identified even if no homology is detected, as is shown here with the detection of a novel gokushovirus genome from human faecal matter. The workflow was validated using read data derived from serum and faeces samples taken from HIV-1 positive adults, and serum samples from pigs that were infected with atypical porcine pestivirus.",

author = "Kinsella, {Cormac M.} and Martin Deijs and {van der Hoek}, Lia",

year = "2019",

doi = "https://doi.org/10.1016/j.virusres.2018.12.010",

language = "English",

volume = "263",

pages = "21--26",

journal = "Virus Research",

issn = "0168-1702",

publisher = "Elsevier",

}

TY - JOUR

T1 - Enhanced bioinformatic profiling of VIDISCA libraries for virus detection and discovery

AU - Kinsella, Cormac M.

AU - Deijs, Martin

AU - van der Hoek, Lia

PY - 2019

Y1 - 2019

N2 - VIDISCA is a next-generation sequencing (NGS) library preparation method designed to enrich viral nucleic acids from samples before highly-multiplexed low depth sequencing. Reliable detection of known viruses and discovery of novel divergent viruses from NGS data require dedicated analysis tools that are both sensitive and accurate. Existing software was utilised to design a new bioinformatic workflow for high-throughput detection and discovery of viruses from VIDISCA data. The workflow leverages the VIDISCA library preparation molecular biology, specifically the use of Mse1 restriction enzyme which produces biological replicate library inserts from identical genomes. The workflow performs total metagenomic analysis for classification of non-viral sequence including parasites and host, and separately carries out virus specific analyses. Ribosomal RNA sequence is removed to increase downstream analysis speed and remaining reads are clustered at 100% identity. Known and novel viruses are sensitively detected via alignment to a virus-only protein database, and false positives are removed. A new cluster-profiling analysis takes advantage of the viral biological replicates produced by Mse1 digestion, using read clustering to flag the presence of short genomes at very high copy number. Importantly, this analysis ensures that highly repeated sequences are identified even if no homology is detected, as is shown here with the detection of a novel gokushovirus genome from human faecal matter. The workflow was validated using read data derived from serum and faeces samples taken from HIV-1 positive adults, and serum samples from pigs that were infected with atypical porcine pestivirus.

AB - VIDISCA is a next-generation sequencing (NGS) library preparation method designed to enrich viral nucleic acids from samples before highly-multiplexed low depth sequencing. Reliable detection of known viruses and discovery of novel divergent viruses from NGS data require dedicated analysis tools that are both sensitive and accurate. Existing software was utilised to design a new bioinformatic workflow for high-throughput detection and discovery of viruses from VIDISCA data. The workflow leverages the VIDISCA library preparation molecular biology, specifically the use of Mse1 restriction enzyme which produces biological replicate library inserts from identical genomes. The workflow performs total metagenomic analysis for classification of non-viral sequence including parasites and host, and separately carries out virus specific analyses. Ribosomal RNA sequence is removed to increase downstream analysis speed and remaining reads are clustered at 100% identity. Known and novel viruses are sensitively detected via alignment to a virus-only protein database, and false positives are removed. A new cluster-profiling analysis takes advantage of the viral biological replicates produced by Mse1 digestion, using read clustering to flag the presence of short genomes at very high copy number. Importantly, this analysis ensures that highly repeated sequences are identified even if no homology is detected, as is shown here with the detection of a novel gokushovirus genome from human faecal matter. The workflow was validated using read data derived from serum and faeces samples taken from HIV-1 positive adults, and serum samples from pigs that were infected with atypical porcine pestivirus.

UR - https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85059540210&origin=inward

UR - https://www.ncbi.nlm.nih.gov/pubmed/30578804

U2 - https://doi.org/10.1016/j.virusres.2018.12.010

DO - https://doi.org/10.1016/j.virusres.2018.12.010

M3 - Article

C2 - 30578804

SN - 0168-1702

VL - 263

SP - 21

EP - 26

JO - Virus Research

JF - Virus Research

ER -

Enhanced bioinformatic profiling of VIDISCA libraries for virus detection and discovery

Abstract

Access to Document

Other files and links

Cite this