Repun: an accurate small variant representation unification method for multiple sequencing platforms.

Zhenxian Zheng,Yingxuan Ren,Lei Chen,Angel On Ki Wong,Shumin Li,Xian Yu,Tak-Wah Lam,Ruibang Luo

doi:10.1093/bib/bbae613

Abstract

Ensuring a unified variant representation aligning the sequencing data is critical for downstream analysis as variant representation may differ across platforms and sequencing conditions. Current approaches typically treat variant unification as a post-step following variant calling and are incapable of measuring the correct variant representation from the outset. Aligning variant representations with the alignment before variant calling has benefits like providing reliable training labels for deep learning-based variant caller model training and enabling direct assessment of alignment quality. However, it also poses challenges due to the large number of candidates to handle. Here, we present Repun, a haplotype-aware variant-alignment unification algorithm that harmonizes the variant representation between provided variants and alignments in different sequencing platforms. Repun leverages phasing to facilitate equivalent haplotype matches between variants and alignments. Our approach reduced the comparisons between variant haplotypes and candidate haplotypes by utilizing haplotypes with read evidence to speed up the unification process. Repun achieved >99.99% precision and > 99.5% recall through extensive evaluations of various Genome in a Bottle Consortium samples encompassing three sequencing platforms: Oxford Nanopore Technology, Pacific Biosciences, and Illumina. Repun is open-source and available at (https://github.com/zhengzhenxian/Repun).

Full Text

Published Version

View

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Repun: an accurate small variant representation unification method for multiple sequencing platforms.

Abstract

Published Version

Talk to us

Similar Papers

More From: Briefings in bioinformatics

Lead the way for us

Journal: Briefings in bioinformatics	Publication Date: Nov 22, 2024
License type: CC BY 4.0

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Repun: an accurate small variant representation unification method for multiple sequencing platforms.

Abstract

Published Version

Talk to us

Similar Papers

More From: Briefings in bioinformatics