WHAM: A High-Throughput Sequence Alignment Method

Yinan Li; Jignesh M. Patel; Allison Terrell

WHAM: A High-Throughput Sequence Alignment Method

Yinan Li ,
Jignesh M. Patel ,
Allison Terrell

ACM Trans. Database Syst. | December 2012 , Vol 37(4): pp. 28

Download BibTex

Over the last decade, the cost of producing genomic sequences has dropped dramatically due to the current so-called next-generation sequencing methods. However, these next-generation sequencing methods are critically dependent on fast and sophisticated data processing methods for aligning a set of query sequences to a reference genome using rich string matching models. The focus of this work is on the design, development and evaluation of a data processing system for this crucial “short read alignment” problem. Our system, called WHAM, employs hash-based indexing methods and bitwise operations for sequence alignments. It allows rich match models and it is significantly faster than the existing state-of-the-art methods. In addition, its relative speedup over the existing method is poised to increase in the future in which read sequence lengths will increase.