iGAP: A scalable solution for genomic variants analysis

Traditionally Next Generation Sequencing (NGS) data analysis is performed in file-based manner. This approach however is error-prone, difficult to track and manage as amount of file increases. Furthermore it complicates complex analysis for whole datasets.

In this publication we introduce interactive Genomic Analysis Platform (iGAP). iGAP is a system supporting secondary NGS analysis pipeline (variant calling, decomposition, normalization, annotation) and aims to streamline tertiary data analysis which leads to clinical report.

Due to high volume of sequencing data the solution was build upon scalable Spark and Hadoop environment and new efficient data model has been proposed. Described application could be used in clinical institutions and laboratories where currently massive amount of NGS data are frequently stored and analyzed in file-centric manner.

Author: Agnieszka Szmurło
Conference: Title