Background: Despite the wide use of next generation sequencing, there are still many difficulties in detecting structural variants. A split read is one of the clues of structural variants and is represented as a soft-clipped read in the raw sequencing data. Considering that most of the breakpoints of structural variants reside in non-coding regions, split read information has not been routinely used in exome sequencing or targeted panel sequencing. Recently, SCRAMble, a software capable of detecting mobile element insertion (MEI) and deletion based on soft-clipped read clusters (SCRCs), was shown to provide an additional diagnostic yield of 0.03 - 0.25%. SCRAMble is the only software that can be used for exome sequencing or targeted panel sequencing to detect structural variants based on SCRC information. The aim of present study was to establish a working procedure of utilizing SCRC information using SCRAMble in clinical exome sequencing and to assess its diagnostic yield.
Methods: Raw sequencing data of clinical exome sequencing were retrospectively analyzed using SCRAMble to search MEIs and deletions. SCRAMble software was installed according to the manufacturer’s instructions and default parameters except for one, mei-score, which was adjusted for sensitivity, were used. RefSeq gene annotation was performed for both MEI and deletion calls using ANNOVAR. Blacklist-based filtering was used to reduce candidate MEI/deletion calls. Clinical relevance was manually evaluated for the remaining variant calls.
Results: One diagnostic MEI, which is a founder variant in East Asia, was detected in two cases (2/266, 0.75%). In addition, two diagnostic deletions, which had been previously detected in depth-of-coverage (DOC)-based copy number variant (CNV) callers, were detected (2/266, 0.75%). Base-level breakpoints that could not be derived by the DOC-based callers were identified for these two deletions using SCRAMble. Most SCRCs were repetitive among cases and blacklist-based filtering reduced candidate MEI/deletion calls by 49.5 - 94.5%, leaving a considerable number of variant calls to be manually validated.
Conclusions: SCRC screening in exome or targeted panel sequencing may provide additional diagnostic yield either by pathogenic MEI detection or reassurance of deletions identified by DOC-based CNV callers. Development of an efficient filtering algorithm is warranted.