SUPPLEMENTARY RESULTS

SUPPLEMENTARY RESULTS

| Unknown Author
The study analyzed coding exons of ~4000 genes (5Mb) from NCI-H209 using PCR and capillary sequencing, identifying 29 known single-base substitutions. The substitution algorithm detected 22 of these, achieving a sensitivity of 76%. The missed substitutions were due to lack of reads, reference bias, insufficient coverage, or contamination. The algorithm's specificity for identifying somatic point mutations was 98% in coding regions and 94% in non-coding regions, with 6 false positives due to germline SNPs and 9 due to neighboring indels and other variants. Small insertions (up to 3 bp) and deletions (up to 11 bp) were detected using corona-lite. A stringent algorithm was used to minimize false positives, requiring at least three supporting tumor reads, one read on each strand, no loss of heterozygosity in the normal genome, maximum 100X coverage, and minimum 30X normal coverage. The algorithm failed to detect 2 coding indels identified previously, confirming low sensitivity for somatic indels. A set of 262 putative somatic indels was confirmed by capillary sequencing, with a true positive rate of 25%. False positives included wild-type sequences and germline indels, highlighting the challenges in identifying genuine somatic indels due to reference bias and germline polymorphisms. - **Supplementary Figure 1**: Distance from transcription start site. - **Supplementary Figure 2**: IARC database: SCLC cases with 245 published substitutions in *TP53*. - **Supplementary Figure 3**: Mutations by transcribed vs non-transcribed strands. - **Supplementary Figure 4**: Not specified in the text.The study analyzed coding exons of ~4000 genes (5Mb) from NCI-H209 using PCR and capillary sequencing, identifying 29 known single-base substitutions. The substitution algorithm detected 22 of these, achieving a sensitivity of 76%. The missed substitutions were due to lack of reads, reference bias, insufficient coverage, or contamination. The algorithm's specificity for identifying somatic point mutations was 98% in coding regions and 94% in non-coding regions, with 6 false positives due to germline SNPs and 9 due to neighboring indels and other variants. Small insertions (up to 3 bp) and deletions (up to 11 bp) were detected using corona-lite. A stringent algorithm was used to minimize false positives, requiring at least three supporting tumor reads, one read on each strand, no loss of heterozygosity in the normal genome, maximum 100X coverage, and minimum 30X normal coverage. The algorithm failed to detect 2 coding indels identified previously, confirming low sensitivity for somatic indels. A set of 262 putative somatic indels was confirmed by capillary sequencing, with a true positive rate of 25%. False positives included wild-type sequences and germline indels, highlighting the challenges in identifying genuine somatic indels due to reference bias and germline polymorphisms. - **Supplementary Figure 1**: Distance from transcription start site. - **Supplementary Figure 2**: IARC database: SCLC cases with 245 published substitutions in *TP53*. - **Supplementary Figure 3**: Mutations by transcribed vs non-transcribed strands. - **Supplementary Figure 4**: Not specified in the text.
Reach us at info@study.space
[slides] A small cell lung cancer genome reports complex tobacco exposure signatures | StudySpace