This functions randoly subsample mutations, retaining all the simple clonal CNAs; subclonal CNAs are dropped. If data contains driver mutation annotations, these can be forced to remain.

subsample(x, N = 15000, keep_drivers = TRUE)

Arguments

x

A new CNAqc object with subset data.

N

The maximum number of mutations to retain.

keep_drivers

If TRUE, it retains drivers annotated in the data.

Examples

data('example_dataset_CNAqc', package = 'CNAqc')
x = init(mutations = example_dataset_CNAqc$mutations, cna = example_dataset_CNAqc$cna, purity = example_dataset_CNAqc$purity)
#> 
#> ── CNAqc - CNA Quality Check ───────────────────────────────────────────────────
#> 
#>  Using reference genome coordinates for: GRCh38.
#>  Found annotated driver mutations: TTN, CTCF, and TP53.
#>  Fortified calls for 12963 somatic mutations: 12963 SNVs (100%) and 0 indels.
#> ! CNAs have no CCF, assuming clonal CNAs (CCF = 1).
#>  Fortified CNAs for 267 segments: 267 clonal and 0 subclonal.
#>  12963 mutations mapped to clonal CNAs.

# Example runs
subsample(x, N = 100)
#> 
#> ── CNAqc - CNA Quality Check ───────────────────────────────────────────────────
#> 
#>  Using reference genome coordinates for: GRCh38.
#>  Found annotated driver mutations: TTN, CTCF, and TP53.
#>  Fortified calls for 103 somatic mutations: 103 SNVs (100%) and 0 indels.
#> ! CNAs have no CCF, assuming clonal CNAs (CCF = 1).
#>  Fortified CNAs for 267 segments: 267 clonal and 0 subclonal.
#> Warning: [CNAqc] a karyotype column is present in CNA calls, and will be overwritten
#>  103 mutations mapped to clonal CNAs.
#> ── [ CNAqc ] MySample 103 mutations in 267 segments (267 clonal, 0 subclonal). G
#> 
#> ── Clonal CNAs 
#> 
#>  2:2  [n = 58, L = 1483 Mb] ■■■■■■■■■■■■■■■■■■■■■■■■■■■  { CTCF }
#>  3:2  [n = 16, L = 357 Mb] ■■■■■■■
#>  4:2  [n = 14, L = 331 Mb] ■■■■■■
#>  2:1  [n =  9, L = 420 Mb] ■■■■  { TTN }
#>  3:0  [n =  5, L = 137 Mb] ■■
#>  2:0  [n =  1, L = 39 Mb]   { TP53 }
#> 
#>  Sample Purity: 89% ~ Ploidy: 4.
#> 
#>  There are 3 annotated driver(s) mapped to clonal CNAs.
#>          chr      from        to ref alt  DP NV       VAF driver_label is_driver
#>         chr2 179431633 179431634   C   T 117 77 0.6581197          TTN      TRUE
#>        chr16  67646006  67646007   C   T 120 54 0.4500000         CTCF      TRUE
#>        chr17   7577106   7577107   G   C  84 78 0.9285714         TP53      TRUE
subsample(x, N = 1000)
#> 
#> ── CNAqc - CNA Quality Check ───────────────────────────────────────────────────
#> 
#>  Using reference genome coordinates for: GRCh38.
#>  Found annotated driver mutations: TTN, CTCF, and TP53.
#>  Fortified calls for 1003 somatic mutations: 1003 SNVs (100%) and 0 indels.
#> ! CNAs have no CCF, assuming clonal CNAs (CCF = 1).
#>  Fortified CNAs for 267 segments: 267 clonal and 0 subclonal.
#> Warning: [CNAqc] a karyotype column is present in CNA calls, and will be overwritten
#>  1003 mutations mapped to clonal CNAs.
#> ── [ CNAqc ] MySample 1003 mutations in 267 segments (267 clonal, 0 subclonal). 
#> 
#> ── Clonal CNAs 
#> 
#>   2:2  [n = 580, L = 1483 Mb] ■■■■■■■■■■■■■■■■■■■■■■■■■■■  { CTCF }
#>   4:2  [n = 148, L = 331 Mb] ■■■■■■■
#>   2:1  [n = 123, L = 420 Mb] ■■■■■■  { TTN }
#>   3:2  [n = 123, L = 357 Mb] ■■■■■■
#>   3:0  [n =  23, L = 137 Mb] ■
#>   2:0  [n =   4, L =  39 Mb]   { TP53 }
#>  25:2  [n =   1, L =   1 Mb] 
#>  26:2  [n =   1, L =   0 Mb] 
#> 
#>  Sample Purity: 89% ~ Ploidy: 4.
#> 
#>  There are 3 annotated driver(s) mapped to clonal CNAs.
#>          chr      from        to ref alt  DP NV       VAF driver_label is_driver
#>         chr2 179431633 179431634   C   T 117 77 0.6581197          TTN      TRUE
#>        chr16  67646006  67646007   C   T 120 54 0.4500000         CTCF      TRUE
#>        chr17   7577106   7577107   G   C  84 78 0.9285714         TP53      TRUE