The function loads input mutations and optional copy number data for TINC. Input formats are reported at the package website.
After loading data, mutation with VAF outside a range (default [0; 0.7]) are removed from analysis. Similarly, mutations are down-sampled if in excess of some threshold (default `N=20000`).
load_TINC_input(x, cna, VAF_range_tumour = c(0, 0.7), N = 20000)
A dataframe or tibble with input mutation data, reporting `chr`, `from`, `to`, `ref` and `alt`, plus `n_ref_count` and `n_alt_count`, and `t_ref_count` and `t_tot_count`.
Copy Number data in the format of package CNAqc
, providing `chr`, `from`, `to`,
`Major` and `minor`.
VAF range used to filter mutations from the tumour sample.
If there are more than `N` mutations in VAF range `VAF_range_tumour`, a random subset of size `N` is retained.
A tibble with the loaded data.
# Generating a random TIN input
rt = random_TIN()
#> ✔ Generated TINC dataset (n = 988 mutations), TIN (0.05) and TIT (1), normal and tumour coverage 30x and 120x.
#> Warning: Removed 2 rows containing missing values or values outside the scale range
#> (`geom_bar()`).
#> Warning: Removed 2 rows containing missing values or values outside the scale range
#> (`geom_bar()`).
load_TINC_input(x = rt$data, cna = rt$cna)
#>
#> ── Loading TINC input data ─────────────────────────────────────────────────────
#> ✔ Input data contains n = 988 mutations, selecting operation mode.
#> ! Found CNA data, retaining only mutations that map to segments with predominant karyotype ...
#>
#>
#> ── CNAqc - CNA Quality Check ───────────────────────────────────────────────────
#>
#> ℹ Using reference genome coordinates for: GRCh38.
#> ✔ Fortified calls for 988 somatic mutations: 988 SNVs (100%) and 0 indels.
#> ! CNAs have no CCF, assuming clonal CNAs (CCF = 1).
#> ! Added segments length (in basepairs) to CNA segments.
#> ✔ Fortified CNAs for 988 segments: 988 clonal and 0 subclonal.
#> Warning: [CNAqc] a karyotype column is present in CNA calls, and will be overwritten
#> ✔ 988 mutations mapped to clonal CNAs.
#>
#>
#> ── Genome coverage by karyotype, in basepairs. ──
#>
#> # A tibble: 1 × 4
#> minor Major n karyotype
#> <dbl> <dbl> <dbl> <chr>
#> 1 1 1 2964 1:1
#> ✔ n = 988 mutations mapped to CNA segments with karyotype 1:1 (largest available in basepairs).
#> ✔ Mutation with VAF within 0 and 0.7 ~ n = 985.
#> $mutations
#> # A tibble: 988 × 12
#> chr from to ref alt n_ref_count n_alt_count t_ref_count
#> <chr> <int> <dbl> <chr> <chr> <dbl> <dbl> <dbl>
#> 1 chr10 58621084 58621085 A T 25 0 62
#> 2 chr22 43901849 43901850 T A 34 0 64
#> 3 chr8 94780725 94780726 T C 28 0 56
#> 4 chr8 33652187 33652188 T C 30 0 53
#> 5 chr15 29033868 29033869 A G 25 0 67
#> 6 chr3 24149402 24149403 C A 28 0 59
#> 7 chr10 79650576 79650577 A G 38 0 58
#> 8 chr17 5547490 5547491 T A 30 0 62
#> 9 chr3 179929084 179929085 A A 27 0 66
#> 10 chr6 33663681 33663682 C T 28 0 57
#> # ℹ 978 more rows
#> # ℹ 4 more variables: t_alt_count <dbl>, karyotype <chr>, id <chr>,
#> # OK_tumour <lgl>
#>
#> $cna_map
#> ── [ CNAqc ] MySample 988 mutations in 988 segments (988 clonal, 0 subclonal). G
#>
#> ── Clonal CNAs
#>
#> 1:1 [n = 988, L = 0 Mb] ■■■■■■■■■■■■■■■■■■■■■■■■■■■
#>
#> ℹ Sample Purity: 80% ~ Ploidy: 2.
#>
#> $what_we_used
#> [1] "1:1"
#>