Load TINC input data. — load_TINC

The function loads input mutations and optional copy number data for TINC. Input formats are reported at the package website.

After loading data, mutation with VAF outside a range (default [0; 0.7]) are removed from analysis. Similarly, mutations are down-sampled if in excess of some threshold (default `N=20000`).

load_TINC_input(x, cna, VAF_range_tumour = c(0, 0.7), N = 20000)

Arguments

x: A dataframe or tibble with input mutation data, reporting `chr`, `from`, `to`, `ref` and `alt`, plus `n_ref_count` and `n_alt_count`, and `t_ref_count` and `t_tot_count`.
cna: Copy Number data in the format of package CNAqc, providing `chr`, `from`, `to`, `Major` and `minor`.
VAF_range_tumour: VAF range used to filter mutations from the tumour sample.
N: If there are more than `N` mutations in VAF range `VAF_range_tumour`, a random subset of size `N` is retained.

Value

A tibble with the loaded data.

Examples

# Generating a random TIN input
rt = random_TIN()
#> ✔ Generated TINC dataset (n = 988 mutations), TIN (0.05) and TIT (1), normal and tumour coverage 30x and 120x.
#> Warning: Removed 2 rows containing missing values or values outside the scale range
#> (`geom_bar()`).
#> Warning: Removed 2 rows containing missing values or values outside the scale range
#> (`geom_bar()`).
load_TINC_input(x = rt$data, cna = rt$cna)
#> 
#> ── Loading TINC input data ─────────────────────────────────────────────────────
#> ✔ Input data contains n = 988 mutations, selecting operation mode.
#> ! Found CNA data, retaining only mutations that map to segments with predominant karyotype ...
#> 
#> 
#> ── CNAqc - CNA Quality Check ───────────────────────────────────────────────────
#> 
#> ℹ Using reference genome coordinates for: GRCh38.
#> ✔ Fortified calls for 988 somatic mutations: 988 SNVs (100%) and 0 indels.
#> ! CNAs have no CCF, assuming clonal CNAs (CCF = 1).
#> ! Added segments length (in basepairs) to CNA segments.
#> ✔ Fortified CNAs for 988 segments: 988 clonal and 0 subclonal.
#> Warning: [CNAqc] a karyotype column is present in CNA calls, and will be overwritten
#> ✔ 988 mutations mapped to clonal CNAs.
#> 
#> 
#> ── Genome coverage by karyotype, in basepairs. ──
#> 
#> # A tibble: 1 × 4
#>   minor Major     n karyotype
#>   <dbl> <dbl> <dbl> <chr>    
#> 1     1     1  2964 1:1      
#> ✔ n = 988 mutations mapped to CNA segments with karyotype 1:1 (largest available in basepairs).
#> ✔ Mutation with VAF within 0 and 0.7 ~ n = 985.
#> $mutations
#> # A tibble: 988 × 12
#>    chr        from        to ref   alt   n_ref_count n_alt_count t_ref_count
#>    <chr>     <int>     <dbl> <chr> <chr>       <dbl>       <dbl>       <dbl>
#>  1 chr10  58621084  58621085 A     T              25           0          62
#>  2 chr22  43901849  43901850 T     A              34           0          64
#>  3 chr8   94780725  94780726 T     C              28           0          56
#>  4 chr8   33652187  33652188 T     C              30           0          53
#>  5 chr15  29033868  29033869 A     G              25           0          67
#>  6 chr3   24149402  24149403 C     A              28           0          59
#>  7 chr10  79650576  79650577 A     G              38           0          58
#>  8 chr17   5547490   5547491 T     A              30           0          62
#>  9 chr3  179929084 179929085 A     A              27           0          66
#> 10 chr6   33663681  33663682 C     T              28           0          57
#> # ℹ 978 more rows
#> # ℹ 4 more variables: t_alt_count <dbl>, karyotype <chr>, id <chr>,
#> #   OK_tumour <lgl>
#> 
#> $cna_map
#> ── [ CNAqc ] MySample 988 mutations in 988 segments (988 clonal, 0 subclonal). G
#> 
#> ── Clonal CNAs 
#> 
#>  1:1  [n = 988, L =   0 Mb] ■■■■■■■■■■■■■■■■■■■■■■■■■■■
#> 
#> ℹ Sample Purity: 80% ~ Ploidy: 2.
#> 
#> $what_we_used
#> [1] "1:1"
#>