This function simulates a random dataset for TINC analysis,
with mutations and copy number data. Segments are not real, they are assumed
to be constant heterozygous diploid (Major = minor = 1
) and span just
mutations (for mappability).
This samples has some noise so that the obtained TIT score might be slightly lower than the required input.
random_TIN(
N = 1000,
TIN = 0.05,
TIT = 1,
normal_coverage = 30,
tumour_coverage = 120
)
Number of input simulations
TIN - Tumour in normal contamination level.
TIT - Tumour in tumour contamination level (aka tumour purity).
Normal coverage (mean).
Tumour coverage (mean).
Tibbles with the data, and a plot.
set.seed(1234)
# Default dataset
random_TIN()
#> ✔ Generated TINC dataset (n = 991 mutations), TIN (0.05) and TIT (1), normal and tumour coverage 30x and 120x.
#> Warning: Removed 2 rows containing missing values or values outside the scale range
#> (`geom_bar()`).
#> Warning: Removed 2 rows containing missing values or values outside the scale range
#> (`geom_bar()`).
#> $data
#> # A tibble: 991 × 14
#> chr from to ref alt simulated_cluster n_tot_count t_tot_count
#> <chr> <int> <dbl> <chr> <chr> <chr> <int> <int>
#> 1 chr21 9350303 9.35e6 T C C1 35 109
#> 2 chr22 34134600 3.41e7 C A C1 40 127
#> 3 chr10 28158356 2.82e7 T C C1 37 120
#> 4 chr13 76030019 7.60e7 G A C1 43 113
#> 5 chr7 80111695 8.01e7 T C C1 27 107
#> 6 chr16 79722541 7.97e7 T T C1 42 122
#> 7 chr10 106104245 1.06e8 A T C1 28 122
#> 8 chr15 95490382 9.55e7 G G C1 24 124
#> 9 chr6 17255265 1.73e7 G A C1 32 120
#> 10 chr7 143422347 1.43e8 C C C1 43 135
#> # ℹ 981 more rows
#> # ℹ 6 more variables: n_alt_count <dbl>, t_alt_count <dbl>, n_ref_count <dbl>,
#> # t_ref_count <dbl>, sim_t_vaf <dbl>, sim_n_vaf <dbl>
#>
#> $cna
#> # A tibble: 991 × 6
#> chr from to ref Major minor
#> <chr> <dbl> <dbl> <chr> <dbl> <dbl>
#> 1 chr21 9350302 9350305 T 1 1
#> 2 chr22 34134599 34134602 C 1 1
#> 3 chr10 28158355 28158358 T 1 1
#> 4 chr13 76030018 76030021 G 1 1
#> 5 chr7 80111694 80111697 T 1 1
#> 6 chr16 79722540 79722543 T 1 1
#> 7 chr10 106104244 106104247 A 1 1
#> 8 chr15 95490381 95490384 G 1 1
#> 9 chr6 17255264 17255267 G 1 1
#> 10 chr7 143422346 143422349 C 1 1
#> # ℹ 981 more rows
#>
#> $plot
#>