
3. Analysis of metastatic propensity and tropism of the MSK-MetTropism Cohort
Source:vignettes/a3_metastasis_analysis.Rmd
a3_metastasis_analysis.Rmd
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
library(INCOMMON)
#> Warning: replacing previous import 'cli::num_ansi_colors' by
#> 'crayon::num_ansi_colors' when loading 'INCOMMON'In this vignette we carry out survival analysis based on INCOMMON classification of samples of breast cancer (BRCA) patients of the MSK-MetTropsim cohort.
First we prepare the input using function init:
data(MSK_genomic_data)
data(MSK_clinical_data)
data(cancer_gene_census)
x = init(
genomic_data = MSK_genomic_data,
clinical_data = MSK_clinical_data %>% filter(tumor_type == 'BRCA'),
gene_roles = cancer_gene_census
)
#> ── INCOMMON - Inference of copy number and mutation multiplicity in oncology ───
#>
#> ── Genomic data ──
#>
#> ✔ Found 25659 samples, with 224939 mutations in 491 genes
#> ! No read counts found for 1393 mutations in 1393 samples
#> ! Gene name not provided for 1393 mutations
#> ! 201 genes could not be assigned a role (TSG or oncogene)
#>
#> ── Clinical data ──
#>
#> ℹ Provided clinical features:
#> ✔ sample (required for classification)
#> ✔ purity (required for classification)
#> ✔ tumor_type
#> ✔ OS_MONTHS
#> ✔ OS_STATUS
#> ✔ SAMPLE_TYPE
#> ✔ MET_COUNT
#> ✔ METASTATIC_SITE
#> ✔ MET_SITE_COUNT
#> ✔ PRIMARY_SITE
#> ✔ SUBTYPE_ABBREVIATION
#> ✔ GENE_PANEL
#> ✔ TMB_NONSYNONYMOUS
#> ✔ FGA
#> ✔ AGE_AT_DEATH
#> ✔ Found 2484 matching samples
#> ✖ Found 23175 unmatched samples
print(x)
#> ── [ INCOMMON ] 9916 PASS mutations across 2462 samples, with 286 mutant genes
#> ℹ Average sample purity: 0.42
#> ℹ Average sequencing depth: 681
#> # A tibble: 9,916 × 25
#> sample tumor_type purity chr from to ref alt DP NV VAF
#> <chr> <chr> <dbl> <chr> <dbl> <dbl> <chr> <chr> <int> <int> <dbl>
#> 1 P-0015535 BRCA 0.3 chr3 1.79e8 1.79e8 G A 868 167 0.192
#> 2 P-0015535 BRCA 0.3 chr17 3.79e7 3.79e7 T C 1172 205 0.175
#> 3 P-0015535 BRCA 0.3 chr7 1.41e8 1.41e8 G A 765 120 0.157
#> 4 P-0015535 BRCA 0.3 chr21 3.62e7 3.62e7 G A 1006 162 0.161
#> 5 P-0015535 BRCA 0.3 chr16 6.88e7 6.88e7 C - 774 210 0.271
#> 6 P-0015535 BRCA 0.3 chr17 1.60e7 1.60e7 C T 764 155 0.203
#> 7 P-0015535 BRCA 0.3 chr19 1.46e7 1.46e7 G C 544 70 0.129
#> 8 P-0007009 BRCA 0.5 chr19 4.28e7 4.28e7 G A 852 648 0.761
#> 9 P-0007009 BRCA 0.5 chr14 1.05e8 1.05e8 - GGCA… 1530 453 0.296
#> 10 P-0013299 BRCA 0.4 chr19 4.59e7 4.59e7 C T 1542 180 0.117
#> # ℹ 9,906 more rows
#> # ℹ 14 more variables: gene <chr>, gene_role <chr>, OS_MONTHS <dbl>,
#> # OS_STATUS <dbl>, SAMPLE_TYPE <chr>, MET_COUNT <dbl>, METASTATIC_SITE <chr>,
#> # MET_SITE_COUNT <dbl>, PRIMARY_SITE <chr>, SUBTYPE_ABBREVIATION <chr>,
#> # GENE_PANEL <chr>, TMB_NONSYNONYMOUS <dbl>, FGA <dbl>, AGE_AT_DEATH <dbl>There are 9916 mutations with average sequencing depth 681 across 2462 samples with average purity 0.42.
Classification of 2462 MSK-MetTropism BRCA samples
We then classify the mutations using PCAWG priors and the default entropy cutoff and overdispersion parameter:
x = classify(
x = x,
priors = pcawg_priors,
entropy_cutoff = 0.2,
rho = 0.01
)
print(x)
#> ── [ INCOMMON ] 9916 PASS mutations across 2462 samples, with 286 mutant genes
#> ℹ Average sample purity: 0.42
#> ℹ Average sequencing depth: 681
#> ── [ INCOMMON ] Classified mutations with overdispersion parameter 0.01 and ent
#> # A tibble: 9,916 × 18
#> sample tumor_type purity chr from to ref alt DP NV VAF
#> <chr> <chr> <dbl> <chr> <dbl> <dbl> <chr> <chr> <int> <int> <dbl>
#> 1 P-0015535 BRCA 0.3 chr3 1.79e8 1.79e8 G A 868 167 0.192
#> 2 P-0015535 BRCA 0.3 chr17 3.79e7 3.79e7 T C 1172 205 0.175
#> 3 P-0015535 BRCA 0.3 chr7 1.41e8 1.41e8 G A 765 120 0.157
#> 4 P-0015535 BRCA 0.3 chr21 3.62e7 3.62e7 G A 1006 162 0.161
#> 5 P-0015535 BRCA 0.3 chr16 6.88e7 6.88e7 C - 774 210 0.271
#> 6 P-0015535 BRCA 0.3 chr17 1.60e7 1.60e7 C T 764 155 0.203
#> 7 P-0015535 BRCA 0.3 chr19 1.46e7 1.46e7 G C 544 70 0.129
#> 8 P-0007009 BRCA 0.5 chr19 4.28e7 4.28e7 G A 852 648 0.761
#> 9 P-0007009 BRCA 0.5 chr14 1.05e8 1.05e8 - GGCA… 1530 453 0.296
#> 10 P-0013299 BRCA 0.4 chr19 4.59e7 4.59e7 C T 1542 180 0.117
#> # ℹ 9,906 more rows
#> # ℹ 7 more variables: gene <chr>, gene_role <chr>, id <chr>, label <chr>,
#> # state <chr>, posterior <dbl>, entropy <dbl>There are 4147 heterozygous diploid mutations (HMD), 578 mutations with loss of heterozygosity (LOH), 2018 mutations with copy-neutral LOH (CNLOH), 663 mutations with amplification. In addition, 2510 mutations were classified as Tier-2, either because of entropy being larger than cutoff or because of a low number of mutant alleles relative to the wild-type.
Metastatic propensity of Mutant TP53 with LOH patients
We can analyse the metastatic propensity of primary breast tumor
genomes containing TP53 mutations by using function
met_propensity. This function implements a logistic
regression to fit the Binomial probability of developing metastasis
based on the interpreted mutant genome, with the mutant gene without CNA
(here, Mutant TP53 without LOH) as reference.
x = met_propensity(x, tumor_type = 'BRCA', gene = 'TP53')
#> ℹ There are 2112 different genotypes
#> ℹ The most abundant genotypes are:
#> • Mutant TP53 with LOH (54 Samples, Frequency 0.02)
#> • Mutant PIK3CA without AMP (33 Samples, Frequency 0.01)
#> • Mutant TP53 without LOH (33 Samples, Frequency 0.01)
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> # A tibble: 1 × 6
#> gene class OR low up p.value
#> <chr> <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 TP53 Mutant TP53 with LOH 1.64 1.12 2.41 0.0105The odds ratio (OR) of metastatising for Mutant TP53 with LOH breast cancer is 1.6 (p.value = 0.01) with respect to mutant samples without LOH.
Metastatic tropism of Mutant TP53 with LOH patients
We can analyse the metastatic organotropism of metastatic breast
tumor genomes containing TP53 mutations by using function
met_tropsim. Similarly to the metastatic propensity
analysis, this function implements a logistic regression to fit the
Binomial probability of developing metastasis towards a specific
metastatic site (here the Liver, as example), based on the interpreted
mutant genome, with the mutant gene without CNA (here, Mutant TP53
without LOH) as reference.
x = met_tropism(x, tumor_type = 'BRCA', gene = 'TP53', metastic_site = 'Liver')
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> # A tibble: 1 × 6
#> gene class OR low up p.value
#> <chr> <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 TP53 Mutant TP53 with LOH 1.90 1.07 3.54 0.0343There is odds ratio (OR) of metastatising to the Liver for Mutant TP53 with LOH breast cancer is almost two-fold (OR = 1.9, p.value = 0.03) with respect to mutant samples without LOH.