
Zenodo
zenodo.RmdEach SPN has a dedicated Zenodo record with three zip archives. Files are downloaded once and cached locally — repeat calls skip the download.
Cache location
By default data are stored at
~/.cache/SCOUT/<spn>/. Override with:
Sys.setenv(SCOUT_CACHE_DIR = "/scratch/shared/SCOUT")Downloading archives
get_ground_truth()
Downloads ground_truth.zip, reads every RDS file inside,
and returns a named list — one element per file.
gt <- get_ground_truth("SPN01")
names(gt)
gt$clones
get_sarek_results()
Downloads sarek.zip and returns the
path to the extracted directory.
sarek_dir <- get_sarek_results("SPN01")
list.files(sarek_dir, recursive = TRUE)
get_tumourevo_results()
Downloads tumourevo.zip and returns the
path to the extracted directory.
te_dir <- get_tumourevo_results("SPN01")
list.files(te_dir, recursive = TRUE)
list_zenodo_files()
Inspect what is available in a record before downloading anything.
list_zenodo_files("1234567")
#> # A tibble: 3 × 3
#> filename size download_url
#> <chr> <int> <chr>
#> 1 ground_truth.zip ... https://zenodo.org/...
#> 2 sarek.zip ... https://zenodo.org/...
#> 3 tumourevo.zip ... https://zenodo.org/...Ground truth getters
Once get_ground_truth() has been called, these functions
let you access specific files without navigating the directory structure
manually.
get_mutations()
Returns the path to the mutations RDS file for a given sample type, coverage and purity.
# Tumour sample
path <- get_mutations("SPN01", type = "tumour", coverage = 100, purity = 0.9)
readRDS(path)
# Matched normal (fixed at 30x, purity 1)
path <- get_mutations("SPN01", type = "normal")Sarek getters
Once get_sarek_results() has been called, these
functions return named lists of file paths for a given sample, coverage,
purity and caller.
get_sarek_vcf()
Returns VCF and index file paths. Supported callers:
"mutect2", "strelka",
"freebayes", "haplotypecaller".
vcf <- get_sarek_vcf("SPN01", "SPN01_1", 100, 0.9, "mutect2", "tumour")
vcf$vcf
vcf$tbi
# strelka returns separate SNV and indel files
vcf <- get_sarek_vcf("SPN01", "SPN01_1", 100, 0.9, "strelka", "tumour")
vcf$snvs_vcf
vcf$indels_vcf
get_sarek_cna()
Returns CNA file paths. Supported callers: "ascat",
"sequenza", "cnvkit".
# ASCAT
cna <- get_sarek_cna("SPN01", "SPN01_1", 100, 0.9, "ascat")
cna$segments
cna$purityploidy
# Sequenza
cna <- get_sarek_cna("SPN01", "SPN01_1", 100, 0.9, "sequenza")
cna$segments
cna$confints_CPtumourevo getters
Once get_tumourevo_results() has been called, these
functions return named lists of file paths. All require
spn, coverage, purity,
vcf_caller ("mutect2" or
"strelka"), and cna_caller
("ascat" or "sequenza").
get_tumourevo_driver()
Driver annotation results for a specific sample.
get_tumourevo_driver("SPN01", 100, 0.9, "mutect2", "ascat", sample = "SPN01_1")
get_tumourevo_subclonal()
Subclonal deconvolution results. Supported tools:
"mobster", "pyclonevi", "ctree",
"viber".
get_tumourevo_subclonal("SPN01", 100, 0.9, "mutect2", "ascat", "mobster", "SPN01_1")
get_tumourevo_subclonal("SPN01", 100, 0.9, "mutect2", "ascat", "pyclonevi", "SPN01_1")
get_tumourevo_qc()
QC results. Supported tools: "cnaqc",
"join_cnaqc", "tinc".
get_tumourevo_qc("SPN01", 100, 0.9, "mutect2", "ascat", "cnaqc", "SPN01_1")
get_tumourevo_signatures()
Signature deconvolution results. Supported tools:
"sigprofiler", "sparsesignatures",
"BASCULE". sigprofiler also requires a
context argument (e.g. "SBS96",
"ID83").
# BASCULE
sigs <- get_tumourevo_signatures("SPN01", 100, 0.9, "mutect2", "ascat", "BASCULE")
sigs$refined_fit
sigs$base_fit
# SigProfiler
sigs <- get_tumourevo_signatures("SPN01", 100, 0.9, "mutect2", "ascat",
"sigprofiler", context = "SBS96")
sigs$COSMIC_exposure
sigs$denovo_signatures