These data are required:
The formats of the inputs is discussed here.
The (R)CONGAS+ workflow consists in:
The package comes with input data to show what the required input formats are, and ready to use objects for case studies that we have discussed in the main papers; see References for details.
# Example data available in the package
data('multiome_congas_object')RNA and ATAC have the same input format reporting:
chr, from and to: genomic
locations of a gene (RNA) or a peak (ATAC)cell: a cell identifiervalue: the observed counts (discrete) This tibble is
created woth the function Segments have this format:
chr, from and to: genomic
range of the segmentcopies: ploidy of each segmentPlease see the RCONGAS object creation vignette to understand how to create a RCONGAS+ object. Once it is successfully generated, we can visualize some information about the object
multiome_congas_object %>% print
#> ✖ Warning ATAC 0-counts cells. 2 cells have no data in any of 22 segments, top 2 with missing data are:
#> ✖ Cell GTCCTAGAGCCAAATC-1-ATAC with 1 0-segments (5%)
#> ✖ Cell TCTCGCCCAAACATAG-1-ATAC with 1 0-segments (5%)
#> ── [ (R)CONGAS+ ] Bimodal lymphoma ─────────────────────────────────────────────
#> 
#> ── CNA segments (reference: GRCh38)
#> → Input 22 CNA segments, mean ploidy 2.
#> 
#>    | | | |  |  |  | |  |  |  |  |  
#> 
#>   Ploidy:    0     1     2     3     4     5     *
#> 
#> ── Modalities
#> → RNA: 500 cells with 9887 mapped genes, 491284 non-zero values. Likelihood: Negative Binomial.
#> → ATAC: 500 cells with 56942 mapped peaks, 1529825 non-zero values. Likelihood: Negative Binomial.
#> ! Clusters: not available.
#> 
#> ──  LOG  ──
#> 
#> - 2024-04-03 17:06:34.686264 Created input object.
#> - 2024-04-03 17:06:38.989073 Filtered s123egments: [0|50|50]
#> [1] 0