These data are required:
The formats of the inputs is discussed here.
The (R)CONGAS+ workflow consists in:
The package comes with input data to show what the required input formats are, and ready to use objects for case studies that we have discussed in the main papers; see References for details.
# Example data available in the package
data('multiome_congas_object')
RNA and ATAC have the same input format reporting:
chr
, from
and to
: genomic
locations of a gene (RNA) or a peak (ATAC)cell
: a cell identifiervalue
: the observed counts (discrete) This tibble is
created woth the function Segments have this format:
chr
, from
and to
: genomic
range of the segmentcopies
: ploidy of each segmentPlease see the RCONGAS object creation vignette to understand how to create a RCONGAS+ object. Once it is successfully generated, we can visualize some information about the object
multiome_congas_object %>% print
#> ✖ Warning ATAC 0-counts cells. 2 cells have no data in any of 22 segments, top 2 with missing data are:
#> ✖ Cell GTCCTAGAGCCAAATC-1-ATAC with 1 0-segments (5%)
#> ✖ Cell TCTCGCCCAAACATAG-1-ATAC with 1 0-segments (5%)
#> ── [ (R)CONGAS+ ] Bimodal lymphoma ─────────────────────────────────────────────
#>
#> ── CNA segments (reference: GRCh38)
#> → Input 22 CNA segments, mean ploidy 2.
#>
#> | | | | | | | | | | | |
#>
#> Ploidy: 0 1 2 3 4 5 *
#>
#> ── Modalities
#> → RNA: 500 cells with 9887 mapped genes, 491284 non-zero values. Likelihood: Negative Binomial.
#> → ATAC: 500 cells with 56942 mapped peaks, 1529825 non-zero values. Likelihood: Negative Binomial.
#> ! Clusters: not available.
#>
#> ── LOG ──
#>
#> - 2024-04-03 17:06:34.686264 Created input object.
#> - 2024-04-03 17:06:38.989073 Filtered s123egments: [0|50|50]
#> [1] 0