Analysis workflow

These data are required:

  • single-cell RNA or ATAC sequencing data from independent cells, in the form of counts per cell;
  • genome-wide segmentation with associated per-segment ploidy.

The formats of the inputs is discussed here.

The (R)CONGAS+ workflow consists in:

  • (optional) applying filters on the input data;
  • creating an (R)CONGAS+ object;
  • (optional) applying filters on the mapped data;
  • fitting a model

The package comes with input data to show what the required input formats are, and ready to use objects for case studies that we have discussed in the main papers; see References for details.

Input formats

# Example data available in the package
data('multiome_congas_object')

RNA and ATAC have the same input format reporting:

  • chr, from and to: genomic locations of a gene (RNA) or a peak (ATAC)
  • cell: a cell identifier
  • value: the observed counts (discrete) This tibble is created woth the function

Segments have this format:

  • chr, from and to: genomic range of the segment
  • copies: ploidy of each segment

Creation of a new dataset

multiomodal assay

Please see the RCONGAS object creation vignette to understand how to create a RCONGAS+ object. Once it is successfully generated, we can visualize some information about the object

multiome_congas_object %>% print
#>  Warning ATAC 0-counts cells. 2 cells have no data in any of 22 segments, top 2 with missing data are:
#>  Cell GTCCTAGAGCCAAATC-1-ATAC with 1 0-segments (5%)
#>  Cell TCTCGCCCAAACATAG-1-ATAC with 1 0-segments (5%)
#> ── [ (R)CONGAS+ ] Bimodal lymphoma ─────────────────────────────────────────────
#> 
#> ── CNA segments (reference: GRCh38)
#> → Input 22 CNA segments, mean ploidy 2.
#> 
#>    | | | |  |  |  | |  |  |  |  |  
#> 
#>   Ploidy:    0     1     2     3     4     5     *
#> 
#> ── Modalities
#> → RNA: 500 cells with 9887 mapped genes, 491284 non-zero values. Likelihood: Negative Binomial.
#> → ATAC: 500 cells with 56942 mapped peaks, 1529825 non-zero values. Likelihood: Negative Binomial.
#> ! Clusters: not available.
#> 
#> ──  LOG  ──
#> 
#> - 2024-04-03 17:06:34.686264 Created input object.
#> - 2024-04-03 17:06:38.989073 Filtered s123egments: [0|50|50]
#> [1] 0