Filter per-segment outliers by quantiles.

After mapping counts data to segments, this function can be used to determine quantiles of mapped data, and identify outliers in each segment and modality.

An outlier can then be removed or capped to the median cell value. The former option introduced 0-counts in the data, which we suggest to check with the stat function, and possibly remove by using the filter_missing_data function. Removal can be important as an excess of 0-counts cells (missing data) will drive the fit to use 0-mean components.

Capping does not introduce any 0-count cell, and is the suggested choice. The capped values is either a count value or a z-score, depending on the modality type of likelihood.

In both cases pre-filtering normalisation factors are no longer adequate after filtering, and have to be recomputed. If the modality adopts a Gaussian likelihood this is not a problem, since those are set to 1 when the object is created, and remain 1 afterwards. In the case of counts based likelihood like Negative Binomials these are re-computed for all input cells by using the auto_normalisation_factor function.

Therefore, if custom factors have been computing this function might affect the general signal in the data, and factors should be handled explicitly by the user.

The function requires and returns an (R)CONGAS+ object.

After mapping counts data to segments, this function can be used to determine cells with missing data, and remove them The function requires and returns an (R)CONGAS+ object.

This filter works by a proportion, as reported by the stat function.

If these cells are not removed, during inference missing values are imputed to be 0. This can create an excess of mixture components fitting 0-counts data.

filter_missing_data(x, proportion_RNA = 0.05, proportion_ATAC = 0.05)

Arguments

x: An rcongasplus object.
proportion_RNA: The RNA proportion cut for a cell to be removed, default 5%.
proportion_ATAC: The ATAC proportion cut for a cell to be removed, default 5%.
lower_quantile: The lower quantile, default 1%.
upper_quantile: The upper quantile, default 99%.
action: If "remove", outliers will be set to 0. If "cap", outliers will be capped at the median per-cell counts.

Value

The object x where outliers have been identified and removec or capped according to the parameters.

The object x where 0-counts cells have been removed.

Arguments

Value

Examples