This function is a general interface for fitting a congas Python model in R. The model briefly consist in a joint mixture model over two modalities, currently scATAC and scRNA-seq. For more information about the theoretical fundations of the approach refer to the vignette. This function performs modele selection over a specified number of clusters, using a specific information criterium (IC). ICs and results for all the runs are, however, reported in the object.
The functions assume a list of model hyperparameters. As the the model formulation isquite complex, and those hyperparameters are extremely difficult to
set by hand we suggest the usage of the function
learning_rate = 0.01,
latent_variables = "G",
CUDA = FALSE,
steps = 500,
samples = 1,
parallel = FALSE,
model_selection = "ICL",
temperature = 10,
equal_variance = TRUE,
threshold = learning_rate * 0.1,
patience = 5,
same_mixing = FALSE
rcongasplus object with the input dataset, constructed with
a vector of integers with the number of clusters we want to test
Float (Optional). Default 0.5. Value of the hyperparameter that controls the weight given to RNA and ATAC modalities during the inference. Values closer to 0 give more weight to the ATAC likelihood, while values closer to 1 result in higher weight given to the RNA likelihood.
a list with model hyperparameters. As errors coming from wrong hyperparameters initialization
are quite hard to troubleshoot is higly suggested to use
Rcongas::auto_config_run() to generate
a template and eventually modify it.
a learning rate for the Adam optimizer
specify the nature of the latent variable modelling the copy number profile. Currently only "G" is available,
use GPU if avilable for training
number of steps of optimization
Number of times a model is fit for each value of
information criteria to which perform the model selection (one of ICL, NLL, BIC, AIC)
Integer. Number of steps to wait before stopping the inference. See
threshold for more details.
boolean that indicates whether to use the same mixing proportions for both RNA and ATAC or use different vectors for the two modalities. Default is FALSE.
Float, default is
learning_rate * 0.1. It corresponds to the threshold that determines the early stopping of the training procedure. When the difference between parameters in step t and step t+1 is
lower than this threshold for a number of steps equal to the parameter
patience the inference is stopped.
An object ot class
rcongasplus with a slot
bset_fit with the learned parameters for the selected model in tiblle format. A slot
with all the runs performed ordered by the selectde IC and a slot
model_selection with all the information to perform model selection.