Fit a generalized linear model in order to model the expression of genes coming from a scRNA count matrix. The resulting object retains the inferred parameters that can later be tested in order to find differentially expressed genes.

fit_linear_model(
  input_matrix,
  model_matrix,
  size_factors = TRUE,
  group_matrix = NULL,
  gene_specific_model_tensor = NULL,
  kernel_input = NULL,
  gene_names = NULL,
  cell_names = NULL,
  variance = "VI_Estimate",
  inference_method = "SVI",
  method_specific_args = list()
)

Arguments

input_matrix

Matrix of counts representing gene expression data for individual cells. Each row corresponds to a gene, and each column represents a single cell.

model_matrix

Matrix also known as design matrix, it represents the relationship between the response variable and the predictor variables in the model. Each row represents a cell and each columns represent a predictor variable (e.g experimental conditions, biological factors, treatment groups, batch effects, ...)

size_factors

Boolean. Decides if a scaling factor for the expression of each cell should be computed

group_matrix

.

gene_specific_model_tensor

.

kernel_input

.

gene_names

Vector containing the names of the genes

cell_names

Vector containing the names of the cells

variance

String. Either "VI_Estimate" or "Hessian".

inference_method

String. Either "SVI" or "HMC"

method_specific_args

List containing additional arguments. The available arguments differs between the inference algorithms.

SVI only:

  • optimizer_name optimizer, one of "ClippedAdam", "Adam", and "SGD";

  • steps number of iterations of the optimization algorithm;

  • lr learning rate for the optimize;

  • gamma_lr parameters to tune the decay of the learning rate using "ClippedAdam";

  • batch_size number of data points or observations sampled from the input matrix in each iteration of the optimization algorithm;

  • threshold parameters to stop the inference earlier when convergence is reached. Default value is set to 0, i.e. all steps will be done;

HMC only:

  • num_samples number of iterations after the warmup-phase, it also indicates the posterior samples each chain will produce;

  • num_chains number of chains for the optimization algorithm;

  • warmup_steps number of iterations of the warmup-phase;

Shared:

  • cuda Boolean, indicates if CUDA should be used if available;

  • jit_compile ;

  • full_cov ;

  • theta_bounds ;

  • init_loc ;

Value

A rdevil object of class rdevil