fit_devil performs model fitting by estimating beta coefficients, dispersion parameters, and beta sigma. The function uses predictor variables provided in the design_matrix and a response variable provided in the input_matrix. Optional features include the estimation of overdispersion and the computation of size factors. The function supports parallel processing and allows customization of various parameters such as the number of iterations, convergence tolerance, and more.

fit_devil(
  input_matrix,
  design_matrix,
  overdispersion = TRUE,
  offset = 0,
  size_factors = TRUE,
  verbose = FALSE,
  max_iter = 200,
  tolerance = 0.001,
  eps = 1e-06,
  CUDA = FALSE,
  batch_size = 1024L,
  parallel.cores = NULL
)

Arguments

input_matrix

A numeric matrix representing the response variable, with rows corresponding to genes and columns to samples.

design_matrix

A numeric matrix representing the predictor variables, with rows corresponding to samples and columns to predictors.

overdispersion

Logical value indicating whether to estimate the overdispersion parameter. (default is TRUE)

offset

A numeric vector to be included as an offset in the model. (default is 0)

size_factors

Logical value indicating whether to compute size factors for normalization. (default is TRUE)

verbose

Logical value indicating whether to display progress messages during execution. (default is FALSE)

max_iter

Integer specifying the maximum number of iterations allowed for the optimization process. (default is 500)

tolerance

Numeric value indicating the tolerance level for the convergence criterion. (default is 1e-3)

eps

A small numeric value added to input_matrix to avoid issues with non-invertible matrices. (default is 1e-6)

CUDA

Logical value indicating whether to use GPU version of the code (default is FALSE)

batch_size

Integer specifying the number of genes that will be fit in each batch if CUDA = TRUE. (default is 1024)

parallel.cores

Integer specifying the number of CPU cores to use for parallelization. If NULL, the maximum number of available cores are used. (defaults is NULL)

Value

A list containing the following elements:

beta

A matrix of fitted beta coefficients for each gene.

overdispersion

A numeric vector of overdispersion parameters for each gene (if estimated).

iterations

A numeric vector indicating the number of iterations taken for each gene.

size_factors

A numeric vector of size factors used for normalization.

offset_matrix

A numeric matrix of offset values used in the model.

design_matrix

The design matrix provided as input.

input_matrix

The input matrix used after processing.

input_parameters

A list of input parameters used in the function, including max_iter, tolerance, and parallel.cores.

Details

This function fits model parameters, including beta coefficients, the dispersion parameter, and beta sigma, using the provided predictor variables (design_matrix) and response variable (input_matrix). It optionally estimates overdispersion based on the fitted model.