Easy and fast way to annotate input mutations, and detect potential driver mutations. This function computes the locations of the different mutations and the consequences of substituions mapped to coding regions, using VariantAnnotation and other Bioconductor packages. Then, putative drivers are annotated upon matching from an input list that, by default, is compiled from the Intogen database. Drivers are selected among coding substituions with known effect.

annotate_variants(
  x,
  drivers = CNAqc::intogen_drivers,
  make_0_span = FALSE,
  collapse = TRUE
)

Arguments

x

A CNAqc object.

drivers

A dataframe in the format of the `intogen_drivers` one released with

make_0_span

Remove -1 from the coloumn `to`, in case SNPs is indicated by `from` and `to = from + 1` to make it effectively a point mutation.

collapse

if the same mutation has more than one consqeunce or location in different transcript, it collapse them by concatenating them using `:` CNAqc. In particular, it must contain a column named `gene` to identify gene names.

Value

A CNAqc object with variants annotated. For each variant this object contains:

- `location` reporting the position of the variant in the genome (`coding`, `intron`, `threeUTR`, ...), - `consequence` with the consequence of coding mutations (`synonymous`, `nonsynonymous`, ...), - `is_driver` a boolean that indicates if the gene is a driver, - `gene_symbol` for the annotated corresponding gene symbol (if the variant is in a gene) - `driver_label` with the driver label written as `gene_refAA->varAA` (`NA` in case `is_driver = FALSE`).

The annotation process is based on the package VariantAnnotation.

Examples


if (FALSE) { # \dontrun{
library(CNAqc)

data('example_dataset_CNAqc', package = 'CNAqc')

mutations <- example_dataset_CNAqc$mutations

mutations_annotated <- annotate_variants(mutations)
} # }