Skip to contents

This class represents the phylogenetic forest of the cells sampled during the computation.

Details

The leaves of his forest are the sampled cells. This class is analoguous to the class SamplesForest, but each node is labelled with the mutations occuring for the first time on the cell represented by the node itself. Moreover each leaf is also associated with the genome mutations occurring in the corresponding cell.

Fields

get_coalescent_cells

Retrieve most recent common ancestors

  • Parameter: cell_ids - The list of the identifiers of the cells whose most recent common ancestors are aimed (optional).

  • Return: A dataframe representing, for each of the identified cells, the identified (column cell_id), whenever the node is not a root, the ancestor identifier (column ancestor), whenever the node was sampled, i.e., it is one of the forest leaves, the name of the sample containing the node, (column sample), the mutant (column mutant), the epistate (column epistate), and the birth time (column birth_time).

get_first_occurrences

Gets the identifier of the cell in which a mutation occurs for the first time

  • Parameter: mutation - A mutation being a SNV, a indel, or a CNA.

  • Return: The identifier of the cell in which a mutation occurs for the first time.

get_germline_mutations

Gets the germinal SNVs and indels

  • Return: A dataframe reporting chr (i.e., the chromosome), chr_pos" (i.e., the position in the chromosome), allele (in which the SNV occurs), ref, alt, type (i.e., either "SNV" or "indel") and class (i.e., "germinal").

get_germline_subject

Gets the germline subject name

  • Return: The name of the subject whose germline is used.

get_nodes

Get the forest nodes

  • Return: A dataframe representing, for each node in the forest, the identified (column id), whenever the node is not a root, the ancestor identifier (column ancestor), whenever the node was sampled, i.e., it is one of the forest leaves, the name of the sample containing the node, (column sample), the mutant (column mutant), the epistate (column epistate), and the birth time (column birth_time).

get_sampled_cell_CNAs

Gets the CNAs of the sampled cells

  • Returns: A dataframe reporting cell_id, type ("A" for amplifications and "D" for deletions), chr, begin (i.e., the first CNA locus in the chromosome), end (i.e., the last CNA locus in the chromosome), allele, src allele (the allele origin for amplifications, NA for deletions), and class (i.e., "driver", "passenger", "germinal" or "preneoplastic").

get_sampled_cell_mutations

Gets the SNVs and the indels of the sampled cells

  • Returns: A dataframe reporting cell_id, chr, (i.e., the mutation chromosome), begin (i.e., position in the chromosome), allele (in which the SNV occurs), ref, alt, type (i.e., either "SNV" or "indel"), cause, and class (i.e., "driver", "passenger", "germinal" or "preneoplastic") for each mutation in the sampled cell genomes.

get_samples_info

Retrieve information about the samples

  • Returns: A dataframe containing, for each sample collected during the simulation, the columns "name", "time", "ymin", "xmin", "ymax", "xmax", "tumour_cells", and "tumour_cells_in_bbox". The columns "ymin", "xmin", "ymax", "xmax" report the boundaries of the sample bounding box, while "tumour_cells" and "tumour_cells_in_bbox" are the number of tumour cells in the sample and in the bounding box, respectively.

get_species_info

Gets the species data

  • Returns: A dataframe reporting mutant and epistate for each registered species.

get_sticks

Compute the forest sticks

  • Returns: The list of the forest sticks. Each stick is represented as the list of cell identifiers labelling the nodes in the stick from the higher to the deeper in the forest.

get_subforest_for

Build a subforest using as leaves some of the original samples

  • Parameter: sample_names - The names of the samples whose cells will be used as leaves of the new forest.

  • Returns: A samples forest built on the samples mentioned in sample_names.

get_absolute_chromosome_positions

Get the absolute chromosome positions

  • Returns: A dataframe reporting the name (column "chr"), the length (column "length"), the initial absolute position (column "from"), and the final absolute position (column "to") of each chromosome.

save

Save a phylogenetic forest in a file

  • Parameter: filename - The path of the file in which the phylogenetic forest must be saved.