This class represents the phylogenetic forest of the cells sampled during the computation.
Details
The leaves of his forest are the sampled cells. This class is analoguous to the class SamplesForest, but each node is labelled with the mutations occuring for the first time on the cell represented by the node itself. Moreover each leaf is also associated with the genome mutations occurring in the corresponding cell.
Fields
get_coalescent_cells
Retrieve most recent common ancestors
Parameter:
cell_ids
- The list of the identifiers of the cells whose most recent common ancestors are aimed (optional).Return: A dataframe representing, for each of the identified cells, the identified (column
cell_id
), whenever the node is not a root, the ancestor identifier (columnancestor
), whenever the node was sampled, i.e., it is one of the forest leaves, the name of the sample containing the node, (columnsample
), the mutant (columnmutant
), the epistate (columnepistate
), and the birth time (columnbirth_time
).
get_first_occurrences
Gets the identifier of the cell in which a mutation occurs for the first time
Parameter:
mutation
- A mutation being a SNV, a indel, or a CNA.Return: The identifier of the cell in which a mutation occurs for the first time.
get_germline_mutations
Gets the germinal SNVs and indels
Return: A dataframe reporting
chr
(i.e., the chromosome),chr_pos
" (i.e., the position in the chromosome),allele
(in which the SNV occurs),ref
,alt
,type
(i.e., either"SNV"
or"indel"
) andclass
(i.e.,"germinal"
).
get_germline_subject
Gets the germline subject name
Return: The name of the subject whose germline is used.
get_nodes
Get the forest nodes
Return: A dataframe representing, for each node in the forest, the identified (column
id
), whenever the node is not a root, the ancestor identifier (columnancestor
), whenever the node was sampled, i.e., it is one of the forest leaves, the name of the sample containing the node, (columnsample
), the mutant (columnmutant
), the epistate (columnepistate
), and the birth time (columnbirth_time
).
get_sampled_cell_CNAs
Gets the CNAs of the sampled cells
Returns: A dataframe reporting
cell_id
,type
("A"
for amplifications and"D"
for deletions),chr
,begin
(i.e., the first CNA locus in the chromosome),end
(i.e., the last CNA locus in the chromosome),allele
,src allele
(the allele origin for amplifications,NA
for deletions), andclass
(i.e.,"driver"
,"passenger"
,"germinal"
or"preneoplastic"
).
get_sampled_cell_mutations
Gets the SNVs and the indels of the sampled cells
Returns: A dataframe reporting
cell_id
,chr
, (i.e., the mutation chromosome),begin
(i.e., position in the chromosome),allele
(in which the SNV occurs),ref
,alt
,type
(i.e., either"SNV"
or"indel"
),cause
, andclass
(i.e.,"driver"
,"passenger"
,"germinal"
or"preneoplastic"
) for each mutation in the sampled cell genomes.
get_samples_info
Retrieve information about the samples
Returns: A dataframe containing, for each sample collected during the simulation, the columns "
name
", "time
", "ymin
", "xmin
", "ymax
", "xmax
", "tumour_cells
", and "tumour_cells_in_bbox
". The columns "ymin
", "xmin
", "ymax
", "xmax
" report the boundaries of the sample bounding box, while "tumour_cells
" and "tumour_cells_in_bbox
" are the number of tumour cells in the sample and in the bounding box, respectively.
get_species_info
Gets the species data
Returns: A dataframe reporting
mutant
andepistate
for each registered species.
get_sticks
Compute the forest sticks
Returns: The list of the forest sticks. Each stick is represented as the list of cell identifiers labelling the nodes in the stick from the higher to the deeper in the forest.
get_subforest_for
Build a subforest using as leaves some of the original samples
Parameter:
sample_names
- The names of the samples whose cells will be used as leaves of the new forest.Returns: A samples forest built on the samples mentioned in
sample_names
.
get_absolute_chromosome_positions
Get the absolute chromosome positions
Returns: A dataframe reporting the name (column "
chr
"), the length (column "length
"), the initial absolute position (column "from
"), and the final absolute position (column "to
") of each chromosome.
save
Save a phylogenetic forest in a file
Parameter:
filename
- The path of the file in which the phylogenetic forest must be saved.