Filter observed counts by quantile.

filter_counts_by_quantile(x, upper_quantile = 0.98)

Arguments

x

An input RNA/ATAC dataset where entries are indexeable by genomic coordinate: "chr", "from" and "to".

upper_quantile

The maximum quantile to determine cuts. If a value is above the quantile it is removed

Value

The input data with removed entries.

Examples

data('example_input')
filter_counts_by_quantile(example_input$rna, upper_quantile = .98)
#> ── Upper quantile 0.98 
#>  n = 5961 entries to remove
#> 
#> # A tibble: 5,961 × 8
#>    gene    chr        from        to cell                      value q_max del  
#>    <chr>   <chr>     <int>     <int> <chr>                     <int> <dbl> <lgl>
#>  1 ACAP3   chr1    1292390   1309609 bcc.su008.pre.tumor_AAGG…     2  1.88 TRUE 
#>  2 ZBTB40  chr1   22428838  22531157 bcc.su008.pre.tumor_AAGG…     2  1.8  TRUE 
#>  3 SRRM1   chr1   24631716  24673281 bcc.su008.pre.tumor_AAGG…    12 11.8  TRUE 
#>  4 TXNDC12 chr1   52020131  52055191 bcc.su008.pre.tumor_AAGG…     6  5.42 TRUE 
#>  5 ALG6    chr1   63367575  63438553 bcc.su008.pre.tumor_AAGG…     3  2.66 TRUE 
#>  6 SSX2IP  chr1   84643706  84690803 bcc.su008.pre.tumor_AAGG…     3  2.78 TRUE 
#>  7 MTF2    chr1   93079235  93139079 bcc.su008.pre.tumor_AAGG…     4  3.34 TRUE 
#>  8 GPSM2   chr1  108875350 108934545 bcc.su008.pre.tumor_AAGG…     8  6.5  TRUE 
#>  9 CELSR2  chr1  109249539 109275751 bcc.su008.pre.tumor_AAGG…     4  3.32 TRUE 
#> 10 RBM15   chr1  110338506 110346681 bcc.su008.pre.tumor_AAGG…     5  4.16 TRUE 
#> # ℹ 5,951 more rows
#> # A tibble: 195,498 × 6
#>    gene     chr      from      to cell                                 value
#>    <chr>    <chr>   <int>   <int> <chr>                                <int>
#>  1 NOC2L    chr1   944203  959309 bcc.su008.pre.tumor_AAGGCAGTCACCGTAA     2
#>  2 AGRN     chr1  1020120 1056118 bcc.su008.pre.tumor_AAGGCAGTCACCGTAA     1
#>  3 SDF4     chr1  1216909 1232067 bcc.su008.pre.tumor_AAGGCAGTCACCGTAA     1
#>  4 CPTP     chr1  1324756 1328896 bcc.su008.pre.tumor_AAGGCAGTCACCGTAA     1
#>  5 AURKAIP1 chr1  1373730 1375495 bcc.su008.pre.tumor_AAGGCAGTCACCGTAA     1
#>  6 CCNL2    chr1  1385711 1399335 bcc.su008.pre.tumor_AAGGCAGTCACCGTAA     2
#>  7 MRPL20   chr1  1401909 1407293 bcc.su008.pre.tumor_AAGGCAGTCACCGTAA     1
#>  8 CDK11B   chr1  1635225 1659012 bcc.su008.pre.tumor_AAGGCAGTCACCGTAA     1
#>  9 CDK11A   chr1  1702379 1724357 bcc.su008.pre.tumor_AAGGCAGTCACCGTAA     1
#> 10 WRAP73   chr1  3630767 3652761 bcc.su008.pre.tumor_AAGGCAGTCACCGTAA     2
#> # ℹ 195,488 more rows
filter_counts_by_quantile(example_input$atac, upper_quantile = .98)
#> 
#> ── Upper quantile 0.98 
#>  n = 60036 entries to remove
#> 
#> # A tibble: 60,036 × 7
#>    cell               value chr      from      to q_max del  
#>    <chr>              <int> <chr>   <int>   <int> <dbl> <lgl>
#>  1 SU008_Tumor_Pre_45     2 chr1   871996  872496  1.96 TRUE 
#>  2 SU008_Tumor_Pre_45     4 chr1   937126  937626  3.88 TRUE 
#>  3 SU008_Tumor_Pre_45     2 chr1  1050579 1051079  1.96 TRUE 
#>  4 SU008_Tumor_Pre_45     4 chr1  1138211 1138711  3.92 TRUE 
#>  5 SU008_Tumor_Pre_45     4 chr1  1176170 1176670  3.96 TRUE 
#>  6 SU008_Tumor_Pre_45     5 chr1  1186185 1186685  4.94 TRUE 
#>  7 SU008_Tumor_Pre_45     6 chr1  1238592 1239092  5.86 TRUE 
#>  8 SU008_Tumor_Pre_45     6 chr1  1239980 1240480  5.44 TRUE 
#>  9 SU008_Tumor_Pre_45     4 chr1  1241917 1242417  3.96 TRUE 
#> 10 SU008_Tumor_Pre_45     3 chr1  1280686 1281186  2.98 TRUE 
#> # ℹ 60,026 more rows
#> # A tibble: 521,771 × 5
#>    cell               value chr     from     to
#>    <chr>              <int> <chr>  <int>  <int>
#>  1 SU008_Tumor_Pre_45     1 chr1  127538 128038
#>  2 SU008_Tumor_Pre_45     2 chr1  540701 541201
#>  3 SU008_Tumor_Pre_45     4 chr1  762643 763143
#>  4 SU008_Tumor_Pre_45     3 chr1  859974 860474
#>  5 SU008_Tumor_Pre_45     1 chr1  866643 867143
#>  6 SU008_Tumor_Pre_45     2 chr1  876975 877475
#>  7 SU008_Tumor_Pre_45     3 chr1  894505 895005
#>  8 SU008_Tumor_Pre_45     2 chr1  895705 896205
#>  9 SU008_Tumor_Pre_45     1 chr1  898583 899083
#> 10 SU008_Tumor_Pre_45     3 chr1  901529 902029
#> # ℹ 521,761 more rows