check_sig()
is a helper function that shows useful information about signatures
that you want to test on your gene expression matrix.
Arguments
- expr_data
A normalized gene expression matrix (or data frame) with gene symbols as row names and samples as columns.
- signatures
It can be a list of signatures or a character vector indicating keywords for a group of signatures. The default (
"all"
) will cause the function to check for all the signatures implemented inhacksig
.
Value
A tibble with a number of rows equal to the number of input signatures and five columns:
signature_id
, a unique identifier associated to a signature;n_genes
, the number of genes composing a signature;n_present
andfrac_present
, the number and fraction of genes in a signature which are present inexpr_data
, respectively;missing_genes
, the missing gene symbols for each signature.
Examples
check_sig(test_expr)
#> # A tibble: 40 × 5
#> signature_id n_genes n_present frac_present missing_genes
#> <chr> <int> <int> <dbl> <list>
#> 1 wu2020_metabolic 30 20 0.667 <chr [10]>
#> 2 muro2016_ifng 6 4 0.667 <chr [2]>
#> 3 liu2020_immune 6 4 0.667 <chr [2]>
#> 4 liu2021_mgs 6 4 0.667 <chr [2]>
#> 5 lu2020_npc 3 2 0.667 <chr [1]>
#> 6 estimate_stromal 141 91 0.645 <chr [50]>
#> 7 she2020_irgs 27 17 0.630 <chr [10]>
#> 8 lohavanichbutr2013_hpvneg 13 8 0.615 <chr [5]>
#> 9 eschrich2009_rsi 10 6 0.6 <chr [4]>
#> 10 li2021_ferroptosis_a 10 6 0.6 <chr [4]>
#> # … with 30 more rows
check_sig(test_expr, "estimate")
#> # A tibble: 2 × 5
#> signature_id n_genes n_present frac_present missing_genes
#> <chr> <int> <int> <dbl> <list>
#> 1 estimate_stromal 141 91 0.645 <chr [50]>
#> 2 estimate_immune 141 74 0.525 <chr [67]>