Skip to contents

Given a gene expression matrix and a 0-1 vector indicating the distant metastasis status of samples, hack_cinsarc() classifies samples into one of two risk classes, C1 or C2, using the CINSARC signature as implemented in Chibon et al., 2010.

Usage

hack_cinsarc(expr_data, dm_status)

Arguments

expr_data

A normalized gene expression matrix (or data frame) with gene symbols as row names and samples as columns.

dm_status

A numeric vector specifying whether a sample has either (1) or not (0) developed distant metastasis.

Value

A tibble with one row for each sample in expr_data and two columns: sample_id and cinsarc_class.

Details

CINSARC (Complexity INdex in SARComas) is a prognostic 67-gene signature related to mitosis and control of chromosome integrity. It was developed to improve metastatic outcome prediction in soft tissue sarcomas over the FNCLCC (Fédération Francaise des Centres de Lutte Contre le Cancer) grading system.

Algorithm

The CINSARC method implemented in hacksig makes use of leave-one-out cross validation (LOOCV) to classify samples into C1/C2 risk groups (see Lesluyes & Chibon, 2020). First, gene expression values are centered by their mean across samples. Then, for each iteration of the LOOCV, mean normalized gene values are computed by metastasis group (i.e. compute the metastatic centroids). Then, one minus the Spearman's correlation between centered samples and metastatic centroids are computed. Finally, if a sample is more correlated to the non-metastatic centroid, then it is assigned to the C1 class (low risk). Conversely, if a sample is more correlated to the metastatic centroid, then it is assigned to the C2 class (high risk).

References

Chibon, F., Lagarde, P., Salas, S., Pérot, G., Brouste, V., Tirode, F., Lucchesi, C., de Reynies, A., Kauffmann, A., Bui, B., Terrier, P., Bonvalot, S., Le Cesne, A., Vince-Ranchère, D., Blay, J. Y., Collin, F., Guillou, L., Leroux, A., Coindre, J. M., & Aurias, A. (2010). Validated prediction of clinical outcome in sarcomas and multiple types of cancer on the basis of a gene expression signature related to genome complexity. Nature medicine, 16(7), 781–787. doi:10.1038/nm.2174 .

Lesluyes, T., & Chibon, F. (2020). A Global and Integrated Analysis of CINSARC-Associated Genetic Defects. Cancer research, 80(23), 5282–5290. doi:10.1158/0008-5472.CAN-20-0512 .

Examples

# generate random distant metastasis outcome
set.seed(123)
test_dm_status <- sample(c(0, 1), size = ncol(test_expr), replace = TRUE)

hack_cinsarc(test_expr, test_dm_status)
#> # A tibble: 20 × 2
#>    sample_id cinsarc_class
#>    <chr>     <chr>        
#>  1 sample1   C2           
#>  2 sample2   C1           
#>  3 sample3   C2           
#>  4 sample4   C1           
#>  5 sample5   C2           
#>  6 sample6   C1           
#>  7 sample7   C1           
#>  8 sample8   C1           
#>  9 sample9   C1           
#> 10 sample10  C2           
#> 11 sample11  C1           
#> 12 sample12  C1           
#> 13 sample13  C1           
#> 14 sample14  C2           
#> 15 sample15  C1           
#> 16 sample16  C2           
#> 17 sample17  C1           
#> 18 sample18  C2           
#> 19 sample19  C1           
#> 20 sample20  C2