Skip to main navigation Skip to search Skip to main content

AUPRC: a metric for evaluating the performance of in-silico perturbation methods in identifying differentially expressed genes

Research output: Contribution to journalArticlepeer-review

Abstract

In silico perturbation models, computational methods that can predict cellular responses to perturbations, present an opportunity to reduce the need for costly and time-intensive in vitro experiments. Many recently proposed models predict high-dimensional cellular responses, such as gene or protein expression to perturbations such as gene knockout or drugs. However, evaluating in silico performance has largely relied on metrics such as R2, which assess overall prediction accuracy but fail to capture biologically significant outcomes like the identification of differentially expressed (DE) genes. In this study, we present a novel evaluation framework that introduces the AUPRC metric to assess the precision and recall of DE gene predictions. By applying this framework to both single-cell and pseudo-bulked datasets, we systematically benchmark simple and advanced computational models. Our results highlight a significant discrepancy between R2 and AUPRC, with models achieving high R2 values but struggling to identify DE genes, as reflected in their low AUPRC values. This finding underscores the limitations of traditional evaluation metrics and the importance of biologically relevant assessments. Our framework provides a more comprehensive understanding of model capabilities, advancing the application of computational approaches in cellular perturbation research.

Original languageEnglish (US)
Article numberbbaf426
JournalBriefings in bioinformatics
Volume26
Issue number5
DOIs
StatePublished - Sep 1 2025

Keywords

  • cellular perturbation experiments
  • differentially expressed genes
  • evaluation metrics
  • in silico models

ASJC Scopus subject areas

  • Information Systems
  • Molecular Biology

Fingerprint

Dive into the research topics of 'AUPRC: a metric for evaluating the performance of in-silico perturbation methods in identifying differentially expressed genes'. Together they form a unique fingerprint.

Cite this