Discrimination slope and integrated discrimination improvement – properties, relationships and impact of calibration

Michael J. Pencina, Jason P. Fine, Ralph B. D'Agostino

Research output: Contribution to journalArticle

  • 7 Citations

Abstract

Discrimination slope, defined as the slope of a linear regression of predicted probabilities of event derived from a prognostic model on the binary event status, has recently gained popularity as a measure of model performance. It is as a building block for the integrated discrimination improvement that equals the difference in discrimination slopes between the two models being compared. Several authors have pointed out that it does not make sense to apply the integrated discrimination improvement and discrimination slope when working with mis-calibrated models, whereas others have raised concerns about the ability of improving discrimination slope without adding new information. In this paper, we show that under certain assumptions the discrimination slope is asymptotically related to two other R-squared measures, one of which is a rescaled version of the Brier score, known to be proper. Furthermore, we illustrate how a simple recalibration makes the slope equal to the rescaled Brier R-squared metric. We also show that the discrimination slope can be interpreted as a measure of reduction in expected regret for the Gini-Brier regret function. Using theoretical and practical examples, we illustrate how all of these metrics are affected by different levels of model mis-calibration. In particular, we demonstrate that simple recalibration ascertaining calibration in-the-large and calibration slope equal to 1 are not sufficient to correct for some forms of mis-calibration. We conclude that R-squared metrics, including the discrimination slope, offer an attractive choice for quantifying model performance as long as one accounts for their sensitivity to model calibration.

LanguageEnglish (US)
Pages4482-4490
Number of pages9
JournalStatistics in Medicine
Volume36
Issue number28
DOIs
StatePublished - Dec 10 2017

Fingerprint

Calibration
Discrimination
Slope
Emotions
Aptitude
Regret
Performance Model
Metric
Linear Models
Relationships
Model Calibration
Linear regression
Model
Building Blocks
Binary
Sufficient

Keywords

  • IDI
  • model
  • proper
  • R-squared
  • risk

ASJC Scopus subject areas

  • Epidemiology
  • Statistics and Probability

Cite this

Discrimination slope and integrated discrimination improvement – properties, relationships and impact of calibration. / Pencina, Michael J.; Fine, Jason P.; D'Agostino, Ralph B.

In: Statistics in Medicine, Vol. 36, No. 28, 10.12.2017, p. 4482-4490.

Research output: Contribution to journalArticle

@article{546f1eeb296842a09d2026860527485b,
title = "Discrimination slope and integrated discrimination improvement – properties, relationships and impact of calibration",
abstract = "Discrimination slope, defined as the slope of a linear regression of predicted probabilities of event derived from a prognostic model on the binary event status, has recently gained popularity as a measure of model performance. It is as a building block for the integrated discrimination improvement that equals the difference in discrimination slopes between the two models being compared. Several authors have pointed out that it does not make sense to apply the integrated discrimination improvement and discrimination slope when working with mis-calibrated models, whereas others have raised concerns about the ability of improving discrimination slope without adding new information. In this paper, we show that under certain assumptions the discrimination slope is asymptotically related to two other R-squared measures, one of which is a rescaled version of the Brier score, known to be proper. Furthermore, we illustrate how a simple recalibration makes the slope equal to the rescaled Brier R-squared metric. We also show that the discrimination slope can be interpreted as a measure of reduction in expected regret for the Gini-Brier regret function. Using theoretical and practical examples, we illustrate how all of these metrics are affected by different levels of model mis-calibration. In particular, we demonstrate that simple recalibration ascertaining calibration in-the-large and calibration slope equal to 1 are not sufficient to correct for some forms of mis-calibration. We conclude that R-squared metrics, including the discrimination slope, offer an attractive choice for quantifying model performance as long as one accounts for their sensitivity to model calibration.",
keywords = "IDI, model, proper, R-squared, risk",
author = "Pencina, {Michael J.} and Fine, {Jason P.} and D'Agostino, {Ralph B.}",
year = "2017",
month = "12",
day = "10",
doi = "10.1002/sim.7139",
language = "English (US)",
volume = "36",
pages = "4482--4490",
journal = "Statistics in Medicine",
issn = "0277-6715",
publisher = "John Wiley and Sons Ltd",
number = "28",

}

TY - JOUR

T1 - Discrimination slope and integrated discrimination improvement – properties, relationships and impact of calibration

AU - Pencina,Michael J.

AU - Fine,Jason P.

AU - D'Agostino,Ralph B.

PY - 2017/12/10

Y1 - 2017/12/10

N2 - Discrimination slope, defined as the slope of a linear regression of predicted probabilities of event derived from a prognostic model on the binary event status, has recently gained popularity as a measure of model performance. It is as a building block for the integrated discrimination improvement that equals the difference in discrimination slopes between the two models being compared. Several authors have pointed out that it does not make sense to apply the integrated discrimination improvement and discrimination slope when working with mis-calibrated models, whereas others have raised concerns about the ability of improving discrimination slope without adding new information. In this paper, we show that under certain assumptions the discrimination slope is asymptotically related to two other R-squared measures, one of which is a rescaled version of the Brier score, known to be proper. Furthermore, we illustrate how a simple recalibration makes the slope equal to the rescaled Brier R-squared metric. We also show that the discrimination slope can be interpreted as a measure of reduction in expected regret for the Gini-Brier regret function. Using theoretical and practical examples, we illustrate how all of these metrics are affected by different levels of model mis-calibration. In particular, we demonstrate that simple recalibration ascertaining calibration in-the-large and calibration slope equal to 1 are not sufficient to correct for some forms of mis-calibration. We conclude that R-squared metrics, including the discrimination slope, offer an attractive choice for quantifying model performance as long as one accounts for their sensitivity to model calibration.

AB - Discrimination slope, defined as the slope of a linear regression of predicted probabilities of event derived from a prognostic model on the binary event status, has recently gained popularity as a measure of model performance. It is as a building block for the integrated discrimination improvement that equals the difference in discrimination slopes between the two models being compared. Several authors have pointed out that it does not make sense to apply the integrated discrimination improvement and discrimination slope when working with mis-calibrated models, whereas others have raised concerns about the ability of improving discrimination slope without adding new information. In this paper, we show that under certain assumptions the discrimination slope is asymptotically related to two other R-squared measures, one of which is a rescaled version of the Brier score, known to be proper. Furthermore, we illustrate how a simple recalibration makes the slope equal to the rescaled Brier R-squared metric. We also show that the discrimination slope can be interpreted as a measure of reduction in expected regret for the Gini-Brier regret function. Using theoretical and practical examples, we illustrate how all of these metrics are affected by different levels of model mis-calibration. In particular, we demonstrate that simple recalibration ascertaining calibration in-the-large and calibration slope equal to 1 are not sufficient to correct for some forms of mis-calibration. We conclude that R-squared metrics, including the discrimination slope, offer an attractive choice for quantifying model performance as long as one accounts for their sensitivity to model calibration.

KW - IDI

KW - model

KW - proper

KW - R-squared

KW - risk

UR - http://www.scopus.com/inward/record.url?scp=84994890073&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84994890073&partnerID=8YFLogxK

U2 - 10.1002/sim.7139

DO - 10.1002/sim.7139

M3 - Article

VL - 36

SP - 4482

EP - 4490

JO - Statistics in Medicine

T2 - Statistics in Medicine

JF - Statistics in Medicine

SN - 0277-6715

IS - 28

ER -