Skip to content
Academic backing

What standsbehind the engine.

The model's figures are not invented. They come from a Latin American cardiovascular cohort of 382,589 patients followed between 1990 and 2025, external validation on an insurer never seen during training, and a methodological protocol that meets TRIPOD+AI (BMJ 2024). The paper documenting all of this is under review at IJCCRP 2026.

Under review · IJCCRP 2026

The paper documentingderivation and validation.

Under review

A Next-Generation Prediction Risk Model for Acute Myocardial Infarction

Amorocho-Morales JD, Parra-Guevara S, Quintero-Muñoz E, Dimas G, Correa-Morales JE. Int J Cardiol Cardiovasc Risk Prev (IJCCRP). 2026.

IJCCRP-D-26-00086

Abstract

We present a predictive model for acute myocardial infarction trained and validated on a Latin American cardiovascular cohort of 382,589 unique patients followed between 1990 and 2025, with 3,940,059 clinical encounters and 15,511 observed AMIs. The model operates on routine clinical data, no advanced imaging, no special biomarkers, and reports near-perfect calibration (O/E 0.998), robust discrimination (AUC 0.869, C-index 6m 0.836, C-index 12m 0.846), and significant external validation on an insurer never seen during training (n=5,602, F=147.6, p<.001). The separation between operating bands (observed rate 54.6% / 38.1% / 21.0%) shows that calibration survives the population shift. The reporting follows the TRIPOD+AI standard (BMJ 2024).

Routine clinical data

An integrated cardiovascularcohort of Colombia, 1990–2025.

The model was trained on real clinical practice, not on a trial. That is what lets the probabilities reflect the population they are applied to.

Unique patients 382,589

Deduplicated identifiers across the 35-year window.

Clinical encounters 3,940,059

Visits, hospitalizations, and events recorded in the system.

Observed infarctions 15,511

Confirmed AMI events (coding + clinical verification).

External validation n = 5,602

Insurer never seen during training · F = 147.6 · p < .001.

1990 – 2025 Observation period
Discrimination, calibration, and validation

The figures that backevery prediction.

The same metrics that /products/caritas uses for operational work, here in academic form with citations. Computed on the full cohort and reproduced in the external validation.

What is O/E?

When the model says 22%, it is 22% in reality.

O/E (observed over expected) compares how many events actually occurred against how many the model predicted. A value near 1.0 means the model's probabilities are numerically correct: when it says 22% risk at 12 months, the patient really does have a 22% chance of AMI in the actual cohort, not 5%, not 60%. That is what lets you use the probability as a clinical threshold, share it with the patient, and define operating bands. Discrimination (AUC, C-index) tells you the order is right; calibration tells you the numbers are real.

O/E
0.998
Near-perfect calibration. The numerical probability the model reports matches the observed rate in the cohort.
AUC
0.869
Discrimination: the model separates patients who will have an AMI from those who will not in 86.9% of random pairs.
C-index 6m
0.836
Time-aware discrimination at 6 months: the priority order matches the actual order of event occurrence.
C-index 12m
0.846
Time-aware discrimination at 12 months. Concordance holds when the prediction horizon is extended.

External validation on an insurer never seen during training. The separation between bands holds, calibration survives the population shift.

5,602

Patients

147.6

F statistic

<.001

p value

Observed rate per band
High 54.6%
Medium 38.1%
Low 21.0%

Classic models (Framingham, SCORE2, PROCAM, PREVENT) measure 10-year risk on European or American cohorts. Applied as-is in LATAM they overpredict: O/E < 1. The system ends up prioritizing people who won't have the event.

Systematic LATAM pattern: Framingham and SCORE2 overpredict (O/E < 1). Sex-adjusted PROCAM partially closes the calibration gap in Colombia. Caritas calibrates almost perfectly on the real cohort.

Modelo
AUC
O/E
Scope
Corpus AI
0.869
0.998
Colombia · 1990–2025 · AMI 6–12 m
Framingham Fuente: Muñoz 2014 · n=1,013
0.658
0.76
Colombia · primary prevention · 10 years
PROCAM (sex-adj) Fuente: Muñoz 2014 · n=1,013
0.744
0.94
Colombia · best localized model · 10 years
SCORE2 Fuente: López-López 2025 · n=2,022
0.68 – 0.72
22–42% overprediction
Colombia · PURE cohort · 12.3 years
Framingham Fuente: Camargos 2024 · n=12,155
0.77
0.38
Brazil · ELSA · 4.2 years
SCORE2 Fuente: Camargos 2024 · n=12,155
0.76
0.63
Brazil · ELSA · recalibrated low-risk
PREVENT (AHA) Fuente: Mancini 2024 · Scheuermann 2024
,
,
No external LATAM validation with incident outcomes as of May 2026
Reported under TRIPOD+AI · BMJ 2024
Verifiable references

The ten referencesbehind the work.

Filter by reference type
  1. Muñoz OM, Rodríguez NI, Ruiz Á, Rondón M. Validación de los modelos de predicción de Framingham y PROCAM en una población colombiana. Rev Colomb Cardiol. 2014;21(4):202–212.

    paper
  2. Hageman SHJ, McKay AJ, et al.. SMART2 risk prediction algorithm. Eur Heart J. 2022;43(18):1715–1727.

    paper
  3. Mancini GBJ, Ryomoto A. Adoption of the PREVENT Risk Algorithm: Potential International Implications. JACC Adv. 2024;3(8):101122.

    paper
  4. Scheuermann B, Brown A, et al.. External Validation of the AHA PREVENT Cardiovascular Disease Risk Equations. JAMA Netw Open. 2024;7(10):e2438311.

    paper
  5. WHO CVD Risk Chart Working Group. WHO CVD risk charts: revised models for 21 global regions. Lancet Glob Health. 2019;7(10):e1332–e1345.

    guideline
  6. Collins GS, Moons KGM, et al.. TRIPOD+AI statement. BMJ. 2024;385:e078378.

    standard
  7. Liu T, Krentz A, Lu L, Curcin V. ML-based prediction models for CVD risk using EHR data: systematic review and meta-analysis. Eur Heart J Digit Health. 2024;6(1):7–22.

    paper
  8. Damen JA, Pajouheshnia R, et al.. Performance of the Framingham risk models and pooled cohort equations. BMC Med. 2019;17(1):109.

    paper
  9. Amorocho-Morales JD, Parra-Guevara S, Quintero-Muñoz E, Dimas G, Correa-Morales JE. A Next-Generation Prediction Risk Model for Acute Myocardial Infarction. Int J Cardiol Cardiovasc Risk Prev. 2026 (under review, IJCCRP-D-26-00086).

    paper
  10. Krittanawong C, Virk HUH, et al.. Machine learning prediction in cardiovascular diseases: meta-analysis. Sci Rep. 2020;10(1):16057.

    paper
Talk with the team

Want to dig deeperinto the methodology?

30 minutes with the clinical-technical team. We share the preprint, discuss the calibration methodology, and explore what the model would look like on your own cohort. We respond within 2 business days.