Skip to content
Philosophy and reliability

How to build a modelyou can actually trust.

This page explains what a predictive model is, why calibration matters more than ranking, which standards we follow, and how value in health is generated. Written for clinical teams, not for data scientists.

What a predictive model is

A model organizes the clinical recordand turns it into a decision.

Before any prediction, the hard part is reading the data. The clinical record lives scattered across labs, vitals, notes, medications. A model is not magic: it is the mechanic for ordering that chaos.

The probability is not an opinion, it is a count. The model's quality is measured by how close that count lands to reality.

Structure the noise

The first job is translating scattered records, codes, abbreviations, values with different units, into a shared clinical schema. Without that step, the model learns noise instead of patterns.

Learn patterns, not memorize

A well-built model does not copy past records: it learns general relationships that hold up on new patients. The difference is called generalization, and it's measured with external validation.

Return a usable probability

The output is not 'will have a heart attack' or 'will not'. It is a calibrated probability, a number between 0 and 1 that can be compared, added, communicated to the patient, and turned into clinical thresholds.

Preventive, predictive, and precision medicine

Prevention is not a promise.It is prioritizing well.

Traditional prevention distributes resources equally. It works, but it doesn't scale. As the cohort grows, prioritizing well requires seeing the individual within the population.

Preventive over reactive

Preventing an event always costs less than treating it. The difference is not only economic: it is years of life gained. The operational question is not 'prevention yes or no?', it is 'who do we intervene on first?'.

Predictive over estimative

A predictive score does not replace clinical judgment. It organizes it: tells the team which patients to look at first and which modifiable factors each one has. The clinician decides; the model orders the attention.

Precision over average

Precision medicine rejects that an average patient determines conduct for everyone. Each patient brings a unique combination of factors; a good model respects that difference and treats each case as its own.

Continuous over episodic

Risk changes. A prediction from six months ago is no longer useful today if medication, weight, or adherence changed. A serious model recalculates with every new data point, not a snapshot, a film.

Precision medicine over one-size-fits-all

Standardized medicine applies the same protocol to everyone: same dose, same frequency, same targets. It works on average and fails at the extremes. Precision medicine uses each patient's information to tune intensity, timing, and type of intervention.

Value in health over fee-for-service

Fee-for-service pays for volume, more visits, more revenue. Value-based care inverts the incentive: the system gets paid for outcomes, life years gained, events avoided. A predictive model only fits in the second paradigm.

Why trust what the machine returns

Five properties that separatea serious model from an experiment.

Trust in a model is not marketing. It is a set of verifiable properties, the five that separate a usable score from one that stays in a paper.

  1. 01

    Calibration

    When the model says 22% risk, the patient in the real cohort has a 22% chance of the event. Not 5%, not 60%. Models that only discriminate well (correct ranking) but calibrate poorly give you an order, not a usable probability. Without calibration there is no way to define a clinical threshold or communicate risk to the patient.

  2. 02

    Idempotency

    The same patient with the same data must receive the same score, whether it's the first visit or the hundredth. If it changes without the inputs changing, there is a bug, or randomness, in the system.

  3. 03

    Determinism

    The same set of variables must produce the same probability on any execution. Without determinism there is no audit: nobody can review a score if next time it comes out different.

  4. 04

    Auditability

    Every prediction must be traceable to its inputs: which variables entered, which version of the model processed them, when. That is what lets a clinical committee review a case months later.

  5. 05

    External validation

    A model trained on one population can degrade when applied to another. The only way to know is to test it on a cohort never seen. If calibration survives the shift, the model is transferable.

Rules we follow

Three standards that governany model we publish.

A predictive model in healthcare cannot operate in a vacuum. Three frameworks, one scientific, two regulatory, define what to report and how to handle data. We follow them by choice, not only by legal obligation.

TRIPOD+AI · BMJ 2024

The scientific standard

TRIPOD+AI is the protocol that defines what a clinical-prediction-model paper must report to be taken seriously. It is not law, it is the bar the medical community expects to see.

  • Complete description of training and validation cohorts.
  • Explicit reporting of calibration, not only discrimination.
  • External validation with label blinding.
  • Model and weights available for independent review.
HIPAA · USA

Clinical privacy · North America

HIPAA sets out how identifiable clinical information is stored, transmitted, and shared in the US. Although we operate from LatAm, we follow it because it defines the standard any integration with international systems will demand.

  • Pseudonymization on any data leaving the client system.
  • Encryption at rest and in transit (AES-256 / TLS 1.3).
  • Access logs to protected health information (PHI).
  • Business Associate Agreements (BAA) where applicable.
GDPR · EU

Clinical privacy · Europe

GDPR is the European privacy framework. It treats clinical data as a special category: explicit consent, right to erasure, algorithmic transparency. Latin American regulation (Law 1581, LGPD) took it as reference.

  • Explicit legal basis for processing clinical data.
  • Patient right to understand how their information is used.
  • Minimization: we ask only for the data the model needs.
  • Portability and right to erasure when the patient requests it.
How value in health is generated

A prevented event costs lessthan a treated one.

Value-based care measures the system by outcomes, not volume. For it to work operationally, the system needs to identify the patients who will generate the most events before they happen.

The math is direct. A payer with one hundred thousand members sees roughly one thousand acute cardiovascular events a year, each at a cost that multiplies the cost of preventing it. If prioritization captures 55% of those events through the 15% at highest risk, the system can invest in targeted interventions, visits, titration, care-gap closure, on a manageable subset. The difference is not only financial: every prevented event is years of productive life for the patient.

Prioritization does not replace universal prevention.

Prioritization does not replace universal prevention. The universal system keeps running underneath; the predictive layer is the accelerator that picks where to put the additional energy.

See the model in action

Want to seewhat prioritization looks like?

Caritas is the Corpus AI platform for payers, providers, and insurers. It shows how all of this applies to a real cohort: individual patient view, population view, intervention levers, clinical validation with figures.