Philosophy and reliability

How to build a modelyou can actually trust.

This page explains what a predictive model is, why calibration matters more than ranking, which standards we follow, and how value in health is generated. Written for clinical teams, not for data scientists.

01 / data structuring What a predictive model is

A model organizes the clinical recordand turns it into a decision.

Before any prediction, the hard part is reading the data. The clinical record lives scattered across labs, vitals, notes, medications. A model is not magic: it is the mechanic for ordering that chaos.

The probability is not an opinion, it is a count. The model's quality is measured by how close that count lands to reality.

Structure the noise

The first job is translating scattered records, codes, abbreviations, values with different units, into a shared clinical schema. Without that step, the model learns noise instead of patterns.

Learn patterns, not memorize

A well-built model does not copy past records: it learns general relationships that hold up on new patients. The difference is called generalization, and it's measured with external validation.

Return a usable probability

The output is not 'will have a heart attack' or 'will not'. It is a calibrated probability, a number between 0 and 1 that can be compared, added, communicated to the patient, and turned into clinical thresholds.

02 / philosophy Preventive, predictive, and precision medicine

Prevention is not a promise.It is prioritizing well.

Traditional prevention distributes resources equally. It works, but it doesn't scale. As the cohort grows, prioritizing well requires seeing the individual within the population.

Preventive over reactive

Preventing an event always costs less than treating it. The difference is not only economic: it is years of life gained. The operational question is not 'prevention yes or no?', it is 'who do we intervene on first?'.

Predictive over estimative

A predictive score does not replace clinical judgment. It organizes it: tells the team which patients to look at first and which modifiable factors each one has. The clinician decides; the model orders the attention.

Precision over average

Precision medicine rejects that an average patient determines conduct for everyone. Each patient brings a unique combination of factors; a good model respects that difference and treats each case as its own.

Continuous over episodic

Risk changes. A prediction from six months ago is no longer useful today if medication, weight, or adherence changed. A serious model recalculates with every new data point, not a snapshot, a film.

Precision medicine over one-size-fits-all

Standardized medicine applies the same protocol to everyone: same dose, same frequency, same targets. It works on average and fails at the extremes. Precision medicine uses each patient's information to tune intensity, timing, and type of intervention.

Value in health over fee-for-service

Fee-for-service pays for volume, more visits, more revenue. Value-based care inverts the incentive: the system gets paid for outcomes, life years gained, events avoided. A predictive model only fits in the second paradigm.

03 / reliability Why trust what the machine returns

Five properties that separatea serious model from an experiment.

Trust in a model is not marketing. It is a set of verifiable properties, the five that separate a usable score from one that stays in a paper.

01

Calibration

When the model says 22% risk, the patient in the real cohort has a 22% chance of the event. Not 5%, not 60%. Models that only discriminate well (correct ranking) but calibrate poorly give you an order, not a usable probability. Without calibration there is no way to define a clinical threshold or communicate risk to the patient.
02

Idempotency

The same patient with the same data must receive the same score, whether it's the first visit or the hundredth. If it changes without the inputs changing, there is a bug, or randomness, in the system.
03

Determinism

The same set of variables must produce the same probability on any execution. Without determinism there is no audit: nobody can review a score if next time it comes out different.
04

Auditability

Every prediction must be traceable to its inputs: which variables entered, which version of the model processed them, when. That is what lets a clinical committee review a case months later.
05

External validation

A model trained on one population can degrade when applied to another. The only way to know is to test it on a cohort never seen. If calibration survives the shift, the model is transferable.

04 / standards Rules we follow

Three standards that governany model we publish.

A predictive model in healthcare cannot operate in a vacuum. Three frameworks, one scientific, two regulatory, define what to report and how to handle data. We follow them by choice, not only by legal obligation.

TRIPOD+AI · BMJ 2024

The scientific standard

TRIPOD+AI is the protocol that defines what a clinical-prediction-model paper must report to be taken seriously. It is not law, it is the bar the medical community expects to see.

Complete description of training and validation cohorts.
Explicit reporting of calibration, not only discrimination.
External validation with label blinding.
Model and weights available for independent review.

HIPAA · USA

Clinical privacy · North America

HIPAA sets out how identifiable clinical information is stored, transmitted, and shared in the US. Although we operate from LatAm, we follow it because it defines the standard any integration with international systems will demand.

Pseudonymization on any data leaving the client system.
Encryption at rest and in transit (AES-256 / TLS 1.3).
Access logs to protected health information (PHI).
Business Associate Agreements (BAA) where applicable.

GDPR · EU

Clinical privacy · Europe

GDPR is the European privacy framework. It treats clinical data as a special category: explicit consent, right to erasure, algorithmic transparency. Latin American regulation (Law 1581, LGPD) took it as reference.

Explicit legal basis for processing clinical data.
Patient right to understand how their information is used.
Minimization: we ask only for the data the model needs.
Portability and right to erasure when the patient requests it.

05 / the engine Prioritization Engine

The engine that turns data intoprioritized decisions.

A unified layer that reads any clinical data, computes risk with visible reasoning, and emits cohorts that are ready for action.

clinical chaos · fragmented data

HL7
FHIR
CSV
EHR
Labs
Claims
Notes

Corpus AI · Prioritization Engine

Immediate action escalate · 24h
High-risk intervention contact · 7d
Active chronic management care plan
Stable monitoring passive follow-up

prioritized cohorts

How we do it

06 / relevance Signal over noise

What mattersnow, not what happened before.

Every patient accumulates years of signals. Corpus reads them together and continuously weighs which factor changes the clinical decision today.

Temporal context

An event's urgency depends on when it happened and how it relates to the rest.

Continuous re-weighting

The network recomputes with every new data point, without anyone retraining it.

Visible reasoning

Every active node appears as a modifiable factor in the final score.

Abnormal ECG 3 days ago
BP 164/102 9 days ago
chest pain (reported) 22 days ago
BP 158/96 85 days ago
LDL 178 140 days ago
BP 152/94 310 days ago
LDL 165 540 days ago
HbA1c 6.1 820 days ago

07 / architecture The model x-ray

Not a black box.An architecture you can inspect.

Three layers. Data flows up, from integration to compute to clinical action. Built for engineering teams, medical directors, and CTOs who need to understand, not just trust.

↑ data flow

L3 · OUTPUT

Delivery & Clinical Action

Stratified lists, clinical reports, and dynamic dashboards that trigger clinical action, treatment adjustment, prevention programs, targeted intervention.

L2 · COMPUTE

Core Predictive Engine

Data reception and validation. The model processes the stream and feeds itself continuously, predictions update on their own when new data arrives.

L1 · INTEGRATION

Data Ingestion & Sync

Customer-system mapping, HIS, LIS, ERPs, via API, HL7, cloud buckets, SFTP, or manual CSV extraction as an immediate-impact pathway.

08 / model performance Model performance

Calibrated and validatedon a real cohort.

The engine is not a promise: its performance is measured and published. These are the headline figures; the full methodology lives in the evidence.

sensitivity 93.8%

precision 95.8%

AUC 0.869

Read the evidence

09 / the loop The loop

Five components.One single loop.

01 ingestion We connect any clinical data source HL7 · FHIR · CSV · EHR · PDF
02 standardization We turn data into clinical intelligence normalize · dedupe · map
03 risk + explainability We compute risk and explain why feature weights · modifiable factors
04 plan We generate the optimal intervention ranked by clinical cost-benefit
05 follow-up We follow the patient over time re-evaluated as risk changes

10 / value How value in health is generated

A prevented event costs lessthan a treated one.

Value-based care measures the system by outcomes, not volume. For it to work operationally, the system needs to identify the patients who will generate the most events before they happen.

The math is direct. A payer with one hundred thousand members sees roughly one thousand acute cardiovascular events a year, each at a cost that multiplies the cost of preventing it. If prioritization captures 55% of those events through the 15% at highest risk, the system can invest in targeted interventions, visits, titration, care-gap closure, on a manageable subset. The difference is not only financial: every prevented event is years of productive life for the patient.

Prioritization does not replace universal prevention.

Prioritization does not replace universal prevention. The universal system keeps running underneath; the predictive layer is the accelerator that picks where to put the additional energy.

See the model in action

Want to seewhat prioritization looks like?

The Corpus AI platform applies all of this to a real cohort: individual prioritization, population view, intervention levers, and clinical validation with figures — for companies, payers, providers, and the person.

See the platform Read the academic evidence