logo
FDA CDS Compliance

Criterion 4 is not about explainable AI.

It requires that a clinician can validate the recommendation without trusting your model. SHAP plots don't satisfy this. Neither does AUROC.

Miss Criterion 4 → your software loses the CDS exemption and is regulated as a medical device. Submission pathway required: 510(k), De Novo, or PMA — each with significant time and cost implications.

What the regulation actually says

“The healthcare professional must be able to independently review the basis for the recommendation so that it is not primarily relied upon to make a clinical decision.”

FDA CDS Guidance (2026)·Full guidance text

The litmus test

Could a physician review the basis for this recommendation — inputs, data, validation — and decide whether to trust it?

If no → the software doesn't qualify for the CDS exemption. The manufacturer must intend for the software to provide that basis. The clinician need not reconstruct the logic unaided.

Four questions your evidence package must answer

With evidence, not assertions.

Q1Data Provenance
Proprietary Data QACriterion 4(c)(ii)

Structured audit of training sources, patient demographics, and gap analysis to deployment cohort.

'Data was sourced from a representative multi-site cohort' — asserted, not demonstrated.

Q2Representativeness
GeneralizabilityCriterion 4(c)(iii)

Stratified performance metrics by age, ethnicity, acuity — including subpopulations not present in training data.

AUROC on a held-out test set drawn from the same distribution as training data.

Q3Recency
StabilityCriterion 4 lifecycle

Temporal drift analysis across simulated time windows, with documented thresholds for when retraining is required.

'Validated on recent data' — with no ongoing monitoring protocol.

Q4Signal Attribution
SanityCriterion 4(c)(i)

Feature attribution confirming clinical signals drive predictions, not documentation artifacts or completeness proxies.

A SHAP summary plot showing top features by importance score.

* Not FDA's official categories — how we operationalize the independent-reviewability requirement.

** For non-time-critical CDS, Criterion 4 is typically the decisive gate. Urgent-care models (sepsis, AKI) may face Criterion 3 time-criticality scrutiny before Criterion 4 applies.

Why the common approaches fail

Most teams ship explainability and call it compliance. The difference is testable.

DimensionStatic ValidationPaper ComplianceKrv
WhenPost-deploymentPre-deployment (on paper)Pre-deployment (in simulation)
Method80/20 static splitWritten explainability report10,000+ synthetic scenarios
Criterion 4 EvidenceNoneDocumentation of lab behaviorEvidence produced by production stress-test
Model OutcomeUnknown until deploymentNo changeModel improved against found failures
DefensibilityNoneDocumented, not testedTested, then documented

Evidence package in days, not months

We produce the evidence.

nightingale — our free, open-source tool that runs in your environment — is coming soon. Or talk to us about a full evidence package for your model.

nightingale (coming soon)Get Your Evidence Package

Further reading: FDA CDS Guidance page ↗ · CDS Guidance PDF (2026) ↗