FDA CDS Compliance

Criterion 4 is not about explainable AI.

It requires that a clinician can validate the recommendation without trusting your model. SHAP plots don't satisfy this. Neither does AUROC.

Miss Criterion 4 → your software loses the CDS exemption and is regulated as a medical device. Submission pathway required: 510(k), De Novo, or PMA — each with significant time and cost implications.

What the regulation actually says

“The healthcare professional must be able to independently review the basis for the recommendation so that it is not primarily relied upon to make a clinical decision.”

FDA CDS Guidance (2026)·Full guidance text

The litmus test

Could a physician review the basis for this recommendation — inputs, data, validation — and decide whether to trust it?

If no → the software doesn't qualify for the CDS exemption. The manufacturer must intend for the software to provide that basis. The clinician need not reconstruct the logic unaided.

Four questions your evidence package must answer

With evidence, not assertions.

Q1Data Provenance

Proprietary Data QACriterion 4(c)(ii)

Structured audit of training sources, patient demographics, and gap analysis to deployment cohort.

'Data was sourced from a representative multi-site cohort' — asserted, not demonstrated.

Q2Representativeness

GeneralizabilityCriterion 4(c)(iii)

Stratified performance metrics by age, ethnicity, acuity — including subpopulations not present in training data.

AUROC on a held-out test set drawn from the same distribution as training data.

Q3Recency

StabilityCriterion 4 lifecycle

Temporal drift analysis across simulated time windows, with documented thresholds for when retraining is required.

'Validated on recent data' — with no ongoing monitoring protocol.

Q4Signal Attribution

SanityCriterion 4(c)(i)

Feature attribution confirming clinical signals drive predictions, not documentation artifacts or completeness proxies.

A SHAP summary plot showing top features by importance score.

* Not FDA's official categories — how we operationalize the independent-reviewability requirement.

** For non-time-critical CDS, Criterion 4 is typically the decisive gate. Urgent-care models (sepsis, AKI) may face Criterion 3 time-criticality scrutiny before Criterion 4 applies.

Why the common approaches fail

Most teams ship explainability and call it compliance. The difference is testable.

Dimension	Static Validation	Paper Compliance	Krv
When	Post-deployment	Pre-deployment (on paper)	Pre-deployment (in simulation)
Method	80/20 static split	Written explainability report	10,000+ synthetic scenarios
Criterion 4 Evidence	None	Documentation of lab behavior	Evidence produced by production stress-test
Model Outcome	Unknown until deployment	No change	Model improved against found failures
Defensibility	None	Documented, not tested	Tested, then documented

Evidence package in days, not months

We produce the evidence.

nightingale — our free, open-source tool that runs in your environment — is coming soon. Or talk to us about a full evidence package for your model.

nightingale (coming soon)Get Your Evidence Package

Further reading: FDA CDS Guidance page ↗ · CDS Guidance PDF (2026) ↗

Criterion 4 is not about explainable AI.

Four questions your evidence package must answer

Why the common approaches fail

We produce the evidence.

Find Us

Quick Links