Criterion 4 is not about explainable AI.
Miss Criterion 4 → your software loses the CDS exemption and is regulated as a medical device. Submission pathway required: 510(k), De Novo, or PMA — each with significant time and cost implications.
What the regulation actually says
The litmus test
If no → the software doesn't qualify for the CDS exemption. The manufacturer must intend for the software to provide that basis. The clinician need not reconstruct the logic unaided.
Four questions your evidence package must answer
With evidence, not assertions.
Structured audit of training sources, patient demographics, and gap analysis to deployment cohort.
'Data was sourced from a representative multi-site cohort' — asserted, not demonstrated.
Stratified performance metrics by age, ethnicity, acuity — including subpopulations not present in training data.
AUROC on a held-out test set drawn from the same distribution as training data.
Temporal drift analysis across simulated time windows, with documented thresholds for when retraining is required.
'Validated on recent data' — with no ongoing monitoring protocol.
Feature attribution confirming clinical signals drive predictions, not documentation artifacts or completeness proxies.
A SHAP summary plot showing top features by importance score.
* Not FDA's official categories — how we operationalize the independent-reviewability requirement.
** For non-time-critical CDS, Criterion 4 is typically the decisive gate. Urgent-care models (sepsis, AKI) may face Criterion 3 time-criticality scrutiny before Criterion 4 applies.
Why the common approaches fail
Most teams ship explainability and call it compliance. The difference is testable.
| Dimension | Static Validation | Paper Compliance | Krv |
|---|---|---|---|
| When | Post-deployment | Pre-deployment (on paper) | Pre-deployment (in simulation) |
| Method | 80/20 static split | Written explainability report | 10,000+ synthetic scenarios |
| Criterion 4 Evidence | None | Documentation of lab behavior | Evidence produced by production stress-test |
| Model Outcome | Unknown until deployment | No change | Model improved against found failures |
| Defensibility | None | Documented, not tested | Tested, then documented |
Evidence package in days, not months
We produce the evidence.
nightingale — our free, open-source tool that runs in your environment — is coming soon. Or talk to us about a full evidence package for your model.
Further reading: FDA CDS Guidance page ↗ · CDS Guidance PDF (2026) ↗