Release
From peer-reviewed research in Nature Energy to an open-source tool — Thema's robust topological analysis, rebuilt in Rust for real-world datasets.
BySidney Gathrid
Release
Pulsar brings the THEMA algorithm to researchers, data scientists, and ML practitioners in a form that is fast enough for large, complex datasets and accessible without specialized topological tooling.
Get started with the Pulsar docs or browse the source on GitHub.
In early 2025, we published research in Nature Energy introducing the Thema algorithm — a new approach to extracting robust structure from complex, high-dimensional datasets. The paper applied Thema to accelerating US coal plant retirements, but the underlying method had far broader potential.
Pulsar is the productized form of that research: a high-performance Rust implementation that makes robust topological data analysis accessible to anyone — not just researchers with deep programming expertise.
When analyzing complex datasets, researchers typically rely on dimensionality reduction techniques like t-SNE, UMAP, or PCA. These tools produce visualizations and embeddings that can reveal structure — but they come with a fundamental limitation.
These embeddings are fragile. Adjust a hyperparameter slightly, and carefully identified clusters may dissolve entirely. This sensitivity raises an uncomfortable question: are the patterns you see genuine structure, or artifacts of a fortunate parameter choice?
Thema takes a fundamentally different approach. Rather than fighting parameter sensitivity, it exploits it as signal. The algorithm:
The result is a robust, reproducible representation — one that captures genuine data topology rather than parameter artifacts.
THEMA does not bet everything on one chart or one parameter setting. Instead, it looks at the same dataset many different ways, builds a whole multiverse of graphs, and asks which relationships keep showing up no matter how you tune the model. The cosmic graph then folds all of that evidence into one weighted map, so the strongest connections are the ones that stay stable across many possible views of the data.
Pulsar is not a different algorithm from THEMA. It is the same core approach rebuilt for speed and scale, so researchers can run large multiverse sweeps and cosmic-graph analysis on datasets that would otherwise be too slow or too cumbersome to explore in practice.
In practical terms, that means less time guessing at hyperparameters and more time identifying stable clusters, meaningful relationships, and structures that are worth acting on.
A Rust core with parallel processing makes THEMA practical at much larger scales. A full 4,000-map sweep completes in approximately 50 seconds.
Run THEMA-powered analysis through Claude using natural language, without needing to manage the workflow by hand.
Analyze complex datasets like MMLU that were previously impractical for running THEMA end to end.
The Model Context Protocol (MCP) integration fundamentally changes how you interact with topological analysis. Run Pulsar directly through Claude using natural language:
Claude handles parameter tuning, executes the topological sweep, and generates a statistical dossier — all without writing a single line of code.
If you want to understand the method in practice, start with our technical deep-dive on the MMLU benchmark. If you want to run Pulsar yourself, the docs and GitHub links above will get you started quickly.
Complete user guide with installation, configuration, and API reference.
Source code, issues, and releases. Blazing fast Rust implementation of Thema.
Run Pulsar through Claude AI with natural language. No coding required.
Original publication introducing robust topological data analysis via parameter sweeps.
Our original post about scaling Thema to production datasets.
See Thema applied to US coal plant retirements in our Nature Energy paper.