We're solving the hard problems.

Sustainable AI needs more than clever engineering. It needs a flagship model worth deploying everywhere, compression that keeps improving, and privacy guarantees that are architectural — not contractual. Our research addresses all three.

Fern

7B-quality intelligence in a 750 MB footprint. Fern runs fully offline on a smartphone, Apple Watch, or an ESP32 embedded chip — no cloud, no latency, full privacy.

  • upto 60x compression with no quality loss
  • Runs on-device: phone, watch, embedded chip
  • 100 domain specialists on a single GPU
  • 2 ms specialist switch time
  • Circuit-level validated with full physics simulation

By the numbers

750 MB
On-device footprint
upto 60x
Compression ratio
2 ms
Domain switch time
7 mm²
Entire model on chip
100+
Specialists per GPU
5
Validated modalities

Active research directions

Higher ratios on reasoning architectures

Pushing compression further on next-generation model families without quality regression.

Lossless compression for safety-critical use

Formal quality guarantees for regulated domains such as medical and defence.

Vision-language and multimodal coverage

Extending our compression pipeline to protein, diffusion, and vision-language models.

Compression-native training pipelines

Building compressibility into the model from the start rather than applying it post-hoc.

Compression Refinements

Our compression technology already achieves up to 60× reduction with no measurable quality loss. But we're not done. Our research is advancing the frontier — higher ratios, broader model coverage, and compression techniques that are provably safe for regulated environments.

Active directions include architecture-aware pruning for next-generation model families, lossless compression for safety-critical applications, and compression-native training pipelines that build compressibility into the model from the start.

These improvements feed directly into our products: better compression means smaller on-device footprints, lower enterprise serving costs, and higher-quality compressed models in our open-source releases.

Privacy-Preserving Inference

Running AI on-device is the most powerful privacy guarantee there is — data never leaves the user's hardware, there are no inference logs, and there is no surface for data leakage. But privacy goes deeper than deployment topology.

We are researching differential privacy techniques for fine-tuning, secure aggregation for federated model updates, and confidential inference on shared infrastructure. The goal: AI that is private by construction, not just by policy.

Combined with Fern's on-device capabilities, this makes our models uniquely suited to healthcare, defence, legal, and any other domain where data sovereignty is non-negotiable.

Why on-device privacy matters

No logs, no leakage

Sensitive data stays on the device — it is never transmitted, stored, or processed by a third party.

Regulatory compliance by design

Compliant with HIPAA, GDPR, and air-gapped deployment requirements without contractual workarounds.

No cloud dependency

Inference runs entirely offline — no API keys, no connectivity requirement, no third-party exposure.

Interested in working with us on these problems?

We're looking for hardware partners, academic collaborators, and engineers who want to work on the foundations of sustainable AI.