Introducing Javai
AI systems are increasingly subject to regulatory scrutiny, yet the testing tools available to most teams were designed for a deterministic world. Javai is here to change that.

Bringing statistical rigour to software testing — so engineering teams can satisfy the regulatory demands of AI performance measurement and regression testing.
AI systems, probabilistic models, and non-deterministic processes are increasingly subject to regulatory scrutiny. Organisations must demonstrate measurable, reproducible evidence that these systems perform within acceptable bounds — not just once, but continuously.
Traditional unit testing assumes deterministic outcomes. In reality, that assumption never withstood scrutiny. But AI means we have no choice but to manage uncertainty professionally, and that means statistically.
Define statistical expectations for your system's behaviour. Assert against distributions, not exact values. punit gives you the vocabulary to express what "correct" means in a non-deterministic context.
Detect when your system drifts beyond acceptable bounds. Run repeatable hypothesis tests in CI/CD and catch performance degradation before it reaches production.
Produce auditable, structured evidence that your AI systems perform as expected. Give regulators, auditors, and risk committees the confidence they need.
Probabilistic unit testing for Java
punit is a JUnit 5 extension that runs tests multiple times and applies statistical inference to determine whether a non-deterministic system is behaving acceptably. Explore configurations, measure empirical baselines, and run regression tests in CI/CD — with configurable confidence levels, latency percentile assertions, and auditable verdicts.
A complete example application demonstrating punit's capabilities — including an LLM-powered shopping basket tested with explore, measure, and optimize experiments, and a payment gateway verified against SLA thresholds.
View on GitHubA Java framework that bridges deterministic application code with fallible, non-deterministic operations. Replaces try/catch with type-safe Outcome values, structured failure classification, policy-driven retries, and built-in observability.
View on GitHubAI systems are increasingly subject to regulatory scrutiny, yet the testing tools available to most teams were designed for a deterministic world. Javai is here to change that.