Introducing Primus

A research system designed to advance the pace of scientific discovery

Blankline Research10 min read
Introducing Primus

Scientific discovery has always followed a particular rhythm. A researcher forms an intuition, translates it into formal language, tests it against data, revises, and iterates — sometimes for months, sometimes for years. The bottleneck is rarely insight alone. It is the slow, painstaking work of refinement: checking whether a derivation holds under edge cases, designing the right statistical test, finding the flaw in an argument before a reviewer does. This is the work that fills the space between a good idea and a published result.

Primus is an AI research system designed to occupy that space.

What Primus is

Primus is a research system built at Blankline for theoretical derivation refinement, statistical analysis design, and formal reasoning across scientific domains. It operates as a collaborative reasoning engine — working alongside a researcher to pressure-test hypotheses, explore mathematical structures, and identify failure modes in theoretical arguments before they reach peer review.

It is not a foundation model. It is not a chatbot repurposed for science. Primus is a system — purpose-built for the specific cognitive work that sits between a researcher's initial insight and a rigorous, publishable result.

The distinction matters. Foundation models are trained to be general. Primus is designed to be precise. Where a general model might generate plausible-sounding mathematics, Primus is built to find the step in a derivation that breaks, the statistical assumption that doesn't hold, the edge case that invalidates an otherwise elegant argument.

Why we built it

The current landscape of AI for science follows a pattern: large organisations build domain-specific systems backed by enormous compute budgets. AlphaFold predicts protein structures. AlphaTensor discovers matrix multiplication algorithms. GNoME finds new materials. Each system is purpose-built for a single domain, trained on domain-specific data, and requires infrastructure that only a handful of institutions can provide.

This approach works. It has produced genuine breakthroughs. But it also encodes an assumption — that AI-assisted scientific discovery is fundamentally a compute problem. That progress requires more parameters, more training data, more GPUs.

We believe the assumption is incomplete.

The hardest part of scientific research is not computation. It is reasoning — the capacity to hold a formal structure in mind, probe it for weaknesses, and determine whether it actually proves what it claims to prove. This is the work that separates a conjecture from a theorem, a correlation from a mechanism, a pattern from a discovery.

Primus is built on a different premise: that the binding constraint on AI-assisted scientific discovery is methodology, not compute.

How Primus works

Primus Architecture

Primus processes research through five stages, each designed to mirror the rigour that a careful human collaborator would apply — but at a speed and consistency that human attention cannot sustain over long derivations.

Stage 1 — Input encoding. Primus ingests the raw materials of a research problem: formal conjectures, empirical datasets, derivation chains, prior literature, and the domain-specific formalisms that govern how arguments are constructed in a given field. This is not keyword extraction. It is structural parsing — understanding what a derivation is trying to prove, what assumptions it depends on, and where the logical load-bearing steps are.

Stage 2 — Core reasoning engine. This is where the central work happens. Primus decomposes logical arguments into their constituent steps, surfaces implicit assumptions the researcher may not have stated, identifies algebraic and geometric structures within the problem, and — critically — searches for cross-domain connections that might strengthen or simplify the argument. A technique from algebraic geometry applied to a question in astrophysics. An information-theoretic bound repurposed for computational complexity. Primus does not respect disciplinary walls because formal structures do not respect them either.

The reasoning engine also constructs and evaluates statistical frameworks. It does not treat statistics as a mechanical procedure — it treats it as an argument that must be defended. It checks whether the assumptions behind a chosen test are actually satisfied by the data, quantifies sensitivity to methodological choices, and identifies confounds before a reviewer does.

Stage 3 — Verification layer. Every output from the reasoning engine passes through systematic verification: internal consistency checks across all steps in a derivation, boundary analysis to test whether conclusions hold at the edges of their parameter space, validation against known results, and robustness testing through perturbation and assumption relaxation. Each element receives a confidence assessment. Results that fail verification are not discarded — they are routed back to Stage 2 through a refinement feedback loop for iterative strengthening.

Stage 4 — Synthesis. Verified components are consolidated into coherent results: refined derivations with complete logical chains, statistical evidence packages with full sensitivity analyses, and — where the verification layer has flagged unresolved assumptions or open boundary cases — explicit identification of what remains unknown and where future work should focus.

Stage 5 — Verified output. The final stage produces publication-ready material: derivations that have survived systematic stress-testing, statistical frameworks designed to be reproducible, and discovery candidates — novel results flagged for independent human verification before any scientific claim is made.

An iterative discovery loop connects the output back to the input. Results from one cycle become the prior literature and formal structure for the next, enabling Primus to build on its own verified outputs across successive research iterations.

What this means for working scientists

Most AI systems built for science attempt to do the science themselves — predicting protein structures, generating candidate molecules, discovering algorithms. Primus does something different. It checks the math.

A theoretical physicist working on a 40-page derivation does not need an AI to generate the proof. She needs something that will find the sign error on page 23, verify that the boundary conditions in Section 4 are consistent with the approximations in Section 2, and flag that the statistical test chosen in the analysis assumes independence that the data does not actually satisfy. This is what Primus does.

A mathematician exploring a conjecture does not need an AI to guess at the answer. He needs something that will systematically probe whether a proposed argument actually proves what it claims to prove — testing edge cases, searching for counterexamples, identifying the precise step where a logical gap exists. This is what Primus does.

A machine learning researcher designing a new architecture does not need an AI to write the code. She needs something that will verify whether the theoretical claims in the paper — convergence guarantees, complexity bounds, formal properties — actually follow from the stated assumptions. This is what Primus does.

The distinction is fundamental. Primus does not try to replace the scientist. It tries to be the most rigorous collaborator the scientist has ever worked with — one that does not get tired on page 30 of a derivation, does not develop blind spots from working on the same problem for months, and does not defer to authority when a logical step does not follow from its premises.

This is the tool that theoretical physicists, mathematicians, and statisticians have needed but never had — not an AI that competes with human insight, but one that holds human reasoning to the highest possible standard. And because it reasons across disciplinary boundaries, it can catch the kind of errors that emerge only when techniques from one field are applied to problems in another — precisely the context where domain-specific expertise is most likely to have blind spots.

What Primus is not

Transparency about limitations is as important as clarity about capabilities.

Primus is not autonomous in the way that word is sometimes used in AI marketing. It does not independently select research problems, design experiments, or make scientific claims. Every result produced with Primus is independently verified by the researcher. Every claim in a Primus-assisted paper represents the author's own scientific judgment. Primus is a tool — a powerful one, but a tool.

Primus is not a replacement for domain expertise. It cannot substitute for the physical intuition that leads an astrophysicist to examine a particular dataset, or the mathematical taste that tells a number theorist which lemma is worth proving. It augments the technical execution of research, not the creative direction.

Primus does not require dedicated training infrastructure. It was not built by training a new model. It was built by designing a methodology — a way of composing, verifying, and iterating on formal reasoning — that leverages existing AI capabilities in a pipeline purpose-built for scientific rigour. If AI-assisted scientific discovery requires a hundred-million-dollar compute budget, then it is available to perhaps a dozen organisations worldwide. If it does not, the implications are different.

The Blankline approach

The prevailing model for AI research labs is straightforward: raise capital, acquire compute, train large models, publish results. This model has produced extraordinary work. It has also concentrated the capacity for AI-assisted discovery in a small number of well-funded institutions.

Blankline operates differently. We are a research organisation with zero external funding. This is not a temporary condition or a phase before a funding round — it is a structural choice that shapes how we build systems. When compute is unconstrained, the natural approach is to scale. When compute is constrained, the natural approach is to think more carefully about methodology.

Primus is a product of the second approach. Every design decision in the system reflects a question we ask repeatedly: what is the minimum computational structure required to do this reasoning correctly? The answer, in many cases, is far less than the field currently assumes.

Current status

Primus is currently in early development. The system we are introducing today — Primus v0.1 — represents the first stable iteration of our research methodology, and it is the version that has been used in our recent scientific work. It remains an internal system at Blankline, not available for external access or deployment.

We want to be direct about where things stand: Primus is not a finished system. It is an active research programme. Core capabilities are functional and have been used to produce real scientific results, but the system is under continuous development. There are failure modes we are still characterising, capabilities we are still building, and fundamental design questions we are still answering. We are sharing Primus now not because it is complete, but because the results it has contributed to are beginning to enter public review — and we believe transparency about the tools used in scientific work is non-negotiable.

Primus is a research system that integrates and orchestrates existing AI capabilities — including foundation models — through a methodology designed specifically for formal scientific reasoning. Its value is not in any single underlying component, but in the verification pipeline, the cross-domain reasoning framework, and the iterative refinement architecture that connects them. We did not train a new model. We did not need to. The insight behind Primus is that the bottleneck in AI-assisted scientific discovery was never the model — it was the methodology.

The specific composition of Primus's pipeline — how components are orchestrated, what verification checks are applied at each stage, and how cross-domain transfer is implemented — will remain internal for the foreseeable future. We believe the responsible path is to demonstrate what the system can do through published, peer-reviewed scientific results, and to let those results speak for themselves.

What comes next

In the coming weeks and months, we will be publishing the scientific work produced during Primus's first research cycle — a series of contributions across astrophysics, mathematics, and machine learning. Each will be released as its own research publication through the standard peer-review process. We believe these results speak to the viability of methodology-driven AI research more clearly than any system description could.

We are sharing Primus today because the scientific community should know that this approach exists. Not as a competitor to compute-intensive methods, but as evidence that the design space for AI-assisted scientific discovery is larger than the current paradigm suggests.

The question of how AI will reshape scientific research is one of the most consequential questions of this decade. We think the answer will involve more than scale.

Primus v0.1 is an internal research system developed at Blankline. It is not available for external access. For research inquiries, contact research@blankline.org.

Share Research

Disseminate this finding to your network.