Intelligence compounds
We built a learning loop that lets an agent improve from its own experience. We then watched it rediscover rules it was never told, first in code, then in mathematics. This is a report on what we found, what we think it means for discovery, and what we are not claiming.
Almost every AI system in use today is frozen. The model that answers you tonight is the same one that will answer you next year, and nothing that happens in between is kept. We have been studying the alternative: an agent that holds onto its own experience and uses it to get better at work it has never seen.
In a held-out study, an agent with this loop independently recovered three rules it was never told, learned only from its own failures, and applied all three to unseen tasks on the first try. An identical agent without the loop recovered none. We then pointed the same loop at a different domain, hidden mathematical laws, and it rediscovered laws from data alone, including ones with interaction terms, while a comparison agent with no learning recovered nothing.
We do not claim a smarter model, recursive self improvement, or autonomous science. The reasoning core never changes. What changes is what the system has lived through. That is a smaller claim than it sounds, and, we think, a more important one. Given experience, intelligence compounds.
The loop that was missing
A capable model is necessary but not sufficient. The gap between an impressive demo and a tool that gets better the longer you use it is memory: the ability to keep what worked, recall it at the right moment, and carry it into the next problem.
Our system keeps three kinds. Experience, the record of what it did. Lessons, the strategies it distilled from successes and failures. Knowledge, the things it has studied and been told. The model stays fixed. All of the learning lives in these stores, recalled into context exactly when a new task resembles an old one.
This is the same shape that human expertise takes. You do not retrain your brain to learn a codebase. You build up scar tissue, conventions, and recalled mistakes, and you apply them. We built the machine version of that, and then we tested whether it actually transfers.
What we found, part one: it learned the rules from its own failures
We ran two identical agents over a stream of coding tasks. The tasks quietly enforced three unstated rules, the kind of house conventions a new engineer has to learn by getting them wrong. We never told the agent the rules. We only let it see whether each attempt passed.
The agent with memory failed the first time it hit each rule, distilled a lesson from that failure, and stopped repeating the mistake. By the end it had recovered all three rules on its own. The decisive test came next: on a held-out set of tasks it had never seen, it applied all three and solved them on the first attempt. The identical agent without memory failed every held-out task. It never learned anything because it could not keep anything.
We then ran the control that matters most. We gave the agent a second set of tasks with no shared structure, so there was nothing transferable to learn. There, memory gave no advantage and the agent recorded nothing. We weight that null result heavily. It is evidence the gains are real where structure exists, and that the system does not manufacture improvement where it does not. It learns where there is something to learn, and stays silent where there is not.
What we found, part two: the same loop reaches into mathematics
A coding convention and a law of physics are not as far apart as they sound. Both are hidden structure you recover by proposing an explanation, testing it against reality, noticing when you are stuck, and trying again with what you learned.
We pointed the same loop at hidden mathematical laws and asked it to recover them from data alone. It proposed candidate explanations, had them scored against held-out data, diagnosed its own plateaus, and carried the search strategies it learned on one law into the next. It rediscovered laws this way, including ones with interaction terms that a comparison agent with no learning loop never found.
We are deliberately careful here. It does not recover every law, the laws in this study are synthetic, and rediscovering a known relationship is not the same as discovering an unknown one. But the behavior is the one that science is actually made of, and watching a frozen model perform it, improving its own method as it goes, is the result that made us write this down.
Two products, one brain
The same learning lives behind two very different surfaces, and the difference is the point.
The Dropstone CLI is a coding agent. It works inside your repository. It learns the procedural lessons of your codebase, the conventions, the traps, the fix that worked last time, and recalls them before it acts. It gets better at your code, specifically, in a way no general model can be shipped pre-trained to do.
chat.dropstone.io is a general assistant. It remembers your conversations, studies the documents and sources you give it, and answers from the knowledge it has accumulated about your work and your world. It gets better at you.
The part that compounds is that they share one per-user memory. A lesson the coding agent learns in your terminal is available to the assistant in your browser, and what you teach the assistant is there when you code. It is not two tools with two memories. It is one accumulating understanding of your work, reachable from wherever you are.
Where we think this points
Most useful discovery is recombination: taking known pieces and arranging them so they explain something they did not before. Kepler did it with ellipses and Tycho's data. Much of working science is exactly this, and it is the kind of search our loop already performs.
The harder kind invents the missing piece itself. We see two routes a system like this could take toward it, and we are honest that both are open problems. One is data-driven and needs a fast, cheap verifier, which in the real world means an instrumented lab; serious work on self-driving labs is early but real. The other is consistency-driven: notice that an existing theory contradicts a principle it must obey, and find the one new term that resolves the contradiction. That route is striking because its verifier is symbolic and instant, no lab required, and it is the move behind some of the largest steps in the history of physics.
This may sound grandiose, and there are good reasons to be skeptical of it. We are not claiming to have done it. We are saying we have a loop that demonstrably recovers hidden structure, that transfers across domains, and that improves its own method, and that this is the right shape for the assisted-discovery systems we want to build next.
What we do not claim
The model is frozen. The learning raises the floor, not the ceiling: the system gets better at reusing what its existing capabilities can already reach, not at reasoning it could never do before. This is not recursive self improvement, and the model does not improve itself.
The effects are narrow by design. They appear where tasks share structure and vanish where they do not. The mathematical results are on synthetic laws, recover some and not all, and select among known building blocks rather than inventing new ones. A person still decides which problems are worth solving and judges whether an answer is correct. We are reporting a learning curve and a set of controlled rediscoveries, not a benchmark score, not autonomy, and not a new model.
The claim we will defend
Strip away everything we are not yet entitled to say, and a small, sturdy claim remains. We built a system whose intelligence is fixed but whose experience is not, and that difference is enough for it to learn rules it was never given, transfer them to new problems, and carry its method from code into mathematics.
That is the version we will stand behind in public, and it is still the one worth being excited about. The model you use does not change. What it has learned with you does, and that part only ever grows. Given experience, intelligence compounds.