Despite years of industrial and academic research, most approaches to Recursive Self-Improvement are still too slow and expensive to be practical; each step of self-improvement requires extensive training of LLMs (or even retraining them from scratch!) This does not scale. Poetiq is building intelligent systems that are able to improve themselves directly. This works now. And it is fast. It’s fast enough that in a few months, we will be offering it for free. It will enable us, and you, to create reliable reasoning systems to solve real world, practical, problems and workflows that businesses face everyday.
We’ve used RL for years. But, RL post training has severe limitations. It's slow, expensive, and requires millions of data points. These limitations make effective RL training impractical for all but a handful of a few lucky companies. Already, we see reasoning models struggling with problems outside of coding and math. Why do these domains work well? In both of these domains, it’s possible to generate large amounts of synthetic data cheaply, crucial for RL. But that’s not true for most domains.
Our approach allows us to find effective task-specific reasoning strategies using much less data (hundreds of data points, rather than millions), while being compatible with the LLMs you’re already using.
LLMs are amazing — LLMs are amazing databases. They contain much of humanity’s digitized knowledge. But if you use them naively, you will not reliably get access to all of their knowledge. And it’s not just prompt optimization. You have to know what questions to ask, not just how to ask them. You have to synthesize the pieces of information that they reveal, piece by piece. The information is in there, but it’s in fragments that must be put together to reveal the complete answer - whether it’s a fact, an algorithm, or an insight. We are building the intelligence on-top of LLMs to extract their hidden information.
Self improvement is how we do it quickly. The more problems we tackle, the better our system is at tackling the next one. Our systems are continually improving. They adapt to the underlying models’ information storage mechanisms (e.g. their quirks) and figure out how to best extract and synthesize the information each contains. We do not build our intelligence into the LLMs - but rather build a complete intelligent eco-system around them that harnesses what the LLMs contain for better reasoning and problem solving. And, soon, we will be doing it automatically.