Markets are one of the clearest real-world tests for agents.
Markets combine noisy signals, delayed feedback, partial observability, adversarial behavior, and shifting regimes, which is exactly where brittle systems tend to fail.
What we are building
We turn financial market data into realistic RL evaluation and post-training environments for agents, providing realistic market simulations and reliable agent behaviour metrics to prevent reward hacking.
Why this matters
Many current evaluations are too narrow to provide reliable estimate on how an agent will behave under real world conditions. We think stronger systems need environments that are harder to exploit and closer to real deployment conditions.
Who this is for
Frontier labs, evaluation teams, and researchers working on RL and post-training wanting to benefit from agent robustness and realistic long-horizon decision-making.