Founding RL Researcher | San Francisco | $300K + Equity
The best AI researchers aren't waiting for a job ad. They're waiting for the right moment.
This might be it.
Anthropic leadership has discussed spending over $1 billion on RL environments in the next year alone. The market is moving - fast and at scale. One pre-seed company just closed a round at a valuation most Series A companies would envy, with frontier AI labs already as paying clients and acquisition interest already on the table.
The reason? They cracked something the rest of the market hasn't.
Most teams build RL environments from synthetic data - easy to demo, easy to commoditise, brittle. This team mines real human behavioural data - how domain experts actually reason, decide, and solve complex tasks over long-horizon workflows. 10-100+ step environments. Closed-loop systems where environments, data, training, and evaluation are tightly integrated. Not proxies. Not shortcuts.
25-30% uplift in model task success rates. 50-65% more training signals. Evals that reflect how humans actually work.
The funding just landed. Second time founders. The founding team is being built right now.
The founding research seat is one of them. The person who takes it defines the research agenda from day one.
๐ $300K base + 0.5-1% founding equity - $20-30M pre-seed valuation
๐ง Frontier AI labs as paying clients from day one
๐ฌ Greenfield research direction - environment design treated as a first-class problem
๐ San Francisco, remote an option but founding team energy a must
๐ Get in now - before the seats are gone and you're watching from the outside
What you'll be doing
๐งช Defining the research agenda for long-horizon RL on real expert behavioural data
๐ Designing novel reward models, verifiers, and evaluation frameworks where the textbook doesn't exist yet
๐ Running training experiments that prove the gyms drive real capability gains - not vanity metrics
๐ Publishing and shaping how the field thinks about environment design
๐๏ธ Working shoulder to shoulder with the founding engineering team to turn research into production-grade infrastructure
You'll love this if...
โ You've done serious RL research - frontier lab, research-heavy startup, or strong PhD with published work
โ You think environment design is the most underrated problem in AI right now
โ Long-horizon RL excites you more than short-form RLHF
โ You want to define the science, not run experiments on someone else's stack
โ You move fast, think in first principles, and want your fingerprints on the foundational decisions
There are very few seats. Are you going to be in one of them?
Built Different. Because great teams are built different.