Founder-led hiring for startups. High-signal roles across Tech, Product, Data and GTM.
Built Different Built by founders, for founders.
Tip: If a role looks close but not perfect, apply anyway - we can calibrate quickly.
No spam. No noise. Just strong teams.
Share this job
Founding RL Researcher
San Francisco, CA
Apply for this job

Founding RL Researcher | San Francisco | $300K + Equity


The best AI researchers aren't waiting for a job ad. They're waiting for the right moment.


This might be it.


Anthropic leadership has discussed spending over $1 billion on RL environments in the next year alone. The market is moving - fast and at scale. One pre-seed company just closed a round at a valuation most Series A companies would envy, with frontier AI labs already as paying clients and acquisition interest already on the table.


The reason? They cracked something the rest of the market hasn't.


Most teams build RL environments from synthetic data - easy to demo, easy to commoditise, brittle. This team mines real human behavioural data - how domain experts actually reason, decide, and solve complex tasks over long-horizon workflows. 10-100+ step environments. Closed-loop systems where environments, data, training, and evaluation are tightly integrated. Not proxies. Not shortcuts.


25-30% uplift in model task success rates. 50-65% more training signals. Evals that reflect how humans actually work.


The funding just landed. Second time founders. The founding team is being built right now.


The founding research seat is one of them. The person who takes it defines the research agenda from day one.


๐Ÿš€ $300K base + 0.5-1% founding equity - $20-30M pre-seed valuation

๐Ÿง  Frontier AI labs as paying clients from day one

๐Ÿ”ฌ Greenfield research direction - environment design treated as a first-class problem

๐ŸŒ‰ San Francisco, remote an option but founding team energy a must

๐Ÿ“ˆ Get in now - before the seats are gone and you're watching from the outside


What you'll be doing


๐Ÿงช Defining the research agenda for long-horizon RL on real expert behavioural data

๐Ÿ“ Designing novel reward models, verifiers, and evaluation frameworks where the textbook doesn't exist yet

๐Ÿ” Running training experiments that prove the gyms drive real capability gains - not vanity metrics

๐Ÿ“ Publishing and shaping how the field thinks about environment design

๐Ÿ—๏ธ Working shoulder to shoulder with the founding engineering team to turn research into production-grade infrastructure


You'll love this if...


โœ… You've done serious RL research - frontier lab, research-heavy startup, or strong PhD with published work

โœ… You think environment design is the most underrated problem in AI right now

โœ… Long-horizon RL excites you more than short-form RLHF

โœ… You want to define the science, not run experiments on someone else's stack

โœ… You move fast, think in first principles, and want your fingerprints on the foundational decisions


There are very few seats. Are you going to be in one of them?


Built Different. Because great teams are built different.


Apply for this job
Built Different
Because Great Teams Are Built Different
Startup Hiring Tech, GTM, Product and Data NL, EU & US
Prefer a quick intro call before applying? Share your CV and a short note on what you want next.
ยฉ Built Different