Founding RL Researcher (San Francisco, CA)

Founder-led hiring for startups. High-signal roles across Tech, Product, Data and GTM.

Built Different Built by founders, for founders.

Tip: If a role looks close but not perfect, apply anyway - we can calibrate quickly.

No spam. No noise. Just strong teams.

Share this job

Founding RL Researcher

San Francisco, CA

Apply for this job

Founding RL Researcher | San Francisco | $300K + Equity

The best AI researchers aren't waiting for a job ad. They're waiting for the right moment.

This might be it.

Anthropic leadership has discussed spending over $1 billion on RL environments in the next year alone. The market is moving - fast and at scale. One pre-seed company just closed a round at a valuation most Series A companies would envy, with frontier AI labs already as paying clients and acquisition interest already on the table.

The reason? They cracked something the rest of the market hasn't.

Most teams build RL environments from synthetic data - easy to demo, easy to commoditise, brittle. This team mines real human behavioural data - how domain experts actually reason, decide, and solve complex tasks over long-horizon workflows. 10-100+ step environments. Closed-loop systems where environments, data, training, and evaluation are tightly integrated. Not proxies. Not shortcuts.

25-30% uplift in model task success rates. 50-65% more training signals. Evals that reflect how humans actually work.

The funding just landed. Second time founders. The founding team is being built right now.

The founding research seat is one of them. The person who takes it defines the research agenda from day one.

🚀 $300K base + 0.5-1% founding equity - $20-30M pre-seed valuation

🧠 Frontier AI labs as paying clients from day one

🔬 Greenfield research direction - environment design treated as a first-class problem

🌉 San Francisco, remote an option but founding team energy a must

📈 Get in now - before the seats are gone and you're watching from the outside

What you'll be doing

🧪 Defining the research agenda for long-horizon RL on real expert behavioural data

📐 Designing novel reward models, verifiers, and evaluation frameworks where the textbook doesn't exist yet

🔁 Running training experiments that prove the gyms drive real capability gains - not vanity metrics

📝 Publishing and shaping how the field thinks about environment design

🏗️ Working shoulder to shoulder with the founding engineering team to turn research into production-grade infrastructure

You'll love this if...

✅ You've done serious RL research - frontier lab, research-heavy startup, or strong PhD with published work

✅ You think environment design is the most underrated problem in AI right now

✅ Long-horizon RL excites you more than short-form RLHF

✅ You want to define the science, not run experiments on someone else's stack

✅ You move fast, think in first principles, and want your fingerprints on the foundational decisions

There are very few seats. Are you going to be in one of them?

Built Different. Because great teams are built different.

Apply for this job

Built Different

Because Great Teams Are Built Different

Startup Hiring Tech, GTM, Product and Data NL, EU & US

Prefer a quick intro call before applying? Share your CV and a short note on what you want next.