Research Engineer, Post Training RL Job at TensorStax, San Mateo, CA

MHR4dWRlVHBBbFV2aENBaU5ac1RHZGZtS2c9PQ==
  • TensorStax
  • San Mateo, CA

Job Description

Research Engineer – Post Training Reinforcement Learning

Location: San Francisco (Hybrid)

About TensorStax

TensorStax is building fully autonomous AI systems to manage and maintain mission-critical data infrastructure and pipelines. We leverage reinforcement learning to enhance language models' ability to reason over large-scale data lakes and warehouses, detect pipeline failures, construct new pipelines with high precision, and enable agentic behavior—allowing systems to proactively identify and resolve issues autonomously.

What You’ll Do

As a Research Engineer specializing in Reinforcement Learning, you will:

  • Develop and refine reward functions to optimize agent behavior for complex data engineering tasks.
  • Create RL gym environments for language model agents.
  • Fine-tune language models using reinforcement learning techniques such as PPO, DPO, and KTO.
  • Stay at the forefront of research on RL for language models, incorporating advancements like GRPO, SWE-Gym, and SWE-RL into practical applications.
  • Curate and build high-quality datasets for supervised fine-tuning (SFT) and RLHF.
  • Design experiments to evaluate and improve the agentic capabilities of language models in data environments.

What We’re Looking For

  • Deep understanding of reinforcement learning, reward shaping, and optimization strategies.
  • Strong familiarity with LLM fine-tuning techniques (PPO, DPO, KTO) and their applications in reinforcement learning.
  • Knowledge of recent advancements in RL for language models (GRPO, SWE-Gym, SWE-RL).
  • Experience curating and constructing high-quality datasets for fine-tuning.
  • Strong problem-solving skills and a history of working on complex ML projects.
  • High agency—ability to work independently, experiment proactively, and drive research initiatives forward.

Bonus Points

  • Experience with distributed training in PyTorch (DDP, FSDP).
  • Hands-on experience designing RL environments for traditional RL problems.
  • Contributions to open-source projects in RL, LLMs, or ML infrastructure.
  • Familiarity with data lakes and warehouses (Snowflake, BigQuery, Redshift).

Benefits

  • 100% employer-covered health, dental, and vision insurance.
  • 401(k) with company match.
  • Access to Bay Club or Equinox in San Francisco.

Job Tags

Temporary work,

Similar Jobs

Ash & Harris Executive Search

Senior Construction Project Manager Job at Ash & Harris Executive Search

 ...Senior Construction Project Manager Qualifications: - Bachelors degree in Construction Management, Engineering, or related field. - Minimum of 8-10 years of experience in project management within the construction industry, with a focus on energy projects.... 

Upward Health

Certified Medical Coder Job at Upward Health

: Upward Health is an in-home, multidisciplinary medical group providing 24/7 whole-person care. Our clinical team treats physical, behavioral...  ...documentation to accurately select ICD-10 and CPT/HCPCS codes, ensuring compliance with coding guidelines, third-party reimbursement... 

Crothall Healthcare

Medical Office Cleaner (FULL TIME) Job at Crothall Healthcare

 ...Description We are hiring immediately for a full time Medical Office Cleaner position. Location : Our Lady of the Lake Children'...  ...experiences. Take a look for yourself at the Power of Clean! Crothall Healthcare is a Compass One Company that provides... 

Crye Precision LLC

Software Engineer Intern Job at Crye Precision LLC

 ...The Opportunity We are seeking a motivated and talented Computer Science major to join our team as a summer intern. This is an opportunity for a junior or senior undergraduate student to gain hands-on experience in software development and project management. Will work... 

Staff Management | SMX

Warehouse Associate - Short Contract Role Job at Staff Management | SMX

 ...perfectly packaged food they need! We're currently hiring motivated Warehouse Associates for these shift opportunities. If you're looking for a rewarding opportunity with the chance to gain valuable experience for career advancement, apply now! Shifts: ~1st Shift : 6:0...