Anthropic· AI Research & Engineering· San Francisco, CA | New York City, NY
Software Engineer, Research Data Platform
Classified Tasks (13)
Automate 0%Augment 85%Human-Only 15%
Augment (11)
AI assists, human decides
1. Build tools to manage, query, and analyze training and evaluation data for frontier models
technical
2. Power internal applications that monitor reinforcement learning (RL) training runs
operational
3. Enable exploration of finetuning datasets through internal applications and interfaces
technical
4. Build and operate data pipelines that extract data from research training runs and load it into queryable storage systems
operational
5. Design and build APIs, libraries, and web interfaces to support researcher data management, exploration, and analysis
technical
6. Develop dataset management tooling, including data cataloging and provenance systems for day-to-day research use
technical
8. Identify high-leverage tooling opportunities within research workflows
analytical
9. Ship tooling solutions quickly to meet research team needs
operational
11. Build ML-specific tooling alongside research teams
technical
12. Leverage existing Data Infrastructure components when developing new tools and pipelines
operational
13. Power internal services that help researchers understand experiment internals and metrics
operational
Human-Only (2)
Requires human judgment
7. Embed with research teams to understand workflows and gather tooling requirements
communication
10. Collaborate with adjacent teams to integrate with and extend existing systems rather than rebuilding them
communication
Job description
About Anthropic Anthropic’s mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial for our users and for society as a whole. Our team is a quickly growing group of committed researchers, engineers, policy experts, and business leaders working together to build beneficial AI systems. About the role The Research Data Platform team builds the tools that Anthropic's researchers use every day to manage, query, and analyze the data that goes into training and evaluating frontier models. We power the internal applications researchers rely on to monitor RL runs, explore finetuning datasets, and understand what's happening inside their experiments. We're looking for engineers who love working directly with users and who excel at building data products — the pipelines that move data out of training runs into queryable storage, and the APIs, libraries, and services researchers use to manage and explore it. This role sits closer to the research workflow than a typical data infrastructure position: you'll often embed with research teams, build ML-specific tooling alongside them, and leverage what our Data Infrastructure team has already built rather than reinventing it. We do not require prior ML or AI training experience. If you enjoy working closely with technical users, learning new domains quickly, and building tools people actually want to use, you'll pick up the research context fast. Responsibilities Build and operate data pipelines that extract data from research training runs and land it in storage systems that are easy and fast to query Work closely with researchers to design and build APIs, libraries, and web interfaces that support data management, exploration, and analysis Develop dataset management, data cataloging, and provenance tooling that researchers use in their day-to-day work Embed with research teams to understand their workflows, identify high-leverage tooling opportunities, and ship solutions quickly Collaborate with adjacent teams to build on existing systems rather than reinventing them You may be a good fit if you Have significant software engineering experience, particularly building data-intensive applications or internal tooling Enjoy working directly with users, gathering requirements iteratively, and shipping things that get adopted Are results-oriented, with a bias towards flexibility and impact Pick up slack, even if it goes outside your job description Want to learn more about machine learning research Care about the societal impacts of your work