Research Scientist, Frontier Red Team (Emerging Risks) at Anthropic — task breakdown

Research Scientist, Frontier Red Team (Emerging Risks)

Classified Tasks (21)

Automate 0%Augment 67%Human-Only 33%

Augment (14)

AI assists, human decides

Scope societal risks caused by advanced AI models

analytical

Evaluate societal risks posed by advanced AI models

analytical

Red team advanced AI models to discover vulnerabilities and misuse pathways

technical

Design defenses and safeguards against societal risks from AI models

technical

Build evaluation suites (evals) to measure model behaviors and capabilities

technical

Design research experiments to probe emerging risks from models

analytical

Run experiments to collect data on model capabilities and failure modes

technical

Search for and analyze real-world signals that indicate emerging AI-driven risks

analytical

Translate experimental findings into actionable insights to guide technology development and deployment

communication

Produce internal and external artifacts (research papers, products, demos, dashboards, tools) that communicate model capabilities

communication

Identify and track the growth of real autonomous businesses in the wild using Clio and other tools

analytical

Build benchmarks assessing models’ national security capabilities

technical

Red team unsafeguarded models’ abilities to be used for control and domination

technical

Identify indicators that models are being used to scale movements which rely on social control

analytical

Human-Only (7)

Requires human judgment

Build a research program to study Emerging Risks from AI integration

leadership

Shape product decisions based on research findings

leadership

Shape safeguards decisions based on research findings

leadership

Shape training decisions based on research findings

leadership

Collaborate with Societal Impacts and Safeguards teams on cross-functional research and mitigation efforts

operational

Collaborate with the Autonomy workstream to study societal-scale risks from agent-world interfaces

operational

Build, run, and study an autonomous AI-powered business as an experimental system

creative

Job description

About Anthropic Anthropic’s mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial for our users and for society as a whole. Our team is a quickly growing group of committed researchers, engineers, policy experts, and business leaders working together to build beneficial AI systems. About the Team The Frontier Red Team (FRT) is a technical research team within Anthropic’s Policy organization. Our goal is to make the entire world safer in this era of advanced AI by understanding what these systems can do and building the defenses that matter. In 2026, we're focused on researching and ensuring safety with self-improving, highly autonomous AI systems—especially ones with cyberphysical capabilities. See our previous related work on cyberdefense , robotics , and Project Vend . We'll also be collaborating closely with the Autonomy workstream to understand novel, societal-scale risks that arise when agents interface with the external world. This is early-stage, high-conviction research with the potential for outsized impact. Note: We are exclusively hiring in SF. We support relocation, but all hires must relocate before starting. About the Role This Research Scientist will focus on scoping, evaluating, red teaming, and defending against societal risks caused by advanced models that emerge over the next few years. Powerful AI models may have major implications for national security, running a business, power and privacy, infrastructure, social relationships, and more. They may come as a result of the increasing integration of powerful models in our economy and social sphere. As an independent Research Scientist, you’ll build a research program to understand these Emerging Risks. You’ll build evals, run experiments, and look for real world signals to understand how these may come about. You’ll turn this into insights we can use to steer the development and use of the technology more positively. Compared to the team's other focuses, you will focus less on acute catastrophic risks and more on risks that emerge from increasing integration into our world. What You’ll Do: Design and run research experiments to understand the emerging risks models may create Produce internal & external artifacts (research, products, demos, dashboards, tools) that communicate the state of model capabilities Shape product, safeguards, and training decisions based on what you find Work closely with Societal Impacts (SI) and Safeguards teams Sample Projects: Build, run, and study an autonomous AI-powered business (e.g. Project Vend ), then identify the growth of real autonomous businesses in the wild using Clio and other tools Build a benchmark for a model’s national security capabilities Red team unsafeguarded models’ abilities to be used for control Identify indicators of models being used to scale movements that rely on social control You May Be a Good Fit If You: <

Source: Anthropic careers · scraped 2026-05-22

Apply at Anthropic