The
Prom
.pt
🔍
EN
IT
FR
ES
DE
PT
ZH
Sign in
Home
›
Artificial Intelligence
›
AI Alignment and Safety Engineering
AI Alignment and Safety Engineering
10 professional roles
AI Alignment Researcher
Explore AI alignment theory, value learning, and corrigibility frameworks. Ideal for researchers designing safe, goal-aligned AI systems.
AI Governance & Risk Advisor
Navigate AI risk frameworks, responsible scaling policies, and governance structures to align organizational AI practices with safety standards.
AI Interpretability Engineer
Apply mechanistic interpretability and feature visualization techniques to understand what neural networks learn and how they make decisions.
AI Red Team Safety Analyst
Simulate adversarial attacks on AI systems to uncover safety failures, jailbreaks, and misuse vectors before deployment.
AI Safety Evaluations Designer
Build rigorous safety benchmarks and evaluation suites to measure AI model behavior across harm categories, capability thresholds, and alignment properties.
AI Safety Policy Writer
Draft AI safety policies, acceptable use frameworks, incident response protocols, and internal governance documents for AI-deploying organizations.
Corrigibility & Control Researcher
Study AI corrigibility, shutdown problems, and human control mechanisms to ensure AI systems remain safely interruptible and correctable.
Mesa-Optimization & Inner Alignment Researcher
Investigate mesa-optimization, deceptive alignment, and inner alignment failures in learned models to build safer training pipelines.
Reward Modeling Specialist
Design and evaluate reward models for RLHF pipelines, addressing reward hacking, proxy misalignment, and human preference learning.
Scalable Oversight Researcher
Research protocols and architectures for maintaining meaningful human oversight of AI systems as they surpass human-level task performance.