AI safety primer

This plan is for candidates interested in AI safety, alignment, or governance work. The category covers several distinct tracks — technical alignment research, applied safety engineering, governance and policy, and the working organizational layer (red-teaming, model evaluation, deployment governance). The plan starts shared across tracks and branches in the second half.

Six months is a working window. Most candidates in this space end up extending the plan into a multi-year trajectory; the plan is the entry layer.

Plan at a glance

Phase	Window	Goal	Workload
1. Shared foundations	Months 1-2	Working AI literacy + safety vocabulary	~8 hrs/week
2. Technical or governance track	Months 3-4	Depth in one of two tracks	~10 hrs/week
3. Applied work	Months 5-6	One artifact: research-grade or policy-grade	~10 hrs/week

Phase 1 — Shared foundations (months 1-2)

Goal: working AI literacy plus the foundational vocabulary of AI safety. Both tracks share this phase.

Recommended programs:

Anthropic AI Fluency (free) — and pay particular attention to the safety-relevant sections.
Harvard CS50’s AI track — the AI fundamentals layer.
AI safety reading list. Standard public lists from independent safety researchers (we deliberately do not centralize the list here; multiple credible curators publish it). Read the foundational papers — the alignment-problem reasoning, the early scaling-laws work, the major recent interpretability and red-teaming results.

Milestones:

Working vocabulary: “alignment,” “specification gaming,” “outer / inner alignment,” “interpretability,” “red-teaming,” “evals,” “RLHF,” “constitutional AI,” “scalable oversight.”
Comfort reading a safety paper at the level of identifying its claims and its weak points.

Phase 2A — Technical track (months 3-4)

If you are pursuing technical alignment or applied safety engineering, follow this branch.

Recommended programs:

Stanford Online — XCS deep learning track — for the implementation depth.
Mechanistic interpretability tutorials. The major safety-research organizations publish public tutorials and reproduction guides; work through one or two.
Hugging Face AI Agents course — for the agent-design layer where most applied safety work currently lands.

Milestones:

Reproduced at least one interpretability result from a published paper.
Built at least one small evaluation harness for a specific failure mode.
Comfort reading the interpretability and alignment papers from the major safety organizations.

Phase 2B — Governance track (months 3-4)

If you are pursuing governance, policy, or applied compliance work, follow this branch.

Recommended programs:

A Kennedy School-style AI policy program — Harvard Kennedy School Executive Education, or equivalent at another institution. The Kennedy School-adjacent programs are mentioned in our Harvard reference page as the institute-track AI offerings most relevant for governance candidates.
AI governance reading list. The standing literature on AI governance is now substantial; the working canon includes the EU AI Act analyses, the major US-government AI guidance documents, the body of work from the AI governance research organizations.

Milestones:

Working understanding of the current global AI regulatory landscape — EU AI Act provisions, US executive orders and agency guidance, UK and Singapore frameworks, the major industry standards.
Comfort writing a one-page policy analysis on a specific deployment-governance question.
A relevant credential (one Kennedy School certificate or equivalent).

Phase 3 — Applied work (months 5-6)

Goal: one applied artifact appropriate to your track.

Technical track artifacts (pick one):

A workshop-quality interpretability or alignment paper.
A working evaluation harness, released openly, that adds non-trivial coverage in a specific failure-mode category.
A substantive reproduction of a recent safety paper, with notes on what was hard and what extension you would propose.

Governance track artifacts (pick one):

A written policy memo on a specific deployment-governance question, at the quality required by an industry working group or government agency.
A written analysis of a specific real deployment of AI in a regulated domain, focusing on the governance practices used.
An applied compliance toolkit for a specific category of AI deployment.

The artifact is the credential. The credentialing programs in phases 1-2 make the artifact readable; the artifact is what closes the loop in safety-track hiring.

A note on the field

AI safety hiring is unusual relative to the broader AI category. The credentialing layer matters less than the body of work; the body of work matters less than the demonstrated thinking. Hiring is heavily relationship-driven — candidates often enter via internships, residencies, fellowships, or extended part-time engagements with safety-focused research organizations.

We strongly recommend, alongside this plan, identifying 2-4 working safety researchers whose work you read closely, engaging with their public outputs substantively, and looking for the specific entry-point programs the major safety organizations run (e.g., MATS, ARENA, the various lab residencies). The credentialing in this plan supports those entry points; it does not substitute for them.

What this plan is not for

General applied AI engineers. AI safety is a specific track, not a general applied AI specialization. Use the AI engineer in 6 months plan if your goal is broader applied work.
Founders. Safety-aware founders should follow the Founder-track AI literacy plan with light supplementation from the safety reading list in Phase 1. The full safety primer is over-investment for the founder use case.

Update log

2026-05-12: Initial publication.

Study plans on Edge Curriculum are working recommendations, not prescriptions. The AI safety category is moving particularly quickly; we expect to update this plan as the field’s working organizations and credentialing programs evolve. For corrections or suggested improvements, corrections@edgecurriculum.com.