This hiring round has closed. The details below are preserved for reference.
So far, we have focused on chain-of-thought monitoring. See our Research section for details on our work, including our paper How does information access affect LLM monitors' ability to detect sabotage? and our post Hidden Reasoning in LLMs: A Taxonomy.
We are not committed to a specific research agenda for the upcoming year yet. Topics we're exploring include shaping the generalization of LLM personas, interpretable continual learning, and pretraining data filtering. We plan to always work on whatever seems most impactful to us.
We prefer that candidates join us for a short-term collaboration (1-3 months part-time) to establish mutual fit before transitioning to a long-term position. However, if you have AI safety experience equivalent to having completed the MATS extension, we are happy to interview you for a long-term position directly. The interview process involves at least two interviews: a coding interview and a conceptual interview where we'll discuss your research interests. The expected starting date for long-term researchers is Feb-May; we're happy to start short-term collaborations ASAP.
If you are only interested in short-term collaborations, you can fill out this form instead.