Hi everyone!
The OpenAI Governance team is looking for engineers and researchers to build Dangerous Capability Evaluations (DC Evals): an eval suite for assessing dangerous capabilities that AI models could have, which increase the chance of catastrophe (e.g. capabilities like persuasion, deception, contributing to weapons proliferation, self-replication or -exfiltration, etc.). Our ideal is to contract with people who have availability through the end of the year and a meaningful amount of hours to work during this time period.
We are looking to hire contractors who can work on the full range of tasks required:
Brainstorm eval ideas to assess dangerous capabilities
Thoughtfully think through and construct the experimental design of individual evals
Code up evals, acquire or create datasets, run results, and iterate rapidly
Clearly communicate materials in written reports (eval results, how others can run the eval, surprising findings, etc.)
Document evals for broader usage, including making available to other AI labs and, in some cases, fully open-source