Skip to content
View cjgwang's full-sized avatar
๐ŸŒ 
๐ŸŒ 
  • University of Oxford
  • Oxford, UK

Block or report cjgwang

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this userโ€™s behavior. Learn more about reporting abuse.

Report abuse
cjgwang/README.md

Hi!

I'm Cath, an undergrad at Oxford studying Mathematics and Statistics. I do AI safety research (technical, governance), and I like making cute websites:)

๐Ÿ’ป Tech Stack

  • Languages: Python, JavaScript, C, R, SQL
  • ML Frameworks: Pytorch, scikit-learn
  • LLM Ecosystem: Hugging Face Transformers, TRL
  • Web: HTML, CSS, JavaScript, Flask
  • Tools & Platforms: Git, Docker, AWS

๐Ÿ”Ž Research

My primary research interests lie in AI control and agent foundations, particularly understanding and mitigating emergent misalignment risks in autonomous AI systems. I focus on empirical questions around goal misgeneralisation, alignment faking, and attack selection in agentic evaluations, aiming to clarify failure modes in frontier models. I am also interested in how these technical insights inform AI governance and policy, especially mechanisms for strategic risk and constraining dangerous capability deployment.

Last updated: Mar 2026

Pinned Loading

  1. tylercrosse/mars-attacks tylercrosse/mars-attacks Public

    Jupyter Notebook

  2. reward-multiplicity reward-multiplicity Public

    Winning project of Research Impact Oxford MT2026; reward multiplicity and using STARC metrics to train meaningfully diverse reward ensembles in a gridworld environment

    Python

  3. J0YY/patchsteg J0YY/patchsteg Public

    mech interp paper for oxford v. cambridge varsity hackathon (grand prize) - defeating SOTA steganographic attacks in latent space

    Python 1

  4. ARENA_3.0 ARENA_3.0 Public

    Forked from callummcdougall/ARENA_3.0

    Jupyter Notebook

  5. tea-brewer tea-brewer Public

    tea brewery website (flask)

    HTML