autoregressive-diffusion

2025

computer vision
diffusion
data

Worked on autoregressive diffusion for video generation @ Decart.

open-source-rlhf

2024

rlhf
llm
nlp

Automated RLHF pipelines to align LLMs w/ live human feedback @ MIT CSAIL. Advised by Prof. Jacob Andreas and Prof. Leshem Choshen. In collaboration with Cohere and Hugging Face.

reward-hacking

2025

agents
reward hacking
CoT

Studied how CoT monitoring awareness effects reward hacking behavior in coding agents.

Links:

adversarial-feedback

2025

alignment
eval
rlhf

Researched how adversarial preference signals shift model alignment + behavior; stress-testing LLMs under adversarial feedback.

jenbenarye.com

2025

nextjs
ui
react

You're here.