Joey Hejna

(Donald Joseph Hejna III)

I'm a third year PhD student in the computer science department at Stanford University advised by Dorsa Sadigh. Currently, I'm a student researcher at the Google Deepmind Robotics group. My research is supported by an NDSEG Fellowship. I completed my undergrad at UC Berkeley where I worked with Professors Pieter Abbeel and Lerrel Pinto.

jhejna @ cs.stanford.edu  /  Resume  /  Github  /  Scholar

profile photo
Research

I'm broadly interested in learning for decision making and robotics. Papers (and preprints) are ordered by recency.

Show, Don't Tell: Aligning Language Models with Demonstrated Feedback
Omar Shaikh, Michelle Lam, Joey Hejna, Yijia Shao, Michael Bernstein, Diyi Yang Omar Shaikh*, Michelle Lam*, Joey Hejna*, Yijia Shao, Michael Bernstein, Diyi Yang
ArXiv Pre-print
paper

We introduce DITTO, an algorithm for few-shot adaptation or alignment of large language models using only two or three human demonstrations. We show that DITTO outperforms prior methods by a significant margin on automated evals and a real-world users study.

Scaling Laws for Reward Model Overoptimization in Direct Alignment Algorithms
Rafael Rafailov*, Yashwanth Chittepu*, Ryan Park*, Harshit Sikchi*, Joey Hejna, W. Bradley Knox, Chelsea Finn, Scott Niekum
ArXiv Pre-print
paper

Through extensive experimentation and multiple model scales we characterize the over-optimization problem for direct alignment algorithms in large langauge models.

DROID: A Large-Scale In-the-Wild Robot Manipulation Dataset
Alexander Khazatsky*, Karl Pertsch*, ..., Joey Hejna, ..., Sergey Levine, Chelsea Finn
RSS 2024
website / paper / code

A large diverse dataset collected across multiple institutions totalling over 75K robot trajectories.

From r to Q*: Your Language Model is Secretly a Q-function
Rafael Rafailov*, Joey Hejna, Ryan Park, Chelsea Finn
ArXiv Pre-print
paper

Understanding direct alignment algorithms for LLMs through the perspective of a per-token Markov Decision Process.

Octo: An Open-Source Generalist Robot Policy
Dibya Ghosh*, Homer Walke, Karl Pertsch*, Kevin Black*, Sudeep Dasari, Joey Hejna, Charles Xu, Jianlan Luo, Tobias Kreiman, You Liang Tan, Dorsa Sadigh, Chelsea Finn, Sergey Levine
RSS 2024
website / paper / code

Laying the groundwork for a foundation model for robotics.

Contrastive Preference Learning: Learning from Human Feedback without RL
Joey Hejna, Rafael Rafailov*, Harshit Sikchi*, Chelsea Finn, Scott Niekum, W. Bradley Knox, Dorsa Sadigh
ICLR 2024
paper / code

Reduces Reinforcement Learning from Human Feedback to contrastive learning under the regret model of human preferences, which has recently been shown to be more accurate than the widely accepted reward model. Unlike many RLHF methods, CPL is fully off-policy and works on arbitrary MDPs.

Inverse Preference Learning: Preference-based RL without a Reward Function
Joey Hejna, Dorsa Sadigh
NeurIPS 2023
paper / code

Approaches to preference-based RL typically work in two phases: first a reward function is learned, then it is maximized using a vanilla RL algorithm. We introduce the Inverse Preference Learning framework, where we directly learn a Q-function that models the user's preferences without explicitly learning a reward function.

Distance Weighted Supervised Learning for Offline Interaction Data
Joey Hejna, Jensen Gao, Dorsa Sadigh
ICML 2023
paper / website / code

We introduce DWSL, an algorithm for offline goal-conditioned reinforcement learning that uses only supervised objectives while still learning a constrained optimal policy. DWSL performs particularly well on high-dimensional image domains and seems robust to hyperparamters.

Extreme Q-Learning: MaxEnt RL without Entropy
Divyansh Garg*, Joey Hejna*, Matthieu Geist,Stefano Ermon
*Equal Contribution
ICLR 2023 (Notable, Top 5% of Submissions)
paper/ website / code

We introduce a novel framework for Q-learning that models the maximal soft-values without needing to sample from a policy and improves performance in online and offline RL settings.

Few-Shot Preference Learning for Human-in-the-Loop RL
Joey Hejna, Dorsa Sadigh
CoRL 2022
paper / website / code

Pretraining preference models greatly reduces query-complexity, enabling humans to teach robots with a reasonable amount of feedback.

Improving Long-Horizon Imitation through Instruction Prediction
Joey Hejna, Pieter Abbeel, Lerrel Pinto
AAAI 2023
paper / code

We show that predicting instructions along with actions drastically improves performance in combinatorially complex long-horizon imitation settings.

Task-Agnostic Morphology Evolution
Donald J. Hejna III, Pieter Abbeel, Lerrel Pinto
Accepted to ICLR 2021
paper / website / code

Better robot strucutres hold the promise of better performance. We propose a new algorithm, TAME, that is able to evolve morphologies without any task specification. This is accomplished using an information theoretic objective that efficiently ranks morphologies based on their ability to explore and control their environment.

Hierarchically Decoupled Imitation for Morphological Transfer
Donald J. Hejna III, Pieter Abbeel, Lerrel Pinto
Accepted to ICML 2020
paper / website / code / talk

We propose transferring RL policies across agents using a hierarchical framework. Then, to remedy poor zero-shot transfer performance we introduce two additional imitation objectives.

Projects
Research Lightning
Donald J. Hejna III
Open source project
code

A lightweight framework for general deep-learning research in pytorch.

OpenX Repository
Donald J. Hejna III
Open source project
code

A lightweight framework for general deep-learning research in pytorch.

Improving Latent Representations via Explicit Disentanglement
Donald J. Hejna III*, Ashwin Vangipuram*, Kara Liu*
Course Project, CS 294-158 Unsupervised Learning, Spring 2020
paper

We examine and compare three methods for explicitly disentangling learned latent representations in VAE models.

Awards
  • National Defense Science and Engineering Graduate Scholarship (NDSEG) 2021, roughly 5% selection rate.
  • Honorable mention for the 2021 CRA Outstanding Undergraduate Researcher Award
  • Highest Degree Honors in Engineering at UC Berkeley Spring 2021, top 3% of the graduating class.
  • UC Berkeley Regents and Chancellors Scholarship
  • Rambus Innovator of the Future 2017
Industry
Student Researcher, Google DeepMind Robotics
Summer 2024

Working on the machine learning and robotics team.

Intern, Citadel Global Quantitative Strategies
Summer 2019

Developed C++ systems for trading APIs and monitoring systems. Worked on optimizing memory usage of large model training.

Intern, Intel Artificial Intelligence Group
Summer 2018
blog post

Worked on demo systems for Intel's OpenVino model optimization system in the AWS DeepLens. Explored systems for gradient based explanations of deep networks.

Teaching and Service
reviewer Reviewer

NeurIPS 2023 Outstanding Reviewer, ICML 2024, CoRL 2024, IEEE RA-L, RL-Brew at RLC 2024
berkeley UC Berkeley EECS Department

Teaching Assistant, EECS 127: Optimization Models, Fall 2020

Teaching Assistant, EECS 189: Machine Learning, Spring 2020

Teaching Assistant, CS 70: Discrete Math and Probability Theory, Fall 2019
teaching resources Public Resources

Introductory ML Notes

Deep Learning Workshop

Reinforcement Learning Workshop

Website source taken from here.