Jihwan Jeong

Jihwan Jeong

Ph.D. Candidate at University of Toronto

University of Toronto

Vector Institute

Biography

Hi 😄!   I’m a Ph.D. candidate and a member of the D3M (Data-Driven Decision-making) lab at University of Toronto working with Professor Scott Sanner. I started my research career as an AI/ML researcher since I was fascinated by the prospect of building tools that can aid and optimize various decision-making problems. I’ve worked on a variety of topics in the past 3 years, and the encompassing theme is robust decision-making with (learned) models. These days, my main research interest is in offline model-based reinforcement learning, and my recent paper on this topic got accepted at ICLR-23 (check it out here)! Specifically, I am very interested in how to use Bayesian models for robust planning at test time or for learning an effective policy, by taking the epistemic uncertainty of the model into account. Prior to coming to Toronto, I did my master’s at KAIST in South Korea under the supervision of Professor Hayong Shin.

Download my CV .

Interests
  • Offline reinforcement learning
  • Model-based reinforcement learning
  • Decision-aware model learning
  • Meta-learning
  • Bayesian deep learning
Education
  • Ph.D. Candidate in Information Engineering (Present)

    University of Toronto

  • M.S. in Industrial and Systems Engineering, 2019

    Korea Advanced Institute of Science and Technology (KAIST)

  • B.S. in Chemistry, 2015

    Korea Advanced Institute of Science and Technology (KAIST)

Experience

 
 
 
 
 
Vector Institute
Research Intern
Jun 2022 – Sep 2022 Toronto
Worked with Professor Pascal Poupart on a model-based offline reinforcement learning project (work to be submitted to ICML-23).
 
 
 
 
 
LG AI Research
Research Intern
Jun 2021 – Oct 2021 Seoul
Worked on a model-based offline reinforcement learning project (submitted to ICLR-23).
 
 
 
 
 
University of Toronto
Ph.D. Candidate (~present)
Sep 2019 – Present Toronto

Research Projects

.js-id-offline-rl
Conservative Bayesian Model-Based Value Expansion for Offline Policy Optimization (to appear at ICLR-23)
A model-based offline RL algorithm that is able to trade-off the uncertainty of the learned dynamics model with that of the value function through Bayesian posterior estimation, achieving state-of-the-art performance on a variety of D4RL benchmark tasks.
Conservative Bayesian Model-Based Value Expansion for Offline Policy Optimization (to appear at ICLR-23)
An Exact Symbolic Reduction of Linear Smart Predict+Optimize to Mixed Integer Linear Programming (ICML-22)
The Smart Predict+Optimize (SPO) framework tries to solve a decision-making problem expressed as mathematical optimization in which some coefficients have to be estimated by a predictive model. The challenge is that this problem is non-convex and non-differentiable, even for linear programs with linear predictive models. Despite that, we provide the first exact optimal solution to the SPO problem by formulating it as a bi-level bi-linear program and reducing it to a mixed-integer linear program (MILP) using a novel symbolic method.
An Exact Symbolic Reduction of Linear Smart Predict+Optimize to Mixed Integer Linear Programming (ICML-22)
A Distributional Framework for Risk-Sensitive End-to-End Planning in Continuous MDPs (AAAI-22)
End-to-end planning framework for risk-sensitive planning under stochastic environments by backpropagating through a model of the environment. The core idea is to use reparameterization of the state distribution, leading to a unique distributional perspective of end-to-end planning where the return distribution is utilized for sampling as well as optimizing risk-aware objectives by backpropagation in a unified framework.
A Distributional Framework for Risk-Sensitive End-to-End Planning in Continuous MDPs (AAAI-22)
Online Continual Learning in Image Classification: An Empirical Survey
Online continual learning for image classification studies the problem of learning to classify images from an online stream of data and tasks, where tasks may include new classes (class incremental) or data nonstationarity (domain incremental).
Online Continual Learning in Image Classification: An Empirical Survey
Symbolic Dynamic Programming for Continuous State MDPs with Linear Program Transitions (IJCAI-21)
Recent advances in symbolic dynamic programming (SDP) have significantly broadened the class of MDPs for which exact closed-form value functions can be derived. However, no existing solution methods can solve complex discrete and continuous state MDPs where a linear program determines state transitions — transitions that are often required in problems with underlying constrained flow dynamics arising in problems ranging from traffic signal control to telecommunications bandwidth planning. In this paper, we present a novel SDP solution method for MDPs with LP transitions and continuous piecewise linear dynamics by introducing a novel, fully symbolic argmax operator.
Symbolic Dynamic Programming for Continuous State MDPs with Linear Program Transitions (IJCAI-21)
Online Class-Incremental Continual Learning with Adversarial Shapley Value (AAAI-21)
In this paper, we specifically focus on the online class-incremental setting where a model needs to learn new classes continually from an online data stream. To this end, we contribute a novel Adversarial Shapley value scoring method that scores memory data samples according to their ability to preserve latent decision boundaries for previously observed classes (to maintain learning stability and avoid forgetting) while interfering with latent decision boundaries of current classes being learned (to encourage plasticity and optimal learning of new class boundaries).
Online Class-Incremental Continual Learning with Adversarial Shapley Value (AAAI-21)

Publications

Quickly discover relevant content by filtering publications.
(2023). Conservative Bayesian Model-Based Value Expansion for Offline Policy Optimization. In ICLR-23.

PDF Cite Project

(2022). An Exact Symbolic Reduction of Linear Smart Predict+Optimize to Mixed Integer Linear Programming. In ICML-22.

PDF Cite Code Project Video

(2022). A Distributional Framework for Risk-Sensitive End-to-End Planning in Continuous MDPs. In AAAI-22.

PDF Cite Project Video

(2022). Online Continual Learning in Image Classification: An Empirical Survey. Neurocomputing, 469: 28-51, 2022.

PDF Cite Code Project

(2021). Bayesian Optimization for a Multiple-Component System with Target Values. Computers & Industrial Engineering, 157.

PDF Cite

(2021). Online Class-Incremental Continual Learning with Adversarial Shapley Value. In AAAI-21.

PDF Cite Project Video

(2020). Batch-level Experience Replay with Review for Continual Learning. In CVPR Workshop on Continual Learning in Computer Vision.

PDF Cite Code Slides

Teaching Experience

 
 
 
 
 
Decision Support Systems (MIE451) 2022 Fall
Sep 2022 – Dec 2022 University of Toronto
 
 
 
 
 
Introduction to Artificial Intelligence (MIE369)
Jan 2022 – Apr 2022 University of Toronto
 
 
 
 
 
Introduction to Artificial Intelligence (MIE369)
Jan 2021 – Apr 2021 University of Toronto
 
 
 
 
 
Introduction to Artificial Intelligence (MIE369)
May 2021 – Aug 2021 University of Toronto
 
 
 
 
 
Optimization in Machine Learning (MIE424)
Jan 2020 – Apr 2020 University of Toronto
 
 
 
 
 
Foundations of Data Analytics and Machine Learning (APS1070)
Sep 2019 – Dec 2019 University of Toronto

Contact