|
Seungyong Moon
I am a final-year PhD student in Computer Science at Seoul National University advised by Hyun Oh Song. I previously graduated from the same university with BS in Mathematics, BA in Economics.
I will be graduating in February 2027 and am actively seeking postdoctoral or research scientist positions. Please feel free to get in touch!
Email  / 
CV  / 
Google Scholar  / 
GitHub
|
|
Research
My research aims to develop autonomous agents with strong robustness, generalization, and reasoning capabilities. Currently, I am working on enhancing the reasoning capabilities of large language models by leveraging synthetic data generation and designing novel reinforcement learning algorithms tailored for long-horizon reasoning tasks.
|
Publications
|
|
|
Learning to Better Search with Language Models via Guided Reinforced Self-Training
Seungyong Moon,
Bumsoo Park,
Hyun Oh Song
Neural Information Processing Systems (NeurIPS), 2025
code /
bibtex
Learning to search and backtrack improves reasoning in language models but often suffers from inefficient, noisy search. We propose a novel fine-tuning method that progressively incorporates optimal-solution guidance for efficient and effective reasoning.
|
|
|
Discovering Hierarchical Achievements in Reinforcement Learning via Contrastive Learning
Seungyong Moon,
Junyoung Yeom,
Bumsoo Park,
Hyun Oh Song
Neural Information Processing Systems (NeurIPS), 2023
code /
bibtex
Discovering subgoal hierarchies in visually complex, procedurally generated environments poses a significant challenge. We develop a new contrastive learning method along with PPO that successfully unlocks hierarchical achievements in the challenging Crafter benchmark.
|
|
|
Rethinking Value Function Learning for Generalization in Reinforcement Learning
Seungyong Moon,
JunYeong Lee,
Hyun Oh Song
Neural Information Processing Systems (NeurIPS), 2022
code /
bibtex
The value network trained on multiple environments is more likely to memorize the training data and requires sufficient regularization. We develop a novel policy gradient algorithm that improves generalization by reducing the update frequency of the value network.
|
|
|
Query-Efficient and Scalable Black-Box Adversarial Attacks on Discrete Sequential Data via Bayesian Optimization
Deokjae Lee,
Seungyong Moon,
Junhyeok Lee,
Hyun Oh Song
International Conference on Machine Learning (ICML), 2022
code /
bibtex
Crafting adversarial examples against language models is challenging due to their discrete nature and dynamic input sizes. We develop a query-efficient black-bax adversarial attack targeting various language models from RNNs to Transformers via Bayesian optimization.
|
|
|
Preemptive Image Robustification for Protecting Users against Man-in-the-Middle Adversarial Attacks
Seungyong Moon*,
Gaon An*,
Hyun Oh Song
AAAI Conference on Artificial Intelligence (AAAI), 2022
code /
bibtex
By harnessing an intriguing property of deep neural networks that they have robust points in the vicinity of in-distribution data, we propose a new defense framework that preemptively alters images before potential adversarial attacks, making it applicable to realistic scenarios.
|
|
|
Uncertainty-Based Offline Reinforcement Learning with Diversified Q-Ensemble
Gaon An*,
Seungyong Moon*,
Jang-Hyun Kim,
Hyun Oh Song
Neural Information Processing Systems (NeurIPS), 2021
code /
bibtex
The Q-function ensemble technique, originally designed to mitigate overestimation bias in online RL, also proves effective in offline RL with gradient diversification. We develop a simple offline RL algorithm that requires neither behavior cloning nor explicit Q-value penalization.
|
|
|
Parsimonious Black-Box Adversarial Attacks via Efficient Combinatorial Optimization
Seungyong Moon*,
Gaon An*,
Hyun Oh Song
International Conference on Machine Learning (ICML), 2019   (Long talk, 159/3424=4.6%)
code /
bibtex
We develop a query-effecient black-box adversarial attack against deep neural networks based on the local search algorithm for non-monotone submodular function maximization, which does not require gradient estimation and becomes free of hyperparameters to tune.
|
Teaching Experience
- Teaching Assistant, Machine Learning (Fall 2020, Fall 2022)
- Teaching Assistant, Introduction to Deep Learning (Spring 2019)
- Undergraduate Student Instructor, Basic Calculus 2 (Fall 2017)
- Undergraduate Student Instructor, Basic Calculus 1 (Spring 2017)
|
Work Experience
- Research Intern, Qualcomm AI Research Amsterdam (Sept 2024–Jan 2025)
- Research Intern, KRAFTON (June 2023–Sept 2023)
- Research Intern, DeepMetrics (June 2022–Sept 2022)
- Reserach Intern, Naver Corporation (Jul 2018–Aug 2018)
|
Honors and Awards
- NeurIPS Top Reviewers (2022, 2025)
- NeurIPS Scholar Award (2023)
- NAVER Ph.D. Fellowship Award (2022)
- Qualcomm Innovation Fellowship Finalists (2020, 2022)
- Yulchon AI Star Scholarship (2022)
- KFAS Computer Science Graduate Student Scholarship (2019-2024)
|
Academic Services
- Conference Reviewer: NeurIPS (2021–2025), ICML (2022–2025), AAAI (2022–2026), ICLR (2024–2026), AISTATS (2025–2026)
- Journal Reviewer: Neurocomputing (2021), Machine Learning (2023), Transactions on Intelligent Vehicles (2023)
|
This website is modified from here.
|
|