| 
          
            | 
                Seungyong Moon
               
                I am a final-year PhD student in Computer Science at Seoul National University advised by Hyun Oh Song. I previously graduated from Seoul National University in 2019 with BS in Mathematics, BA in Economics, and Minor in Computer Science.
               
                Email  / 
                 CV  / 
                Google Scholar  / 
                GitHub
               |   |  
          
            | Research 
                My research focuses on enhancing the generalization and reasoning capabilities of reinforcement learning agents. Specifically, I am interested in 
                 
                  Reasoning and planning with large language models Generalization in reinforcement learning Adversarial attacks and defenses  |  
          
            | Publications |  
            |  | Learning to Better Search with Language Models via Guided Reinforced Self-Training Seungyong Moon,
              Bumsoo Park,
              Hyun Oh Song
 Neural Information Processing Systems (NeurIPS), 2025
 code /
              bibtex
 |  
            |  | Discovering Hierarchical Achievements in Reinforcement Learning via Contrastive Learning Seungyong Moon,
              Junyoung Yeom,
              Bumsoo Park,
              Hyun Oh Song
 Neural Information Processing Systems (NeurIPS), 2023
 code /
              bibtex
 Discovering subgoal hierarchies in visually complex, procedurally generated environments poses a significant challenge. We develop a new contrastive learning method along with PPO that successfully unlocks hierarchical achievements in the challenging Crafter benchmark. |  
            |  | Rethinking Value Function Learning for Generalization in Reinforcement Learning Seungyong Moon,
              JunYeong Lee,
              Hyun Oh Song
 Neural Information Processing Systems (NeurIPS), 2022
 code /
              bibtex
 The value network trained on multiple environments is more likely to memorize the training data and requires sufficient regularization. We develop a novel policy gradient algorithm that improves generalization by reducing the update frequency of the value network. |  
            |  | Query-Efficient and Scalable Black-Box Adversarial Attacks on Discrete Sequential Data via Bayesian Optimization Deokjae Lee,
              Seungyong Moon,
              Junhyeok Lee,
              Hyun Oh Song
 International Conference on Machine Learning (ICML), 2022
 code /
              bibtex
 Crafting adversarial examples against language models is challenging due to their discrete nature and dynamic input sizes. We develop a query-efficient black-bax adversarial attack targeting various language models from RNNs to Transformers via Bayesian optimization. |  
            |  | Preemptive Image Robustification for Protecting Users against Man-in-the-Middle Adversarial Attacks Seungyong Moon*,
              Gaon An*,
              Hyun Oh Song
 AAAI Conference on Artificial Intelligence (AAAI), 2022
 code /
              bibtex
 By harnessing an intriguing property of deep neural networks that they have robust points in the vicinity of in-distribution data, we propose a new defense framework that preemptively alters images before potential adversarial attacks, making it applicable to realistic scenarios. |  
            |  | Uncertainty-Based Offline Reinforcement Learning with Diversified Q-Ensemble Gaon An*,
              Seungyong Moon*,
              Jang-Hyun Kim,
              Hyun Oh Song
 Neural Information Processing Systems (NeurIPS), 2021
 code /
              bibtex
 The Q-function ensemble technique, originally designed to mitigage overestimation bias in online RL, proves also effective in offline RL with gradient diversification. We develop a new offline RL algorithm that does not require behavior cloning or explicit Q-value penalization. |  
            |  | Parsimonious Black-Box Adversarial Attacks via Efficient Combinatorial Optimization Seungyong Moon*,
              Gaon An*,
              Hyun Oh Song
 International Conference on Machine Learning (ICML), 2019   (Long talk, 159/3424=4.6%)
 code /
              bibtex
 We develop a query-effecient black-box adversarial attack against deep neural networks based on the local search algorithm for non-monotone submodular function maximization, which does not require gradient estimation and becomes free of hyperparameters to tune. |  
          
            | Teaching Experience 
                Teaching Assistant, Machine Learning (Fall 2020, Fall 2022)Teaching Assistant, Introduction to Deep Learning (Spring 2019)Undergraduate Student Instructor, Basic Calculus 2 (Fall 2017)Undergraduate Student Instructor, Basic Calculus 1 (Spring 2017) |  
          
            | Work Experience 
                Research Intern, Qualcomm AI Research Amsterdam (Sept 2024-Jan 2025)Research Intern, KRAFTON (June 2023-Sept 2023)Research Intern, DeepMetrics (June 2022-Sept 2022)Reserach Intern, Naver Search & Clova (Jul 2018-Aug 2018) |  
          
            | Honors and Awards 
                NeurIPS Scholar Award (2023) NAVER Ph.D. Fellowship Award (2022) NeurIPS Top Reviewers (2022, 2025) Qualcomm Innovation Fellowship Finalists (2020, 2022) Yulchon AI Star Scholarship (2022) KFAS Computer Science Graduate Student Scholarship (2019-2024)  |  
          
            | Academic Services 
                Conference Reviewer: NeurIPS (2021-2025), ICML (2022-2025), AAAI (2022-2026), ICLR (2024-2026), RLC (2024), AISTATS (2025-2026) Journal Reviewer: Neurocomputing (2021), Machine Learning (2023), Transactions on Intelligent Vehicles (2023)
               |  
          
            | 
 
                
                This website is modified from here.
               |  |