Publications

2026

  1. WeatherArchive-Bench: Benchmarking Retrieval-Augmented Reasoning for Historical Weather Archives
    Yongan Yu, Xianda Du, Qingchen Hu, and 7 more authors
    Proceedings of the 49th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2026), 2026
  2. THINK: Can Large Language Models Think-aloud?
    Yongan Yu, Mengqian Wu, Yiran Lin, and 1 more author
    International Conference on Artificial Intelligence in Education (AIED 2026), 2026
  3. ACL
    acl2026.png
    CodeFlowBench: A Multi-turn, Iterative Benchmark for Complex Code Generation
    Sizhe Wang, Zhengren Wang, Dongsheng Ma, and 5 more authors
    2026

2025

  1. MaintainCoder: Maintainable Code Generation under Dynamic Requirements
    Zhengren Wang, Rui Ling, Chufan Wang, and 5 more authors
    Advances in Neural Information Processing Systems (NeurIPS 2025), 2025
  2. MILO: An LLM Multi-Stage Conversational Agent for Fostering Teenagers’ Mental Resilience
    Han Bao, Yongan Yu, Bohan Wang, and 2 more authors
    In Adjunct Proceedings of the 38th Annual ACM Symposium on User Interface Software and Technology (UIST 2025), 2025
  3. ACL
    acl2025.png
    WXImpactBench: A Disruptive Weather Impact Understanding Benchmark for Evaluating Large Language Models
    Yongan Yu, Qingchen Hu, Xianda Du, and 3 more authors
    In Findings of the Association for Computational Linguistics: ACL 2025, 2025
  4. From Recall to Reasoning: Automated Question Generation for Deeper Math Learning through Large Language Models
    Yongan Yu, Alexandre Krantz, and Nikki G Lobczowski
    In International Conference on Artificial Intelligence in Education (AIED 2025), 2025