CV

Education

Ph.D in Version Control Theory, GitHub University, 2018 (expected)
M.S. in Jekyll, GitHub University, 2014
B.S. in GitHub, GitHub University, 2012

Work experience

Spring 2024: Academic Pages Collaborator
- Github University
- Duties includes: Updates and improvements to template
- Supervisor: The Users
Fall 2015: Research Assistant
- Github University
- Duties included: Merging pull requests
- Supervisor: Professor Hub
Summer 2015: Research Assistant
- Github University
- Duties included: Tagging issues
- Supervisor: Professor Git

Skills

Skill 1
Skill 2
- Sub-skill 2.1
- Sub-skill 2.2
- Sub-skill 2.3
Skill 3

Publications

Hardware Co-Design Scaling Laws via Roofline Modelling for On-Device LLMs

Sun, L., Jiang, J., Ding, Y., Li, F., Song, Y., Zhang, H., Ying, J., Ren, L., Zhan, K., Chen, W., Xie, Y., & Deng, C. (2026). Hardware Co-Design Scaling Laws via Roofline Modelling for On-Device LLMs. arXiv preprint arXiv:2602.10377.

Memory-Driven Self-Improvement for Decision Making with Large Language Models

Yan, X., Ou, Z., Yang, M., Song, Y., Zhang, H., Li, Y., & Wang, J. (2025). Memory-Driven Self-Improvement for Decision Making with Large Language Models. arXiv preprint arXiv:2509.26340.

REMA: Learning to Meta-Think for LLMS with Multi-Agent Reinforcement Learning

Wan, Z., LI, Y., Wen, X., Song, Y., Wang, H., Yang, L., Schmidt, M., Wang, J., Zhang, W., Hu, S., & Wen, Y. (2025). REMA: Learning to Meta-Think for LLMS with Multi-Agent Reinforcement Learning. Advances in Neural Information Processing Systems (NeurIPS).

ThinkBench: Dynamic Out-of-Distribution Evaluation for Robust LLM Reasoning

Huang, S., Yang, L., Song, Y., Chen, S., Cui, L., Wan, Z., Zeng, Q., Wen, Y., Shao, K., Zhang, W., Wang, J., & Zhang, Y. (2025). ThinkBench: Dynamic Out-of-Distribution Evaluation for Robust LLM Reasoning. Advances in Neural Information Processing Systems (NeurIPS) Datasets & Benchmarks Track.

Ask more, know better: Reinforce-Learned Prompt Questions for Decision Making with Large Language Models

Yan, X., Song, Y., Cui, X., Christianos, F., Zhang, H., Mguni, D. H., & Wang, J. (2025). Ask more, know better: Reinforce-Learned Prompt Questions for Decision Making with Large Language Models. European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML-PKDD).

Efficient Reinforcement Learning with Large Language Model Priors

Yan, X., Song, Y., Feng, X., Yang, M., Zhang, H., Bou Ammar, H., & Wang, J. (2025). Efficient Reinforcement Learning with Large Language Model Priors. International Conference on Learning Representations (ICLR).

Natural Language Reinforcement Learning

Feng, X., Liu, B., Song, Y., Fu, H., Wan, Z., Koushik, G. A., Hu, Z., Yang, M., Wen, Y., & Wang, J. (2024). Natural Language Reinforcement Learning. arXiv preprint arXiv:2411.14251.

OpenR: An Open Source Framework for Advanced Reasoning with Large Language Models

Wang, J., Fang, M., Wan, Z., Wen, M., Zhu, J., Liu, A., Gong, Z., Song, Y., Chen, L., Ni, L. M., Yang, L., Wen, Y., & Zhang, W. (2024). OpenR: An Open Source Framework for Advanced Reasoning with Large Language Models. arXiv preprint arXiv:2410.09671.

AI-Olympics: Exploring the Generalization of Agents through Open Competitions

Wang, C., Song, Y., Wu, S., Wu, S., Zhang, R., Lin, S., & Zhang, H. (2024). AI-Olympics: Exploring the Generalization of Agents through Open Competitions. Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI) Demo Track.

Boosting Studies of Multi-Agent Reinforcement Learning on Google Research Football Environment: the Past, Present, and Future

Song, Y., Jiang, H., Zhang, H., Tian, Z., Zhang, W., & Wang, J. (2024). Boosting Studies of Multi-Agent Reinforcement Learning on Google Research Football Environment: the Past, Present, and Future. Proceedings of the International Conference on Autonomous Agents and Multi-Agent Systems (AAMAS).

TaxAI: A Dynamic Economic Simulator and Benchmark for Multi-Agent Reinforcement Learning

Mi, Q., Xia, S., Song, Y., Zhang, H., Zhu, S., & Wang, J. (2024). TaxAI: A Dynamic Economic Simulator and Benchmark for Multi-Agent Reinforcement Learning. Proceedings of the International Conference on Autonomous Agents and Multi-Agent Systems (AAMAS).

An empirical study on google research football multi-agent scenarios

Song, Y., Jiang, H., Tian, Z., Zhang, H., Zhang, Y., Zhu, J., Dai, Z., Zhang, W., & Wang, J. (2024). An empirical study on google research football multi-agent scenarios. Machine Intelligence Research, 21, 549–570.

Teaching

Service and leadership

Currently signed in to 43 different slack teams

Yan Song

CV