Larry, Yinxi Li

yinxi.li[at]uwaterloo.ca | SWAG Lab DC 2555

self.jpg

Amor fati, carpe diem

🦄 About Me | Currently Master’s Student in CS

Hey there, I’m Larry! Thanks for visiting my humble page. I’m currently a first-year Computer Science M.Math. student at the University of Waterloo, advised by Pengyu Nie. Before that, I received my B.Sc. in Computer Science (ELITE Stream) from The Chinese University of Hong Kong, and my final year project was advised by Eric Lo. I also spent a semester as an exchange student at ETH Zürich during my undergraduate, where I explored advanced topics in Machine Learning and Software Engineering.

My research is supported by grants from Prof. Nie’s research group, funded by the Natrual Sciences and Engineering Research Council of Canada (NSERC) and the University of Waterloo.

💻 Research Interests

  • Understanding and Improving the Internal Mechanisms of LLMs and DLMs: tokenization, reasoning, representation learning
  • LLM Applications in Software Engineering, Math, and Scientific Discovery

I am open to academic and research collaborations and welcome any questions or discussions.

🎉 News

Oct 17, 2025 📌 New preprint: TokDrift: When LLM Speaks in Subwords but Code Speaks in Grammar is now available on arXiv! 1️⃣ LLMs’ subword tokenizers don’t align well with programming language grammar: tiny whitespace or renaming tweaks -> different tokenization -> flipped outputs. 2️⃣ Our framework TokDrift systematically tests 9 code LLMs on 3 tasks, showing their sensitivity to tokenization changes: up to 60% outputs change under a single semantic-preserving rewrite. 3️⃣ If your win margin is ~1 pp, beware: spacing & naming can swing results.
Feb 15, 2025 Homepage Acknowledgements
Feb 15, 2025 Hello my personal website! Let’s make a brithday for it😍. Glad to see you there but it was still under construction. Hopefully it will be done soon.

💡 Selected Publications / Preprints

  1. TokDrift_example.gif
    TokDrift: When LLM Speaks in Subwords but Code Speaks in Grammar
    Yinxi Li, Yuntian Deng, and Pengyu Nie
    arXiv preprint arXiv:2510.14972, 2025