[93rd TrustML Young Scientist Seminar] Talk by Diganta Misra (Max Planck Institute) "LLMs vs. Torch 1.5: Why Your Code Assistant Can't Keep Up"

Name: [93rd TrustML Young Scientist Seminar] Talk by Diganta Misra (Max Planck Institute) "LLMs vs. Torch 1.5: Why Your Code Assistant Can't Keep Up"
Start: 2025-05-09T01:00+09:00
End: 2025-05-09T02:00+09:00

2025/05/09(金)

01:00〜02:00

Googleカレンダーに追加

参加者

35人/人

主催：RIKEN AIP Public

イベントに申し込む

Date and Time: May 9, 2025: 10:00 - 11:00 (JST)
Venue: Online and Open Space at the RIKEN AIP Nihonbashi office

*Open Space is available to AIP researchers only

Title: LLMs vs. Torch 1.5: Why Your Code Assistant Can't Keep Up

Speaker: Diganta Misra (Max Planck Institute)

Abstract: In the fast-evolving world of software libraries, code generation models are struggling to keep pace. Most existing benchmarks focus on static, version-agnostic code predictions, failing to capture the true complexity of adapting to frequent updates and maintaining compatibility with multiple library versions. To address this gap, we introduce GitChameleon, a novel dataset featuring 116 Python code completion tasks, each tied to specific library versions and accompanied by executable unit tests. This dataset is designed to rigorously evaluate the ability of large language models (LLMs) to generate version-specific code that is both syntactically correct and functionally accurate. Our findings are revealing: even state-of-the-art models like GPT-4o achieve a pass@10 of just 39.9% (43.7% with error feedback), highlighting significant limitations in their ability to adapt to versioned code. In this talk, I’ll explore why today’s LLMs, while impressive, still fall short in the dynamic landscape of evolving software libraries. By examining these challenges, we hope to spark a conversation about how to build more adaptable, reliable code generation tools for the future.

Bio: Diganta Misra is an Amazon x IMPRS-IS x ELLIS PhD Fellow at the Max Planck Institute for Intelligent Systems at Tübingen advised by Antonio Orvieto and Volkan Cevher (EPFL). His work spans the topics of Hypernetworks, code-generation LLMs, evaluation frameworks and Mixture of Experts. Previously, Diganta was an UNIQUE scholar MSc student at Mila Montréal advised by Irina Rish and has worked in research and engineering roles at firms/ institutions like Morgan Stanley, CMU and Weights & Biases. Outside of research, he has interest in piano, tribal art, mountaineering, travelling and soccer.

キーワード
タグ
ツール	Discord Google Hangout Google Meet Remo Skype Slack Microsoft Teams Whereby YouTube Live Zoom
開催日
こだわり条件	人気のウェビナー終了のウェビナーを含む

[93rd TrustML Young Scientist Seminar] Talk by Diganta Misra (Max Planck Institute) "LLMs vs. Torch 1.5: Why Your Code Assistant Can't Keep Up"

主催：RIKEN AIP Public

似たイベント

[ABI Team Seminar] Emanuele Rodolà: Model Merging - What, Why, and How

主催：RIKEN AIP Public

ビジネスコミュニティBizCRE に参加しよう！

主催：CREEKS

ブレンド東京: 文化を繋ぎ、ネットワークを広げる #004

主催：Business In Japan

【金曜日・予約制】Nerima Base 模型工作室（高校生以下無料！）- Crafting Room Event

主催：Nerima Base - ネリマベース

Deep Learning Theory Team Seminar (Talk by Leyang Wang, University College London).

主催：RIKEN AIP Public