検索条件

キーワード
タグ
ツール
開催日
こだわり条件

タグ一覧

JavaScript
PHP
Java
Ruby
Python
Perl
Scala
Haskell
C言語
C言語系
Google言語
デスクトップアプリ
スマートフォンアプリ
プログラミング言語
U/UX
MySQL
RDB
NoSQL
全文検索エンジン
全文検索
Hadoop
Apache Spark
BigQuery
サーバ構成管理
開発サポートツール
テストツール
開発手法
BI
Deep Learning
自然言語処理
BaaS
PaaS
Iaas
Saas
クラウド
AI
Payment
クラウドソフトウェア
仮想化ソフトウェア
OS
サーバ監視
ネットワーク
WEBサーバ
開発ツール
テキストエディタ
CSS
HTML
WEB知識
CMS
WEBマーケティング
グラフィック
グラフィックツール
Drone
AR
マーケット知識
セキュリティ
Shell
IoT
テスト
Block chain
知識

[93rd TrustML Young Scientist Seminar] Talk by Diganta Misra (Max Planck Institute) "LLMs vs. Torch 1.5: Why Your Code Assistant Can't Keep Up"

2025/05/09(金)
01:00〜02:00
Googleカレンダーに追加
参加者

35人/

主催:RIKEN AIP Public

Date and Time: May 9, 2025: 10:00 - 11:00 (JST)
Venue: Online and Open Space at the RIKEN AIP Nihonbashi office

*Open Space is available to AIP researchers only

Title: LLMs vs. Torch 1.5: Why Your Code Assistant Can't Keep Up

Speaker: Diganta Misra (Max Planck Institute)

Abstract: In the fast-evolving world of software libraries, code generation models are struggling to keep pace. Most existing benchmarks focus on static, version-agnostic code predictions, failing to capture the true complexity of adapting to frequent updates and maintaining compatibility with multiple library versions. To address this gap, we introduce GitChameleon, a novel dataset featuring 116 Python code completion tasks, each tied to specific library versions and accompanied by executable unit tests. This dataset is designed to rigorously evaluate the ability of large language models (LLMs) to generate version-specific code that is both syntactically correct and functionally accurate. Our findings are revealing: even state-of-the-art models like GPT-4o achieve a pass@10 of just 39.9% (43.7% with error feedback), highlighting significant limitations in their ability to adapt to versioned code. In this talk, I’ll explore why today’s LLMs, while impressive, still fall short in the dynamic landscape of evolving software libraries. By examining these challenges, we hope to spark a conversation about how to build more adaptable, reliable code generation tools for the future.

Bio: Diganta Misra is an Amazon x IMPRS-IS x ELLIS PhD Fellow at the Max Planck Institute for Intelligent Systems at Tübingen advised by Antonio Orvieto and Volkan Cevher (EPFL). His work spans the topics of Hypernetworks, code-generation LLMs, evaluation frameworks and Mixture of Experts. Previously, Diganta was an UNIQUE scholar MSc student at Mila Montréal advised by Irina Rish and has worked in research and engineering roles at firms/ institutions like Morgan Stanley, CMU and Weights & Biases. Outside of research, he has interest in piano, tribal art, mountaineering, travelling and soccer.

Workship