EffiBench: Benchmarking the Efficiency of Automatically Generated Code

Huang, Dong; Zhang, Jie M.; Qing, Yuhao; Cui, Heming

Computer Science > Software Engineering

arXiv:2402.02037v2 (cs)

[Submitted on 3 Feb 2024 (v1), revised 15 Feb 2024 (this version, v2), latest version 4 Jul 2024 (v4)]

Title:EffiBench: Benchmarking the Efficiency of Automatically Generated Code

Authors:Dong Huang, Jie M.Zhang, Yuhao Qing, Heming Cui

View PDF

Abstract:Code generation models have increasingly become integral to aiding software development, offering assistance in tasks such as code completion, debugging, and code translation. Although current research has thoroughly examined the correctness of code produced by code generation models, a vital aspect, i.e., the efficiency of the generated code, has often been neglected. This paper presents EffiBench, a benchmark with 1,000 efficiency-critical coding problems for assessing the efficiency of code generated by code generation models. EffiBench contains a diverse set of LeetCode coding problems. Each problem is paired with an executable human-written canonical solution. With EffiBench, we empirically examine the capability of 21 Large Language Models (13 open-sourced and 8 closed-sourced) in generating efficient code. The results demonstrate that GPT-4-turbo generates the most efficient code, significantly outperforming Palm-2-chat-bison, Claude-instant-1, Gemini-pro, GPT-4, and GPT-3.5. Nevertheless, its code efficiency is still worse than the efficiency of human-written canonical solutions. In particular, the average and worst execution time of GPT-4-turbo generated code is 1.69 and 45.49 times that of the canonical solutions.

Comments:	26 pages, 13 figures, 18 tables
Subjects:	Software Engineering (cs.SE); Computation and Language (cs.CL)
Cite as:	arXiv:2402.02037 [cs.SE]
	(or arXiv:2402.02037v2 [cs.SE] for this version)
	https://doi.org/10.48550/arXiv.2402.02037

Submission history

From: Huang Dong [view email]
[v1] Sat, 3 Feb 2024 05:24:39 UTC (963 KB)
[v2] Thu, 15 Feb 2024 15:57:06 UTC (963 KB)
[v3] Fri, 7 Jun 2024 09:21:21 UTC (945 KB)
[v4] Thu, 4 Jul 2024 02:55:05 UTC (946 KB)

Computer Science > Software Engineering

Title:EffiBench: Benchmarking the Efficiency of Automatically Generated Code

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Software Engineering

Title:EffiBench: Benchmarking the Efficiency of Automatically Generated Code

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators