Jump to content

GitHub Copilot: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
→‎Reception: simplification of wording, preserving its meaning
Eudamonic (talk | contribs)
added Differentiable computing navbox
Line 55: Line 55:


* {{official site|copilot.github.com}}
* {{official site|copilot.github.com}}

{{Differentiable computing}}


[[Category:GitHub]]
[[Category:GitHub]]

Revision as of 11:30, 2 September 2022

GitHub Copilot
Developer(s)GitHub, OpenAI
Stable release
1.7.4421
Operating systemMicrosoft Windows, Linux, macOS, Web
Websitecopilot.github.com

GitHub Copilot is an artificial intelligence tool developed by GitHub and OpenAI to assist users of Visual Studio Code, Visual Studio, Neovim, and JetBrains integrated development environments (IDEs) by autocompleting code.[1] Currently available by subscription to individual developers, the tool was first announced by GitHub on 29 June 2021, and works best for users coding in Python, JavaScript, TypeScript, Ruby, and Go.[2][3]

History

On 29 June 2021, GitHub announced GitHub Copilot for technical preview in the Visual Studio Code development environment.[1][4]

On 26 October 2021, GitHub Copilot was released as a plugin on the JetBrains marketplace.[5]

On 27 October 2021, GitHub released the GitHub Copilot Neovim plugin as a public repository.[6]

On 29 March 2022, GitHub officially announced Copilot's availability for the Visual Studio 2022 IDE.[7]

On 21 June 2022, GitHub officially announced that Copilot was out of "technical preview", and is available as a subscription-based service for individual developers.[8] Additionally, GitHub mentioned that Copilot would remain "free for verified students and maintainers of popular open source projects" and "will be offered to companies later this year (2022)".

Eigenschaften

GitHub Copilot is powered by the OpenAI Codex, an artificial intelligence model created by OpenAI which is an artificial intelligence research laboratory.[9] The OpenAI Codex is a modified, production version of the Generative Pre-trained Transformer 3 (GPT-3), a language model using deep-learning to produce human-like text.[10] For example, when provided with a programming problem in natural language, Codex is capable of generating solution code.[11] It is also able to describe input code in English and translate code between programming languages.[11] Codex’s GPT-3 is licensed exclusively to Microsoft, GitHub’s parent company.[12]

Copilot’s OpenAI Codex is trained on a selection of the English language, public GitHub repositories, and other publicly available source code.[3] This includes a filtered dataset of 159 gigabytes of Python code sourced from 54 million public GitHub repositories.[13]

According to its website, GitHub Copilot includes assistive features for programmers, such as the conversion of code comments to runnable code and autocomplete for chunks of code, repetitive sections of code, and entire methods and/or functions.[3][14] GitHub reports that Copilot’s autocomplete feature is accurate roughly half of the time; with some Python function header code, for example, Copilot correctly autocompleted the rest of the function body code 43% of the time on the first try and 57% of the time after ten attempts.[3]

GitHub states that Copilot’s features allow programmers to navigate unfamiliar coding frameworks and languages by reducing the amount of time users spend reading documentation.[3]

Reception

Since Copilot's release, there have been concerns with its security and educational impact, as well as licensing controversy surrounding the code it produces.[2][11][15]

Licensing controversy

Although most code output by Copilot can be classified as a transformative work, GitHub admits that a small proportion is copied verbatim, which has led to fears that the output code is insufficiently transformative to be classified as fair use and may infringe on the copyright of the original owner.[2] This leaves Copilot on untested legal ground, although GitHub states that "training machine learning models on publicly available data is considered fair use across the machine learning community".[2] The company has also stated that as of June 2022 only a few source codes are taken over completely or partially unchanged. Therefore as the software continues to learn, this figure is expected to drop.[16]

FSF white papers

On 28 July 2021, the Free Software Foundation (FSF) published a funded call for white papers on philosophical and legal questions around Copilot.[17] Donald Robertson, the Licensing and Compliance Manager of the FSF, stated that "Copilot raises many [...] questions which require deeper examination."[17] On 24 February 2022, the FSF announced they had received 22 papers on the subject and using an anonymous review process chose 5 papers to highlight.[18]

Security concerns

A paper accepted for publication in the IEEE Symposium on Security and Privacy in 2022 assessed the security of code generated by Copilot for the MITRE’s top 25 code weakness enumerations (e.g., cross-site scripting, path traversal) across 89 different scenarios and 1,689 programs.[15] This was done along the axes of diversity of weaknesses (its ability to respond to scenarios that may lead to various code weaknesses), diversity of prompts (its ability to respond to the same code weakness with subtle variation), and diversity of domains (its ability to generate register transfer level hardware specifications in Verilog).[15] The study found that across these axes in multiple languages, 39.33% of top suggestions and 40.73% of total suggestions led to code vulnerabilities. Additionally, they found that small, non-semantic (i.e., comments) changes made to code could impact code safety.[15]

Education concerns

A February 2022 paper released by the Association for Computing Machinery evaluates the impact Codex, the technology used by Github Copilot, may have on the education of novice programmers.[11] The study utilizes assessment questions from an introductory programming class at The University of Auckland and compares Codex’s responses with student performance.[11] Researchers found that Codex, on average, performed better than most students; however, its performance decreased on questions that limited what features could be used in the solution (e.g., conditionals, collections, and loops).[11] Given this type of problem, “only two of [Codex’s] 10 solutions produced the correct output, but both [...] violated [the] constraint.” The paper concludes that Codex may be useful in providing a variety of solutions to learners, but may also lead to over-reliance and plagiarism.[11]

See also

References

  1. ^ a b Gershgorn, Dave (29 June 2021). "GitHub and OpenAI launch a new AI tool that generates its own code". The Verge. Retrieved 6 July 2021.
  2. ^ a b c d e "GitHub Copilot · Your AI pair programmer". GitHub Copilot. Retrieved 7 April 2022.
  3. ^ "Introducing GitHub Copilot: your AI pair programmer". The GitHub Blog. 29 June 2021. Retrieved 7 April 2022.
  4. ^ "GitHub Copilot - IntelliJ IDEs Plugin | Marketplace". JetBrains Marketplace. Retrieved 7 April 2022.
  5. ^ Copilot.vim, GitHub, 7 April 2022, retrieved 7 April 2022
  6. ^ "GitHub Copilot now available for Visual Studio 2022". The GitHub Blog. 29 March 2022. Retrieved 7 April 2022.
  7. ^ "GitHub Copilot is generally available to all developers". The GitHub Blog. 21 June 2022. Retrieved 21 June 2022.
  8. ^ Krill, Paul (12 August 2021). "OpenAI offers API for GitHub Copilot AI model". InfoWorld. Retrieved 7 April 2022.
  9. ^ "OpenAI Releases GPT-3, The Largest Model So Far". Analytics India Magazine. 3 June 2020. Retrieved 7 April 2022.
  10. ^ a b c d e f g Finnie-Ansley, James; Denny, Paul; Becker, Brett A.; Luxton-Reilly, Andrew; Prather, James (14 February 2022). "The Robots Are Coming: Exploring the Implications of OpenAI Codex on Introductory Programming". Australasian Computing Education Conference. ACE '22. New York, NY, USA: Association for Computing Machinery: 10–19. doi:10.1145/3511861.3511863. ISBN 978-1-4503-9643-1. S2CID 246681316.
  11. ^ "OpenAI is giving Microsoft exclusive access to its GPT-3 language model". MIT Technology Review. Retrieved 7 April 2022.
  12. ^ "OpenAI Announces 12 Billion Parameter Code-Generation AI Codex". InfoQ. Retrieved 7 April 2022.
  13. ^ Sobania, Dominik; Schweim, Dirk; Rothlauf, Franz (2022). "A Comprehensive Survey on Program Synthesis with Evolutionary Algorithms". IEEE Transactions on Evolutionary Computation: 1. doi:10.1109/TEVC.2022.3162324. ISSN 1941-0026. S2CID 247721793.
  14. ^ a b c d Pearce, Hammond; Ahmad, Baleegh; Tan, Benjamin; Dolan-Gavitt, Brendan; Karri, Ramesh (16 December 2021). "Asleep at the Keyboard? Assessing the Security of GitHub Copilot's Code Contributions". arXiv:2108.09293 [cs.CR].
  15. ^ "GitHub Copilot: The programming assistant at a glance". IONOS Digitalguide. Retrieved 20 July 2022.
  16. ^ a b "FSF-funded call for white papers on philosophical and legal questions around Copilot". Free Software Foundation. 28 July 2021. Retrieved 11 August 2021.
  17. ^ "Publication of the FSF-funded white papers on questions around Copilot". Free Software Foundation. 24 February 2022.