evoML Yellow Paper: Evolutionary AI and Optimisation Studio
Authors:
Lingbo Li,
Leslie Kanthan,
Michail Basios,
Fan Wu,
Manal Adham,
Vitali Avagyan,
Alexis Butler,
Paul Brookes,
Rafail Giavrimis,
Buhong Liu,
Chrystalla Pavlou,
Matthew Truscott,
Vardan Voskanyan
Abstract:
Machine learning model development and optimisation can be a rather cumbersome and resource-intensive process. Custom models are often more difficult to build and deploy, and they require infrastructure and expertise which are often costly to acquire and maintain. Machine learning product development lifecycle must take into account the need to navigate the difficulties of developing and deploying…
▽ More
Machine learning model development and optimisation can be a rather cumbersome and resource-intensive process. Custom models are often more difficult to build and deploy, and they require infrastructure and expertise which are often costly to acquire and maintain. Machine learning product development lifecycle must take into account the need to navigate the difficulties of developing and deploying machine learning models. evoML is an AI-powered tool that provides automated functionalities in machine learning model development, optimisation, and model code optimisation. Core functionalities of evoML include data cleaning, exploratory analysis, feature analysis and generation, model optimisation, model evaluation, model code optimisation, and model deployment. Additionally, a key feature of evoML is that it embeds code and model optimisation into the model development process, and includes multi-objective optimisation capabilities.
△ Less
Submitted 20 December, 2022;
originally announced December 2022.
Do Names Echo Semantics? A Large-Scale Study of Identifiers Used in C++'s Named Casts
Authors:
Constantin Cezar Petrescu,
Sam Smith,
Rafail Giavrimis,
Santanu Kumar Dash
Abstract:
Developers relax restrictions on a type to reuse methods with other types. While type casts are prevalent, in weakly typed languages such as C++, they are also extremely permissive. Assignments where a source expression is cast into a new type and assigned to a target variable of the new type, can lead to software bugs if performed without care. In this paper, we propose an information-theoretic a…
▽ More
Developers relax restrictions on a type to reuse methods with other types. While type casts are prevalent, in weakly typed languages such as C++, they are also extremely permissive. Assignments where a source expression is cast into a new type and assigned to a target variable of the new type, can lead to software bugs if performed without care. In this paper, we propose an information-theoretic approach to identify poor implementations of explicit cast operations. Our approach measures accord between the source expression and the target variable using conditional entropy. We collect casts from 34 components of the Chromium project, which collectively account for 27MLOC and random-uniformly sample this dataset to create a manually labelled dataset of 271 casts. Information-theoretic vetting of these 271 casts achieves a peak precision of 81% and a recall of 90%. We additionally present the findings of an in-depth investigation of notable explicit casts, two of which were fixed in recent releases of the Chromium project.
△ Less
Submitted 3 April, 2023; v1 submitted 2 November, 2021;
originally announced November 2021.