-
MLR-Copilot: Autonomous Machine Learning Research based on Large Language Models Agents
Authors:
Ruochen Li,
Teerth Patel,
Qingyun Wang,
Xinya Du
Abstract:
Machine learning research, crucial for technological advancements and innovation, often faces significant challenges due to its inherent complexity, slow pace of experimentation, and the necessity for specialized expertise. Motivated by this, we present a new systematic framework, autonomous Machine Learning Research with large language models (MLR-Copilot), designed to enhance machine learning re…
▽ More
Machine learning research, crucial for technological advancements and innovation, often faces significant challenges due to its inherent complexity, slow pace of experimentation, and the necessity for specialized expertise. Motivated by this, we present a new systematic framework, autonomous Machine Learning Research with large language models (MLR-Copilot), designed to enhance machine learning research productivity through the automatic generation and implementation of research ideas using Large Language Model (LLM) agents. The framework consists of three phases: research idea generation, experiment implementation, and implementation execution. First, existing research papers are used to generate hypotheses and experimental plans vis IdeaAgent powered by LLMs. Next, the implementation generation phase translates these plans into executables with ExperimentAgent. This phase leverages retrieved prototype code and optionally retrieves candidate models and data. Finally, the execution phase, also managed by ExperimentAgent, involves running experiments with mechanisms for human feedback and iterative debugging to enhance the likelihood of achieving executable research outcomes. We evaluate our framework on five machine learning research tasks and the experimental results show the framework's potential to facilitate the research progress and innovations.
△ Less
Submitted 26 August, 2024;
originally announced August 2024.
-
As Biased as You Measure: Methodological Pitfalls of Bias Evaluations in Speaker Verification Research
Authors:
Wiebke Hutiri,
Tanvina Patel,
Aaron Yi Ding,
Odette Scharenborg
Abstract:
Detecting and mitigating bias in speaker verification systems is important, as datasets, processing choices and algorithms can lead to performance differences that systematically favour some groups of people while disadvantaging others. Prior studies have thus measured performance differences across groups to evaluate bias. However, when comparing results across studies, it becomes apparent that t…
▽ More
Detecting and mitigating bias in speaker verification systems is important, as datasets, processing choices and algorithms can lead to performance differences that systematically favour some groups of people while disadvantaging others. Prior studies have thus measured performance differences across groups to evaluate bias. However, when comparing results across studies, it becomes apparent that they draw contradictory conclusions, hindering progress in this area. In this paper we investigate how measurement impacts the outcomes of bias evaluations. We show empirically that bias evaluations are strongly influenced by base metrics that measure performance, by the choice of ratio or difference-based bias measure, and by the aggregation of bias measures into meta-measures. Based on our findings, we recommend the use of ratio-based bias measures, in particular when the values of base metrics are small, or when base metrics with different orders of magnitude need to be compared.
△ Less
Submitted 24 August, 2024;
originally announced August 2024.
-
ReCon: Reconfiguring Analog Rydberg Atom Quantum Computers for Quantum Generative Adversarial Networks
Authors:
Nicholas S. DiBrita,
Daniel Leeds,
Yuqian Huo,
Jason Ludmir,
Tirthak Patel
Abstract:
Quantum computing has shown theoretical promise of speedup in several machine learning tasks, including generative tasks using generative adversarial networks (GANs). While quantum computers have been implemented with different types of technologies, recently, analog Rydberg atom quantum computers have been demonstrated to have desirable properties such as reconfigurable qubit (quantum bit) positi…
▽ More
Quantum computing has shown theoretical promise of speedup in several machine learning tasks, including generative tasks using generative adversarial networks (GANs). While quantum computers have been implemented with different types of technologies, recently, analog Rydberg atom quantum computers have been demonstrated to have desirable properties such as reconfigurable qubit (quantum bit) positions and multi-qubit operations. To leverage the properties of this technology, we propose ReCon, the first work to implement quantum GANs on analog Rydberg atom quantum computers. Our evaluation using simulations and real-computer executions shows 33% better quality (measured using Frechet Inception Distance (FID)) in generated images than the state-of-the-art technique implemented on superconducting-qubit technology.
△ Less
Submitted 23 August, 2024;
originally announced August 2024.
-
Improving child speech recognition with augmented child-like speech
Authors:
Yuanyuan Zhang,
Zhengjun Yue,
Tanvina Patel,
Odette Scharenborg
Abstract:
State-of-the-art ASRs show suboptimal performance for child speech. The scarcity of child speech limits the development of child speech recognition (CSR). Therefore, we studied child-to-child voice conversion (VC) from existing child speakers in the dataset and additional (new) child speakers via monolingual and cross-lingual (Dutch-to-German) VC, respectively. The results showed that cross-lingua…
▽ More
State-of-the-art ASRs show suboptimal performance for child speech. The scarcity of child speech limits the development of child speech recognition (CSR). Therefore, we studied child-to-child voice conversion (VC) from existing child speakers in the dataset and additional (new) child speakers via monolingual and cross-lingual (Dutch-to-German) VC, respectively. The results showed that cross-lingual child-to-child VC significantly improved child ASR performance. Experiments on the impact of the quantity of child-to-child cross-lingual VC-generated data on fine-tuning (FT) ASR models gave the best results with two-fold augmentation for our FT-Conformer model and FT-Whisper model which reduced WERs with ~3% absolute compared to the baseline, and with six-fold augmentation for the model trained from scratch, which improved by an absolute 3.6% WER. Moreover, using a small amount of "high-quality" VC-generated data achieved similar results to those of our best-FT models.
△ Less
Submitted 12 June, 2024;
originally announced June 2024.
-
KerasCV and KerasNLP: Vision and Language Power-Ups
Authors:
Matthew Watson,
Divyashree Shivakumar Sreepathihalli,
Francois Chollet,
Martin Gorner,
Kiranbir Sodhia,
Ramesh Sampath,
Tirth Patel,
Haifeng Jin,
Neel Kovelamudi,
Gabriel Rasskin,
Samaneh Saadat,
Luke Wood,
Chen Qian,
Jonathan Bischof,
Ian Stenbit,
Abheesht Sharma,
Anshuman Mishra
Abstract:
We present the Keras domain packages KerasCV and KerasNLP, extensions of the Keras API for Computer Vision and Natural Language Processing workflows, capable of running on either JAX, TensorFlow, or PyTorch. These domain packages are designed to enable fast experimentation, with a focus on ease-of-use and performance. We adopt a modular, layered design: at the library's lowest level of abstraction…
▽ More
We present the Keras domain packages KerasCV and KerasNLP, extensions of the Keras API for Computer Vision and Natural Language Processing workflows, capable of running on either JAX, TensorFlow, or PyTorch. These domain packages are designed to enable fast experimentation, with a focus on ease-of-use and performance. We adopt a modular, layered design: at the library's lowest level of abstraction, we provide building blocks for creating models and data preprocessing pipelines, and at the library's highest level of abstraction, we provide pretrained ``task" models for popular architectures such as Stable Diffusion, YOLOv8, GPT2, BERT, Mistral, CLIP, Gemma, T5, etc. Task models have built-in preprocessing, pretrained weights, and can be fine-tuned on raw inputs. To enable efficient training, we support XLA compilation for all models, and run all preprocessing via a compiled graph of TensorFlow operations using the tf.data API. The libraries are fully open-source (Apache 2.0 license) and available on GitHub.
△ Less
Submitted 5 June, 2024; v1 submitted 30 May, 2024;
originally announced May 2024.
-
Benchmarking the CoW with the TopCoW Challenge: Topology-Aware Anatomical Segmentation of the Circle of Willis for CTA and MRA
Authors:
Kaiyuan Yang,
Fabio Musio,
Yihui Ma,
Norman Juchler,
Johannes C. Paetzold,
Rami Al-Maskari,
Luciano Höher,
Hongwei Bran Li,
Ibrahim Ethem Hamamci,
Anjany Sekuboyina,
Suprosanna Shit,
Houjing Huang,
Chinmay Prabhakar,
Ezequiel de la Rosa,
Diana Waldmannstetter,
Florian Kofler,
Fernando Navarro,
Martin Menten,
Ivan Ezhov,
Daniel Rueckert,
Iris Vos,
Ynte Ruigrok,
Birgitta Velthuis,
Hugo Kuijf,
Julien Hämmerli
, et al. (59 additional authors not shown)
Abstract:
The Circle of Willis (CoW) is an important network of arteries connecting major circulations of the brain. Its vascular architecture is believed to affect the risk, severity, and clinical outcome of serious neuro-vascular diseases. However, characterizing the highly variable CoW anatomy is still a manual and time-consuming expert task. The CoW is usually imaged by two angiographic imaging modaliti…
▽ More
The Circle of Willis (CoW) is an important network of arteries connecting major circulations of the brain. Its vascular architecture is believed to affect the risk, severity, and clinical outcome of serious neuro-vascular diseases. However, characterizing the highly variable CoW anatomy is still a manual and time-consuming expert task. The CoW is usually imaged by two angiographic imaging modalities, magnetic resonance angiography (MRA) and computed tomography angiography (CTA), but there exist limited public datasets with annotations on CoW anatomy, especially for CTA. Therefore we organized the TopCoW Challenge in 2023 with the release of an annotated CoW dataset. The TopCoW dataset was the first public dataset with voxel-level annotations for thirteen possible CoW vessel components, enabled by virtual-reality (VR) technology. It was also the first large dataset with paired MRA and CTA from the same patients. TopCoW challenge formalized the CoW characterization problem as a multiclass anatomical segmentation task with an emphasis on topological metrics. We invited submissions worldwide for the CoW segmentation task, which attracted over 140 registered participants from four continents. The top performing teams managed to segment many CoW components to Dice scores around 90%, but with lower scores for communicating arteries and rare variants. There were also topological mistakes for predictions with high Dice scores. Additional topological analysis revealed further areas for improvement in detecting certain CoW components and matching CoW variant topology accurately. TopCoW represented a first attempt at benchmarking the CoW anatomical segmentation task for MRA and CTA, both morphologically and topologically.
△ Less
Submitted 29 April, 2024; v1 submitted 29 December, 2023;
originally announced December 2023.
-
Small Effect Sizes in Malware Detection? Make Harder Train/Test Splits!
Authors:
Tirth Patel,
Fred Lu,
Edward Raff,
Charles Nicholas,
Cynthia Matuszek,
James Holt
Abstract:
Industry practitioners care about small improvements in malware detection accuracy because their models are deployed to hundreds of millions of machines, meaning a 0.1\% change can cause an overwhelming number of false positives. However, academic research is often restrained to public datasets on the order of ten thousand samples and is too small to detect improvements that may be relevant to ind…
▽ More
Industry practitioners care about small improvements in malware detection accuracy because their models are deployed to hundreds of millions of machines, meaning a 0.1\% change can cause an overwhelming number of false positives. However, academic research is often restrained to public datasets on the order of ten thousand samples and is too small to detect improvements that may be relevant to industry. Working within these constraints, we devise an approach to generate a benchmark of configurable difficulty from a pool of available samples. This is done by leveraging malware family information from tools like AVClass to construct training/test splits that have different generalization rates, as measured by a secondary model. Our experiments will demonstrate that using a less accurate secondary model with disparate features is effective at producing benchmarks for a more sophisticated target model that is under evaluation. We also ablate against alternative designs to show the need for our approach.
△ Less
Submitted 25 December, 2023;
originally announced December 2023.
-
Swarm-GPT: Combining Large Language Models with Safe Motion Planning for Robot Choreography Design
Authors:
Aoran Jiao,
Tanmay P. Patel,
Sanjmi Khurana,
Anna-Mariya Korol,
Lukas Brunke,
Vivek K. Adajania,
Utku Culha,
Siqi Zhou,
Angela P. Schoellig
Abstract:
This paper presents Swarm-GPT, a system that integrates large language models (LLMs) with safe swarm motion planning - offering an automated and novel approach to deployable drone swarm choreography. Swarm-GPT enables users to automatically generate synchronized drone performances through natural language instructions. With an emphasis on safety and creativity, Swarm-GPT addresses a critical gap i…
▽ More
This paper presents Swarm-GPT, a system that integrates large language models (LLMs) with safe swarm motion planning - offering an automated and novel approach to deployable drone swarm choreography. Swarm-GPT enables users to automatically generate synchronized drone performances through natural language instructions. With an emphasis on safety and creativity, Swarm-GPT addresses a critical gap in the field of drone choreography by integrating the creative power of generative models with the effectiveness and safety of model-based planning algorithms. This goal is achieved by prompting the LLM to generate a unique set of waypoints based on extracted audio data. A trajectory planner processes these waypoints to guarantee collision-free and feasible motion. Results can be viewed in simulation prior to execution and modified through dynamic re-prompting. Sim-to-real transfer experiments demonstrate Swarm-GPT's ability to accurately replicate simulated drone trajectories, with a mean sim-to-real root mean square error (RMSE) of 28.7 mm. To date, Swarm-GPT has been successfully showcased at three live events, exemplifying safe real-world deployment of pre-trained models.
△ Less
Submitted 2 December, 2023;
originally announced December 2023.
-
Improving Whispered Speech Recognition Performance using Pseudo-whispered based Data Augmentation
Authors:
Zhaofeng Lin,
Tanvina Patel,
Odette Scharenborg
Abstract:
Whispering is a distinct form of speech known for its soft, breathy, and hushed characteristics, often used for private communication. The acoustic characteristics of whispered speech differ substantially from normally phonated speech and the scarcity of adequate training data leads to low automatic speech recognition (ASR) performance. To address the data scarcity issue, we use a signal processin…
▽ More
Whispering is a distinct form of speech known for its soft, breathy, and hushed characteristics, often used for private communication. The acoustic characteristics of whispered speech differ substantially from normally phonated speech and the scarcity of adequate training data leads to low automatic speech recognition (ASR) performance. To address the data scarcity issue, we use a signal processing-based technique that transforms the spectral characteristics of normal speech to those of pseudo-whispered speech. We augment an End-to-End ASR with pseudo-whispered speech and achieve an 18.2% relative reduction in word error rate for whispered speech compared to the baseline. Results for the individual speaker groups in the wTIMIT database show the best results for US English. Further investigation showed that the lack of glottal information in whispered speech has the largest impact on whispered speech ASR performance.
△ Less
Submitted 9 November, 2023;
originally announced November 2023.
-
SLIQ: Quantum Image Similarity Networks on Noisy Quantum Computers
Authors:
Daniel Silver,
Tirthak Patel,
Aditya Ranjan,
Harshitta Gandhi,
William Cutler,
Devesh Tiwari
Abstract:
Exploration into quantum machine learning has grown tremendously in recent years due to the ability of quantum computers to speed up classical programs. However, these efforts have yet to solve unsupervised similarity detection tasks due to the challenge of porting them to run on quantum computers. To overcome this challenge, we propose SLIQ, the first open-sourced work for resource-efficient quan…
▽ More
Exploration into quantum machine learning has grown tremendously in recent years due to the ability of quantum computers to speed up classical programs. However, these efforts have yet to solve unsupervised similarity detection tasks due to the challenge of porting them to run on quantum computers. To overcome this challenge, we propose SLIQ, the first open-sourced work for resource-efficient quantum similarity detection networks, built with practical and effective quantum learning and variance-reducing algorithms.
△ Less
Submitted 26 September, 2023;
originally announced September 2023.
-
QUILT: Effective Multi-Class Classification on Quantum Computers Using an Ensemble of Diverse Quantum Classifiers
Authors:
Daniel Silver,
Tirthak Patel,
Devesh Tiwari
Abstract:
Quantum computers can theoretically have significant acceleration over classical computers; but, the near-future era of quantum computing is limited due to small number of qubits that are also error prone. Quilt is a framework for performing multi-class classification task designed to work effectively on current error-prone quantum computers. Quilt is evaluated with real quantum machines as well a…
▽ More
Quantum computers can theoretically have significant acceleration over classical computers; but, the near-future era of quantum computing is limited due to small number of qubits that are also error prone. Quilt is a framework for performing multi-class classification task designed to work effectively on current error-prone quantum computers. Quilt is evaluated with real quantum machines as well as with projected noise levels as quantum machines become more noise-free. Quilt demonstrates up to 85% multi-class classification accuracy with the MNIST dataset on a five-qubit system.
△ Less
Submitted 26 September, 2023;
originally announced September 2023.
-
Can NLP Models 'Identify', 'Distinguish', and 'Justify' Questions that Don't have a Definitive Answer?
Authors:
Ayushi Agarwal,
Nisarg Patel,
Neeraj Varshney,
Mihir Parmar,
Pavan Mallina,
Aryan Bhavin Shah,
Srihari Raju Sangaraju,
Tirth Patel,
Nihar Thakkar,
Chitta Baral
Abstract:
Though state-of-the-art (SOTA) NLP systems have achieved remarkable performance on a variety of language understanding tasks, they primarily focus on questions that have a correct and a definitive answer. However, in real-world applications, users often ask questions that don't have a definitive answer. Incorrectly answering such questions certainly hampers a system's reliability and trustworthine…
▽ More
Though state-of-the-art (SOTA) NLP systems have achieved remarkable performance on a variety of language understanding tasks, they primarily focus on questions that have a correct and a definitive answer. However, in real-world applications, users often ask questions that don't have a definitive answer. Incorrectly answering such questions certainly hampers a system's reliability and trustworthiness. Can SOTA models accurately identify such questions and provide a reasonable response?
To investigate the above question, we introduce QnotA, a dataset consisting of five different categories of questions that don't have definitive answers. Furthermore, for each QnotA instance, we also provide a corresponding QA instance i.e. an alternate question that ''can be'' answered. With this data, we formulate three evaluation tasks that test a system's ability to 'identify', 'distinguish', and 'justify' QnotA questions. Through comprehensive experiments, we show that even SOTA models including GPT-3 and Flan T5 do not fare well on these tasks and lack considerably behind the human performance baseline. We conduct a thorough analysis which further leads to several interesting findings. Overall, we believe our work and findings will encourage and facilitate further research in this important area and help develop more robust models.
△ Less
Submitted 8 September, 2023;
originally announced September 2023.
-
MosaiQ: Quantum Generative Adversarial Networks for Image Generation on NISQ Computers
Authors:
Daniel Silver,
Tirthak Patel,
William Cutler,
Aditya Ranjan,
Harshitta Gandhi,
Devesh Tiwari
Abstract:
Quantum machine learning and vision have come to the fore recently, with hardware advances enabling rapid advancement in the capabilities of quantum machines. Recently, quantum image generation has been explored with many potential advantages over non-quantum techniques; however, previous techniques have suffered from poor quality and robustness. To address these problems, we introduce, MosaiQ, a…
▽ More
Quantum machine learning and vision have come to the fore recently, with hardware advances enabling rapid advancement in the capabilities of quantum machines. Recently, quantum image generation has been explored with many potential advantages over non-quantum techniques; however, previous techniques have suffered from poor quality and robustness. To address these problems, we introduce, MosaiQ, a high-quality quantum image generation GAN framework that can be executed on today's Near-term Intermediate Scale Quantum (NISQ) computers.
△ Less
Submitted 21 August, 2023;
originally announced August 2023.
-
Semi-Supervised Anomaly Detection for the Determination of Vehicle Hijacking Tweets
Authors:
Taahir Aiyoob Patel,
Clement N. Nyirenda
Abstract:
In South Africa, there is an ever-growing issue of vehicle hijackings. This leads to travellers constantly being in fear of becoming a victim to such an incident. This work presents a new semi-supervised approach to using tweets to identify hijacking incidents by using unsupervised anomaly detection algorithms. Tweets consisting of the keyword "hijacking" are obtained, stored, and processed using…
▽ More
In South Africa, there is an ever-growing issue of vehicle hijackings. This leads to travellers constantly being in fear of becoming a victim to such an incident. This work presents a new semi-supervised approach to using tweets to identify hijacking incidents by using unsupervised anomaly detection algorithms. Tweets consisting of the keyword "hijacking" are obtained, stored, and processed using the term frequency-inverse document frequency (TF-IDF) and further analyzed by using two anomaly detection algorithms: 1) K-Nearest Neighbour (KNN); 2) Cluster Based Outlier Factor (CBLOF). The comparative evaluation showed that the KNN method produced an accuracy of 89%, whereas the CBLOF produced an accuracy of 90%. The CBLOF method was also able to obtain a F1-Score of 0.8, whereas the KNN produced a 0.78. Therefore, there is a slight difference between the two approaches, in favour of CBLOF, which has been selected as a preferred unsupervised method for the determination of relevant hijacking tweets. In future, a comparison will be done between supervised learning methods and the unsupervised methods presented in this work on larger dataset. Optimisation mechanisms will also be employed in order to increase the overall performance.
△ Less
Submitted 19 August, 2023;
originally announced August 2023.
-
Toward Privacy in Quantum Program Execution On Untrusted Quantum Cloud Computing Machines for Business-sensitive Quantum Needs
Authors:
Tirthak Patel,
Daniel Silver,
Aditya Ranjan,
Harshitta Gandhi,
William Cutler,
Devesh Tiwari
Abstract:
Quantum computing is an emerging paradigm that has shown great promise in accelerating large-scale scientific, optimization, and machine-learning workloads. With most quantum computing solutions being offered over the cloud, it has become imperative to protect confidential and proprietary quantum code from being accessed by untrusted and/or adversarial agents. In response to this challenge, we pro…
▽ More
Quantum computing is an emerging paradigm that has shown great promise in accelerating large-scale scientific, optimization, and machine-learning workloads. With most quantum computing solutions being offered over the cloud, it has become imperative to protect confidential and proprietary quantum code from being accessed by untrusted and/or adversarial agents. In response to this challenge, we propose SPYCE, which is the first known solution to obfuscate quantum code and output to prevent the leaking of any confidential information over the cloud. SPYCE implements a lightweight, scalable, and effective solution based on the unique principles of quantum computing to achieve this task.
△ Less
Submitted 31 July, 2023;
originally announced July 2023.
-
PRD: Peer Rank and Discussion Improve Large Language Model based Evaluations
Authors:
Ruosen Li,
Teerth Patel,
Xinya Du
Abstract:
Nowadays, the quality of responses generated by different modern large language models (LLMs) is hard to evaluate and compare automatically. Recent studies suggest and predominantly use LLMs for reference-free evaluation of open-ended question answering. More specifically, they use the recognized "strongest" LLM as the evaluator, which conducts pairwise comparisons of candidate models' answers and…
▽ More
Nowadays, the quality of responses generated by different modern large language models (LLMs) is hard to evaluate and compare automatically. Recent studies suggest and predominantly use LLMs for reference-free evaluation of open-ended question answering. More specifically, they use the recognized "strongest" LLM as the evaluator, which conducts pairwise comparisons of candidate models' answers and provides a ranking score. However, this intuitive method has multiple problems, such as bringing in self-enhancement (favoring its own answers) and positional bias. We draw insights and lessons from the educational domain (Cho & MacArthur, 2011; Walsh, 2014) to improve LLM-based evaluations. Specifically, we propose (1) the peer rank (PR) algorithm that takes into account each peer LLM's pairwise preferences of all answer pairs, and outputs a final ranking of models; and (2) peer discussion (PD), where we prompt two LLMs to discuss and try to reach a mutual agreement on the preferences of two answers. We conduct experiments on two benchmark datasets. We find that our approaches achieve higher accuracy and align better with human judgments. Interestingly, PR can induce a relatively accurate self-ranking of models under the anonymous setting, where each model's name is unrevealed. Our work provides space to explore evaluating models that are hard to compare for humans.
△ Less
Submitted 3 July, 2024; v1 submitted 6 July, 2023;
originally announced July 2023.
-
Using Data Augmentations and VTLN to Reduce Bias in Dutch End-to-End Speech Recognition Systems
Authors:
Tanvina Patel,
Odette Scharenborg
Abstract:
Speech technology has improved greatly for norm speakers, i.e., adult native speakers of a language without speech impediments or strong accents. However, non-norm or diverse speaker groups show a distinct performance gap with norm speakers, which we refer to as bias. In this work, we aim to reduce bias against different age groups and non-native speakers of Dutch. For an end-to-end (E2E) ASR syst…
▽ More
Speech technology has improved greatly for norm speakers, i.e., adult native speakers of a language without speech impediments or strong accents. However, non-norm or diverse speaker groups show a distinct performance gap with norm speakers, which we refer to as bias. In this work, we aim to reduce bias against different age groups and non-native speakers of Dutch. For an end-to-end (E2E) ASR system, we use state-of-the-art speed perturbation and spectral augmentation as data augmentation techniques and explore Vocal Tract Length Normalization (VTLN) to normalise for spectral differences due to differences in anatomy. The combination of data augmentation and VTLN reduced the average WER and bias across various diverse speaker groups by 6.9% and 3.9%, respectively. The VTLN model trained on Dutch was also effective in improving performance of Mandarin Chinese child speech, thus, showing generalisability across languages
△ Less
Submitted 4 July, 2023;
originally announced July 2023.
-
AVScan2Vec: Feature Learning on Antivirus Scan Data for Production-Scale Malware Corpora
Authors:
Robert J. Joyce,
Tirth Patel,
Charles Nicholas,
Edward Raff
Abstract:
When investigating a malicious file, searching for related files is a common task that malware analysts must perform. Given that production malware corpora may contain over a billion files and consume petabytes of storage, many feature extraction and similarity search approaches are computationally infeasible. Our work explores the potential of antivirus (AV) scan data as a scalable source of feat…
▽ More
When investigating a malicious file, searching for related files is a common task that malware analysts must perform. Given that production malware corpora may contain over a billion files and consume petabytes of storage, many feature extraction and similarity search approaches are computationally infeasible. Our work explores the potential of antivirus (AV) scan data as a scalable source of features for malware. This is possible because AV scan reports are widely available through services such as VirusTotal and are ~100x smaller than the average malware sample. The information within an AV scan report is abundant with information and can indicate a malicious file's family, behavior, target operating system, and many other characteristics. We introduce AVScan2Vec, a language model trained to comprehend the semantics of AV scan data. AVScan2Vec ingests AV scan data for a malicious file and outputs a meaningful vector representation. AVScan2Vec vectors are ~3 to 85x smaller than popular alternatives in use today, enabling faster vector comparisons and lower memory usage. By incorporating Dynamic Continuous Indexing, we show that nearest-neighbor queries on AVScan2Vec vectors can scale to even the largest malware production datasets. We also demonstrate that AVScan2Vec vectors are superior to other leading malware feature vector representations across nearly all classification, clustering, and nearest-neighbor lookup algorithms that we evaluated.
△ Less
Submitted 9 June, 2023;
originally announced June 2023.
-
A systematic literature review on Security of Unmanned Aerial Vehicle Systems
Authors:
Tirth Patel,
Niyatiben Salot,
Vrusha Parikh
Abstract:
Unmanned aerial vehicles (UAVs) are becoming more common, and their operational range is expanding tremendously, making the security aspect of the inquiry essential. This study does a thorough assessment of the literature to determine the most common cyberattacks and the effects they have on UAV assaults on civilian targets. The STRIDE assault paradigm, the challenge they present, and the proper t…
▽ More
Unmanned aerial vehicles (UAVs) are becoming more common, and their operational range is expanding tremendously, making the security aspect of the inquiry essential. This study does a thorough assessment of the literature to determine the most common cyberattacks and the effects they have on UAV assaults on civilian targets. The STRIDE assault paradigm, the challenge they present, and the proper tools for the attack are used to categorize the cyber dangers discussed in this paper. Spoofing and denial of service assaults are the most prevalent types of UAV cyberattacks and have the best results. No attack style demands the employment of a hard-to-reach gadget, indicating that the security environment currently necessitates improvements to UAV use in civilian applications.
△ Less
Submitted 9 December, 2022;
originally announced December 2022.
-
Comparative evaluation of different methods of "Homomorphic Encryption" and "Traditional Encryption" on a dataset with current problems and developments
Authors:
Tanvi S. Patel,
Srinivasakranthikiran Kolachina,
Daxesh P. Patel,
Pranav S. Shrivastav
Abstract:
A database is a prime target for cyber-attacks as it contains confidential, sensitive, or protected information. With the increasing sophistication of the internet and dependencies on internet data transmission, it has become vital to be aware of various encryption technologies and trends. It can assist in safeguarding private information and sensitive data, as well as improve the security of clie…
▽ More
A database is a prime target for cyber-attacks as it contains confidential, sensitive, or protected information. With the increasing sophistication of the internet and dependencies on internet data transmission, it has become vital to be aware of various encryption technologies and trends. It can assist in safeguarding private information and sensitive data, as well as improve the security of client-server communication. Database encryption is a procedure that employs an algorithm to convert data contained in a database into "cipher text," which is incomprehensible until decoded. Homomorphic encryption technology, which works with encrypted data, can be utilized in both symmetric and asymmetric systems. In this paper, we evaluated homomorphic encryption techniques based on recent highly cited articles, as well as compared all database encryption problems and developments since 2018. The benefits and drawbacks of homomorphic approaches were examined over classic encryption methods including Transparent Database Encryption, Column Level Encryption, Field Level Encryption, File System Level Encryption, and Encrypting File System Encryption in this review. Additionally, popular databases that provide encryption services to their customers to protect their data are also examined.
△ Less
Submitted 17 November, 2022;
originally announced November 2022.
-
CHARTER: Identifying the Most-Critical Gate Operations in Quantum Circuits via Amplified Gate Reversibility
Authors:
Tirthak Patel,
Daniel Silver,
Devesh Tiwari
Abstract:
When quantum programs are executed on noisy intermediate-scale quantum (NISQ) computers, they experience hardware noise; consequently, the program outputs are often erroneous. To mitigate the adverse effects of hardware noise, it is necessary to understand the effect of hardware noise on the program output and more fundamentally, understand the impact of hardware noise on specific regions within a…
▽ More
When quantum programs are executed on noisy intermediate-scale quantum (NISQ) computers, they experience hardware noise; consequently, the program outputs are often erroneous. To mitigate the adverse effects of hardware noise, it is necessary to understand the effect of hardware noise on the program output and more fundamentally, understand the impact of hardware noise on specific regions within a quantum program. Identifying and optimizing regions that are more noise-sensitive is the key to expanding the capabilities of NISQ computers.
Toward achieving that goal, we propose CHARTER, a novel technique to pinpoint specific gates and regions within a quantum program that are the most affected by the hardware noise and that have the highest impact on the program output. Using CHARTER's methodology, programmers can obtain a precise understanding of how different components of their code affect the output and optimize those components without the need for non-scalable quantum simulation on classical computers.
△ Less
Submitted 17 November, 2022;
originally announced November 2022.
-
A Twitter-Driven Deep Learning Mechanism for the Determination of Vehicle Hijacking Spots in Cities
Authors:
Taahir Aiyoob Patel,
Clement N. Nyirenda
Abstract:
Vehicle hijacking is one of the leading crimes in many cities. For instance, in South Africa, drivers must constantly remain vigilant on the road in order to ensure that they do not become hijacking victims. This work is aimed at developing a map depicting hijacking spots in a city by using Twitter data. Tweets, which include the keyword "hijacking", are obtained in a designated city of Cape Town,…
▽ More
Vehicle hijacking is one of the leading crimes in many cities. For instance, in South Africa, drivers must constantly remain vigilant on the road in order to ensure that they do not become hijacking victims. This work is aimed at developing a map depicting hijacking spots in a city by using Twitter data. Tweets, which include the keyword "hijacking", are obtained in a designated city of Cape Town, in this work. In order to extract relevant tweets, these tweets are analyzed by using the following machine learning techniques: 1) a Multi-layer Feed-forward Neural Network (MLFNN); 2) Convolutional Neural Network; and Bidirectional Encoder Representations from Transformers (BERT). Through training and testing, CNN achieved an accuracy of 99.66%, while MLFNN and BERT achieve accuracies of 98.99% and 73.99% respectively. In terms of Recall, Precision and F1-score, CNN also achieved the best results. Therefore, CNN was used for the identification of relevant tweets. The relevant reports that it generates are visually presented on a points map of the City of Cape Town. This work used a small dataset of 426 tweets. In future, the use of evolutionary computation will be explored for purposes of optimizing the deep learning models. A mobile application is under development to make this information usable by the general public.
△ Less
Submitted 11 August, 2022;
originally announced August 2022.
-
RIBBON: Cost-Effective and QoS-Aware Deep Learning Model Inference using a Diverse Pool of Cloud Computing Instances
Authors:
Baolin Li,
Rohan Basu Roy,
Tirthak Patel,
Vijay Gadepally,
Karen Gettings,
Devesh Tiwari
Abstract:
Deep learning model inference is a key service in many businesses and scientific discovery processes. This paper introduces RIBBON, a novel deep learning inference serving system that meets two competing objectives: quality-of-service (QoS) target and cost-effectiveness. The key idea behind RIBBON is to intelligently employ a diverse set of cloud computing instances (heterogeneous instances) to me…
▽ More
Deep learning model inference is a key service in many businesses and scientific discovery processes. This paper introduces RIBBON, a novel deep learning inference serving system that meets two competing objectives: quality-of-service (QoS) target and cost-effectiveness. The key idea behind RIBBON is to intelligently employ a diverse set of cloud computing instances (heterogeneous instances) to meet the QoS target and maximize cost savings. RIBBON devises a Bayesian Optimization-driven strategy that helps users build the optimal set of heterogeneous instances for their model inference service needs on cloud computing platforms -- and, RIBBON demonstrates its superiority over existing approaches of inference serving systems using homogeneous instance pools. RIBBON saves up to 16% of the inference service cost for different learning models including emerging deep learning recommender system models and drug-discovery enabling models.
△ Less
Submitted 28 July, 2022; v1 submitted 23 July, 2022;
originally announced July 2022.
-
MISO: Exploiting Multi-Instance GPU Capability on Multi-Tenant Systems for Machine Learning
Authors:
Baolin Li,
Tirthak Patel,
Siddarth Samsi,
Vijay Gadepally,
Devesh Tiwari
Abstract:
GPU technology has been improving at an expedited pace in terms of size and performance, empowering HPC and AI/ML researchers to advance the scientific discovery process. However, this also leads to inefficient resource usage, as most GPU workloads, including complicated AI/ML models, are not able to utilize the GPU resources to their fullest extent -- encouraging support for GPU multi-tenancy. We…
▽ More
GPU technology has been improving at an expedited pace in terms of size and performance, empowering HPC and AI/ML researchers to advance the scientific discovery process. However, this also leads to inefficient resource usage, as most GPU workloads, including complicated AI/ML models, are not able to utilize the GPU resources to their fullest extent -- encouraging support for GPU multi-tenancy. We propose MISO, a technique to exploit the Multi-Instance GPU (MIG) capability on the latest NVIDIA datacenter GPUs (e.g., A100, H100) to dynamically partition GPU resources among co-located jobs. MISO's key insight is to use the lightweight, more flexible Multi-Process Service (MPS) capability to predict the best MIG partition allocation for different jobs, without incurring the overhead of implementing them during exploration. Due to its ability to utilize GPU resources more efficiently, MISO achieves 49% and 16% lower average job completion time than the unpartitioned and optimal static GPU partition schemes, respectively.
△ Less
Submitted 6 October, 2022; v1 submitted 23 July, 2022;
originally announced July 2022.
-
Predicting within and across language phoneme recognition performance of self-supervised learning speech pre-trained models
Authors:
Hang Ji,
Tanvina Patel,
Odette Scharenborg
Abstract:
In this work, we analyzed and compared speech representations extracted from different frozen self-supervised learning (SSL) speech pre-trained models on their ability to capture articulatory features (AF) information and their subsequent prediction of phone recognition performance for within and across language scenarios. Specifically, we compared CPC, wav2vec 2.0, and HuBert. First, frame-level…
▽ More
In this work, we analyzed and compared speech representations extracted from different frozen self-supervised learning (SSL) speech pre-trained models on their ability to capture articulatory features (AF) information and their subsequent prediction of phone recognition performance for within and across language scenarios. Specifically, we compared CPC, wav2vec 2.0, and HuBert. First, frame-level AF probing tasks were implemented. Subsequently, phone-level end-to-end ASR systems for phoneme recognition tasks were implemented, and the performance on the frame-level AF probing task and the phone accuracy were correlated. Compared to the conventional speech representation MFCC, all SSL pre-trained speech representations captured more AF information, and achieved better phoneme recognition performance within and across languages, with HuBert performing best. The frame-level AF probing task is a good predictor of phoneme recognition performance, showing the importance of capturing AF information in the speech representations. Compared with MFCC, in the within-language scenario, the performance of these SSL speech pre-trained models on AF probing tasks achieved a maximum relative increase of 34.4%, and it resulted in the lowest PER of 10.2%. In the cross-language scenario, the maximum relative increase of 26.7% also resulted in the lowest PER of 23.0%.
△ Less
Submitted 24 June, 2022;
originally announced June 2022.
-
Robust and Resource-Efficient Quantum Circuit Approximation
Authors:
Tirthak Patel,
Ed Younis,
Costin Iancu,
Wibe de Jong,
Devesh Tiwari
Abstract:
We present QEst, a procedure to systematically generate approximations for quantum circuits to reduce their CNOT gate count. Our approach employs circuit partitioning for scalability with procedures to 1) reduce circuit length using approximate synthesis, 2) improve fidelity by running circuits that represent key samples in the approximation space, and 3) reason about approximation upper bound. Ou…
▽ More
We present QEst, a procedure to systematically generate approximations for quantum circuits to reduce their CNOT gate count. Our approach employs circuit partitioning for scalability with procedures to 1) reduce circuit length using approximate synthesis, 2) improve fidelity by running circuits that represent key samples in the approximation space, and 3) reason about approximation upper bound. Our evaluation results indicate that our approach of "dissimilar" approximations provides close fidelity to the original circuit. Overall, the results indicate that QEst can reduce CNOT gate count by 30-80% on ideal systems and decrease the impact of noise on existing and near-future quantum systems.
△ Less
Submitted 28 August, 2021;
originally announced August 2021.
-
DisQ: A Novel Quantum Output State Classification Method on IBM Quantum Computers using OpenPulse
Authors:
Tirthak Patel,
Devesh Tiwari
Abstract:
Superconducting quantum computing technology has ushered in a new era of computational possibilities. While a considerable research effort has been geared toward improving the quantum technology and building the software stack to efficiently execute quantum algorithms with reduced error rate, effort toward optimizing how quantum output states are defined and classified for the purpose of reducing…
▽ More
Superconducting quantum computing technology has ushered in a new era of computational possibilities. While a considerable research effort has been geared toward improving the quantum technology and building the software stack to efficiently execute quantum algorithms with reduced error rate, effort toward optimizing how quantum output states are defined and classified for the purpose of reducing the error rate is still limited. To this end, this paper proposes DisQ, a quantum output state classification approach which reduces error rates of quantum programs on NISQ devices.
△ Less
Submitted 1 February, 2021;
originally announced February 2021.
-
A Machine Learning Challenge for Prognostic Modelling in Head and Neck Cancer Using Multi-modal Data
Authors:
Michal Kazmierski,
Mattea Welch,
Sejin Kim,
Chris McIntosh,
Princess Margaret Head,
Neck Cancer Group,
Katrina Rey-McIntyre,
Shao Hui Huang,
Tirth Patel,
Tony Tadic,
Michael Milosevic,
Fei-Fei Liu,
Andrew Hope,
Scott Bratman,
Benjamin Haibe-Kains
Abstract:
Accurate prognosis for an individual patient is a key component of precision oncology. Recent advances in machine learning have enabled the development of models using a wider range of data, including imaging. Radiomics aims to extract quantitative predictive and prognostic biomarkers from routine medical imaging, but evidence for computed tomography radiomics for prognosis remains inconclusive. W…
▽ More
Accurate prognosis for an individual patient is a key component of precision oncology. Recent advances in machine learning have enabled the development of models using a wider range of data, including imaging. Radiomics aims to extract quantitative predictive and prognostic biomarkers from routine medical imaging, but evidence for computed tomography radiomics for prognosis remains inconclusive. We have conducted an institutional machine learning challenge to develop an accurate model for overall survival prediction in head and neck cancer using clinical data etxracted from electronic medical records and pre-treatment radiological images, as well as to evaluate the true added benefit of radiomics for head and neck cancer prognosis. Using a large, retrospective dataset of 2,552 patients and a rigorous evaluation framework, we compared 12 different submissions using imaging and clinical data, separately or in combination. The winning approach used non-linear, multitask learning on clinical data and tumour volume, achieving high prognostic accuracy for 2-year and lifetime survival prediction and outperforming models relying on clinical data only, engineered radiomics and deep learning. Combining all submissions in an ensemble model resulted in improved accuracy, with the highest gain from a image-based deep learning model. Our results show the potential of machine learning and simple, informative prognostic factors in combination with large datasets as a tool to guide personalized cancer care.
△ Less
Submitted 28 January, 2021;
originally announced January 2021.
-
Comparative Stability of Cloned and Non-cloned Code: A Replication Study
Authors:
Oualid El Halimi,
Trith Patel,
Zohaib S. Kiyani,
Naresh. Kumar,
Ankit Singh
Abstract:
Code cloning is an important software engineering aspect. It is a common software reuse principle that consists of duplicating source code within a program or across different systems owned or maintained by the same entity. There are several contradictory claims concerning the impact of cloning on software stability and maintenance effort. Some papers state that cloning is desired since it speeds…
▽ More
Code cloning is an important software engineering aspect. It is a common software reuse principle that consists of duplicating source code within a program or across different systems owned or maintained by the same entity. There are several contradictory claims concerning the impact of cloning on software stability and maintenance effort. Some papers state that cloning is desired since it speeds up the development process and helps stakeholders meet the tight schedule and deliver on time. Other papers argue that code clone leads to code bloat and causes increase software maintenance costs due to copied defects and dead code. In this paper, we are replicating a previous study done on cloning by the original author. We are repeating his work using the same methods and metrics but with different subjects and experimenters. The paper we are addressing evaluates the impact of code cloning on code stability using three different stability-measuring methods. Our team will apply the same stability measurement techniques on a different software system developed in C programming language to determine generalizability, assure that the results are reliable, validate their outcomes, and to inspire new search by combining previous findings from related studies.
△ Less
Submitted 28 April, 2015;
originally announced April 2015.