Search | arXiv e-print repository

Can Large Language Models Code Like a Linguist?: A Case Study in Low Resource Sound Law Induction

Authors: Atharva Naik, Kexun Zhang, Nathaniel Robinson, Aravind Mysore, Clayton Marr, Hong Sng Rebecca Byrnes, Anna Cai, Kalvin Chang, David Mortensen

Abstract: Historical linguists have long written a kind of incompletely formalized ''program'' that converts reconstructed words in an ancestor language into words in one of its attested descendants that consist of a series of ordered string rewrite functions (called sound laws). They do this by observing pairs of words in the reconstructed language (protoforms) and the descendent language (reflexes) and co… ▽ More Historical linguists have long written a kind of incompletely formalized ''program'' that converts reconstructed words in an ancestor language into words in one of its attested descendants that consist of a series of ordered string rewrite functions (called sound laws). They do this by observing pairs of words in the reconstructed language (protoforms) and the descendent language (reflexes) and constructing a program that transforms protoforms into reflexes. However, writing these programs is error-prone and time-consuming. Prior work has successfully scaffolded this process computationally, but fewer researchers have tackled Sound Law Induction (SLI), which we approach in this paper by casting it as Programming by Examples. We propose a language-agnostic solution that utilizes the programming ability of Large Language Models (LLMs) by generating Python sound law programs from sound change examples. We evaluate the effectiveness of our approach for various LLMs, propose effective methods to generate additional language-agnostic synthetic data to fine-tune LLMs for SLI, and compare our method with existing automated SLI methods showing that while LLMs lag behind them they can complement some of their weaknesses. △ Less

Submitted 18 June, 2024; originally announced June 2024.

arXiv:2405.05376 [pdf, other]

Kreyòl-MT: Building MT for Latin American, Caribbean and Colonial African Creole Languages

Authors: Nathaniel R. Robinson, Raj Dabre, Ammon Shurtz, Rasul Dent, Onenamiyi Onesi, Claire Bizon Monroc, Loïc Grobol, Hasan Muhammad, Ashi Garg, Naome A. Etori, Vijay Murari Tiyyala, Olanrewaju Samuel, Matthew Dean Stutzman, Bismarck Bamfo Odoom, Sanjeev Khudanpur, Stephen D. Richardson, Kenton Murray

Abstract: A majority of language technologies are tailored for a small number of high-resource languages, while relatively many low-resource languages are neglected. One such group, Creole languages, have long been marginalized in academic study, though their speakers could benefit from machine translation (MT). These languages are predominantly used in much of Latin America, Africa and the Caribbean. We pr… ▽ More A majority of language technologies are tailored for a small number of high-resource languages, while relatively many low-resource languages are neglected. One such group, Creole languages, have long been marginalized in academic study, though their speakers could benefit from machine translation (MT). These languages are predominantly used in much of Latin America, Africa and the Caribbean. We present the largest cumulative dataset to date for Creole language MT, including 14.5M unique Creole sentences with parallel translations -- 11.6M of which we release publicly, and the largest bitexts gathered to date for 41 languages -- the first ever for 21. In addition, we provide MT models supporting all 41 Creole languages in 172 translation directions. Given our diverse dataset, we produce a model for Creole language MT exposed to more genre diversity than ever before, which outperforms a genre-specific Creole MT model on its own benchmark for 26 of 34 translation directions. △ Less

Submitted 13 May, 2024; v1 submitted 8 May, 2024; originally announced May 2024.

Comments: NAACL 2024

arXiv:2403.13169 [pdf, other]

Wav2Gloss: Generating Interlinear Glossed Text from Speech

Authors: Taiqi He, Kwanghee Choi, Lindia Tjuatja, Nathaniel R. Robinson, Jiatong Shi, Shinji Watanabe, Graham Neubig, David R. Mortensen, Lori Levin

Abstract: Thousands of the world's languages are in danger of extinction--a tremendous threat to cultural identities and human language diversity. Interlinear Glossed Text (IGT) is a form of linguistic annotation that can support documentation and resource creation for these languages' communities. IGT typically consists of (1) transcriptions, (2) morphological segmentation, (3) glosses, and (4) free transl… ▽ More Thousands of the world's languages are in danger of extinction--a tremendous threat to cultural identities and human language diversity. Interlinear Glossed Text (IGT) is a form of linguistic annotation that can support documentation and resource creation for these languages' communities. IGT typically consists of (1) transcriptions, (2) morphological segmentation, (3) glosses, and (4) free translations to a majority language. We propose Wav2Gloss: a task in which these four annotation components are extracted automatically from speech, and introduce the first dataset to this end, Fieldwork: a corpus of speech with all these annotations, derived from the work of field linguists, covering 37 languages, with standard formatting, and train/dev/test splits. We provide various baselines to lay the groundwork for future research on IGT generation from speech, such as end-to-end versus cascaded, monolingual versus multilingual, and single-task versus multi-task approaches. △ Less

Submitted 5 June, 2024; v1 submitted 19 March, 2024; originally announced March 2024.

Comments: ACL 2024 camera ready version

arXiv:2402.01582 [pdf]

Automating Sound Change Prediction for Phylogenetic Inference: A Tukanoan Case Study

Authors: Kalvin Chang, Nathaniel R. Robinson, Anna Cai, Ting Chen, Annie Zhang, David R. Mortensen

Abstract: We describe a set of new methods to partially automate linguistic phylogenetic inference given (1) cognate sets with their respective protoforms and sound laws, (2) a mapping from phones to their articulatory features and (3) a typological database of sound changes. We train a neural network on these sound change data to weight articulatory distances between phones and predict intermediate sound c… ▽ More We describe a set of new methods to partially automate linguistic phylogenetic inference given (1) cognate sets with their respective protoforms and sound laws, (2) a mapping from phones to their articulatory features and (3) a typological database of sound changes. We train a neural network on these sound change data to weight articulatory distances between phones and predict intermediate sound change steps between historical protoforms and their modern descendants, replacing a linguistic expert in part of a parsimony-based phylogenetic inference algorithm. In our best experiments on Tukanoan languages, this method produces trees with a Generalized Quartet Distance of 0.12 from a tree that used expert annotations, a significant improvement over other semi-automated baselines. We discuss potential benefits and drawbacks to our neural approach and parsimony-based tree prediction. We also experiment with a minimal generalization learner for automatic sound law induction, finding it comparably effective to sound laws from expert annotation. Our code is publicly available at https://github.com/cmu-llab/aiscp. △ Less

Submitted 2 February, 2024; originally announced February 2024.

Comments: Accepted to LChange 2023

arXiv:2312.00183 [pdf, other]

RNA-KG: An ontology-based knowledge graph for representing interactions involving RNA molecules

Authors: Emanuele Cavalleri, Alberto Cabri, Mauricio Soto-Gomez, Sara Bonfitto, Paolo Perlasca, Jessica Gliozzo, Tiffany J. Callahan, Justin Reese, Peter N Robinson, Elena Casiraghi, Giorgio Valentini, Marco Mesiti

Abstract: The "RNA world" represents a novel frontier for the study of fundamental biological processes and human diseases and is paving the way for the development of new drugs tailored to the patient's biomolecular characteristics. Although scientific data about coding and non-coding RNA molecules are continuously produced and available from public repositories, they are scattered across different databas… ▽ More The "RNA world" represents a novel frontier for the study of fundamental biological processes and human diseases and is paving the way for the development of new drugs tailored to the patient's biomolecular characteristics. Although scientific data about coding and non-coding RNA molecules are continuously produced and available from public repositories, they are scattered across different databases and a centralized, uniform, and semantically consistent representation of the "RNA world" is still lacking. We propose RNA-KG, a knowledge graph encompassing biological knowledge about RNAs gathered from more than 50 public databases, integrating functional relationships with genes, proteins, and chemicals and ontologically grounded biomedical concepts. To develop RNA-KG, we first identified, pre-processed, and characterized each data source; next, we built a meta-graph that provides an ontological description of the KG by representing all the bio-molecular entities and medical concepts of interest in this domain, as well as the types of interactions connecting them. Finally, we leveraged an instance-based semantically abstracted knowledge model to specify the ontological alignment according to which RNA-KG was generated. RNA-KG can be downloaded in different formats and also queried by a SPARQL endpoint. A thorough topological analysis of the resulting heterogeneous graph provides further insights into the characteristics of the "RNA world". RNA-KG can be both directly explored and visualized, and/or analyzed by applying computational methods to infer bio-medical knowledge from its heterogeneous nodes and edges. The resource can be easily updated with new experimental data, and specific views of the overall KG can be extracted according to the bio-medical problem to be studied. △ Less

Submitted 30 November, 2023; originally announced December 2023.

arXiv:2309.17169 [pdf]

An evaluation of GPT models for phenotype concept recognition

Authors: Tudor Groza, Harry Caufield, Dylan Gration, Gareth Baynam, Melissa A Haendel, Peter N Robinson, Christopher J Mungall, Justin T Reese

Abstract: Objective: Clinical deep phenotyping and phenotype annotation play a critical role in both the diagnosis of patients with rare disorders as well as in building computationally-tractable knowledge in the rare disorders field. These processes rely on using ontology concepts, often from the Human Phenotype Ontology, in conjunction with a phenotype concept recognition task (supported usually by machin… ▽ More Objective: Clinical deep phenotyping and phenotype annotation play a critical role in both the diagnosis of patients with rare disorders as well as in building computationally-tractable knowledge in the rare disorders field. These processes rely on using ontology concepts, often from the Human Phenotype Ontology, in conjunction with a phenotype concept recognition task (supported usually by machine learning methods) to curate patient profiles or existing scientific literature. With the significant shift in the use of large language models (LLMs) for most NLP tasks, we examine the performance of the latest Generative Pre-trained Transformer (GPT) models underpinning ChatGPT as a foundation for the tasks of clinical phenotyping and phenotype annotation. Materials and Methods: The experimental setup of the study included seven prompts of various levels of specificity, two GPT models (gpt-3.5-turbo and gpt-4.0) and two established gold standard corpora for phenotype recognition, one consisting of publication abstracts and the other clinical observations. Results: Our results show that, with an appropriate setup, these models can achieve state of the art performance. The best run, using few-shot learning, achieved 0.58 macro F1 score on publication abstracts and 0.75 macro F1 score on clinical observations, the former being comparable with the state of the art, while the latter surpassing the current best in class tool. Conclusion: While the results are promising, the non-deterministic nature of the outcomes, the high cost and the lack of concordance between different runs using the same prompt and input make the use of these LLMs challenging for this particular task. △ Less

Submitted 22 November, 2023; v1 submitted 29 September, 2023; originally announced September 2023.

arXiv:2309.07423 [pdf, other]

ChatGPT MT: Competitive for High- (but not Low-) Resource Languages

Authors: Nathaniel R. Robinson, Perez Ogayo, David R. Mortensen, Graham Neubig

Abstract: Large language models (LLMs) implicitly learn to perform a range of language tasks, including machine translation (MT). Previous studies explore aspects of LLMs' MT capabilities. However, there exist a wide variety of languages for which recent LLM MT performance has never before been evaluated. Without published experimental evidence on the matter, it is difficult for speakers of the world's dive… ▽ More Large language models (LLMs) implicitly learn to perform a range of language tasks, including machine translation (MT). Previous studies explore aspects of LLMs' MT capabilities. However, there exist a wide variety of languages for which recent LLM MT performance has never before been evaluated. Without published experimental evidence on the matter, it is difficult for speakers of the world's diverse languages to know how and whether they can use LLMs for their languages. We present the first experimental evidence for an expansive set of 204 languages, along with MT cost analysis, using the FLORES-200 benchmark. Trends reveal that GPT models approach or exceed traditional MT model performance for some high-resource languages (HRLs) but consistently lag for low-resource languages (LRLs), under-performing traditional MT for 84.1% of languages we covered. Our analysis reveals that a language's resource level is the most important feature in determining ChatGPT's relative ability to translate it, and suggests that ChatGPT is especially disadvantaged for LRLs and African languages. △ Less

Submitted 14 September, 2023; originally announced September 2023.

Comments: 27 pages, 9 figures, 14 tables

arXiv:2308.06435 [pdf]

A Brief Wellbeing Training Session Delivered by a Humanoid Social Robot: A Pilot Randomized Controlled Trial

Authors: Nicole Robinson, Jennifer Connolly, Gavin Suddrey, David J. Kavanagh

Abstract: Mental health and psychological distress are rising in adults, showing the importance of wellbeing promotion, support, and technique practice that is effective and accessible. Interactive social robots have been tested to deliver health programs but have not been explored to deliver wellbeing technique training in detail. A pilot randomised controlled trial was conducted to explore the feasibility… ▽ More Mental health and psychological distress are rising in adults, showing the importance of wellbeing promotion, support, and technique practice that is effective and accessible. Interactive social robots have been tested to deliver health programs but have not been explored to deliver wellbeing technique training in detail. A pilot randomised controlled trial was conducted to explore the feasibility of an autonomous humanoid social robot to deliver a brief mindful breathing technique to promote information around wellbeing. It contained two conditions: brief technique training (Technique) and control designed to represent a simple wait-list activity to represent a relationship-building discussion (Simple Rapport). This trial also explored willingness to discuss health-related topics with a robot. Recruitment uptake rate through convenience sampling was high (53%). A total of 230 participants took part (mean age = 29 years) with 71% being higher education students. There were moderate ratings of technique enjoyment, perceived usefulness, and likelihood to repeat the technique again. Interaction effects were found across measures with scores varying across gender and distress levels. Males with high distress and females with low distress who received the simple rapport activity reported greater comfort to discuss non-health topics than males with low distress and females with high distress. This trial marks a notable step towards the design and deployment of an autonomous wellbeing intervention to investigate the impact of a brief robot-delivered mindfulness training program for a sub-clinical population. △ Less

Submitted 11 August, 2023; originally announced August 2023.

arXiv:2307.15363 [pdf, other]

doi 10.1145/3570731

Robotic Vision for Human-Robot Interaction and Collaboration: A Survey and Systematic Review

Authors: Nicole Robinson, Brendan Tidd, Dylan Campbell, Dana Kulić, Peter Corke

Abstract: Robotic vision for human-robot interaction and collaboration is a critical process for robots to collect and interpret detailed information related to human actions, goals, and preferences, enabling robots to provide more useful services to people. This survey and systematic review presents a comprehensive analysis on robotic vision in human-robot interaction and collaboration over the last 10 yea… ▽ More Robotic vision for human-robot interaction and collaboration is a critical process for robots to collect and interpret detailed information related to human actions, goals, and preferences, enabling robots to provide more useful services to people. This survey and systematic review presents a comprehensive analysis on robotic vision in human-robot interaction and collaboration over the last 10 years. From a detailed search of 3850 articles, systematic extraction and evaluation was used to identify and explore 310 papers in depth. These papers described robots with some level of autonomy using robotic vision for locomotion, manipulation and/or visual communication to collaborate or interact with people. This paper provides an in-depth analysis of current trends, common domains, methods and procedures, technical processes, data sets and models, experimental testing, sample populations, performance metrics and future challenges. This manuscript found that robotic vision was often used in action and gesture recognition, robot movement in human spaces, object handover and collaborative actions, social communication and learning from demonstration. Few high-impact and novel techniques from the computer vision field had been translated into human-robot interaction and collaboration. Overall, notable advancements have been made on how to develop and deploy robots to assist people. △ Less

Submitted 28 July, 2023; originally announced July 2023.

Journal ref: ACM Transactions on Human-Robot Interaction (2023) Volume 12 Issue 1 Article No 12 pp 1-66

arXiv:2307.05727 [pdf]

An Open-Source Knowledge Graph Ecosystem for the Life Sciences

Authors: Tiffany J. Callahan, Ignacio J. Tripodi, Adrianne L. Stefanski, Luca Cappelletti, Sanya B. Taneja, Jordan M. Wyrwa, Elena Casiraghi, Nicolas A. Matentzoglu, Justin Reese, Jonathan C. Silverstein, Charles Tapley Hoyt, Richard D. Boyce, Scott A. Malec, Deepak R. Unni, Marcin P. Joachimiak, Peter N. Robinson, Christopher J. Mungall, Emanuele Cavalleri, Tommaso Fontana, Giorgio Valentini, Marco Mesiti, Lucas A. Gillenwater, Brook Santangelo, Nicole A. Vasilevsky, Robert Hoehndorf , et al. (7 additional authors not shown)

Abstract: Translational research requires data at multiple scales of biological organization. Advancements in sequencing and multi-omics technologies have increased the availability of these data, but researchers face significant integration challenges. Knowledge graphs (KGs) are used to model complex phenomena, and methods exist to construct them automatically. However, tackling complex biomedical integrat… ▽ More Translational research requires data at multiple scales of biological organization. Advancements in sequencing and multi-omics technologies have increased the availability of these data, but researchers face significant integration challenges. Knowledge graphs (KGs) are used to model complex phenomena, and methods exist to construct them automatically. However, tackling complex biomedical integration problems requires flexibility in the way knowledge is modeled. Moreover, existing KG construction methods provide robust tooling at the cost of fixed or limited choices among knowledge representation models. PheKnowLator (Phenotype Knowledge Translator) is a semantic ecosystem for automating the FAIR (Findable, Accessible, Interoperable, and Reusable) construction of ontologically grounded KGs with fully customizable knowledge representation. The ecosystem includes KG construction resources (e.g., data preparation APIs), analysis tools (e.g., SPARQL endpoints and abstraction algorithms), and benchmarks (e.g., prebuilt KGs and embeddings). We evaluated the ecosystem by systematically comparing it to existing open-source KG construction methods and by analyzing its computational performance when used to construct 12 large-scale KGs. With flexible knowledge representation, PheKnowLator enables fully customizable KGs without compromising performance or usability. △ Less

Submitted 30 January, 2024; v1 submitted 11 July, 2023; originally announced July 2023.

arXiv:2304.02711 [pdf, other]

Structured prompt interrogation and recursive extraction of semantics (SPIRES): A method for populating knowledge bases using zero-shot learning

Authors: J. Harry Caufield, Harshad Hegde, Vincent Emonet, Nomi L. Harris, Marcin P. Joachimiak, Nicolas Matentzoglu, HyeongSik Kim, Sierra A. T. Moxon, Justin T. Reese, Melissa A. Haendel, Peter N. Robinson, Christopher J. Mungall

Abstract: Creating knowledge bases and ontologies is a time consuming task that relies on a manual curation. AI/NLP approaches can assist expert curators in populating these knowledge bases, but current approaches rely on extensive training data, and are not able to populate arbitrary complex nested knowledge schemas. Here we present Structured Prompt Interrogation and Recursive Extraction of Semantics (S… ▽ More Creating knowledge bases and ontologies is a time consuming task that relies on a manual curation. AI/NLP approaches can assist expert curators in populating these knowledge bases, but current approaches rely on extensive training data, and are not able to populate arbitrary complex nested knowledge schemas. Here we present Structured Prompt Interrogation and Recursive Extraction of Semantics (SPIRES), a Knowledge Extraction approach that relies on the ability of Large Language Models (LLMs) to perform zero-shot learning (ZSL) and general-purpose query answering from flexible prompts and return information conforming to a specified schema. Given a detailed, user-defined knowledge schema and an input text, SPIRES recursively performs prompt interrogation against GPT-3+ to obtain a set of responses matching the provided schema. SPIRES uses existing ontologies and vocabularies to provide identifiers for all matched elements. We present examples of use of SPIRES in different domains, including extraction of food recipes, multi-species cellular signaling pathways, disease treatments, multi-step drug mechanisms, and chemical to disease causation graphs. Current SPIRES accuracy is comparable to the mid-range of existing Relation Extraction (RE) methods, but has the advantage of easy customization, flexibility, and, crucially, the ability to perform new tasks in the absence of any training data. This method supports a general strategy of leveraging the language interpreting capabilities of LLMs to assemble knowledge bases, assisting manual knowledge curation and acquisition while supporting validation with publicly-available databases and ontologies external to the LLM. SPIRES is available as part of the open source OntoGPT package: https://github.com/ monarch-initiative/ontogpt. △ Less

Submitted 22 December, 2023; v1 submitted 5 April, 2023; originally announced April 2023.

Comments: Updated 2023-12-22

arXiv:2304.02541 [pdf, other]

PWESuite: Phonetic Word Embeddings and Tasks They Facilitate

Authors: Vilém Zouhar, Kalvin Chang, Chenxuan Cui, Nathaniel Carlson, Nathaniel Robinson, Mrinmaya Sachan, David Mortensen

Abstract: Mapping words into a fixed-dimensional vector space is the backbone of modern NLP. While most word embedding methods successfully encode semantic information, they overlook phonetic information that is crucial for many tasks. We develop three methods that use articulatory features to build phonetically informed word embeddings. To address the inconsistent evaluation of existing phonetic word embed… ▽ More Mapping words into a fixed-dimensional vector space is the backbone of modern NLP. While most word embedding methods successfully encode semantic information, they overlook phonetic information that is crucial for many tasks. We develop three methods that use articulatory features to build phonetically informed word embeddings. To address the inconsistent evaluation of existing phonetic word embedding methods, we also contribute a task suite to fairly evaluate past, current, and future methods. We evaluate both (1) intrinsic aspects of phonetic word embeddings, such as word retrieval and correlation with sound similarity, and (2) extrinsic performance on tasks such as rhyme and cognate detection and sound analogies. We hope our task suite will promote reproducibility and inspire future phonetic embedding research. △ Less

Submitted 26 March, 2024; v1 submitted 5 April, 2023; originally announced April 2023.

Comments: LREC-COLING 2024

arXiv:2302.10800 [pdf]

KG-Hub -- Building and Exchanging Biological Knowledge Graphs

Authors: J Harry Caufield, Tim Putman, Kevin Schaper, Deepak R Unni, Harshad Hegde, Tiffany J Callahan, Luca Cappelletti, Sierra AT Moxon, Vida Ravanmehr, Seth Carbon, Lauren E Chan, Katherina Cortes, Kent A Shefchek, Glass Elsarboukh, James P Balhoff, Tommaso Fontana, Nicolas Matentzoglu, Richard M Bruskiewich, Anne E Thessen, Nomi L Harris, Monica C Munoz-Torres, Melissa A Haendel, Peter N Robinson, Marcin P Joachimiak, Christopher J Mungall , et al. (1 additional authors not shown)

Abstract: Knowledge graphs (KGs) are a powerful approach for integrating heterogeneous data and making inferences in biology and many other domains, but a coherent solution for constructing, exchanging, and facilitating the downstream use of knowledge graphs is lacking. Here we present KG-Hub, a platform that enables standardized construction, exchange, and reuse of knowledge graphs. Features include a simp… ▽ More Knowledge graphs (KGs) are a powerful approach for integrating heterogeneous data and making inferences in biology and many other domains, but a coherent solution for constructing, exchanging, and facilitating the downstream use of knowledge graphs is lacking. Here we present KG-Hub, a platform that enables standardized construction, exchange, and reuse of knowledge graphs. Features include a simple, modular extract-transform-load (ETL) pattern for producing graphs compliant with Biolink Model (a high-level data model for standardizing biological data), easy integration of any OBO (Open Biological and Biomedical Ontologies) ontology, cached downloads of upstream data sources, versioned and automatically updated builds with stable URLs, web-browsable storage of KG artifacts on cloud infrastructure, and easy reuse of transformed subgraphs across projects. Current KG-Hub projects span use cases including COVID-19 research, drug repurposing, microbial-environmental interactions, and rare disease research. KG-Hub is equipped with tooling to easily analyze and manipulate knowledge graphs. KG-Hub is also tightly integrated with graph machine learning (ML) tools which allow automated graph machine learning, including node embeddings and training of models for link prediction and node classification. △ Less

Submitted 31 January, 2023; originally announced February 2023.

arXiv:2212.05626 [pdf, other]

Human-Robot Team Performance Compared to Full Robot Autonomy in 16 Real-World Search and Rescue Missions: Adaptation of the DARPA Subterranean Challenge

Authors: Nicole Robinson, Jason Williams, David Howard, Brendan Tidd, Fletcher Talbot, Brett Wood, Alex Pitt, Navinda Kottege, Dana Kulić

Abstract: Human operators in human-robot teams are commonly perceived to be critical for mission success. To explore the direct and perceived impact of operator input on task success and team performance, 16 real-world missions (10 hrs) were conducted based on the DARPA Subterranean Challenge. These missions were to deploy a heterogeneous team of robots for a search task to locate and identify artifacts suc… ▽ More Human operators in human-robot teams are commonly perceived to be critical for mission success. To explore the direct and perceived impact of operator input on task success and team performance, 16 real-world missions (10 hrs) were conducted based on the DARPA Subterranean Challenge. These missions were to deploy a heterogeneous team of robots for a search task to locate and identify artifacts such as climbing rope, drills and mannequins representing human survivors. Two conditions were evaluated: human operators that could control the robot team with state-of-the-art autonomy (Human-Robot Team) compared to autonomous missions without human operator input (Robot-Autonomy). Human-Robot Teams were often in directed autonomy mode (70% of mission time), found more items, traversed more distance, covered more unique ground, and had a higher time between safety-related events. Human-Robot Teams were faster at finding the first artifact, but slower to respond to information from the robot team. In routine conditions, scores were comparable for artifacts, distance, and coverage. Reasons for intervention included creating waypoints to prioritise high-yield areas, and to navigate through error-prone spaces. After observing robot autonomy, operators reported increases in robot competency and trust, but that robot behaviour was not always transparent and understandable, even after high mission performance. △ Less

Submitted 11 December, 2022; originally announced December 2022.

Comments: Submitted to Transactions on Human-Robot Interaction

arXiv:2209.06295 [pdf, other]

Data-adaptive Transfer Learning for Translation: A Case Study in Haitian and Jamaican

Authors: Nathaniel R. Robinson, Cameron J. Hogan, Nancy Fulda, David R. Mortensen

Abstract: Multilingual transfer techniques often improve low-resource machine translation (MT). Many of these techniques are applied without considering data characteristics. We show in the context of Haitian-to-English translation that transfer effectiveness is correlated with amount of training data and relationships between knowledge-sharing languages. Our experiments suggest that for some languages beyo… ▽ More Multilingual transfer techniques often improve low-resource machine translation (MT). Many of these techniques are applied without considering data characteristics. We show in the context of Haitian-to-English translation that transfer effectiveness is correlated with amount of training data and relationships between knowledge-sharing languages. Our experiments suggest that for some languages beyond a threshold of authentic data, back-translation augmentation methods are counterproductive, while cross-lingual transfer from a sufficiently related language is preferred. We complement this finding by contributing a rule-based French-Haitian orthographic and syntactic engine and a novel method for phonological embedding. When used with multilingual techniques, orthographic transformation makes statistically significant improvements over conventional methods. And in very low-resource Jamaican MT, code-switching with a transfer language for orthographic resemblance yields a 6.63 BLEU point advantage. △ Less

Submitted 13 September, 2022; originally announced September 2022.

arXiv:2209.04732 [pdf]

Ontologizing Health Systems Data at Scale: Making Translational Discovery a Reality

Authors: Tiffany J. Callahan, Adrianne L. Stefanski, Jordan M. Wyrwa, Chenjie Zeng, Anna Ostropolets, Juan M. Banda, William A. Baumgartner Jr., Richard D. Boyce, Elena Casiraghi, Ben D. Coleman, Janine H. Collins, Sara J. Deakyne-Davies, James A. Feinstein, Melissa A. Haendel, Asiyah Y. Lin, Blake Martin, Nicolas A. Matentzoglu, Daniella Meeker, Justin Reese, Jessica Sinclair, Sanya B. Taneja, Katy E. Trinkley, Nicole A. Vasilevsky, Andrew Williams, Xingman A. Zhang , et al. (7 additional authors not shown)

Abstract: Background: Common data models solve many challenges of standardizing electronic health record (EHR) data, but are unable to semantically integrate all the resources needed for deep phenotyping. Open Biological and Biomedical Ontology (OBO) Foundry ontologies provide computable representations of biological knowledge and enable the integration of heterogeneous data. However, mapping EHR data to OB… ▽ More Background: Common data models solve many challenges of standardizing electronic health record (EHR) data, but are unable to semantically integrate all the resources needed for deep phenotyping. Open Biological and Biomedical Ontology (OBO) Foundry ontologies provide computable representations of biological knowledge and enable the integration of heterogeneous data. However, mapping EHR data to OBO ontologies requires significant manual curation and domain expertise. Objective: We introduce OMOP2OBO, an algorithm for mapping Observational Medical Outcomes Partnership (OMOP) vocabularies to OBO ontologies. Results: Using OMOP2OBO, we produced mappings for 92,367 conditions, 8611 drug ingredients, and 10,673 measurement results, which covered 68-99% of concepts used in clinical practice when examined across 24 hospitals. When used to phenotype rare disease patients, the mappings helped systematically identify undiagnosed patients who might benefit from genetic testing. Conclusions: By aligning OMOP vocabularies to OBO ontologies our algorithm presents new opportunities to advance EHR-based deep phenotyping. △ Less

Submitted 30 January, 2023; v1 submitted 10 September, 2022; originally announced September 2022.

Comments: Supplementary Material is included at the end of the manuscript

ACM Class: J.3

arXiv:2208.11856 [pdf, other]

Design and Implementation of a Human-Robot Joint Action Framework using Augmented Reality and Eye Gaze

Authors: Wesley P. Chan, Morgan Crouch, Khoa Hoang, Charlie Chen, Nicole Robinson, Elizabeth Croft

Abstract: When humans work together to complete a joint task, each person builds an internal model of the situation and how it will evolve. Efficient collaboration is dependent on how these individual models overlap to form a shared mental model among team members, which is important for collaborative processes in human-robot teams. The development and maintenance of an accurate shared mental model requires… ▽ More When humans work together to complete a joint task, each person builds an internal model of the situation and how it will evolve. Efficient collaboration is dependent on how these individual models overlap to form a shared mental model among team members, which is important for collaborative processes in human-robot teams. The development and maintenance of an accurate shared mental model requires bidirectional communication of individual intent and the ability to interpret the intent of other team members. To enable effective human-robot collaboration, this paper presents a design and implementation of a novel joint action framework in human-robot team collaboration, utilizing augmented reality (AR) technology and user eye gaze to enable bidirectional communication of intent. We tested our new framework through a user study with 37 participants, and found that our system improves task efficiency, trust, as well as task fluency. Therefore, using AR and eye gaze to enable bidirectional communication is a promising mean to improve core components that influence collaboration between humans and robots. △ Less

Submitted 24 August, 2022; originally announced August 2022.

arXiv:2207.09889 [pdf, other]

When Is TTS Augmentation Through a Pivot Language Useful?

Authors: Nathaniel Robinson, Perez Ogayo, Swetha Gangu, David R. Mortensen, Shinji Watanabe

Abstract: Developing Automatic Speech Recognition (ASR) for low-resource languages is a challenge due to the small amount of transcribed audio data. For many such languages, audio and text are available separately, but not audio with transcriptions. Using text, speech can be synthetically produced via text-to-speech (TTS) systems. However, many low-resource languages do not have quality TTS systems either.… ▽ More Developing Automatic Speech Recognition (ASR) for low-resource languages is a challenge due to the small amount of transcribed audio data. For many such languages, audio and text are available separately, but not audio with transcriptions. Using text, speech can be synthetically produced via text-to-speech (TTS) systems. However, many low-resource languages do not have quality TTS systems either. We propose an alternative: produce synthetic audio by running text from the target language through a trained TTS system for a higher-resource pivot language. We investigate when and how this technique is most effective in low-resource settings. In our experiments, using several thousand synthetic TTS text-speech pairs and duplicating authentic data to balance yields optimal results. Our findings suggest that searching over a set of candidate pivot languages can lead to marginal improvements and that, surprisingly, ASR performance can by harmed by increases in measured TTS quality. Application of these findings improves ASR by 64.5\% and 45.0\% character error reduction rate (CERR) respectively for two low-resource languages: Guaraní and Suba. △ Less

Submitted 20 July, 2022; originally announced July 2022.

arXiv:2206.06444 [pdf]

A method for comparing multiple imputation techniques: a case study on the U.S. National COVID Cohort Collaborative

Authors: Elena Casiraghi, Rachel Wong, Margaret Hall, Ben Coleman, Marco Notaro, Michael D. Evans, Jena S. Tronieri, Hannah Blau, Bryan Laraway, Tiffany J. Callahan, Lauren E. Chan, Carolyn T. Bramante, John B. Buse, Richard A. Moffitt, Til Sturmer, Steven G. Johnson, Yu Raymond Shao, Justin Reese, Peter N. Robinson, Alberto Paccanaro, Giorgio Valentini, Jared D. Huling, Kenneth Wilkins, :, Tell Bennet , et al. (12 additional authors not shown)

Abstract: Healthcare datasets obtained from Electronic Health Records have proven to be extremely useful to assess associations between patients' predictors and outcomes of interest. However, these datasets often suffer from missing values in a high proportion of cases and the simple removal of these cases may introduce severe bias. For these reasons, several multiple imputation algorithms have been propose… ▽ More Healthcare datasets obtained from Electronic Health Records have proven to be extremely useful to assess associations between patients' predictors and outcomes of interest. However, these datasets often suffer from missing values in a high proportion of cases and the simple removal of these cases may introduce severe bias. For these reasons, several multiple imputation algorithms have been proposed to attempt to recover the missing information. Each algorithm presents strengths and weaknesses, and there is currently no consensus on which multiple imputation algorithms works best in a given scenario. Furthermore, the selection of each algorithm parameters and data-related modelling choices are also both crucial and challenging. In this paper, we propose a novel framework to numerically evaluate strategies for handling missing data in the context of statistical analysis, with a particular focus on multiple imputation techniques. We demonstrate the feasibility of our approach on a large cohort of type-2 diabetes patients provided by the National COVID Cohort Collaborative (N3C) Enclave, where we explored the influence of various patient characteristics on outcomes related to COVID-19. Our analysis included classic multiple imputation techniques as well as simple complete-case Inverse Probability Weighted models. The experiments presented here show that our approach could effectively highlight the most valid and performant missing-data handling strategy for our case study. Moreover, our methodology allowed us to gain an understanding of the behavior of the different models and of how it changed as we modified their parameters. Our method is general and can be applied to different research fields and on datasets containing heterogeneous types. △ Less

Submitted 25 September, 2022; v1 submitted 13 June, 2022; originally announced June 2022.

arXiv:2203.15172 [pdf, other]

Assessing Evolutionary Terrain Generation Methods for Curriculum Reinforcement Learning

Authors: David Howard, Josh Kannemeyer, Davide Dolcetti, Humphrey Munn, Nicole Robinson

Abstract: Curriculum learning allows complex tasks to be mastered via incremental progression over `stepping stone' goals towards a final desired behaviour. Typical implementations learn locomotion policies for challenging environments through gradual complexification of a terrain mesh generated through a parameterised noise function. To date, researchers have predominantly generated terrains from a limited… ▽ More Curriculum learning allows complex tasks to be mastered via incremental progression over `stepping stone' goals towards a final desired behaviour. Typical implementations learn locomotion policies for challenging environments through gradual complexification of a terrain mesh generated through a parameterised noise function. To date, researchers have predominantly generated terrains from a limited range of noise functions, and the effect of the generator on the learning process is underrepresented in the literature. We compare popular noise-based terrain generators to two indirect encodings, CPPN and GAN. To allow direct comparison between both direct and indirect representations, we assess the impact of a range of representation-agnostic MAP-Elites feature descriptors that compute metrics directly from the generated terrain meshes. Next, performance and coverage are assessed when training a humanoid robot in a physics simulator using the PPO algorithm. Results describe key differences between the generators that inform their use in curriculum learning, and present a range of useful feature descriptors for uptake by the community. △ Less

Submitted 28 March, 2022; originally announced March 2022.

Comments: 8 pages, 8 figures, 2022 Genetic and Evolutionary Computing Conference (GECCO'22)

arXiv:2203.10403 [pdf]

An Exploratory Study into Vulnerability Chaining Blindness Terminology and Viability

Authors: Nikki Robinson

Abstract: To tie together the concepts of linkage blindness and the inability to link vulnerabilities together in a Vulnerability Management Program (VMP), the researcher postulated new terminology. The terminology of vulnerability chaining blindness is proposed to understand the underlying issues behind vulnerability management and vulnerabilities that can be used in combination. The general problem is tha… ▽ More To tie together the concepts of linkage blindness and the inability to link vulnerabilities together in a Vulnerability Management Program (VMP), the researcher postulated new terminology. The terminology of vulnerability chaining blindness is proposed to understand the underlying issues behind vulnerability management and vulnerabilities that can be used in combination. The general problem is that IT and cybersecurity professionals have a difficult time identifying chained vulnerabilities due to the complexity of vulnerability prioritization and remediation (Abomhara & Køien, 2015; Felmetsger et al., 2010). The specific problem is the inability to link and view multiple vulnerabilities in combination based on limited expertise and awareness of vulnerability chaining (Tang et al., 2017). The population of this study was limited to one focus group, within the IT and Security fields, within the United States. The sample size consisted of one focus group comprised of 8-10 IT and cybersecurity professionals. The research questions focused on if participants were aware of linkage blindness or vulnerability chaining, as well as if vulnerability chaining blindness would be applicable to describe the phenomenon. Several themes emerged through top-level, eclectic, and second-level coding data analysis. These themes included complexity in cybersecurity programs, new concepts in vulnerability management, as well as fear of the unknown and where security meets technology. Keywords: linkage blindness, vulnerability chaining, vulnerability chaining blindness, vulnerability management △ Less

Submitted 19 March, 2022; originally announced March 2022.

Comments: 24 pages

arXiv:2110.06196 [pdf, other]

GRAPE for Fast and Scalable Graph Processing and random walk-based Embedding

Authors: Luca Cappelletti, Tommaso Fontana, Elena Casiraghi, Vida Ravanmehr, Tiffany J. Callahan, Carlos Cano, Marcin P. Joachimiak, Christopher J. Mungall, Peter N. Robinson, Justin Reese, Giorgio Valentini

Abstract: Graph Representation Learning (GRL) methods opened new avenues for addressing complex, real-world problems represented by graphs. However, many graphs used in these applications comprise millions of nodes and billions of edges and are beyond the capabilities of current methods and software implementations. We present GRAPE, a software resource for graph processing and embedding that can scale with… ▽ More Graph Representation Learning (GRL) methods opened new avenues for addressing complex, real-world problems represented by graphs. However, many graphs used in these applications comprise millions of nodes and billions of edges and are beyond the capabilities of current methods and software implementations. We present GRAPE, a software resource for graph processing and embedding that can scale with big graphs by using specialized and smart data structures, algorithms, and a fast parallel implementation of random walk-based methods. Compared with state-of-the-art software resources, GRAPE shows an improvement of orders of magnitude in empirical space and time complexity, as well as a competitive edge and node label prediction performance. GRAPE comprises about 1.7 million well-documented lines of Python and Rust code and provides 69 node embedding methods, 25 inference models, a collection of efficient graph processing utilities and over 80,000 graphs from the literature and other sources. Standardized interfaces allow seamless integration of third-party libraries, while ready-to-use and modular pipelines permit an easy-to-use evaluation of GRL methods, therefore also positioning GRAPE as a software resource to perform a fair comparison between methods and libraries for graph processing and embedding. △ Less

Submitted 7 May, 2023; v1 submitted 12 October, 2021; originally announced October 2021.

ACM Class: D.m; E.2; I.2.6; I.5.5

arXiv:2109.09908 [pdf, other]

A Proposed Set of Communicative Gestures for Human Robot Interaction and an RGB Image-based Gesture Recognizer Implemented in ROS

Authors: Jia Chuan A. Tan, Wesley P. Chan, Nicole L. Robinson, Elizabeth A. Croft, Dana Kulic

Abstract: We propose a set of communicative gestures and develop a gesture recognition system with the aim of facilitating more intuitive Human-Robot Interaction (HRI) through gestures. First, we propose a set of commands commonly used for human-robot interaction. Next, an online user study with 190 participants was performed to investigate if there was an agreed set of gestures that people intuitively use… ▽ More We propose a set of communicative gestures and develop a gesture recognition system with the aim of facilitating more intuitive Human-Robot Interaction (HRI) through gestures. First, we propose a set of commands commonly used for human-robot interaction. Next, an online user study with 190 participants was performed to investigate if there was an agreed set of gestures that people intuitively use to communicate the given commands to robots when no guidance or training were given. As we found large variations among the gestures exist between participants, we then proposed a set of gestures for the proposed commands to be used as a common foundation for robot interaction. We collected ~7500 video demonstrations of the proposed gestures and trained a gesture recognition model, adapting 3D Convolutional Neural Networks (CNN) as the classifier, with a final accuracy of 84.1% (sigma=2.4). The resulting model was capable of training successfully with a relatively small amount of training data. We integrated the gesture recognition model into the ROS framework and report details for a demonstrated use case, where a person commands a robot to perform a pick and place task using the proposed set. This integrated ROS gesture recognition system is made available for use, and built with the intention to allow for new adaptations depending on robot model and use case scenarios, for novel user applications. △ Less

Submitted 20 September, 2021; originally announced September 2021.

Comments: 6 pages, 5 figures, 3 tables, ICRA 2022 Conference

arXiv:2105.02786 [pdf, other]

LGGNet: Learning from Local-Global-Graph Representations for Brain-Computer Interface

Authors: Yi Ding, Neethu Robinson, Chengxuan Tong, Qiuhao Zeng, Cuntai Guan

Abstract: Neuropsychological studies suggest that co-operative activities among different brain functional areas drive high-level cognitive processes. To learn the brain activities within and among different functional areas of the brain, we propose LGGNet, a novel neurologically inspired graph neural network, to learn local-global-graph representations of electroencephalography (EEG) for Brain-Computer Int… ▽ More Neuropsychological studies suggest that co-operative activities among different brain functional areas drive high-level cognitive processes. To learn the brain activities within and among different functional areas of the brain, we propose LGGNet, a novel neurologically inspired graph neural network, to learn local-global-graph representations of electroencephalography (EEG) for Brain-Computer Interface (BCI). The input layer of LGGNet comprises a series of temporal convolutions with multi-scale 1D convolutional kernels and kernel-level attentive fusion. It captures temporal dynamics of EEG which then serves as input to the proposed local and global graph-filtering layers. Using a defined neurophysiologically meaningful set of local and global graphs, LGGNet models the complex relations within and among functional areas of the brain. Under the robust nested cross-validation settings, the proposed method is evaluated on three publicly available datasets for four types of cognitive classification tasks, namely, the attention, fatigue, emotion, and preference classification tasks. LGGNet is compared with state-of-the-art methods, such as DeepConvNet, EEGNet, R2G-STNN, TSception, RGNN, AMCNN-DGCN, HRNN and GraphNet. The results show that LGGNet outperforms these methods, and the improvements are statistically significant (p<0.05) in most cases. The results show that bringing neuroscience prior knowledge into neural network design yields an improvement of classification performance. The source code can be found at https://github.com/yi-ding-cs/LGG △ Less

Submitted 5 December, 2022; v1 submitted 5 May, 2021; originally announced May 2021.

Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

arXiv:2104.02935 [pdf, other]

doi 10.1109/TAFFC.2022.3169001

TSception: Capturing Temporal Dynamics and Spatial Asymmetry from EEG for Emotion Recognition

Authors: Yi Ding, Neethu Robinson, Su Zhang, Qiuhao Zeng, Cuntai Guan

Abstract: The high temporal resolution and the asymmetric spatial activations are essential attributes of electroencephalogram (EEG) underlying emotional processes in the brain. To learn the temporal dynamics and spatial asymmetry of EEG towards accurate and generalized emotion recognition, we propose TSception, a multi-scale convolutional neural network that can classify emotions from EEG. TSception consis… ▽ More The high temporal resolution and the asymmetric spatial activations are essential attributes of electroencephalogram (EEG) underlying emotional processes in the brain. To learn the temporal dynamics and spatial asymmetry of EEG towards accurate and generalized emotion recognition, we propose TSception, a multi-scale convolutional neural network that can classify emotions from EEG. TSception consists of dynamic temporal, asymmetric spatial, and high-level fusion layers, which learn discriminative representations in the time and channel dimensions simultaneously. The dynamic temporal layer consists of multi-scale 1D convolutional kernels whose lengths are related to the sampling rate of EEG, which learns the dynamic temporal and frequency representations of EEG. The asymmetric spatial layer takes advantage of the asymmetric EEG patterns for emotion, learning the discriminative global and hemisphere representations. The learned spatial representations will be fused by a high-level fusion layer. Using more generalized cross-validation settings, the proposed method is evaluated on two publicly available datasets DEAP and MAHNOB-HCI. The performance of the proposed network is compared with prior reported methods such as SVM, KNN, FBFgMDM, FBTSC, Unsupervised learning, DeepConvNet, ShallowConvNet, and EEGNet. TSception achieves higher classification accuracies and F1 scores than other methods in most of the experiments. The codes are available at https://github.com/yi-ding-cs/TSception △ Less

Submitted 24 April, 2022; v1 submitted 7 April, 2021; originally announced April 2021.

Comments: Accepted as a regular paper in IEEE Transactions on Affective Computing. A version after proof-reading. Some typos in the Early Access version of IEEE Xplore are corrected

Journal ref: IEEE Transactions on Affective Computing 2022

arXiv:2104.01233 [pdf, other]

FBCNet: A Multi-view Convolutional Neural Network for Brain-Computer Interface

Authors: Ravikiran Mane, Effie Chew, Karen Chua, Kai Keng Ang, Neethu Robinson, A. P. Vinod, Seong-Whan Lee, Cuntai Guan

Abstract: Lack of adequate training samples and noisy high-dimensional features are key challenges faced by Motor Imagery (MI) decoding algorithms for electroencephalogram (EEG) based Brain-Computer Interface (BCI). To address these challenges, inspired from neuro-physiological signatures of MI, this paper proposes a novel Filter-Bank Convolutional Network (FBCNet) for MI classification. FBCNet employs a mu… ▽ More Lack of adequate training samples and noisy high-dimensional features are key challenges faced by Motor Imagery (MI) decoding algorithms for electroencephalogram (EEG) based Brain-Computer Interface (BCI). To address these challenges, inspired from neuro-physiological signatures of MI, this paper proposes a novel Filter-Bank Convolutional Network (FBCNet) for MI classification. FBCNet employs a multi-view data representation followed by spatial filtering to extract spectro-spatially discriminative features. This multistage approach enables efficient training of the network even when limited training data is available. More significantly, in FBCNet, we propose a novel Variance layer that effectively aggregates the EEG time-domain information. With this design, we compare FBCNet with state-of-the-art (SOTA) BCI algorithm on four MI datasets: The BCI competition IV dataset 2a (BCIC-IV-2a), the OpenBMI dataset, and two large datasets from chronic stroke patients. The results show that, by achieving 76.20% 4-class classification accuracy, FBCNet sets a new SOTA for BCIC-IV-2a dataset. On the other three datasets, FBCNet yields up to 8% higher binary classification accuracies. Additionally, using explainable AI techniques we present one of the first reports about the differences in discriminative EEG features between healthy subjects and stroke patients. Also, the FBCNet source code is available at https://github.com/ravikiran-mane/FBCNet. △ Less

Submitted 17 March, 2021; originally announced April 2021.

Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

arXiv:2104.00954 [pdf, other]

doi 10.1038/s41586-021-03854-z

Skillful Precipitation Nowcasting using Deep Generative Models of Radar

Authors: Suman Ravuri, Karel Lenc, Matthew Willson, Dmitry Kangin, Remi Lam, Piotr Mirowski, Megan Fitzsimons, Maria Athanassiadou, Sheleem Kashem, Sam Madge, Rachel Prudden, Amol Mandhane, Aidan Clark, Andrew Brock, Karen Simonyan, Raia Hadsell, Niall Robinson, Ellen Clancy, Alberto Arribas, Shakir Mohamed

Abstract: Precipitation nowcasting, the high-resolution forecasting of precipitation up to two hours ahead, supports the real-world socio-economic needs of many sectors reliant on weather-dependent decision-making. State-of-the-art operational nowcasting methods typically advect precipitation fields with radar-based wind estimates, and struggle to capture important non-linear events such as convective initi… ▽ More Precipitation nowcasting, the high-resolution forecasting of precipitation up to two hours ahead, supports the real-world socio-economic needs of many sectors reliant on weather-dependent decision-making. State-of-the-art operational nowcasting methods typically advect precipitation fields with radar-based wind estimates, and struggle to capture important non-linear events such as convective initiations. Recently introduced deep learning methods use radar to directly predict future rain rates, free of physical constraints. While they accurately predict low-intensity rainfall, their operational utility is limited because their lack of constraints produces blurry nowcasts at longer lead times, yielding poor performance on more rare medium-to-heavy rain events. To address these challenges, we present a Deep Generative Model for the probabilistic nowcasting of precipitation from radar. Our model produces realistic and spatio-temporally consistent predictions over regions up to 1536 km x 1280 km and with lead times from 5-90 min ahead. In a systematic evaluation by more than fifty expert forecasters from the Met Office, our generative model ranked first for its accuracy and usefulness in 88% of cases against two competitive methods, demonstrating its decision-making value and ability to provide physical insight to real-world experts. When verified quantitatively, these nowcasts are skillful without resorting to blurring. We show that generative nowcasting can provide probabilistic predictions that improve forecast value and support operational utility, and at resolutions and lead times where alternative methods struggle. △ Less

Submitted 2 April, 2021; originally announced April 2021.

Comments: 46 pages, 17 figures, 2 tables

arXiv:2101.03769 [pdf, ps, other]

doi 10.1109/THMS.2022.3149173

A Review of Evaluation Practices of Gesture Generation in Embodied Conversational Agents

Authors: Pieter Wolfert, Nicole Robinson, Tony Belpaeme

Abstract: Embodied conversational agents (ECA) are often designed to produce nonverbal behavior to complement or enhance their verbal communication. One such form of nonverbal behavior is co-speech gesturing, which involves movements that the agent makes with its arms and hands that are paired with verbal communication. Co-speech gestures for ECAs can be created using different generation methods, divided i… ▽ More Embodied conversational agents (ECA) are often designed to produce nonverbal behavior to complement or enhance their verbal communication. One such form of nonverbal behavior is co-speech gesturing, which involves movements that the agent makes with its arms and hands that are paired with verbal communication. Co-speech gestures for ECAs can be created using different generation methods, divided into rule-based and data-driven processes, with the latter gaining traction because of the increasing interest from the applied machine learning community. However, reports on gesture generation methods use a variety of evaluation measures, which hinders comparison. To address this, we present a systematic review on co-speech gesture generation methods for iconic, metaphoric, deictic, and beat gestures, including reported evaluation methods. We review 22 studies that have an ECA with a human-like upper body that uses co-speech gesturing in social human-agent interaction. This includes studies that use human participants to evaluate performance. We found most studies use a within-subject design and rely on a form of subjective evaluation, but without a systematic approach. We argue that the field requires more rigorous and uniform tools for co-speech gesture evaluation, and formulate recommendations for empirical evaluation, including standardized phrases and example scenarios to help systematically test generative models across studies. Furthermore, we also propose a checklist that can be used to report relevant information for the evaluation of generative models, as well as to evaluate co-speech gesture use. △ Less

Submitted 1 March, 2022; v1 submitted 11 January, 2021; originally announced January 2021.

Comments: 11 pages, accepted for publication in IEEE Transactions on Human-Machine Systems

arXiv:2012.05983 [pdf, other]

Towards Neural Programming Interfaces

Authors: Zachary C. Brown, Nathaniel Robinson, David Wingate, Nancy Fulda

Abstract: It is notoriously difficult to control the behavior of artificial neural networks such as generative neural language models. We recast the problem of controlling natural language generation as that of learning to interface with a pretrained language model, just as Application Programming Interfaces (APIs) control the behavior of programs by altering hyperparameters. In this new paradigm, a special… ▽ More It is notoriously difficult to control the behavior of artificial neural networks such as generative neural language models. We recast the problem of controlling natural language generation as that of learning to interface with a pretrained language model, just as Application Programming Interfaces (APIs) control the behavior of programs by altering hyperparameters. In this new paradigm, a specialized neural network (called a Neural Programming Interface or NPI) learns to interface with a pretrained language model by manipulating the hidden activations of the pretrained model to produce desired outputs. Importantly, no permanent changes are made to the weights of the original model, allowing us to re-purpose pretrained models for new tasks without overwriting any aspect of the language model. We also contribute a new data set construction algorithm and GAN-inspired loss function that allows us to train NPI models to control outputs of autoregressive transformers. In experiments against other state-of-the-art approaches, we demonstrate the efficacy of our methods using OpenAI's GPT-2 model, successfully controlling noun selection, topic aversion, offensive speech filtering, and other aspects of language while largely maintaining the controlled model's fluency under deterministic settings. △ Less

Submitted 17 February, 2021; v1 submitted 10 December, 2020; originally announced December 2020.

Comments: 24 pages total (13 for main paper and references, 11 for Appendix 1), accepted for publication in Advances in Neural Information Processing Systems 33 (NeurIPS 2020)

Journal ref: Neural Information Processing Systems 33 (2020) 17416-17428

arXiv:2009.08478 [pdf]

doi 10.1093/bioinformatics/btab019

PhenoTagger: A Hybrid Method for Phenotype Concept Recognition using Human Phenotype Ontology

Authors: Ling Luo, Shankai Yan, Po-Ting Lai, Daniel Veltri, Andrew Oler, Sandhya Xirasagar, Rajarshi Ghosh, Morgan Similuk, Peter N. Robinson, Zhiyong Lu

Abstract: Automatic phenotype concept recognition from unstructured text remains a challenging task in biomedical text mining research. Previous works that address the task typically use dictionary-based matching methods, which can achieve high precision but suffer from lower recall. Recently, machine learning-based methods have been proposed to identify biomedical concepts, which can recognize more unseen… ▽ More Automatic phenotype concept recognition from unstructured text remains a challenging task in biomedical text mining research. Previous works that address the task typically use dictionary-based matching methods, which can achieve high precision but suffer from lower recall. Recently, machine learning-based methods have been proposed to identify biomedical concepts, which can recognize more unseen concept synonyms by automatic feature learning. However, most methods require large corpora of manually annotated data for model training, which is difficult to obtain due to the high cost of human annotation. In this paper, we propose PhenoTagger, a hybrid method that combines both dictionary and machine learning-based methods to recognize Human Phenotype Ontology (HPO) concepts in unstructured biomedical text. We first use all concepts and synonyms in HPO to construct a dictionary, which is then used to automatically build a distantly supervised training dataset for machine learning. Next, a cutting-edge deep learning model is trained to classify each candidate phrase (n-gram from input sentence) into a corresponding concept label. Finally, the dictionary and machine learning-based prediction results are combined for improved performance. Our method is validated with two HPO corpora, and the results show that PhenoTagger compares favorably to previous methods. In addition, to demonstrate the generalizability of our method, we retrained PhenoTagger using the disease ontology MEDIC for disease concept recognition to investigate the effect of training on different ontologies. Experimental results on the NCBI disease corpus show that PhenoTagger without requiring manually annotated training data achieves competitive performance as compared with state-of-the-art supervised methods. △ Less

Submitted 25 January, 2021; v1 submitted 17 September, 2020; originally announced September 2020.

Comments: Accepted by Bioinformatics

arXiv:2005.04988 [pdf, ps, other]

A review of radar-based nowcasting of precipitation and applicable machine learning techniques

Authors: Rachel Prudden, Samantha Adams, Dmitry Kangin, Niall Robinson, Suman Ravuri, Shakir Mohamed, Alberto Arribas

Abstract: A 'nowcast' is a type of weather forecast which makes predictions in the very short term, typically less than two hours - a period in which traditional numerical weather prediction can be limited. This type of weather prediction has important applications for commercial aviation; public and outdoor events; and the construction industry, power utilities, and ground transportation services that cond… ▽ More A 'nowcast' is a type of weather forecast which makes predictions in the very short term, typically less than two hours - a period in which traditional numerical weather prediction can be limited. This type of weather prediction has important applications for commercial aviation; public and outdoor events; and the construction industry, power utilities, and ground transportation services that conduct much of their work outdoors. Importantly, one of the key needs for nowcasting systems is in the provision of accurate warnings of adverse weather events, such as heavy rain and flooding, for the protection of life and property in such situations. Typical nowcasting approaches are based on simple extrapolation models applied to observations, primarily rainfall radar. In this paper we review existing techniques to radar-based nowcasting from environmental sciences, as well as the statistical approaches that are applicable from the field of machine learning. Nowcasting continues to be an important component of operational systems and we believe new advances are possible with new partnerships between the environmental science and machine learning communities. △ Less

Submitted 11 May, 2020; originally announced May 2020.

Comments: 17 pages This work has been submitted to Monthly Weather Review. Copyright in this work may be transferred without further notice

arXiv:2004.02965 [pdf, other]

TSception: A Deep Learning Framework for Emotion Detection Using EEG

Authors: Yi Ding, Neethu Robinson, Qiuhao Zeng, Duo Chen, Aung Aung Phyo Wai, Tih-Shih Lee, Cuntai Guan

Abstract: In this paper, we propose a deep learning framework, TSception, for emotion detection from electroencephalogram (EEG). TSception consists of temporal and spatial convolutional layers, which learn discriminative representations in the time and channel domains simultaneously. The temporal learner consists of multi-scale 1D convolutional kernels whose lengths are related to the sampling rate of the E… ▽ More In this paper, we propose a deep learning framework, TSception, for emotion detection from electroencephalogram (EEG). TSception consists of temporal and spatial convolutional layers, which learn discriminative representations in the time and channel domains simultaneously. The temporal learner consists of multi-scale 1D convolutional kernels whose lengths are related to the sampling rate of the EEG signal, which learns multiple temporal and frequency representations. The spatial learner takes advantage of the asymmetry property of emotion responses at the frontal brain area to learn the discriminative representations from the left and right hemispheres of the brain. In our study, a system is designed to study the emotional arousal in an immersive virtual reality (VR) environment. EEG data were collected from 18 healthy subjects using this system to evaluate the performance of the proposed deep learning network for the classification of low and high emotional arousal states. The proposed method is compared with SVM, EEGNet, and LSTM. TSception achieves a high classification accuracy of 86.03%, which outperforms the prior methods significantly (p<0.05). The code is available at https://github.com/deepBrains/TSception △ Less

Submitted 7 April, 2020; v1 submitted 1 April, 2020; originally announced April 2020.

Comments: Authors information updated only. Accepted to be published in: 2020 International Joint Conference on Neural Networks (IJCNN), Glasgow, July 19--24, 2020, part of 2020 IEEE World Congress on Computational Intelligence (IEEE WCCI 2020)

arXiv:2003.00971 [pdf]

doi 10.1007/978-3-030-63089-8

Graphing Website Relationships for Risk Prediction: Identifying Derived Threats to Users Based on Known Indicators

Authors: Philip H. Kulp, Nikki E. Robinson

Abstract: The hypothesis for the study was that the relationship based on referrer links and the number of hops to a malicious site could indicate the risk to another website. We chose Receiver Operating Characteristics (ROC) analysis as the method of comparing true positive and false positive rates for captured web traffic to test the predictive capabilities of our model. Known threat indicators were used… ▽ More The hypothesis for the study was that the relationship based on referrer links and the number of hops to a malicious site could indicate the risk to another website. We chose Receiver Operating Characteristics (ROC) analysis as the method of comparing true positive and false positive rates for captured web traffic to test the predictive capabilities of our model. Known threat indicators were used as designators, and the Neo4j graph database was leveraged to map the relationships between other websites based on referring links. Using the referring traffic, we mapped user visits across websites with a known relationship to track the rate at which users progressed from a non-malicious website to a known threat. The results were grouped by the hop distance from the known threat to calculate the predictive rate. The results of the model produced true positive rates between 58.59% and 63.45% and false positive rates between 7.42% to 37.50%, respectively. The true and false positive rates suggest an improved performance based on the closer proximity from the known threat, while an increased referring distance from the threat resulted in higher rates of false positives. △ Less

Submitted 2 March, 2020; originally announced March 2020.

Comments: 10 pages, 3 figures, 3 tables

ACM Class: C.2.4; H.2.4; E.2; C.2.1

arXiv:2001.04930 [pdf]

Shades of Perception- User Factors in Identifying Password Strength

Authors: Jason M. Pittman, Nikki Robinson

Abstract: The purpose of this study was to measure whether participant education, profession, and technical skill level exhibited a relationship with identification of password strength. Participants reviewed 50 passwords and labeled each as weak or strong. A Chi-square test of independence was used to measure relationships between education, profession, technical skill level relative to the frequency of we… ▽ More The purpose of this study was to measure whether participant education, profession, and technical skill level exhibited a relationship with identification of password strength. Participants reviewed 50 passwords and labeled each as weak or strong. A Chi-square test of independence was used to measure relationships between education, profession, technical skill level relative to the frequency of weak and strong password identification. The results demonstrate significant relationships across all variable combinations except for technical skill and strong passwords which demonstrated no relationship. This research has three limitations. Data collection was dependent upon participant self-reporting and has limited externalized power. Further, the instrument was constructed under the assumption that all participants could read English and understood the concept of password strength. Finally, we did not control for external tool use (i.e., password strength meter). The results build upon existing literature insofar as the outcomes add to the collective understanding of user perception of passwords in specific and authentication in general. Whereas prior research has explored similar areas, such work has done so by having participants create passwords. This work measures perception of pre-generated passwords. The results demonstrate a need for further investigation into why users continue to rely on weak passwords. The originality of this work rests in soliciting a broad spectrum of participants and measuring potential correlations between participant education, profession, and technical skill level. △ Less

Submitted 14 January, 2020; originally announced January 2020.

Comments: 18 pages, 6 tables, 6 figures

arXiv:1908.03356 [pdf, other]

Seven Principles for Effective Scientific Big-DataSystems

Authors: Niall H. Robinson, Joe Hamman, Ryan Abernathey

Abstract: We should be in a golden age of scientific discovery, given that we have more data and more compute power available than ever before, plus a new generation of algorithms that can learn effectively from data. But paradoxically, in many data-driven fields, the eureka moments are becoming increasingly rare. Scientists are struggling to keep pace with the explosion in the volume and complexity of scie… ▽ More We should be in a golden age of scientific discovery, given that we have more data and more compute power available than ever before, plus a new generation of algorithms that can learn effectively from data. But paradoxically, in many data-driven fields, the eureka moments are becoming increasingly rare. Scientists are struggling to keep pace with the explosion in the volume and complexity of scientific data. We describe here a few simple architectural principles that we believe are essential in order to create effective, robust, and flexible platforms that make the best use of emerging technology to deal with the exponential growth of scientific data. △ Less

Submitted 25 June, 2020; v1 submitted 9 August, 2019; originally announced August 2019.

arXiv:1604.03688 [pdf, other]

A Practical Approach to Spatiotemporal Data Compression

Authors: Niall H. Robinson, Rachel Prudden, Alberto Arribas

Abstract: Datasets representing the world around us are becoming ever more unwieldy as data volumes grow. This is largely due to increased measurement and modelling resolution, but the problem is often exacerbated when data are stored at spuriously high precisions. In an effort to facilitate analysis of these datasets, computationally intensive calculations are increasingly being performed on specialised re… ▽ More Datasets representing the world around us are becoming ever more unwieldy as data volumes grow. This is largely due to increased measurement and modelling resolution, but the problem is often exacerbated when data are stored at spuriously high precisions. In an effort to facilitate analysis of these datasets, computationally intensive calculations are increasingly being performed on specialised remote servers before the reduced data are transferred to the consumer. Due to bandwidth limitations, this often means data are displayed as simple 2D data visualisations, such as scatter plots or images. We present here a novel way to efficiently encode and transmit 4D data fields on-demand so that they can be locally visualised and interrogated. This nascent "4D video" format allows us to more flexibly move the boundary between data server and consumer client. However, it has applications beyond purely scientific visualisation, in the transmission of data to virtual and augmented reality. △ Less

Submitted 27 April, 2016; v1 submitted 13 April, 2016; originally announced April 2016.

Showing 1–36 of 36 results for author: Robinson, N