Search | arXiv e-print repository

Img2CAD: Reverse Engineering 3D CAD Models from Images through VLM-Assisted Conditional Factorization

Authors: Yang You, Mikaela Angelina Uy, Jiaqi Han, Rahul Thomas, Haotong Zhang, Suya You, Leonidas Guibas

Abstract: Reverse engineering 3D computer-aided design (CAD) models from images is an important task for many downstream applications including interactive editing, manufacturing, architecture, robotics, etc. The difficulty of the task lies in vast representational disparities between the CAD output and the image input. CAD models are precise, programmatic constructs that involves sequential operations comb… ▽ More Reverse engineering 3D computer-aided design (CAD) models from images is an important task for many downstream applications including interactive editing, manufacturing, architecture, robotics, etc. The difficulty of the task lies in vast representational disparities between the CAD output and the image input. CAD models are precise, programmatic constructs that involves sequential operations combining discrete command structure with continuous attributes -- making it challenging to learn and optimize in an end-to-end fashion. Concurrently, input images introduce inherent challenges such as photo-metric variability and sensor noise, complicating the reverse engineering process. In this work, we introduce a novel approach that conditionally factorizes the task into two sub-problems. First, we leverage large foundation models, particularly GPT-4V, to predict the global discrete base structure with semantic information. Second, we propose TrAssembler that conditioned on the discrete structure with semantics predicts the continuous attribute values. To support the training of our TrAssembler, we further constructed an annotated CAD dataset of common objects from ShapeNet. Putting all together, our approach and data demonstrate significant first steps towards CAD-ifying images in the wild. Our project page: https://anonymous123342.github.io/ △ Less

Submitted 19 July, 2024; originally announced August 2024.

arXiv:2407.13183 [pdf]

Methods to Measure the Broncho-Arterial Ratio and Wall Thickness in the Right Lower Lobe for Defining Radiographic Reversibility of Bronchiectasis

Authors: Abhijith R. Beeravolu, Ian Brent Masters, Mirjam Jonkman, Kheng Cher Yeo, Spyridon Prountzos, Rahul J Thomas, Eva Ignatious, Sami Azam, Gabrielle B McCallum, Efthymia Alexopoulou, Anne B Chang, Friso De Boer

Abstract: The diagnosis of bronchiectasis requires measuring abnormal bronchial dilation. It is confirmed using a chest CT scan, where the key feature is an increased broncho-arterial ratio (BAR) (>0.8 in children), often with bronchial wall thickening. Image processing methods facilitate quicker interpretation and detailed evaluations by lobes and segments. Challenges like inclined nature, oblique orientat… ▽ More The diagnosis of bronchiectasis requires measuring abnormal bronchial dilation. It is confirmed using a chest CT scan, where the key feature is an increased broncho-arterial ratio (BAR) (>0.8 in children), often with bronchial wall thickening. Image processing methods facilitate quicker interpretation and detailed evaluations by lobes and segments. Challenges like inclined nature, oblique orientation, and partial volume effect make it difficult to obtain accurate measurements in the upper and middle lobes using the same algorithms. Therefore, accurate detection and measurement of airway and artery regions for BAR and wall thickness in each lobe require different image processing/machine learning methods. We propose methods for: 1. Separating the right lower lobe (RLL) region from full-length CT scans using the tracheal bifurcation (Carina) point as a central marker; 2. Locating the inner diameter of airways and outer diameter of arteries for BAR measurement; and 3. Measuring airway wall thickness (WT) by identifying the outer and inner diameters of airway boundaries. Analysis of 13 HRCT scans with varying thicknesses (0.67mm, 1mm, 2mm) shows the tracheal bifurcation frame can be detected accurately, with a deviation of +/- 2 frames in some cases. A Windows app was developed for measuring inner airway diameter, artery diameter, BAR, and wall thickness, allowing users to draw boundaries around visible BA pairs in the RLL region. Measurements of 10 BA pairs revealed accurate results comparable to those of a human reader, with deviations of +/- 0.10-0.15mm. Additional studies and validation are needed to consolidate inter- and intra-rater variability and enhance the methods. △ Less

Submitted 18 July, 2024; originally announced July 2024.

Comments: 14 pages

arXiv:2405.00970 [pdf, other]

How Can I Get It Right? Using GPT to Rephrase Incorrect Trainee Responses

Authors: Jionghao Lin, Zifei Han, Danielle R. Thomas, Ashish Gurung, Shivang Gupta, Vincent Aleven, Kenneth R. Koedinger

Abstract: One-on-one tutoring is widely acknowledged as an effective instructional method, conditioned on qualified tutors. However, the high demand for qualified tutors remains a challenge, often necessitating the training of novice tutors (i.e., trainees) to ensure effective tutoring. Research suggests that providing timely explanatory feedback can facilitate the training process for trainees. However, it… ▽ More One-on-one tutoring is widely acknowledged as an effective instructional method, conditioned on qualified tutors. However, the high demand for qualified tutors remains a challenge, often necessitating the training of novice tutors (i.e., trainees) to ensure effective tutoring. Research suggests that providing timely explanatory feedback can facilitate the training process for trainees. However, it presents challenges due to the time-consuming nature of assessing trainee performance by human experts. Inspired by the recent advancements of large language models (LLMs), our study employed the GPT-4 model to build an explanatory feedback system. This system identifies trainees' responses in binary form (i.e., correct/incorrect) and automatically provides template-based feedback with responses appropriately rephrased by the GPT-4 model. We conducted our study on 410 responses from trainees across three training lessons: Giving Effective Praise, Reacting to Errors, and Determining What Students Know. Our findings indicate that: 1) using a few-shot approach, the GPT-4 model effectively identifies correct/incorrect trainees' responses from three training lessons with an average F1 score of 0.84 and an AUC score of 0.85; and 2) using the few-shot approach, the GPT-4 model adeptly rephrases incorrect trainees' responses into desired responses, achieving performance comparable to that of human experts. △ Less

Submitted 1 May, 2024; originally announced May 2024.

Comments: International Journal of Artificial Intelligence in Education

arXiv:2405.00291 [pdf, other]

doi 10.1051/0004-6361/202349120

How Can I Improve? Using GPT to Highlight the Desired and Undesired Parts of Open-ended Responses

Authors: Jionghao Lin, Eason Chen, Zeifei Han, Ashish Gurung, Danielle R. Thomas, Wei Tan, Ngoc Dang Nguyen, Kenneth R. Koedinger

Abstract: Automated explanatory feedback systems play a crucial role in facilitating learning for a large cohort of learners by offering feedback that incorporates explanations, significantly enhancing the learning process. However, delivering such explanatory feedback in real-time poses challenges, particularly when high classification accuracy for domain-specific, nuanced responses is essential. Our study… ▽ More Automated explanatory feedback systems play a crucial role in facilitating learning for a large cohort of learners by offering feedback that incorporates explanations, significantly enhancing the learning process. However, delivering such explanatory feedback in real-time poses challenges, particularly when high classification accuracy for domain-specific, nuanced responses is essential. Our study leverages the capabilities of large language models, specifically Generative Pre-Trained Transformers (GPT), to explore a sequence labeling approach focused on identifying components of desired and less desired praise for providing explanatory feedback within a tutor training dataset. Our aim is to equip tutors with actionable, explanatory feedback during online training lessons. To investigate the potential of GPT models for providing the explanatory feedback, we employed two commonly-used approaches: prompting and fine-tuning. To quantify the quality of highlighted praise components identified by GPT models, we introduced a Modified Intersection over Union (M-IoU) score. Our findings demonstrate that: (1) the M-IoU score effectively correlates with human judgment in evaluating sequence quality; (2) using two-shot prompting on GPT-3.5 resulted in decent performance in recognizing effort-based (M-IoU of 0.46) and outcome-based praise (M-IoU of 0.68); and (3) our optimally fine-tuned GPT-3.5 model achieved M-IoU scores of 0.64 for effort-based praise and 0.84 for outcome-based praise, aligning with the satisfaction levels evaluated by human coders. Our results show promise for using GPT models to provide feedback that focuses on specific elements in their open-ended responses that are desirable or could use improvement. △ Less

Submitted 30 April, 2024; originally announced May 2024.

Comments: 11 pages, full research paper, EDM 2024

Journal ref: A&A 687, A227 (2024)

arXiv:2402.14594 [pdf, other]

Improving Assessment of Tutoring Practices using Retrieval-Augmented Generation

Authors: Zifei FeiFei Han, Jionghao Lin, Ashish Gurung, Danielle R. Thomas, Eason Chen, Conrad Borchers, Shivang Gupta, Kenneth R. Koedinger

Abstract: One-on-one tutoring is an effective instructional method for enhancing learning, yet its efficacy hinges on tutor competencies. Novice math tutors often prioritize content-specific guidance, neglecting aspects such as social-emotional learning. Social-emotional learning promotes equity and inclusion and nurturing relationships with students, which is crucial for holistic student development. Asses… ▽ More One-on-one tutoring is an effective instructional method for enhancing learning, yet its efficacy hinges on tutor competencies. Novice math tutors often prioritize content-specific guidance, neglecting aspects such as social-emotional learning. Social-emotional learning promotes equity and inclusion and nurturing relationships with students, which is crucial for holistic student development. Assessing the competencies of tutors accurately and efficiently can drive the development of tailored tutor training programs. However, evaluating novice tutor ability during real-time tutoring remains challenging as it typically requires experts-in-the-loop. To address this challenge, this preliminary study aims to harness Generative Pre-trained Transformers (GPT), such as GPT-3.5 and GPT-4 models, to automatically assess tutors' ability of using social-emotional tutoring strategies. Moreover, this study also reports on the financial dimensions and considerations of employing these models in real-time and at scale for automated assessment. The current study examined four prompting strategies: two basic Zero-shot prompt strategies, Tree of Thought prompt, and Retrieval-Augmented Generator (RAG) based prompt. The results indicate that the RAG prompt demonstrated more accurate performance (assessed by the level of hallucination and correctness in the generated assessment texts) and lower financial costs than the other strategies evaluated. These findings inform the development of personalized tutor training interventions to enhance the the educational effectiveness of tutored learning. △ Less

Submitted 4 February, 2024; originally announced February 2024.

Comments: 11 page Workshop paper, AAAI2024 Workshop on AI for Education - Bridging Innovation and Responsibility, Large Language Model, Personalized Tutor Training, Automatic Assessment

arXiv:2402.11758 [pdf, other]

Conformally rigid graphs

Authors: Stefan Steinerberger, Rekha R. Thomas

Abstract: Given a finite, simple, connected graph $G=(V,E)$ with $|V|=n$, we consider the associated graph Laplacian matrix $L = D - A$ with eigenvalues $0 = λ_1 < λ_2 \leq \dots \leq λ_n$. One can also consider the same graph equipped with positive edge weights $w:E \rightarrow \mathbb{R}_{> 0}$ normalized to $\sum_{e \in E} w_e = |E|$ and the associated weighted Laplacian matrix $L_w$. We say that $G$ is… ▽ More Given a finite, simple, connected graph $G=(V,E)$ with $|V|=n$, we consider the associated graph Laplacian matrix $L = D - A$ with eigenvalues $0 = λ_1 < λ_2 \leq \dots \leq λ_n$. One can also consider the same graph equipped with positive edge weights $w:E \rightarrow \mathbb{R}_{> 0}$ normalized to $\sum_{e \in E} w_e = |E|$ and the associated weighted Laplacian matrix $L_w$. We say that $G$ is conformally rigid if constant edge-weights maximize the second eigenvalue $λ_2(w)$ of $L_w$ over all $w$, and minimize $λ_n(w')$ of $L_{w'}$ over all $w'$, i.e., for all $w,w'$, $$ λ_2(w) \leq λ_2(1) \leq λ_n(1) \leq λ_n(w').$$ Conformal rigidity requires an extraordinary amount of symmetry in $G$. Every edge-transitive graph is conformally rigid. We prove that every distance-regular graph, and hence every strongly-regular graph, is conformally rigid. Certain special graph embeddings can be used to characterize conformal rigidity. Cayley graphs can be conformally rigid but need not be, we prove a sufficient criterion. We also find a small set of conformally rigid graphs that do not belong into any of the above categories; these include the Hoffman graph, the crossing number graph 6B and others. Conformal rigidity can be certified via semidefinite programming, we provide explicit examples. △ Less

Submitted 18 February, 2024; originally announced February 2024.

arXiv:2312.11274 [pdf, other]

doi 10.1145/3636555.3636896

Improving Student Learning with Hybrid Human-AI Tutoring: A Three-Study Quasi-Experimental Investigation

Authors: Danielle R. Thomas, Jionghao Lin, Erin Gatz, Ashish Gurung, Shivang Gupta, Kole Norberg, Stephen E. Fancsali, Vincent Aleven, Lee Branstetter, Emma Brunskill, Kenneth R. Koedinger

Abstract: Artificial intelligence (AI) applications to support human tutoring have potential to significantly improve learning outcomes, but engagement issues persist, especially among students from low-income backgrounds. We introduce an AI-assisted tutoring model that combines human and AI tutoring and hypothesize that this synergy will have positive impacts on learning processes. To investigate this hypo… ▽ More Artificial intelligence (AI) applications to support human tutoring have potential to significantly improve learning outcomes, but engagement issues persist, especially among students from low-income backgrounds. We introduce an AI-assisted tutoring model that combines human and AI tutoring and hypothesize that this synergy will have positive impacts on learning processes. To investigate this hypothesis, we conduct a three-study quasi-experiment across three urban and low-income middle schools: 1) 125 students in a Pennsylvania school; 2) 385 students (50% Latinx) in a California school; and 3) 75 students (100% Black) in a Pennsylvania charter school, all implementing analogous tutoring models. We compare learning analytics of students engaged in human-AI tutoring compared to students using math software only. We find human-AI tutoring has positive effects, particularly in student's proficiency and usage, with evidence suggesting lower achieving students may benefit more compared to higher achieving students. We illustrate the use of quasi-experimental methods adapted to the particulars of different schools and data-availability contexts so as to achieve the rapid data-driven iteration needed to guide an inspired creation into effective innovation. Future work focuses on improving the tutor dashboard and optimizing tutor-student ratios, while maintaining annual costs per students of approximately $700 annually. △ Less

Submitted 21 December, 2023; v1 submitted 18 December, 2023; originally announced December 2023.

Comments: 17 pages

arXiv:2312.02211 [pdf]

Cycle-consistent Generative Adversarial Network Synthetic CT for MR-only Adaptive Radiation Therapy on MR-Linac

Authors: Gabriel L. Asher, Bassem I. Zaki, Gregory A. Russo, Gobind S. Gill, Charles R. Thomas, Temiloluwa O. Prioleau, Rongxiao Zhang, Brady Hunt

Abstract: Purpose: This study assesses the effectiveness of Deep Learning (DL) for creating synthetic CT (sCT) images in MR-guided adaptive radiation therapy (MRgART). Methods: A Cycle-GAN model was trained with MRI and CT scan slices from MR-LINAC treatments, generating sCT volumes. The analysis involved retrospective treatment plan data from patients with various tumors. sCT images were compared with st… ▽ More Purpose: This study assesses the effectiveness of Deep Learning (DL) for creating synthetic CT (sCT) images in MR-guided adaptive radiation therapy (MRgART). Methods: A Cycle-GAN model was trained with MRI and CT scan slices from MR-LINAC treatments, generating sCT volumes. The analysis involved retrospective treatment plan data from patients with various tumors. sCT images were compared with standard CT scans using mean absolute error in Hounsfield Units (HU) and image similarity metrics (SSIM, PSNR, NCC). sCT volumes were integrated into a clinical treatment system for dosimetric re-evaluation. Results: The model, trained on 8405 frames from 57 patients and tested on 357 sCT frames from 17 patients, showed sCTs comparable to dCTs in electron density and structural similarity with MRI scans. The MAE between sCT and dCT was 49.2 +/- 13.2 HU, with sCT NCC exceeding dCT by 0.06, and SSIM and PSNR at 0.97 +/- 0.01 and 19.9 +/- 1.6 respectively. Dosimetric evaluations indicated minimal differences between sCTs and dCTs, with sCTs showing better air-bubble reconstruction. Conclusions: DL-based sCT generation on MR-Linacs is accurate for dose calculation and optimization in MRgART. This could facilitate MR-only treatment planning, enhancing simulation and adaptive planning efficiency on MR-Linacs. △ Less

Submitted 2 December, 2023; originally announced December 2023.

arXiv:2310.20685 [pdf, other]

NeRF Revisited: Fixing Quadrature Instability in Volume Rendering

Authors: Mikaela Angelina Uy, Kiyohiro Nakayama, Guandao Yang, Rahul Krishna Thomas, Leonidas Guibas, Ke Li

Abstract: Neural radiance fields (NeRF) rely on volume rendering to synthesize novel views. Volume rendering requires evaluating an integral along each ray, which is numerically approximated with a finite sum that corresponds to the exact integral along the ray under piecewise constant volume density. As a consequence, the rendered result is unstable w.r.t. the choice of samples along the ray, a phenomenon… ▽ More Neural radiance fields (NeRF) rely on volume rendering to synthesize novel views. Volume rendering requires evaluating an integral along each ray, which is numerically approximated with a finite sum that corresponds to the exact integral along the ray under piecewise constant volume density. As a consequence, the rendered result is unstable w.r.t. the choice of samples along the ray, a phenomenon that we dub quadrature instability. We propose a mathematically principled solution by reformulating the sample-based rendering equation so that it corresponds to the exact integral under piecewise linear volume density. This simultaneously resolves multiple issues: conflicts between samples along different rays, imprecise hierarchical sampling, and non-differentiability of quantiles of ray termination distances w.r.t. model parameters. We demonstrate several benefits over the classical sample-based rendering equation, such as sharper textures, better geometric reconstruction, and stronger depth supervision. Our proposed formulation can be also be used as a drop-in replacement to the volume rendering equation of existing NeRF-based methods. Our project page can be found at pl-nerf.github.io. △ Less

Submitted 19 January, 2024; v1 submitted 31 October, 2023; originally announced October 2023.

Comments: Neurips 2023

arXiv:2310.19201 [pdf, ps, other]

Open Problems in DAOs

Authors: Joshua Tan, Tara Merk, Sarah Hubbard, Eliza R. Oak, Helena Rong, Joni Pirovich, Ellie Rennie, Rolf Hoefer, Michael Zargham, Jason Potts, Chris Berg, Reuben Youngblom, Primavera De Filippi, Seth Frey, Jeff Strnad, Morshed Mannan, Kelsie Nabben, Silke Noa Elrifai, Jake Hartnell, Benjamin Mako Hill, Tobin South, Ryan L. Thomas, Jonathan Dotan, Ariana Spring, Alexia Maddox , et al. (4 additional authors not shown)

Abstract: Decentralized autonomous organizations (DAOs) are a new, rapidly-growing class of organizations governed by smart contracts. Here we describe how researchers can contribute to the emerging science of DAOs and other digitally-constituted organizations. From granular privacy primitives to mechanism designs to model laws, we identify high-impact problems in the DAO ecosystem where existing gaps might… ▽ More Decentralized autonomous organizations (DAOs) are a new, rapidly-growing class of organizations governed by smart contracts. Here we describe how researchers can contribute to the emerging science of DAOs and other digitally-constituted organizations. From granular privacy primitives to mechanism designs to model laws, we identify high-impact problems in the DAO ecosystem where existing gaps might be tackled through a new data set or by applying tools and ideas from existing research fields such as political science, computer science, economics, law, and organizational science. Our recommendations encompass exciting research questions as well as promising business opportunities. We call on the wider research community to join the global effort to invent the next generation of organizations. △ Less

Submitted 12 June, 2024; v1 submitted 29 October, 2023; originally announced October 2023.

Comments: includes major coordination problems

arXiv:2310.17004 [pdf, other]

Improved Panning on Non-Equidistant Loudspeakers with Direct Sound Level Compensation

Authors: Jan-Hendrik Hanschke, Daniel Arteaga, Giulio Cengarle, Joshua Lando, Mark R. P. Thomas, Alan Seefeldt

Abstract: Loudspeaker rendering techniques that create phantom sound sources often assume an equidistant loudspeaker layout. Typical home setups might not fulfill this condition as loudspeakers deviate from canonical positions, thus requiring a corresponding calibration. The standard approach is to compensate for delays and to match the loudness of each loudspeaker at the listener's location. It was found t… ▽ More Loudspeaker rendering techniques that create phantom sound sources often assume an equidistant loudspeaker layout. Typical home setups might not fulfill this condition as loudspeakers deviate from canonical positions, thus requiring a corresponding calibration. The standard approach is to compensate for delays and to match the loudness of each loudspeaker at the listener's location. It was found that a shift of the phantom image occurs when this calibration procedure is applied and one of a pair of loudspeakers is significantly closer to the listener than the other. In this paper, a novel approach to panning on non-equidistant loudspeaker layouts is presented whereby the panning position is governed by the direct sound and the perceived loudness is governed by the full impulse response. Subjective listening tests are presented that validate the approach and quantify the perceived effect of the compensation. In a setup where the standard calibration leads to an average error of 10 degrees, the proposed direct sound compensation largely returns the phantom source to its intended position. △ Less

Submitted 27 October, 2023; v1 submitted 25 October, 2023; originally announced October 2023.

Comments: 10 pages. Accepted for presentation in AES Convention 155 (2023)

Journal ref: Proceedings of the Audio Engineering Society Convention 155, New York, paper 10669 (October 2023). https://www.aes.org/e-lib/inst/browse.cfm?elib=22250

arXiv:2310.16678 [pdf, other]

Robust and Actively Secure Serverless Collaborative Learning

Authors: Olive Franzese, Adam Dziedzic, Christopher A. Choquette-Choo, Mark R. Thomas, Muhammad Ahmad Kaleem, Stephan Rabanser, Congyu Fang, Somesh Jha, Nicolas Papernot, Xiao Wang

Abstract: Collaborative machine learning (ML) is widely used to enable institutions to learn better models from distributed data. While collaborative approaches to learning intuitively protect user data, they remain vulnerable to either the server, the clients, or both, deviating from the protocol. Indeed, because the protocol is asymmetric, a malicious server can abuse its power to reconstruct client data… ▽ More Collaborative machine learning (ML) is widely used to enable institutions to learn better models from distributed data. While collaborative approaches to learning intuitively protect user data, they remain vulnerable to either the server, the clients, or both, deviating from the protocol. Indeed, because the protocol is asymmetric, a malicious server can abuse its power to reconstruct client data points. Conversely, malicious clients can corrupt learning with malicious updates. Thus, both clients and servers require a guarantee when the other cannot be trusted to fully cooperate. In this work, we propose a peer-to-peer (P2P) learning scheme that is secure against malicious servers and robust to malicious clients. Our core contribution is a generic framework that transforms any (compatible) algorithm for robust aggregation of model updates to the setting where servers and clients can act maliciously. Finally, we demonstrate the computational efficiency of our approach even with 1-million parameter models trained by 100s of peers on standard datasets. △ Less

Submitted 25 October, 2023; originally announced October 2023.

Comments: Accepted at NeurIPS 2023

arXiv:2307.02018 [pdf, ps, other]

Comparative Analysis of GPT-4 and Human Graders in Evaluating Praise Given to Students in Synthetic Dialogues

Authors: Dollaya Hirunyasiri, Danielle R. Thomas, Jionghao Lin, Kenneth R. Koedinger, Vincent Aleven

Abstract: Research suggests that providing specific and timely feedback to human tutors enhances their performance. However, it presents challenges due to the time-consuming nature of assessing tutor performance by human evaluators. Large language models, such as the AI-chatbot ChatGPT, hold potential for offering constructive feedback to tutors in practical settings. Nevertheless, the accuracy of AI-genera… ▽ More Research suggests that providing specific and timely feedback to human tutors enhances their performance. However, it presents challenges due to the time-consuming nature of assessing tutor performance by human evaluators. Large language models, such as the AI-chatbot ChatGPT, hold potential for offering constructive feedback to tutors in practical settings. Nevertheless, the accuracy of AI-generated feedback remains uncertain, with scant research investigating the ability of models like ChatGPT to deliver effective feedback. In this work-in-progress, we evaluate 30 dialogues generated by GPT-4 in a tutor-student setting. We use two different prompting approaches, the zero-shot chain of thought and the few-shot chain of thought, to identify specific components of effective praise based on five criteria. These approaches are then compared to the results of human graders for accuracy. Our goal is to assess the extent to which GPT-4 can accurately identify each praise criterion. We found that both zero-shot and few-shot chain of thought approaches yield comparable results. GPT-4 performs moderately well in identifying instances when the tutor offers specific and immediate praise. However, GPT-4 underperforms in identifying the tutor's ability to deliver sincere praise, particularly in the zero-shot prompting scenario where examples of sincere tutor praise statements were not provided. Future work will focus on enhancing prompt engineering, developing a more general tutoring rubric, and evaluating our method using real-life tutoring dialogues. △ Less

Submitted 5 July, 2023; originally announced July 2023.

Comments: 12 pages Workshop paper, The 24th International Conference on Artificial Intelligence in Education, AIED 2023 Educational Dialogue Act Classification, Large Language Models, Tutor Training

arXiv:2306.15498 [pdf, other]

Using Large Language Models to Provide Explanatory Feedback to Human Tutors

Authors: Jionghao Lin, Danielle R. Thomas, Feifei Han, Shivang Gupta, Wei Tan, Ngoc Dang Nguyen, Kenneth R. Koedinger

Abstract: Research demonstrates learners engaging in the process of producing explanations to support their reasoning, can have a positive impact on learning. However, providing learners real-time explanatory feedback often presents challenges related to classification accuracy, particularly in domain-specific environments, containing situationally complex and nuanced responses. We present two approaches fo… ▽ More Research demonstrates learners engaging in the process of producing explanations to support their reasoning, can have a positive impact on learning. However, providing learners real-time explanatory feedback often presents challenges related to classification accuracy, particularly in domain-specific environments, containing situationally complex and nuanced responses. We present two approaches for supplying tutors real-time feedback within an online lesson on how to give students effective praise. This work-in-progress demonstrates considerable accuracy in binary classification for corrective feedback of effective, or effort-based (F1 score = 0.811), and ineffective, or outcome-based (F1 score = 0.350), praise responses. More notably, we introduce progress towards an enhanced approach of providing explanatory feedback using large language model-facilitated named entity recognition, which can provide tutors feedback, not only while engaging in lessons, but can potentially suggest real-time tutor moves. Future work involves leveraging large language models for data augmentation to improve accuracy, while also developing an explanatory feedback interface. △ Less

Submitted 27 June, 2023; originally announced June 2023.

Comments: 12 pages Workshop paper, The 24th International Conference on Artificial Intelligence in Education, AIED 2023 Educational Dialogue Act Classification, Large Language Models, Named Entity Recognition, Tutor Training, Explanatory Feedback, Natural Language Processing

arXiv:2306.10709 [pdf, other]

Machine learning of hidden variables in multiscale fluid simulation

Authors: Archis S. Joglekar, Alexander G. R. Thomas

Abstract: Solving fluid dynamics equations often requires the use of closure relations that account for missing microphysics. For example, when solving equations related to fluid dynamics for systems with a large Reynolds number, sub-grid effects become important and a turbulence closure is required, and in systems with a large Knudsen number, kinetic effects become important and a kinetic closure is requir… ▽ More Solving fluid dynamics equations often requires the use of closure relations that account for missing microphysics. For example, when solving equations related to fluid dynamics for systems with a large Reynolds number, sub-grid effects become important and a turbulence closure is required, and in systems with a large Knudsen number, kinetic effects become important and a kinetic closure is required. By adding an equation governing the growth and transport of the quantity requiring the closure relation, it becomes possible to capture microphysics through the introduction of ``hidden variables'' that are non-local in space and time. The behavior of the ``hidden variables'' in response to the fluid conditions can be learned from a higher fidelity or ab-initio model that contains all the microphysics. In our study, a partial differential equation simulator that is end-to-end differentiable is used to train judiciously placed neural networks against ground-truth simulations. We show that this method enables an Euler equation based approach to reproduce non-linear, large Knudsen number plasma physics that can otherwise only be modeled using Boltzmann-like equation simulators such as Vlasov or Particle-In-Cell modeling. △ Less

Submitted 19 June, 2023; originally announced June 2023.

arXiv:2306.06204 [pdf, other]

Spectrahedral Geometry of Graph Sparsifiers

Authors: Catherine Babecki, Stefan Steinerberger, Rekha R. Thomas

Abstract: We propose an approach to graph sparsification based on the idea of preserving the smallest $k$ eigenvalues and eigenvectors of the Graph Laplacian. This is motivated by the fact that small eigenvalues and their associated eigenvectors tend to be more informative of the global structure and geometry of the graph than larger eigenvalues and their eigenvectors. The set of all weighted subgraphs of a… ▽ More We propose an approach to graph sparsification based on the idea of preserving the smallest $k$ eigenvalues and eigenvectors of the Graph Laplacian. This is motivated by the fact that small eigenvalues and their associated eigenvectors tend to be more informative of the global structure and geometry of the graph than larger eigenvalues and their eigenvectors. The set of all weighted subgraphs of a graph $G$ that have the same first $k$ eigenvalues (and eigenvectors) as $G$ is the intersection of a polyhedron with a cone of positive semidefinite matrices. We discuss the geometry of these sets and deduce the natural scale of $k$. Various families of graphs illustrate our construction. △ Less

Submitted 9 June, 2023; originally announced June 2023.

Comments: 34 pages, 17 figures, 3 tables

arXiv:2303.00823 [pdf, other]

Automated control and optimisation of laser driven ion acceleration

Authors: B. Loughran, M. J. V. Streeter, H. Ahmed, S. Astbury, M. Balcazar, M. Borghesi, N. Bourgeois, C. B. Curry, S. J. D. Dann, S. DiIorio, N. P. Dover, T. Dzelzanis, O. C. Ettlinger, M. Gauthier, L. Giuffrida, G. D. Glenn, S. H. Glenzer, J. S. Green, R. J. Gray, G. S. Hicks, C. Hyland, V. Istokskaia, M. King, D. Margarone, O. McCusker , et al. (10 additional authors not shown)

Abstract: The interaction of relativistically intense lasers with opaque targets represents a highly non-linear, multi-dimensional parameter space. This limits the utility of sequential 1D scanning of experimental parameters for the optimisation of secondary radiation, although to-date this has been the accepted methodology due to low data acquisition rates. High repetition-rate (HRR) lasers augmented by ma… ▽ More The interaction of relativistically intense lasers with opaque targets represents a highly non-linear, multi-dimensional parameter space. This limits the utility of sequential 1D scanning of experimental parameters for the optimisation of secondary radiation, although to-date this has been the accepted methodology due to low data acquisition rates. High repetition-rate (HRR) lasers augmented by machine learning present a valuable opportunity for efficient source optimisation. Here, an automated, HRR-compatible system produced high fidelity parameter scans, revealing the influence of laser intensity on target pre-heating and proton generation. A closed-loop Bayesian optimisation of maximum proton energy, through control of the laser wavefront and target position, produced proton beams with equivalent maximum energy to manually-optimized laser pulses but using only 60% of the laser energy. This demonstration of automated optimisation of laser-driven proton beams is a crucial step towards deeper physical insight and the construction of future radiation sources. △ Less

Submitted 1 March, 2023; originally announced March 2023.

Comments: 11 pages

arXiv:2212.10618 [pdf, ps, other]

Ontologically Faithful Generation of Non-Player Character Dialogues

Authors: Nathaniel Weir, Ryan Thomas, Randolph D'Amore, Kellie Hill, Benjamin Van Durme, Harsh Jhamtani

Abstract: We introduce a language generation task grounded in a popular video game environment. KNUDGE (KNowledge Constrained User-NPC Dialogue GEneration) requires models to produce trees of dialogue between video game characters that accurately reflect quest and entity specifications stated in natural language. KNUDGE is constructed from side quest dialogues drawn directly from game data of Obsidian Enter… ▽ More We introduce a language generation task grounded in a popular video game environment. KNUDGE (KNowledge Constrained User-NPC Dialogue GEneration) requires models to produce trees of dialogue between video game characters that accurately reflect quest and entity specifications stated in natural language. KNUDGE is constructed from side quest dialogues drawn directly from game data of Obsidian Entertainment's The Outer Worlds, leading to real-world complexities in generation: (1) dialogues are branching trees as opposed to linear chains of utterances; (2) utterances must remain faithful to the game lore -- character personas, backstories, and entity relationships; and (3) a dialogue must accurately reveal new quest details to the human player. We report results for a set of neural generation models using supervised and in-context learning techniques; we find competent performance but room for future work addressing the challenges of creating realistic, game-quality dialogues. △ Less

Submitted 13 May, 2023; v1 submitted 20 December, 2022; originally announced December 2022.

arXiv:2212.08038 [pdf, ps, other]

Redefining Relationships in Music

Authors: Christian Detweiler, Beth Coleman, Fernando Diaz, Lieke Dom, Chris Donahue, Jesse Engel, Cheng-Zhi Anna Huang, Larry James, Ethan Manilow, Amanda McCroskery, Kyle Pedersen, Pamela Peter-Agbia, Negar Rostamzadeh, Robert Thomas, Marco Zamarato, Ben Zevenbergen

Abstract: AI tools increasingly shape how we discover, make and experience music. While these tools can have the potential to empower creativity, they may fundamentally redefine relationships between stakeholders, to the benefit of some and the detriment of others. In this position paper, we argue that these tools will fundamentally reshape our music culture, with profound effects (for better and for worse)… ▽ More AI tools increasingly shape how we discover, make and experience music. While these tools can have the potential to empower creativity, they may fundamentally redefine relationships between stakeholders, to the benefit of some and the detriment of others. In this position paper, we argue that these tools will fundamentally reshape our music culture, with profound effects (for better and for worse) on creators, consumers and the commercial enterprises that often connect them. By paying careful attention to emerging Music AI technologies and developments in other creative domains and understanding the implications, people working in this space could decrease the possible negative impacts on the practice, consumption and meaning of music. Given that many of these technologies are already available, there is some urgency in conducting analyses of these technologies now. It is important that people developing and working with these tools address these issues now to help guide their evolution to be equitable and empower creativity. We identify some potential risks and opportunities associated with existing and forthcoming AI tools for music, though more work is needed to identify concrete actions which leverage the opportunities while mitigating risks. △ Less

Submitted 16 December, 2022; v1 submitted 13 December, 2022; originally announced December 2022.

Comments: Presented at Cultures in AI/AI in Culture workshop at NeurIPS 2022

arXiv:2211.08927 [pdf, other]

Benchmarking Graph Neural Networks for FMRI analysis

Authors: Ahmed ElGazzar, Rajat Thomas, Guido van Wingen

Abstract: Graph Neural Networks (GNNs) have emerged as a powerful tool to learn from graph-structured data. A paramount example of such data is the brain, which operates as a network, from the micro-scale of neurons, to the macro-scale of regions. This organization deemed GNNs a natural tool of choice to model brain activity, and have consequently attracted a lot of attention in the neuroimaging community.… ▽ More Graph Neural Networks (GNNs) have emerged as a powerful tool to learn from graph-structured data. A paramount example of such data is the brain, which operates as a network, from the micro-scale of neurons, to the macro-scale of regions. This organization deemed GNNs a natural tool of choice to model brain activity, and have consequently attracted a lot of attention in the neuroimaging community. Yet, the advantage of adopting these models over conventional methods has not yet been assessed in a systematic way to gauge if GNNs are capable of leveraging the underlying structure of the data to improve learning. In this work, we study and evaluate the performance of five popular GNN architectures in diagnosing major depression disorder and autism spectrum disorder in two multi-site clinical datasets, and sex classification on the UKBioBank, from functional brain scans under a general uniform framework. Our results show that GNNs fail to outperform kernel-based and structure-agnostic deep learning models, in which 1D CNNs outperform the other methods in all scenarios. We highlight that creating optimal graph structures for functional brain data is a major bottleneck hindering the performance of GNNs, where existing works use arbitrary measures to define the edges resulting in noisy graphs. We therefore propose to integrate graph diffusion into existing architectures and show that it can alleviate this problem and improve their performance. Our results call for increased moderation and rigorous validation when evaluating graph methods and advocate for more data-centeric approaches in developing GNNs for functional neuroimaging applications. △ Less

Submitted 16 November, 2022; originally announced November 2022.

Comments: 14 pages, 3 figures

arXiv:2208.10629 [pdf, other]

Getting Bored of Cyberwar: Exploring the Role of Low-level Cybercrime Actors in the Russia-Ukraine Conflict

Authors: Anh V. Vu, Daniel R. Thomas, Ben Collier, Alice Hutchings, Richard Clayton, Ross Anderson

Abstract: There has been substantial commentary on the role of cyberattacks carried out by low-level cybercrime actors in the Russia-Ukraine conflict. We analyse 358k website defacement attacks, 1.7M UDP amplification DDoS attacks, 1764 posts made by 372 users on Hack Forums mentioning the two countries, and 441 Telegram announcements (with 58k replies) of a volunteer hacking group for two months before and… ▽ More There has been substantial commentary on the role of cyberattacks carried out by low-level cybercrime actors in the Russia-Ukraine conflict. We analyse 358k website defacement attacks, 1.7M UDP amplification DDoS attacks, 1764 posts made by 372 users on Hack Forums mentioning the two countries, and 441 Telegram announcements (with 58k replies) of a volunteer hacking group for two months before and four months after the invasion. We find the conflict briefly but notably caught the attention of low-level cybercrime actors, with significant increases in online discussion and both types of attacks targeting Russia and Ukraine. However, there was little evidence of high-profile actions; the role of these players in the ongoing hybrid warfare is minor, and they should be separated from persistent and motivated 'hacktivists' in state-sponsored operations. Their involvement in the conflict appears to have been short-lived and fleeting, with a clear loss of interest in discussing the situation and carrying out both website defacement and DDoS attacks against either Russia or Ukraine after just a few weeks. △ Less

Submitted 13 April, 2024; v1 submitted 22 August, 2022; originally announced August 2022.

arXiv:2208.04166 [pdf, other]

fMRI-S4: learning short- and long-range dynamic fMRI dependencies using 1D Convolutions and State Space Models

Authors: Ahmed El-Gazzar, Rajat Mani Thomas, Guido Van Wingen

Abstract: Single-subject mapping of resting-state brain functional activity to non-imaging phenotypes is a major goal of neuroimaging. The large majority of learning approaches applied today rely either on static representations or on short-term temporal correlations. This is at odds with the nature of brain activity which is dynamic and exhibit both short- and long-range dependencies. Further, new sophisti… ▽ More Single-subject mapping of resting-state brain functional activity to non-imaging phenotypes is a major goal of neuroimaging. The large majority of learning approaches applied today rely either on static representations or on short-term temporal correlations. This is at odds with the nature of brain activity which is dynamic and exhibit both short- and long-range dependencies. Further, new sophisticated deep learning approaches have been developed and validated on single tasks/datasets. The application of these models for the study of a different targets typically require exhaustive hyperparameter search, model engineering and trial and error to obtain competitive results with simpler linear models. This in turn limit their adoption and hinder fair benchmarking in a rapidly developing area of research. To this end, we propose fMRI-S4; a versatile deep learning model for the classification of phenotypes and psychiatric disorders from the timecourses of resting-state functional magnetic resonance imaging scans. fMRI-S4 capture short- and long- range temporal dependencies in the signal using 1D convolutions and the recently introduced state-space models S4. The proposed architecture is lightweight, sample-efficient and robust across tasks/datasets. We validate fMRI-S4 on the tasks of diagnosing major depressive disorder (MDD), autism spectrum disorder (ASD) and sex classifcation on three multi-site rs-fMRI datasets. We show that fMRI-S4 can outperform existing methods on all three tasks and can be trained as a plug&play model without special hyperpararameter tuning for each setting △ Less

Submitted 8 August, 2022; originally announced August 2022.

Comments: 11 pages, 3 Figures, Accepted at MLCN 2022

arXiv:2207.07645 [pdf, other]

doi 10.3847/1538-4357/ac7c08

A Probabilistic Autoencoder for Type Ia Supernovae Spectral Time Series

Authors: George Stein, Uros Seljak, Vanessa Bohm, G. Aldering, P. Antilogus, C. Aragon, S. Bailey, C. Baltay, S. Bongard, K. Boone, C. Buton, Y. Copin, S. Dixon, D. Fouchez, E. Gangler, R. Gupta, B. Hayden, W. Hillebrandt, M. Karmen, A. G. Kim, M. Kowalski, D. Kusters, P. F. Leget, F. Mondon, J. Nordin , et al. (15 additional authors not shown)

Abstract: We construct a physically-parameterized probabilistic autoencoder (PAE) to learn the intrinsic diversity of type Ia supernovae (SNe Ia) from a sparse set of spectral time series. The PAE is a two-stage generative model, composed of an Auto-Encoder (AE) which is interpreted probabilistically after training using a Normalizing Flow (NF). We demonstrate that the PAE learns a low-dimensional latent sp… ▽ More We construct a physically-parameterized probabilistic autoencoder (PAE) to learn the intrinsic diversity of type Ia supernovae (SNe Ia) from a sparse set of spectral time series. The PAE is a two-stage generative model, composed of an Auto-Encoder (AE) which is interpreted probabilistically after training using a Normalizing Flow (NF). We demonstrate that the PAE learns a low-dimensional latent space that captures the nonlinear range of features that exists within the population, and can accurately model the spectral evolution of SNe Ia across the full range of wavelength and observation times directly from the data. By introducing a correlation penalty term and multi-stage training setup alongside our physically-parameterized network we show that intrinsic and extrinsic modes of variability can be separated during training, removing the need for the additional models to perform magnitude standardization. We then use our PAE in a number of downstream tasks on SNe Ia for increasingly precise cosmological analyses, including automatic detection of SN outliers, the generation of samples consistent with the data distribution, and solving the inverse problem in the presence of noisy and incomplete data to constrain cosmological distance measurements. We find that the optimal number of intrinsic model parameters appears to be three, in line with previous studies, and show that we can standardize our test sample of SNe Ia with an RMS of $0.091 \pm 0.010$ mag, which corresponds to $0.074 \pm 0.010$ mag if peculiar velocity contributions are removed. Trained models and codes are released at \href{https://github.com/georgestein/suPAErnova}{github.com/georgestein/suPAErnova} △ Less

Submitted 15 July, 2022; originally announced July 2022.

Comments: 23 pages, 8 Figures, 1 Table. Accepted to ApJ

arXiv:2206.13468 [pdf, ps, other]

doi 10.1007/s10208-022-09592-6

An Atlas for the Pinhole Camera

Authors: Sameer Agarwal, Timothy Duff, Max Lieblich, Rekha Thomas

Abstract: We introduce an atlas of algebro-geometric objects associated with image formation in pinhole cameras. The nodes of the atlas are algebraic varieties or their vanishing ideals related to each other by projection or elimination and restriction or specialization respectively. This atlas offers a unifying framework for the study of problems in 3D computer vision. We initiate the study of the atlas by… ▽ More We introduce an atlas of algebro-geometric objects associated with image formation in pinhole cameras. The nodes of the atlas are algebraic varieties or their vanishing ideals related to each other by projection or elimination and restriction or specialization respectively. This atlas offers a unifying framework for the study of problems in 3D computer vision. We initiate the study of the atlas by completely characterizing a part of the atlas stemming from the triangulation problem. We conclude with several open problems and generalizations of the atlas. △ Less

Submitted 3 October, 2022; v1 submitted 27 June, 2022; originally announced June 2022.

Comments: 47 pages with references and appendices, final version

MSC Class: 14Q25; 94A08

Journal ref: JFoCM, 2022

arXiv:2206.11992 [pdf]

The LBNL Superfacility Project Report

Authors: Deborah Bard, Cory Snavely, Lisa Gerhardt, Jason Lee, Becci Totzke, Katie Antypas, William Arndt, Johannes Blaschke, Suren Byna, Ravi Cheema, Shreyas Cholia, Mark Day, Bjoern Enders, Aditi Gaur, Annette Greiner, Taylor Groves, Mariam Kiran, Quincey Koziol, Tom Lehman, Kelly Rowland, Chris Samuel, Ashwin Selvarajan, Alex Sim, David Skinner, Laurie Stephey , et al. (2 additional authors not shown)

Abstract: The Superfacility model is designed to leverage HPC for experimental science. It is more than simply a model of connected experiment, network, and HPC facilities; it encompasses the full ecosystem of infrastructure, software, tools, and expertise needed to make connected facilities easy to use. The three-year Lawrence Berkeley National Laboratory (LBNL) Superfacility project was initiated in 2019… ▽ More The Superfacility model is designed to leverage HPC for experimental science. It is more than simply a model of connected experiment, network, and HPC facilities; it encompasses the full ecosystem of infrastructure, software, tools, and expertise needed to make connected facilities easy to use. The three-year Lawrence Berkeley National Laboratory (LBNL) Superfacility project was initiated in 2019 to coordinate work being performed at LBNL to support this model, and to provide a coherent and comprehensive set of science requirements to drive existing and new work. A key component of the project was the in-depth engagements with eight science teams that represent challenging use cases across the DOE Office of Science. By the close of the project, we met our project goal by enabling our science application engagements to demonstrate automated pipelines that analyze data from remote facilities at large scale, without routine human intervention. In several cases, we have gone beyond demonstrations and now provide production-level services. To achieve this goal, the Superfacility team developed tools, infrastructure, and policies for near-real-time computing support, dynamic high-performance networking, data management and movement tools, API-driven automation, HPC-scale notebooks via Jupyter, authentication using Federated Identity and container-based edge services supported. The lessons we learned during this project provide a valuable model for future large, complex, cross-disciplinary collaborations. There is a pressing need for a coherent computing infrastructure across national facilities, and LBNL's Superfacility project is a unique model for success in tackling the challenges that will be faced in hardware, software, policies, and services across multiple science domains. △ Less

Submitted 27 June, 2022; v1 submitted 23 June, 2022; originally announced June 2022.

Comments: 85 pages, 23 figures

Report number: UCPMS ID: 3815358 UCPMS ID: 3815358 UCPMS ID: 3815358 UCPMS ID: 3815358UCPMS ID: 3815358 UCPMS ID: 3815358

arXiv:2206.05346 [pdf, other]

Random Walks, Equidistribution and Graphical Designs

Authors: Stefan Steinerberger, Rekha R. Thomas

Abstract: Let $G=(V,E)$ be a $d$-regular graph on $n$ vertices and let $μ_0$ be a probability measure on $V$. The act of moving to a randomly chosen neighbor leads to a sequence of probability measures supported on $V$ given by $μ_{k+1} = A D^{-1} μ_k$, where $A$ is the adjacency matrix and $D$ is the diagonal matrix of vertex degrees of $G$. Ordering the eigenvalues of $ A D^{-1}$ as… ▽ More Let $G=(V,E)$ be a $d$-regular graph on $n$ vertices and let $μ_0$ be a probability measure on $V$. The act of moving to a randomly chosen neighbor leads to a sequence of probability measures supported on $V$ given by $μ_{k+1} = A D^{-1} μ_k$, where $A$ is the adjacency matrix and $D$ is the diagonal matrix of vertex degrees of $G$. Ordering the eigenvalues of $ A D^{-1}$ as $1 = λ_1 \geq |λ_2| \geq \dots \geq |λ_n| \geq 0$, it is well-known that the graphs for which $|λ_2|$ is small are those in which the random walk process converges quickly to the uniform distribution: for all initial probability measures $μ_0$ and all $k \geq 0$, $$ \sum_{v \in V} \left| μ_k(v) - \frac{1}{n} \right|^2 \leq λ_2^{2k}.$$ One could wonder whether this rate can be improved for specific initial probability measures $μ_0$. We show that if $G$ is regular, then for any $1 \leq \ell \leq n$, there exists a probability measure $μ_0$ supported on at most $\ell$ vertices so that $$ \sum_{v \in V} \left| μ_k(v) - \frac{1}{n} \right|^2 \leq λ_{\ell+1}^{2k}.$$ The result has applications in the graph sampling problem: we show that these measures have good sampling properties for reconstructing global averages. △ Less

Submitted 10 June, 2022; originally announced June 2022.

arXiv:2206.03331 [pdf, other]

Improving the Diagnosis of Psychiatric Disorders with Self-Supervised Graph State Space Models

Authors: Ahmed El Gazzar, Rajat Mani Thomas, Guido Van Wingen

Abstract: Single subject prediction of brain disorders from neuroimaging data has gained increasing attention in recent years. Yet, for some heterogeneous disorders such as major depression disorder (MDD) and autism spectrum disorder (ASD), the performance of prediction models on large-scale multi-site datasets remains poor. We present a two-stage framework to improve the diagnosis of heterogeneous psychiat… ▽ More Single subject prediction of brain disorders from neuroimaging data has gained increasing attention in recent years. Yet, for some heterogeneous disorders such as major depression disorder (MDD) and autism spectrum disorder (ASD), the performance of prediction models on large-scale multi-site datasets remains poor. We present a two-stage framework to improve the diagnosis of heterogeneous psychiatric disorders from resting-state functional magnetic resonance imaging (rs-fMRI). First, we propose a self-supervised mask prediction task on data from healthy individuals that can exploit differences between healthy controls and patients in clinical datasets. Next, we train a supervised classifier on the learned discriminative representations. To model rs-fMRI data, we develop Graph-S4; an extension to the recently proposed state-space model S4 to graph settings where the underlying graph structure is not known in advance. We show that combining the framework and Graph-S4 can significantly improve the diagnostic performance of neuroimaging-based single subject prediction models of MDD and ASD on three open-source multi-center rs-fMRI clinical datasets. △ Less

Submitted 7 June, 2022; originally announced June 2022.

Comments: 20 pages

arXiv:2206.01637 [pdf, other]

doi 10.1017/S0022377822000939

Unsupervised Discovery of Inertial-Fusion Plasma Physics using Differentiable Kinetic Simulations and a Maximum Entropy Loss Function

Authors: Archis S. Joglekar, Alexander G. R. Thomas

Abstract: Plasma supports collective modes and particle-wave interactions that leads to complex behavior in inertial fusion energy applications. While plasma can sometimes be modeled as a charged fluid, a kinetic description is useful towards the study of nonlinear effects in the higher dimensional momentum-position phase-space that describes the full complexity of plasma dynamics. We create a differentiabl… ▽ More Plasma supports collective modes and particle-wave interactions that leads to complex behavior in inertial fusion energy applications. While plasma can sometimes be modeled as a charged fluid, a kinetic description is useful towards the study of nonlinear effects in the higher dimensional momentum-position phase-space that describes the full complexity of plasma dynamics. We create a differentiable solver for the plasma kinetics 3D partial-differential-equation and introduce a domain-specific objective function. Using this framework, we perform gradient-based optimization of neural networks that provide forcing function parameters to the differentiable solver given a set of initial conditions. We apply this to an inertial-fusion relevant configuration and find that the optimization process exploits a novel physical effect that has previously remained undiscovered. △ Less

Submitted 27 July, 2022; v1 submitted 3 June, 2022; originally announced June 2022.

Comments: 2nd AI4Science Workshop at the 39th International Conference on Machine Learning (ICML), 2022

arXiv:2204.01873 [pdf, other]

Graphical Designs and Gale Duality

Authors: Catherine Babecki, Rekha R. Thomas

Abstract: A graphical design is a subset of graph vertices such that the weighted averages of certain graph eigenvectors over the design agree with their global averages. We use Gale duality to show that positively weighted graphical designs in regular graphs are in bijection with the faces of a generalized eigenpolytope of the graph. This connection can be used to organize, compute and optimize designs. We… ▽ More A graphical design is a subset of graph vertices such that the weighted averages of certain graph eigenvectors over the design agree with their global averages. We use Gale duality to show that positively weighted graphical designs in regular graphs are in bijection with the faces of a generalized eigenpolytope of the graph. This connection can be used to organize, compute and optimize designs. We illustrate the power of this tool on three families of Cayley graphs -- cocktail party graphs, cycles, and graphs of hypercubes -- by computing or bounding the smallest designs that average all but the last eigenspace in frequency order. △ Less

Submitted 5 July, 2022; v1 submitted 4 April, 2022; originally announced April 2022.

Comments: 32 pages, 14 figures, 1 table

MSC Class: 05C50; 52B35; 90C57; 68R10

arXiv:2203.15423 [pdf]

Development of a Scale to Measure Technology Acceptance in Smart Agriculture

Authors: Rosemary J Thomas, Rebecca Whetton, Andy Doyle, David Coyle

Abstract: This paper describes the development of a scale to measure technology acceptance in smart agriculture. The scale is intended for use in diverse situations, ranging for the evaluation of existing technologies already in widespread use, to the evaluation of prototype systems. A systematic screening of prior literature was conducted to identify initial scale items regarding how technology acceptance… ▽ More This paper describes the development of a scale to measure technology acceptance in smart agriculture. The scale is intended for use in diverse situations, ranging for the evaluation of existing technologies already in widespread use, to the evaluation of prototype systems. A systematic screening of prior literature was conducted to identify initial scale items regarding how technology acceptance is currently understood and measured. These items were iteratively reviewed and systematically categorised to develop the final scale proposed in this paper. In future work, this scale will be validated through user studies. The purpose of this paper is to make the initial scale available to the research community with a view to initial use and further evaluation. △ Less

Submitted 29 March, 2022; originally announced March 2022.

arXiv:2111.00801 [pdf, other]

Livestock Monitoring with Transformer

Authors: Bhavesh Tangirala, Ishan Bhandari, Daniel Laszlo, Deepak K. Gupta, Rajat M. Thomas, Devanshu Arya

Abstract: Tracking the behaviour of livestock enables early detection and thus prevention of contagious diseases in modern animal farms. Apart from economic gains, this would reduce the amount of antibiotics used in livestock farming which otherwise enters the human diet exasperating the epidemic of antibiotic resistance - a leading cause of death. We could use standard video cameras, available in most mode… ▽ More Tracking the behaviour of livestock enables early detection and thus prevention of contagious diseases in modern animal farms. Apart from economic gains, this would reduce the amount of antibiotics used in livestock farming which otherwise enters the human diet exasperating the epidemic of antibiotic resistance - a leading cause of death. We could use standard video cameras, available in most modern farms, to monitor livestock. However, most computer vision algorithms perform poorly on this task, primarily because, (i) animals bred in farms look identical, lacking any obvious spatial signature, (ii) none of the existing trackers are robust for long duration, and (iii) real-world conditions such as changing illumination, frequent occlusion, varying camera angles, and sizes of the animals make it hard for models to generalize. Given these challenges, we develop an end-to-end behaviour monitoring system for group-housed pigs to perform simultaneous instance level segmentation, tracking, action recognition and re-identification (STAR) tasks. We present starformer, the first end-to-end multiple-object livestock monitoring framework that learns instance-level embeddings for grouped pigs through the use of transformer architecture. For benchmarking, we present Pigtrace, a carefully curated dataset comprising video sequences with instance level bounding box, segmentation, tracking and activity classification of pigs in real indoor farming environment. Using simultaneous optimization on STAR tasks we show that starformer outperforms popular baseline models trained for individual tasks. △ Less

Submitted 2 November, 2021; v1 submitted 1 November, 2021; originally announced November 2021.

Comments: Accepted at BMVC 2021

arXiv:2109.12517 [pdf, other]

doi 10.1007/978-3-030-87586-2_13

Dynamic Adaptive Spatio-temporal Graph Convolution for fMRI Modelling

Authors: Ahmed El-Gazzar, Rajat Mani Thomas, Guido van Wingen

Abstract: The characterisation of the brain as a functional network in which the connections between brain regions are represented by correlation values across time series has been very popular in the last years. Although this representation has advanced our understanding of brain function, it represents a simplified model of brain connectivity that has a complex dynamic spatio-temporal nature. Oversimplifi… ▽ More The characterisation of the brain as a functional network in which the connections between brain regions are represented by correlation values across time series has been very popular in the last years. Although this representation has advanced our understanding of brain function, it represents a simplified model of brain connectivity that has a complex dynamic spatio-temporal nature. Oversimplification of the data may hinder the merits of applying advanced non-linear feature extraction algorithms. To this end, we propose a dynamic adaptive spatio-temporal graph convolution (DAST-GCN) model to overcome the shortcomings of pre-defined static correlation-based graph structures. The proposed approach allows end-to-end inference of dynamic connections between brain regions via layer-wise graph structure learning module while mapping brain connectivity to a phenotype in a supervised learning framework. This leverages the computational power of the model, data and targets to represent brain connectivity, and could enable the identification of potential biomarkers for the supervised target in question. We evaluate our pipeline on the UKBiobank dataset for age and gender classification tasks from resting-state functional scans and show that it outperforms currently adapted linear and non-linear methods in neuroimaging. Further, we assess the generalizability of the inferred graph structure by transferring the pre-trained graph to an independent dataset for the same task. Our results demonstrate the task-robustness of the graph against different scanning parameters and demographics. △ Less

Submitted 26 September, 2021; originally announced September 2021.

Comments: Accepted at International Workshop on Machine Learning in Clinical Neuroimaging (MLCN2021)

Journal ref: Abdulkadir A. et al. (eds) Machine Learning in Clinical Neuroimaging. MLCN 2021. Lecture Notes in Computer Science, vol 13001. Springer, Cham

arXiv:2105.04764 [pdf, other]

Autonomous Situational Awareness for Robotic Swarms in High-Risk Environments

Authors: Vincent W. Hill, Ryan W. Thomas, Jordan D. Larson

Abstract: This paper describes a technique for the autonomous mission planning of robotic swarms in high risk environments where agent disablement is likely. Given a swarm operating in a known area, a central command system generates measurements from the swarm. If those measurements indicate changes to the mission situation such as target movement or agent loss, the swarm planning is updated to reflect the… ▽ More This paper describes a technique for the autonomous mission planning of robotic swarms in high risk environments where agent disablement is likely. Given a swarm operating in a known area, a central command system generates measurements from the swarm. If those measurements indicate changes to the mission situation such as target movement or agent loss, the swarm planning is updated to reflect the new situation and guidance updates are broadcast to the swarm. The primary algorithms featured in this work are A* pathfinding and the Generalized Labeled Multi-Bernoulli multi-object tracking method. △ Less

Submitted 10 May, 2021; originally announced May 2021.

Comments: arXiv admin note: substantial text overlap with arXiv:2104.08904

arXiv:2105.00100 [pdf, other]

doi 10.1109/IJCNN52387.2021.9534128

Data-driven Full-waveform Inversion Surrogate using Conditional Generative Adversarial Networks

Authors: Saraiva Marcus, Forechi Avelino, de Oliveira Neto Jorcy, DelRey Antonio, Rauber Thomas

Abstract: In the Oil and Gas industry, estimating a subsurface velocity field is an essential step in seismic processing, reservoir characterization, and hydrocarbon volume calculation. Full-waveform inversion (FWI) velocity modeling is an iterative advanced technique that provides an accurate and detailed velocity field model, although at a very high computational cost due to the physics-based numerical si… ▽ More In the Oil and Gas industry, estimating a subsurface velocity field is an essential step in seismic processing, reservoir characterization, and hydrocarbon volume calculation. Full-waveform inversion (FWI) velocity modeling is an iterative advanced technique that provides an accurate and detailed velocity field model, although at a very high computational cost due to the physics-based numerical simulations required at each FWI iteration. In this study, we propose a method of generating velocity field models, as detailed as those obtained through FWI, using a conditional generative adversarial network (cGAN) with multiple inputs. The primary motivation of this approach is to circumvent the extremely high cost of full-waveform inversion velocity modeling. Real-world data were used to train and test the proposed network architecture, and three evaluation metrics (percent error, structural similarity index measure, and visual analysis) were adopted as quality criteria. Based on these metrics, the results evaluated upon the test set suggest that the GAN was able to accurately match real FWI generated outputs, enabling it to extract from input data the main geological structures and lateral velocity variations. Experimental results indicate that the proposed method, when deployed, has the potential to increase the speed of geophysical reservoir characterization processes, saving on time and computational resources. △ Less

Submitted 30 April, 2021; originally announced May 2021.

arXiv:2104.08904 [pdf, other]

Autonomous Situational Awareness for UAS Swarms

Authors: Vincent W. Hill, Ryan W. Thomas, Jordan D. Larson

Abstract: This paper describes a technique for the autonomous mission planning of unmanned aerial system swarms. Given a swarm operating in a known area, a central command system generates measurements from the swarm. If those measurements indicate changes to the mission situation such as target movement, the swarm planning is updated to reflect the new situation and guidance updates are broadcast to the sw… ▽ More This paper describes a technique for the autonomous mission planning of unmanned aerial system swarms. Given a swarm operating in a known area, a central command system generates measurements from the swarm. If those measurements indicate changes to the mission situation such as target movement, the swarm planning is updated to reflect the new situation and guidance updates are broadcast to the swarm. The primary algorithms featured in this work are A* pathfinding and the Generalized Labeled Multi-Bernoulli multi-target tracking method. △ Less

Submitted 18 April, 2021; originally announced April 2021.

Comments: IEEE Aerospace 2021

arXiv:2104.04017 [pdf, other]

Improving Solar Cell Metallization Designs using Convolutional Neural Networks

Authors: Sumit Bhattacharya, Devanshu Arya, Debjani Bhowmick, Rajat Mani Thomas, Deepak Kumar Gupta

Abstract: Optimizing the design of solar cell metallizations is one of the ways to improve the performance of solar cells. Recently, it has been shown that Topology Optimization (TO) can be used to design complex metallization patterns for solar cells that lead to improved performance. TO generates unconventional design patterns that cannot be obtained with the traditional shape optimization methods. In thi… ▽ More Optimizing the design of solar cell metallizations is one of the ways to improve the performance of solar cells. Recently, it has been shown that Topology Optimization (TO) can be used to design complex metallization patterns for solar cells that lead to improved performance. TO generates unconventional design patterns that cannot be obtained with the traditional shape optimization methods. In this paper, we show that this design process can be improved further using a deep learning inspired strategy. We present SolarNet, a CNN-based reparameterization scheme that can be used to obtain improved metallization designs. SolarNet modifies the optimization domain such that rather than optimizing the electrode material distribution directly, the weights of a CNN model are optimized. The design generated by CNN is then evaluated using the physics equations, which in turn generates gradients for backpropagation. SolarNet is trained end-to-end involving backpropagation through the solar cell model as well as the CNN pipeline. Through application on solar cells of different shapes as well as different busbar geometries, we demonstrate that SolarNet improves the performance of solar cells compared to the traditional TO approach. △ Less

Submitted 8 April, 2021; originally announced April 2021.

Comments: Published as a workshop paper at ICLR 2021 SimDL Workshop

arXiv:2102.13473 [pdf]

Sleep Apnea and Respiratory Anomaly Detection from a Wearable Band and Oxygen Saturation

Authors: Wolfgang Ganglberger, Abigail A. Bucklin, Ryan A. Tesh, Madalena Da Silva Cardoso, Haoqi Sun, Michael J. Leone, Luis Paixao, Ezhil Panneerselvam, Elissa M. Ye, B. Taylor Thompson, Oluwaseun Akeju, David Kuller, Robert J. Thomas, M. Brandon Westover

Abstract: Objective: Sleep related respiratory abnormalities are typically detected using polysomnography. There is a need in general medicine and critical care for a more convenient method to automatically detect sleep apnea from a simple, easy-to-wear device. The objective is to automatically detect abnormal respiration and estimate the Apnea-Hypopnea-Index (AHI) with a wearable respiratory device, compar… ▽ More Objective: Sleep related respiratory abnormalities are typically detected using polysomnography. There is a need in general medicine and critical care for a more convenient method to automatically detect sleep apnea from a simple, easy-to-wear device. The objective is to automatically detect abnormal respiration and estimate the Apnea-Hypopnea-Index (AHI) with a wearable respiratory device, compared to an SpO2 signal or polysomnography using a large (n = 412) dataset serving as ground truth. Methods: Simultaneously recorded polysomnographic (PSG) and wearable respiratory effort data were used to train and evaluate models in a cross-validation fashion. Time domain and complexity features were extracted, important features were identified, and a random forest model employed to detect events and predict AHI. Four models were trained: one each using the respiratory features only, a feature from the SpO2 (%)-signal only, and two additional models that use the respiratory features and the SpO2 (%)-feature, one allowing a time lag of 30 seconds between the two signals. Results: Event-based classification resulted in areas under the receiver operating characteristic curves of 0.94, 0.86, 0.82, and areas under the precision-recall curves of 0.48, 0.32, 0.51 for the models using respiration and SpO2, respiration-only, and SpO2-only respectively. Correlation between expert-labelled and predicted AHI was 0.96, 0.78, and 0.93, respectively. Conclusions: A wearable respiratory effort signal with or without SpO2 predicted AHI accurately. Given the large dataset and rigorous testing design, we expect our models are generalizable to evaluating respiration in a variety of environments, such as at home and in critical care. △ Less

Submitted 23 February, 2021; originally announced February 2021.

Comments: Co-First Authors: Wolfgang Ganglberger, Abigail A. Bucklin Co-Senior Authors: Robert J. Thomas, M. Brandon Westover

arXiv:2101.04635 [pdf, other]

Automated Respiratory Event Detection Using Deep Neural Networks

Authors: Thijs E Nassi, Wolfgang Ganglberger, Haoqi Sun, Abigail A Bucklin, Siddharth Biswal, Michel J A M van Putten, Robert J Thomas, M Brandon Westover

Abstract: The gold standard to assess respiration during sleep is polysomnography; a technique that is burdensome, expensive (both in analysis time and measurement costs), and difficult to repeat. Automation of respiratory analysis can improve test efficiency and enable accessible implementation opportunities worldwide. Using 9,656 polysomnography recordings from the Massachusetts General Hospital (MGH), we… ▽ More The gold standard to assess respiration during sleep is polysomnography; a technique that is burdensome, expensive (both in analysis time and measurement costs), and difficult to repeat. Automation of respiratory analysis can improve test efficiency and enable accessible implementation opportunities worldwide. Using 9,656 polysomnography recordings from the Massachusetts General Hospital (MGH), we trained a neural network (WaveNet) based on a single respiratory effort belt to detect obstructive apnea, central apnea, hypopnea and respiratory-effort related arousals. Performance evaluation included event-based and recording-based metrics - using an apnea-hypopnea index analysis. The model was further evaluated on a public dataset, the Sleep-Heart-Health-Study-1, containing 8,455 polysomnographic recordings. For binary apnea event detection in the MGH dataset, the neural network obtained an accuracy of 95%, an apnea-hypopnea index $r^2$ of 0.89 and area under the curve for the receiver operating characteristics curve and precision-recall curve of 0.93 and 0.74, respectively. For the multiclass task, we obtained varying performances: 81% of all labeled central apneas were correctly classified, whereas this metric was 46% for obstructive apneas, 29% for respiratory effort related arousals and 16% for hypopneas. The majority of false predictions were misclassifications as another type of respiratory event. Our fully automated method can detect respiratory events and assess the apnea-hypopnea index with sufficient accuracy for clinical utilization. Differentiation of event types is more difficult and may reflect in part the complexity of human respiratory output and some degree of arbitrariness in the clinical thresholds and criteria used during manual annotation. △ Less

Submitted 12 January, 2021; originally announced January 2021.

Comments: 11 pages, 6 figures, 6 tables, \c{opyright}2020 IEEE

arXiv:2012.08175 [pdf, other]

doi 10.1016/j.jqsrt.2020.107361

Machine Learning for automatic identification of new minor species

Authors: Frederic Schmidt, Guillaume Cruz Mermy, Justin Erwin, Severine Robert, Lori Neary, Ian R. Thomas, Frank Daerden, Bojan Ristic, Manish R. Patel, Giancarlo Bellucci, Jose-Juan Lopez-Moreno, Ann-Carine Vandaele

Abstract: One of the main difficulties to analyze modern spectroscopic datasets is due to the large amount of data. For example, in atmospheric transmittance spectroscopy, the solar occultation channel (SO) of the NOMAD instrument onboard the ESA ExoMars2016 satellite called Trace Gas Orbiter (TGO) had produced $\sim$10 millions of spectra in 20000 acquisition sequences since the beginning of the mission in… ▽ More One of the main difficulties to analyze modern spectroscopic datasets is due to the large amount of data. For example, in atmospheric transmittance spectroscopy, the solar occultation channel (SO) of the NOMAD instrument onboard the ESA ExoMars2016 satellite called Trace Gas Orbiter (TGO) had produced $\sim$10 millions of spectra in 20000 acquisition sequences since the beginning of the mission in April 2018 until 15 January 2020. Other datasets are even larger with $\sim$billions of spectra for OMEGA onboard Mars Express or CRISM onboard Mars Reconnaissance Orbiter. Usually, new lines are discovered after a long iterative process of model fitting and manual residual analysis. Here we propose a new method based on unsupervised machine learning, to automatically detect new minor species. Although precise quantification is out of scope, this tool can also be used to quickly summarize the dataset, by giving few endmembers ("source") and their abundances. We approximate the dataset non-linearity by a linear mixture of abundance and source spectra (endmembers). We used unsupervised source separation in form of non-negative matrix factorization to estimate those quantities. Several methods are tested on synthetic and simulation data. Our approach is dedicated to detect minor species spectra rather than precisely quantifying them. On synthetic example, this approach is able to detect chemical compounds present in form of 100 hidden spectra out of $10^4$, at 1.5 times the noise level. Results on simulated spectra of NOMAD-SO targeting CH$_{4}$ show that detection limits goes in the range of 100-500 ppt in favorable conditions. Results on real martian data from NOMAD-SO show that CO$_{2}$ and H$_{2}$O are present, as expected, but CH$_{4}$ is absent. Nevertheless, we confirm a set of new unexpected lines in the database, attributed by ACS instrument Team to the CO$_{2}$ magnetic dipole. △ Less

Submitted 15 December, 2020; originally announced December 2020.

Comments: 26 pages, 10 figures

Journal ref: Quantitative Spectroscopy and Radiative Transfer, 2021, 259, 107361

arXiv:2011.07197 [pdf, other]

Existence of Two View Chiral Reconstructions

Authors: Andrew Pryhuber, Rainer Sinn, Rekha R. Thomas

Abstract: A fundamental question in computer vision is whether a set of point pairs is the image of a scene that lies in front of two cameras. Such a scene and the cameras together are known as a chiral reconstruction of the point pairs. In this paper we provide a complete classification of k point pairs for which a chiral reconstruction exists. The existence of chiral reconstructions is equivalent to the n… ▽ More A fundamental question in computer vision is whether a set of point pairs is the image of a scene that lies in front of two cameras. Such a scene and the cameras together are known as a chiral reconstruction of the point pairs. In this paper we provide a complete classification of k point pairs for which a chiral reconstruction exists. The existence of chiral reconstructions is equivalent to the non-emptiness of certain semialgebraic sets. For up to three point pairs, we prove that a chiral reconstruction always exists while the set of five or more point pairs that do not have a chiral reconstruction is Zariski-dense. We show that for five generic point pairs, the chiral region is bounded by line segments in a Schläfli double six on a cubic surface with 27 real lines. Four point pairs have a chiral reconstruction unless they belong to two non-generic combinatorial types, in which case they may or may not. △ Less

Submitted 3 December, 2021; v1 submitted 13 November, 2020; originally announced November 2020.

arXiv:2011.02998 [pdf, other]

Competence-Level Prediction and Resume & Job Description Matching Using Context-Aware Transformer Models

Authors: Changmao Li, Elaine Fisher, Rebecca Thomas, Steve Pittard, Vicki Hertzberg, Jinho D. Choi

Abstract: This paper presents a comprehensive study on resume classification to reduce the time and labor needed to screen an overwhelming number of applications significantly, while improving the selection of suitable candidates. A total of 6,492 resumes are extracted from 24,933 job applications for 252 positions designated into four levels of experience for Clinical Research Coordinators (CRC). Each resu… ▽ More This paper presents a comprehensive study on resume classification to reduce the time and labor needed to screen an overwhelming number of applications significantly, while improving the selection of suitable candidates. A total of 6,492 resumes are extracted from 24,933 job applications for 252 positions designated into four levels of experience for Clinical Research Coordinators (CRC). Each resume is manually annotated to its most appropriate CRC position by experts through several rounds of triple annotation to establish guidelines. As a result, a high Kappa score of 61% is achieved for inter-annotator agreement. Given this dataset, novel transformer-based classification models are developed for two tasks: the first task takes a resume and classifies it to a CRC level (T1), and the second task takes both a resume and a job description to apply and predicts if the application is suited to the job T2. Our best models using section encoding and multi-head attention decoding give results of 73.3% to T1 and 79.2% to T2. Our analysis shows that the prediction errors are mostly made among adjacent CRC levels, which are hard for even experts to distinguish, implying the practical value of our models in real HR platforms. △ Less

Submitted 5 November, 2020; originally announced November 2020.

Comments: Accepted by the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP 2020)

ACM Class: I.2.7

arXiv:2010.12397 [pdf, ps, other]

Quickly excluding a non-planar graph

Authors: Ken-ichi Kawarabayashi, Robin Thomas, Paul Wollan

Abstract: A cornerstone theorem in the Graph Minors series of Robertson and Seymour is the result that every graph $G$ with no minor isomorphic to a fixed graph $H$ has a certain structure. The structure can then be exploited to deduce far-reaching consequences. The exact statement requires some explanation, but roughly it says that there exist integers $k,n$ depending on $H$ only such that $0<k<n$ and for… ▽ More A cornerstone theorem in the Graph Minors series of Robertson and Seymour is the result that every graph $G$ with no minor isomorphic to a fixed graph $H$ has a certain structure. The structure can then be exploited to deduce far-reaching consequences. The exact statement requires some explanation, but roughly it says that there exist integers $k,n$ depending on $H$ only such that $0<k<n$ and for every $n\times n$ grid minor $J$ of $G$ the graph $G$ has a a $k$-near embedding in a surface $Σ$ that does not embed $H$ in such a way that a substantial part of $J$ is embedded in $Σ$. Here a $k$-near embedding means that after deleting at most $k$ vertices the graph can be drawn in $Σ$ without crossings, except for local areas of non-planarity, where crossings are permitted, but at most $k$ of these areas are attached to the rest of the graph by four or more vertices and inside those the graph is constrained in a different way, again depending on the parameter $k$. The original and only proof so far is quite long and uses many results developed in the Graph Minors series. We give a proof that uses only our earlier paper [A new proof of the flat wall theorem, {\it J.~Combin.\ Theory Ser.\ B \bf 129} (2018), 158--203] and results from graduate textbooks. Our proof is constructive and yields a polynomial time algorithm to construct such a structure. We also give explicit constants for the structure theorem, whereas the original proof only guarantees the existence of such constants. △ Less

Submitted 2 January, 2021; v1 submitted 23 October, 2020; originally announced October 2020.

Comments: 97 pages

arXiv:2009.09084 [pdf, other]

Intimate Partner Violence and Injury Prediction From Radiology Reports

Authors: Irene Y. Chen, Emily Alsentzer, Hyesun Park, Richard Thomas, Babina Gosangi, Rahul Gujrathi, Bharti Khurana

Abstract: Intimate partner violence (IPV) is an urgent, prevalent, and under-detected public health issue. We present machine learning models to assess patients for IPV and injury. We train the predictive algorithms on radiology reports with 1) IPV labels based on entry to a violence prevention program and 2) injury labels provided by emergency radiology fellowship-trained physicians. Our dataset includes 3… ▽ More Intimate partner violence (IPV) is an urgent, prevalent, and under-detected public health issue. We present machine learning models to assess patients for IPV and injury. We train the predictive algorithms on radiology reports with 1) IPV labels based on entry to a violence prevention program and 2) injury labels provided by emergency radiology fellowship-trained physicians. Our dataset includes 34,642 radiology reports and 1479 patients of IPV victims and control patients. Our best model predicts IPV a median of 3.08 years before violence prevention program entry with a sensitivity of 64% and a specificity of 95%. We conduct error analysis to determine for which patients our model has especially high or low performance and discuss next steps for a deployed clinical risk model. △ Less

Submitted 7 October, 2020; v1 submitted 28 August, 2020; originally announced September 2020.

arXiv:2008.03513 [pdf, ps, other]

doi 10.1109/ICASSP40776.2020.9054728

A Novel Method for Obtaining Diffuse Field Measurements for Microphone Calibration

Authors: Noman Akbar, Glenn Dickins, Mark R. P. Thomas, Prasanga Samarasinghe, Thushara Abhayapala

Abstract: We propose a straightforward and cost-effective method to perform diffuse soundfield measurements for calibrating the magnitude response of a microphone array. Typically, such calibration is performed in a diffuse soundfield created in reverberation chambers, an expensive and time-consuming process. A method is proposed for obtaining diffuse field measurements in untreated environments. First, a c… ▽ More We propose a straightforward and cost-effective method to perform diffuse soundfield measurements for calibrating the magnitude response of a microphone array. Typically, such calibration is performed in a diffuse soundfield created in reverberation chambers, an expensive and time-consuming process. A method is proposed for obtaining diffuse field measurements in untreated environments. First, a closed-form expression for the spatial correlation of a wideband signal in a diffuse field is derived. Next, we describe a practical procedure for obtaining the diffuse field response of a microphone array in the presence of a non-diffuse soundfield by the introduction of random perturbations in the microphone location. Experimental spatial correlation data obtained is compared with the theoretical model, confirming that it is possible to obtain diffuse field measurements in untreated environments with relatively few loudspeakers. A 30 second test signal played from 4-8 loudspeakers is shown to be sufficient in obtaining a diffuse field measurement using the proposed method. An Eigenmike is then successfully calibrated at two different geographical locations. △ Less

Submitted 8 August, 2020; originally announced August 2020.

Comments: Accepted to appear in IEEE ICASSP 2020

Journal ref: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2020

arXiv:2003.09265 [pdf, ps, other]

The Chiral Domain of a Camera Arrangement

Authors: Sameer Agarwal, Andrew Pryhuber, Rainer Sinn, Rekha R. Thomas

Abstract: We introduce the chiral domain of an arrangement of cameras $\mathcal{A} = \{A_1,\dots, A_m\}$ which is the subset of $\mathbb{P}^3$ visible in $\mathcal{A}$. It generalizes the classical definition of chirality to include all of $\mathbb{P}^3$ and offers a unifying framework for studying multiview chirality. We give an algebraic description of the chiral domain which allows us to define and descr… ▽ More We introduce the chiral domain of an arrangement of cameras $\mathcal{A} = \{A_1,\dots, A_m\}$ which is the subset of $\mathbb{P}^3$ visible in $\mathcal{A}$. It generalizes the classical definition of chirality to include all of $\mathbb{P}^3$ and offers a unifying framework for studying multiview chirality. We give an algebraic description of the chiral domain which allows us to define and describe a chiral version of Triggs' joint image. We then use the chiral domain to re-derive and extend prior results on chirality due to Hartley. △ Less

Submitted 26 April, 2022; v1 submitted 19 March, 2020; originally announced March 2020.

arXiv:2002.08512 [pdf]

The Problem with Metrics is a Fundamental Problem for AI

Authors: Rachel Thomas, David Uminsky

Abstract: Optimizing a given metric is a central aspect of most current AI approaches, yet overemphasizing metrics leads to manipulation, gaming, a myopic focus on short-term goals, and other unexpected negative consequences. This poses a fundamental contradiction for AI development. Through a series of real-world case studies, we look at various aspects of where metrics go wrong in practice and aspects of… ▽ More Optimizing a given metric is a central aspect of most current AI approaches, yet overemphasizing metrics leads to manipulation, gaming, a myopic focus on short-term goals, and other unexpected negative consequences. This poses a fundamental contradiction for AI development. Through a series of real-world case studies, we look at various aspects of where metrics go wrong in practice and aspects of how our online environment and current business practices are exacerbating these failures. Finally, we propose a framework towards mitigating the harms caused by overemphasis of metrics within AI by: (1) using a slate of metrics to get a fuller and more nuanced picture, (2) combining metrics with qualitative accounts, and (3) involving a range of stakeholders, including those who will be most impacted. △ Less

Submitted 19 February, 2020; originally announced February 2020.

Comments: Accepted to EDSC (Ethics of Data Science Conference) 2020

arXiv:2002.05981 [pdf, other]

A Hybrid 3DCNN and 3DC-LSTM based model for 4D Spatio-temporal fMRI data: An ABIDE Autism Classification study

Authors: Ahmed El-Gazzar, Mirjam Quaak, Leonardo Cerliani, Peter Bloem, Guido van Wingen, Rajat Mani Thomas

Abstract: Functional Magnetic Resonance Imaging (fMRI) captures the temporal dynamics of neural activity as a function of spatial location in the brain. Thus, fMRI scans are represented as 4-Dimensional (3-space + 1-time) tensors. And it is widely believed that the spatio-temporal patterns in fMRI manifests as behaviour and clinical symptoms. Because of the high dimensionality ($\sim$ 1 Million) of fMRI, an… ▽ More Functional Magnetic Resonance Imaging (fMRI) captures the temporal dynamics of neural activity as a function of spatial location in the brain. Thus, fMRI scans are represented as 4-Dimensional (3-space + 1-time) tensors. And it is widely believed that the spatio-temporal patterns in fMRI manifests as behaviour and clinical symptoms. Because of the high dimensionality ($\sim$ 1 Million) of fMRI, and the added constraints of limited cardinality of data sets, extracting such patterns are challenging. A standard approach to overcome these hurdles is to reduce the dimensionality of the data by either summarizing activation over time or space at the expense of possible loss of useful information. Here, we introduce an end-to-end algorithm capable of extracting spatiotemporal features from the full 4-D data using 3-D CNNs and 3-D Convolutional LSTMs. We evaluate our proposed model on the publicly available ABIDE dataset to demonstrate the capability of our model to classify Autism Spectrum Disorder (ASD) from resting-state fMRI data. Our results show that the proposed model achieves state of the art results on single sites with F1-scores of 0.78 and 0.7 on NYU and UM sites, respectively. △ Less

Submitted 14 February, 2020; originally announced February 2020.

Comments: 8pages

Journal ref: Second International Workshop, OR 2.0 2019, and Second International Workshop, MLCN 2019, Held in Conjunction with MICCAI 2019, Shenzhen, China, October 13 and 17, 2019, Proceedings

arXiv:1911.02682 [pdf, other]

Physics-Guided Architecture (PGA) of Neural Networks for Quantifying Uncertainty in Lake Temperature Modeling

Authors: Arka Daw, R. Quinn Thomas, Cayelan C. Carey, Jordan S. Read, Alison P. Appling, Anuj Karpatne

Abstract: To simultaneously address the rising need of expressing uncertainties in deep learning models along with producing model outputs which are consistent with the known scientific knowledge, we propose a novel physics-guided architecture (PGA) of neural networks in the context of lake temperature modeling where the physical constraints are hard coded in the neural network architecture. This allows us… ▽ More To simultaneously address the rising need of expressing uncertainties in deep learning models along with producing model outputs which are consistent with the known scientific knowledge, we propose a novel physics-guided architecture (PGA) of neural networks in the context of lake temperature modeling where the physical constraints are hard coded in the neural network architecture. This allows us to integrate such models with state of the art uncertainty estimation approaches such as Monte Carlo (MC) Dropout without sacrificing the physical consistency of our results. We demonstrate the effectiveness of our approach in ensuring better generalizability as well as physical consistency in MC estimates over data collected from Lake Mendota in Wisconsin and Falling Creek Reservoir in Virginia, even with limited training data. We further show that our MC estimates correctly match the distribution of ground-truth observations, thus making the PGA paradigm amenable to physically grounded uncertainty quantification. △ Less

Submitted 6 November, 2019; originally announced November 2019.

Comments: 11 pages, 15 figures, 2 tables

arXiv:1907.01288 [pdf, other]

Simple 1-D Convolutional Networks for Resting-State fMRI Based Classification in Autism

Authors: Ahmed El Gazzar, Leonardo Cerliani, Guido van Wingen, Rajat Mani Thomas

Abstract: Deep learning methods are increasingly being used with neuroimaging data like structural and function magnetic resonance imaging (MRI) to predict the diagnosis of neuropsychiatric and neurological disorders. For psychiatric disorders in particular, it is believed that one of the most promising modality is the resting-state functional MRI (rsfMRI), which captures the intrinsic connectivity between… ▽ More Deep learning methods are increasingly being used with neuroimaging data like structural and function magnetic resonance imaging (MRI) to predict the diagnosis of neuropsychiatric and neurological disorders. For psychiatric disorders in particular, it is believed that one of the most promising modality is the resting-state functional MRI (rsfMRI), which captures the intrinsic connectivity between regions in the brain. Because rsfMRI data points are inherently high-dimensional (~1M), it is impossible to process the entire input in its raw form. In this paper, we propose a very simple transformation of the rsfMRI images that captures all of the temporal dynamics of the signal but sub-samples its spatial extent. As a result, we use a very simple 1-D convolutional network which is fast to train, requires minimal preprocessing and performs at par with the state-of-the-art on the classification of Autism spectrum disorders. △ Less

Submitted 2 July, 2019; originally announced July 2019.

Comments: accepted for publication in IJCNN 2019

arXiv:1907.00305 [pdf, ps, other]

doi 10.1016/j.jctb.2005.10.005

Minimal bricks

Authors: Serguei Norine, Robin Thomas

Abstract: A brick is a 3-connected graph such that the graph obtained from it by deleting any two distinct vertices has a perfect matching. A brick is minimal if for every edge e the deletion of e results in a graph that is not a brick. We prove a generation theorem for minimal bricks and two corollaries: (1) for n>4, every minimal brick on 2n vertices has at most 5n-7 edges, and (2) every minimal brick has… ▽ More A brick is a 3-connected graph such that the graph obtained from it by deleting any two distinct vertices has a perfect matching. A brick is minimal if for every edge e the deletion of e results in a graph that is not a brick. We prove a generation theorem for minimal bricks and two corollaries: (1) for n>4, every minimal brick on 2n vertices has at most 5n-7 edges, and (2) every minimal brick has at least three vertices of degree three. △ Less

Submitted 29 June, 2019; originally announced July 2019.

Comments: 10 pages, 2 figures. This version fixes an error kindly pointed to us by P. A. Fabres, N. Kothari and M. H. de Carvalho

Journal ref: J. Combin. Theory Ser. B 96 (2006), 505-513

Showing 1–50 of 96 results for author: Thomas, R