-
Concept Lens: Visually Analyzing the Consistency of Semantic Manipulation in GANs
Authors:
Sangwon Jeong,
Mingwei Li,
Matthew Berger,
Shusen Liu
Abstract:
As applications of generative AI become mainstream, it is important to understand what generative models are capable of producing, and the extent to which one can predictably control their outputs. In this paper, we propose a visualization design, named Concept Lens, for jointly navigating the data distribution of a generative model, and concept manipulations supported by the model. Our work is fo…
▽ More
As applications of generative AI become mainstream, it is important to understand what generative models are capable of producing, and the extent to which one can predictably control their outputs. In this paper, we propose a visualization design, named Concept Lens, for jointly navigating the data distribution of a generative model, and concept manipulations supported by the model. Our work is focused on modern vision-based generative adversarial networks (GAN), and their learned latent spaces, wherein concept discovery has gained significant interest as a means of image manipulation. Concept Lens is designed to support users in understanding the diversity of a provided set of concepts, the relationship between concepts, and the suitability of concepts to give semantic controls for image generation. Key to our approach is the hierarchical grouping of concepts, generated images, and the associated joint exploration. We show how Concept Lens can reveal consistent semantic manipulations for editing images, while also serving as a diagnostic tool for studying the limitations and trade-offs of concept discovery methods.
△ Less
Submitted 28 June, 2024;
originally announced June 2024.
-
Text-based Transfer Function Design for Semantic Volume Rendering
Authors:
Sangwon Jeong,
Jixian Li,
Christopher Johnson,
Shusen Liu,
Matthew Berger
Abstract:
Transfer function design is crucial in volume rendering, as it directly influences the visual representation and interpretation of volumetric data. However, creating effective transfer functions that align with users' visual objectives is often challenging due to the complex parameter space and the semantic gap between transfer function values and features of interest within the volume. In this wo…
▽ More
Transfer function design is crucial in volume rendering, as it directly influences the visual representation and interpretation of volumetric data. However, creating effective transfer functions that align with users' visual objectives is often challenging due to the complex parameter space and the semantic gap between transfer function values and features of interest within the volume. In this work, we propose a novel approach that leverages recent advancements in language-vision models to bridge this semantic gap. By employing a fully differentiable rendering pipeline and an image-based loss function guided by language descriptions, our method generates transfer functions that yield volume-rendered images closely matching the user's intent. We demonstrate the effectiveness of our approach in creating meaningful transfer functions from simple descriptions, empowering users to intuitively express their desired visual outcomes with minimal effort. This advancement streamlines the transfer function design process and makes volume rendering more accessible to a wider range of users.
△ Less
Submitted 21 June, 2024;
originally announced June 2024.
-
Graphical Perception of Saliency-based Model Explanations
Authors:
Yayan Zhao,
Mingwei Li,
Matthew Berger
Abstract:
In recent years, considerable work has been devoted to explaining predictive, deep learning-based models, and in turn how to evaluate explanations. An important class of evaluation methods are ones that are human-centered, which typically require the communication of explanations through visualizations. And while visualization plays a critical role in perceiving and understanding model explanation…
▽ More
In recent years, considerable work has been devoted to explaining predictive, deep learning-based models, and in turn how to evaluate explanations. An important class of evaluation methods are ones that are human-centered, which typically require the communication of explanations through visualizations. And while visualization plays a critical role in perceiving and understanding model explanations, how visualization design impacts human perception of explanations remains poorly understood. In this work, we study the graphical perception of model explanations, specifically, saliency-based explanations for visual recognition models. We propose an experimental design to investigate how human perception is influenced by visualization design, wherein we study the task of alignment assessment, or whether a saliency map aligns with an object in an image. Our findings show that factors related to visualization design decisions, the type of alignment, and qualities of the saliency map all play important roles in how humans perceive saliency-based visual explanations.
△ Less
Submitted 11 June, 2024;
originally announced June 2024.
-
CUPID: Contextual Understanding of Prompt-conditioned Image Distributions
Authors:
Yayan Zhao,
Mingwei Li,
Matthew Berger
Abstract:
We present CUPID: a visualization method for the contextual understanding of prompt-conditioned image distributions. CUPID targets the visual analysis of distributions produced by modern text-to-image generative models, wherein a user can specify a scene via natural language, and the model generates a set of images, each intended to satisfy the user's description. CUPID is designed to help underst…
▽ More
We present CUPID: a visualization method for the contextual understanding of prompt-conditioned image distributions. CUPID targets the visual analysis of distributions produced by modern text-to-image generative models, wherein a user can specify a scene via natural language, and the model generates a set of images, each intended to satisfy the user's description. CUPID is designed to help understand the resulting distribution, using contextual cues to facilitate analysis: objects mentioned in the prompt, novel, synthesized objects not explicitly mentioned, and their potential relationships. Central to CUPID is a novel method for visualizing high-dimensional distributions, wherein contextualized embeddings of objects, those found within images, are mapped to a low-dimensional space via density-based embeddings. We show how such embeddings allows one to discover salient styles of objects within a distribution, as well as identify anomalous, or rare, object styles. Moreover, we introduce conditional density embeddings, whereby conditioning on a given object allows one to compare object dependencies within the distribution. We employ CUPID for analyzing image distributions produced by large-scale diffusion models, where our experimental results offer insights on language misunderstanding from such models and biases in object composition, while also providing an interface for discovery of typical, or rare, synthesized scenes.
△ Less
Submitted 11 June, 2024;
originally announced June 2024.
-
Improving Memory Dependence Prediction with Static Analysis
Authors:
Luke Panayi,
Rohan Gandhi,
Jim Whittaker,
Vassilios Chouliaras,
Martin Berger,
Paul Kelly
Abstract:
This paper explores the potential of communicating information gained by static analysis from compilers to Out-of-Order (OoO) machines, focusing on the memory dependence predictor (MDP). The MDP enables loads to issue without all in-flight store addresses being known, with minimal memory order violations. We use LLVM to find loads with no dependencies and label them via their opcode. These labelle…
▽ More
This paper explores the potential of communicating information gained by static analysis from compilers to Out-of-Order (OoO) machines, focusing on the memory dependence predictor (MDP). The MDP enables loads to issue without all in-flight store addresses being known, with minimal memory order violations. We use LLVM to find loads with no dependencies and label them via their opcode. These labelled loads skip making lookups into the MDP, improving prediction accuracy by reducing false dependencies. We communicate this information in a minimally intrusive way, i.e.~without introducing additional hardware costs or instruction bandwidth, providing these improvements without any additional overhead in the CPU. We find that in select cases in Spec2017, a significant number of load instructions can skip interacting with the MDP and lead to a performance gain. These results point to greater possibilities for static analysis as a source of near zero cost performance gains in future CPU designs.
△ Less
Submitted 5 June, 2024; v1 submitted 12 March, 2024;
originally announced March 2024.
-
Quantum computing in civil engineering: Potentials and Limitations
Authors:
Joern Ploennigs,
Markus Berger,
Martin Mevissen,
Kay Smarsly
Abstract:
Quantum computing is a new computational paradigm with the potential to solve certain computationally challenging problems much faster than traditional approaches. Civil engineering encompasses many computationally challenging problems, which leads to the question of how well quantum computing is suitable for solving civil engineering problems and how much impact and implications to the field of c…
▽ More
Quantum computing is a new computational paradigm with the potential to solve certain computationally challenging problems much faster than traditional approaches. Civil engineering encompasses many computationally challenging problems, which leads to the question of how well quantum computing is suitable for solving civil engineering problems and how much impact and implications to the field of civil engineering can be expected when deploying quantum computing for solving these problems. To address these questions, we will, in this paper, first introduce the fundamentals of quantum computing. Thereupon, we will analyze the problem classes to elucidate where quantum computing holds the potential to outperform traditional computers and, focusing on the limitations, where quantum computing is not considered the most suitable solution. Finally, we will review common complex computation use cases in civil engineering and evaluate the potential and the limitations of being improved by quantum computing.
△ Less
Submitted 28 March, 2024; v1 submitted 22 February, 2024;
originally announced February 2024.
-
LTL learning on GPUs
Authors:
Mojtaba Valizadeh,
Nathanaël Fijalkow,
Martin Berger
Abstract:
Linear temporal logic (LTL) is widely used in industrial verification. LTL formulae can be learned from traces. Scaling LTL formula learning is an open problem. We implement the first GPU-based LTL learner using a novel form of enumerative program synthesis. The learner is sound and complete. Our benchmarks indicate that it handles traces at least 2048 times more numerous, and on average at least…
▽ More
Linear temporal logic (LTL) is widely used in industrial verification. LTL formulae can be learned from traces. Scaling LTL formula learning is an open problem. We implement the first GPU-based LTL learner using a novel form of enumerative program synthesis. The learner is sound and complete. Our benchmarks indicate that it handles traces at least 2048 times more numerous, and on average at least 46 times faster than existing state-of-the-art learners. This is achieved with, among others, novel branch-free LTL semantics that has $O(\log n)$ time complexity, where $n$ is trace length, while previous implementations are $O(n^2)$ or worse (assuming bitwise boolean operations and shifts by powers of 2 have unit costs -- a realistic assumption on modern processors).
△ Less
Submitted 27 March, 2024; v1 submitted 19 February, 2024;
originally announced February 2024.
-
Generative AI and the History of Architecture
Authors:
Joern Ploennigs,
Markus Berger
Abstract:
Recent generative AI platforms are able to create texts or impressive images from simple text prompts. This makes them powerful tools for summarizing knowledge about architectural history or deriving new creative work in early design tasks like ideation, sketching and modelling. But, how good is the understanding of the generative AI models of the history of architecture? Has it learned to properl…
▽ More
Recent generative AI platforms are able to create texts or impressive images from simple text prompts. This makes them powerful tools for summarizing knowledge about architectural history or deriving new creative work in early design tasks like ideation, sketching and modelling. But, how good is the understanding of the generative AI models of the history of architecture? Has it learned to properly distinguish styles, or is it hallucinating information? In this chapter, we investigate this question for generative AI platforms for text and image generation for different architectural styles, to understand the capabilities and boundaries of knowledge of those tools. We also analyze how they are already being used by analyzing a data set of 101 million Midjourney queries to see if and how practitioners are already querying for specific architectural concepts.
△ Less
Submitted 22 December, 2023;
originally announced December 2023.
-
ArchiGuesser -- AI Art Architecture Educational Game
Authors:
Joern Ploennigs,
Markus Berger,
Eva Carnein
Abstract:
The use of generative AI in education is a controversial topic. Current technology offers the potential to create educational content from text, speech, to images based on simple input prompts. This can enhance productivity by summarizing knowledge and improving communication, quickly adjusting to different types of learners. Moreover, generative AI holds the promise of making the learning itself…
▽ More
The use of generative AI in education is a controversial topic. Current technology offers the potential to create educational content from text, speech, to images based on simple input prompts. This can enhance productivity by summarizing knowledge and improving communication, quickly adjusting to different types of learners. Moreover, generative AI holds the promise of making the learning itself more fun, by responding to user inputs and dynamically generating high-quality creative material. In this paper we present the multisensory educational game ArchiGuesser that combines various AI technologies from large language models, image generation, to computer vision to serve a single purpose: Teaching students in a playful way the diversity of our architectural history and how generative AI works.
△ Less
Submitted 14 December, 2023;
originally announced December 2023.
-
Proving the Potential of Skeleton Based Action Recognition to Automate the Analysis of Manual Processes
Authors:
Marlin Berger,
Frederik Cloppenburg,
Jens Eufinger,
Thomas Gries
Abstract:
In manufacturing sectors such as textiles and electronics, manual processes are a fundamental part of production. The analysis and monitoring of the processes is necessary for efficient production design. Traditional methods for analyzing manual processes are complex, expensive, and inflexible. Compared to established approaches such as Methods-Time-Measurement (MTM), machine learning (ML) methods…
▽ More
In manufacturing sectors such as textiles and electronics, manual processes are a fundamental part of production. The analysis and monitoring of the processes is necessary for efficient production design. Traditional methods for analyzing manual processes are complex, expensive, and inflexible. Compared to established approaches such as Methods-Time-Measurement (MTM), machine learning (ML) methods promise: Higher flexibility, self-sufficient & permanent use, lower costs. In this work, based on a video stream, the current motion class in a manual assembly process is detected. With information on the current motion, Key-Performance-Indicators (KPIs) can be derived easily. A skeleton-based action recognition approach is taken, as this field recently shows major success in machine vision tasks. For skeleton-based action recognition in manual assembly, no sufficient pre-work could be found. Therefore, a ML pipeline is developed, to enable extensive research on different (pre-) processing methods and neural nets. Suitable well generalizing approaches are found, proving the potential of ML to enhance analyzation of manual processes. Models detect the current motion, performed by an operator in manual assembly, but the results can be transferred to all kinds of manual processes.
△ Less
Submitted 12 October, 2023;
originally announced October 2023.
-
Distribution-free risk assessment of regression-based machine learning algorithms
Authors:
Sukrita Singh,
Neeraj Sarna,
Yuanyuan Li,
Yang Li,
Agni Orfanoudaki,
Michael Berger
Abstract:
Machine learning algorithms have grown in sophistication over the years and are increasingly deployed for real-life applications. However, when using machine learning techniques in practical settings, particularly in high-risk applications such as medicine and engineering, obtaining the failure probability of the predictive model is critical. We refer to this problem as the risk-assessment task. W…
▽ More
Machine learning algorithms have grown in sophistication over the years and are increasingly deployed for real-life applications. However, when using machine learning techniques in practical settings, particularly in high-risk applications such as medicine and engineering, obtaining the failure probability of the predictive model is critical. We refer to this problem as the risk-assessment task. We focus on regression algorithms and the risk-assessment task of computing the probability of the true label lying inside an interval defined around the model's prediction. We solve the risk-assessment problem using the conformal prediction approach, which provides prediction intervals that are guaranteed to contain the true label with a given probability. Using this coverage property, we prove that our approximated failure probability is conservative in the sense that it is not lower than the true failure probability of the ML algorithm. We conduct extensive experiments to empirically study the accuracy of the proposed method for problems with and without covariate shift. Our analysis focuses on different modeling regimes, dataset sizes, and conformal prediction methodologies.
△ Less
Submitted 5 October, 2023;
originally announced October 2023.
-
Extending Cross-Modal Retrieval with Interactive Learning to Improve Image Retrieval Performance in Forensics
Authors:
Nils Böhne,
Mark Berger,
Ronald van Velzen
Abstract:
Nowadays, one of the critical challenges in forensics is analyzing the enormous amounts of unstructured digital evidence, such as images. Often, unstructured digital evidence contains precious information for forensic investigations. Therefore, a retrieval system that can effectively identify forensically relevant images is paramount. In this work, we explored the effectiveness of interactive lear…
▽ More
Nowadays, one of the critical challenges in forensics is analyzing the enormous amounts of unstructured digital evidence, such as images. Often, unstructured digital evidence contains precious information for forensic investigations. Therefore, a retrieval system that can effectively identify forensically relevant images is paramount. In this work, we explored the effectiveness of interactive learning in improving image retrieval performance in the forensic domain by proposing Excalibur - a zero-shot cross-modal image retrieval system extended with interactive learning. Excalibur was evaluated using both simulations and a user study. The simulations reveal that interactive learning is highly effective in improving retrieval performance in the forensic domain. Furthermore, user study participants could effectively leverage the power of interactive learning. Finally, they considered Excalibur effective and straightforward to use and expressed interest in using it in their daily practice.
△ Less
Submitted 28 August, 2023;
originally announced August 2023.
-
Correct and Optimal: the Regular Expression Inference Challenge
Authors:
Mojtaba Valizadeh,
Philip John Gorinski,
Ignacio Iacobacci,
Martin Berger
Abstract:
We propose regular expression inference (REI) as a challenge for code/language modelling, and the wider machine learning community. REI is a supervised machine learning (ML) and program optimisation task, and poses the problem of finding minimal regular expressions from examples: Given two finite sets of strings $P$ and $N$ and a cost function $cost(\cdot)$, the task is to generate an expression…
▽ More
We propose regular expression inference (REI) as a challenge for code/language modelling, and the wider machine learning community. REI is a supervised machine learning (ML) and program optimisation task, and poses the problem of finding minimal regular expressions from examples: Given two finite sets of strings $P$ and $N$ and a cost function $cost(\cdot)$, the task is to generate an expression $r$ that accepts all strings in $P$ and rejects all strings in $N$, while no other such expression $r'$ exists with $cost(r')<cost(r)$. REI has advantages as a challenge problem: (i) regular expressions are well-known, widely used, and a natural idealisation of code; (ii) REI's asymptotic worst-case complexity is well understood; (iii) REI has a small number of easy to understand parameters (e.g. $P$ or $N$ cardinality, string lengths of examples, or the cost function); this lets us easily finetune REI-hardness; (iv) REI, with its emphasis on optimisation, is an unsolved problem for deep learning based ML. Recently, an REI solver was implemented on GPUs, using program synthesis techniques. This enabled, for the first time, fast generation of minimal regular expressions for complex REI instances. Building on this advance, we generate and publish the first large-scale datasets for REI, and devise and evaluate several initial heuristic and machine learning baselines. We invite the community to participate and explore ML methods that learn to solve REI problems. We believe that progress in REI directly translates to progress in code/language modelling.
△ Less
Submitted 10 May, 2024; v1 submitted 15 August, 2023;
originally announced August 2023.
-
Automating Computational Design with Generative AI
Authors:
Joern Ploennigs,
Markus Berger
Abstract:
AI image generators based on diffusion models have recently garnered attention for their capability to create images from simple text prompts. However, for practical use in civil engineering they need to be able to create specific construction plans for given constraints. This paper investigates the potential of current AI generators in addressing such challenges, specifically for the creation of…
▽ More
AI image generators based on diffusion models have recently garnered attention for their capability to create images from simple text prompts. However, for practical use in civil engineering they need to be able to create specific construction plans for given constraints. This paper investigates the potential of current AI generators in addressing such challenges, specifically for the creation of simple floor plans. We explain how the underlying diffusion-models work and propose novel refinement approaches to improve semantic encoding and generation quality. In several experiments we show that we can improve validity of generated floor plans from 6% to 90%. Based on these results we derive future research challenges considering building information modelling. With this we provide: (i) evaluation of current generative AIs; (ii) propose improved refinement approaches; (iii) evaluate them on various examples; (iv) derive future directions for diffusion models in civil engineering.
△ Less
Submitted 3 May, 2024; v1 submitted 5 July, 2023;
originally announced July 2023.
-
Search-Based Regular Expression Inference on a GPU
Authors:
Mojtaba Valizadeh,
Martin Berger
Abstract:
Regular expression inference (REI) is a supervised machine learning and program synthesis problem that takes a cost metric for regular expressions, and positive and negative examples of strings as input. It outputs a regular expression that is precise (i.e., accepts all positive and rejects all negative examples), and minimal w.r.t. to the cost metric. We present a novel algorithm for REI over arb…
▽ More
Regular expression inference (REI) is a supervised machine learning and program synthesis problem that takes a cost metric for regular expressions, and positive and negative examples of strings as input. It outputs a regular expression that is precise (i.e., accepts all positive and rejects all negative examples), and minimal w.r.t. to the cost metric. We present a novel algorithm for REI over arbitrary alphabets that is enumerative and trades off time for space. Our main algorithmic idea is to implement the search space of regular expressions succinctly as a contiguous matrix of bitvectors. Collectively, the bitvectors represent, as characteristic sequences, all sub-languages of the infix-closure of the union of positive and negative examples. Mathematically, this is a semiring of (a variant of) formal power series. Infix-closure enables bottom-up compositional construction of larger from smaller regular expressions using the operations of our semiring. This minimises data movement and data-dependent branching, hence maximises data-parallelism. In addition, the infix-closure remains unchanged during the search, hence search can be staged: first pre-compute various expensive operations, and then run the compute intensive search process. We provide two C++ implementations, one for general purpose CPUs and one for Nvidia GPUs (using CUDA). We benchmark both on Google Colab Pro: the GPU implementation is on average over 1000x faster than the CPU implementation on the hardest benchmarks.
△ Less
Submitted 29 May, 2023;
originally announced May 2023.
-
Segmentation of glioblastomas in early post-operative multi-modal MRI with deep neural networks
Authors:
Ragnhild Holden Helland,
Alexandros Ferles,
André Pedersen,
Ivar Kommers,
Hilko Ardon,
Frederik Barkhof,
Lorenzo Bello,
Mitchel S. Berger,
Tora Dunås,
Marco Conti Nibali,
Julia Furtner,
Shawn Hervey-Jumper,
Albert J. S. Idema,
Barbara Kiesel,
Rishi Nandoe Tewari,
Emmanuel Mandonnet,
Domenique M. J. Müller,
Pierre A. Robe,
Marco Rossi,
Lisa M. Sagberg,
Tommaso Sciortino,
Tom Aalders,
Michiel Wagemakers,
Georg Widhalm,
Marnix G. Witte
, et al. (8 additional authors not shown)
Abstract:
Extent of resection after surgery is one of the main prognostic factors for patients diagnosed with glioblastoma. To achieve this, accurate segmentation and classification of residual tumor from post-operative MR images is essential. The current standard method for estimating it is subject to high inter- and intra-rater variability, and an automated method for segmentation of residual tumor in ear…
▽ More
Extent of resection after surgery is one of the main prognostic factors for patients diagnosed with glioblastoma. To achieve this, accurate segmentation and classification of residual tumor from post-operative MR images is essential. The current standard method for estimating it is subject to high inter- and intra-rater variability, and an automated method for segmentation of residual tumor in early post-operative MRI could lead to a more accurate estimation of extent of resection. In this study, two state-of-the-art neural network architectures for pre-operative segmentation were trained for the task. The models were extensively validated on a multicenter dataset with nearly 1000 patients, from 12 hospitals in Europe and the United States. The best performance achieved was a 61\% Dice score, and the best classification performance was about 80\% balanced accuracy, with a demonstrated ability to generalize across hospitals. In addition, the segmentation performance of the best models was on par with human expert raters. The predicted segmentations can be used to accurately classify the patients into those with residual tumor, and those with gross total resection.
△ Less
Submitted 18 April, 2023;
originally announced April 2023.
-
Artificial-intelligence-based molecular classification of diffuse gliomas using rapid, label-free optical imaging
Authors:
Todd C. Hollon,
Cheng Jiang,
Asadur Chowdury,
Mustafa Nasir-Moin,
Akhil Kondepudi,
Alexander Aabedi,
Arjun Adapa,
Wajd Al-Holou,
Jason Heth,
Oren Sagher,
Pedro Lowenstein,
Maria Castro,
Lisa Irina Wadiura,
Georg Widhalm,
Volker Neuschmelting,
David Reinecke,
Niklas von Spreckelsen,
Mitchel S. Berger,
Shawn L. Hervey-Jumper,
John G. Golfinos,
Matija Snuderl,
Sandra Camelo-Piragua,
Christian Freudiger,
Honglak Lee,
Daniel A. Orringer
Abstract:
Molecular classification has transformed the management of brain tumors by enabling more accurate prognostication and personalized treatment. However, timely molecular diagnostic testing for patients with brain tumors is limited, complicating surgical and adjuvant treatment and obstructing clinical trial enrollment. In this study, we developed DeepGlioma, a rapid ($< 90$ seconds), artificial-intel…
▽ More
Molecular classification has transformed the management of brain tumors by enabling more accurate prognostication and personalized treatment. However, timely molecular diagnostic testing for patients with brain tumors is limited, complicating surgical and adjuvant treatment and obstructing clinical trial enrollment. In this study, we developed DeepGlioma, a rapid ($< 90$ seconds), artificial-intelligence-based diagnostic screening system to streamline the molecular diagnosis of diffuse gliomas. DeepGlioma is trained using a multimodal dataset that includes stimulated Raman histology (SRH); a rapid, label-free, non-consumptive, optical imaging method; and large-scale, public genomic data. In a prospective, multicenter, international testing cohort of patients with diffuse glioma ($n=153$) who underwent real-time SRH imaging, we demonstrate that DeepGlioma can predict the molecular alterations used by the World Health Organization to define the adult-type diffuse glioma taxonomy (IDH mutation, 1p19q co-deletion and ATRX mutation), achieving a mean molecular classification accuracy of $93.3\pm 1.6\%$. Our results represent how artificial intelligence and optical histology can be used to provide a rapid and scalable adjunct to wet lab methods for the molecular screening of patients with diffuse glioma.
△ Less
Submitted 23 March, 2023;
originally announced March 2023.
-
A modest proposal: explicit support for foundational pluralism
Authors:
Martin Berger,
Dominic P. Mulligan
Abstract:
Whilst mathematicians assume classical reasoning principles by default they often context switch when working, restricting themselves to various forms of subclassical reasoning. This pattern is especially common amongst logicians and set theorists, but workaday mathematicians also commonly do this too, witnessed by narrative notes accompanying a proof -- "the following proof is constructive", or "…
▽ More
Whilst mathematicians assume classical reasoning principles by default they often context switch when working, restricting themselves to various forms of subclassical reasoning. This pattern is especially common amongst logicians and set theorists, but workaday mathematicians also commonly do this too, witnessed by narrative notes accompanying a proof -- "the following proof is constructive", or "the following proof does not use choice", for example. Yet, current proof assistants provide poor support for capturing these narrative notes formally, an observation that is especially true of systems based on Gordon's HOL, a classical higher-order logic. Consequently, HOL and its many implementations seem ironically more committed to classical reasoning than mainstream mathematicians are themselves, limiting the mathematical content that one may easily formalise. To facilitate these context switches, we propose that mathematicians mentally employ a simple tainting system when temporarily working subclassically -- an idea not currently explored in proof assistants. We introduce a series of modest but far-reaching changes to HOL, extending the standard two-place Natural Deduction relation to incorporate a taint-label, taken from a particular lattice, and which describes or limits the "amount" of classical reasoning used within a proof. Taint can be seen either as a simple typing system on HOL proofs, or as a form of static analysis on proof trees, and partitions our logic into various fragments of differing expressivity, sitting side-by-side. Results may pass from a "less classical" fragment into a "more classical" fragment of the logic without modification, but not vice versa, with the flow of results between worlds controlled by an inference rule akin to a subtyping or subsumption rule.
△ Less
Submitted 20 February, 2023;
originally announced February 2023.
-
AI Art in Architecture
Authors:
Joern Ploennigs,
Markus Berger
Abstract:
Recent diffusion-based AI art platforms are able to create impressive images from simple text descriptions. This makes them powerful tools for concept design in any discipline that requires creativity in visual design tasks. This is also true for early stages of architectural design with multiple stages of ideation, sketching and modelling. In this paper, we investigate how applicable diffusion-ba…
▽ More
Recent diffusion-based AI art platforms are able to create impressive images from simple text descriptions. This makes them powerful tools for concept design in any discipline that requires creativity in visual design tasks. This is also true for early stages of architectural design with multiple stages of ideation, sketching and modelling. In this paper, we investigate how applicable diffusion-based models already are to these tasks. We research the applicability of the platforms Midjourney, DALL-E 2 and StableDiffusion to a series of common use cases in architectural design to determine which are already solvable or might soon be. We also analyze how they are already being used by analyzing a data set of 40 million Midjourney queries with NLP methods to extract common usage patterns. With this insights we derived a workflow to interior and exterior design that combines the strengths of the individual platforms.
△ Less
Submitted 19 December, 2022;
originally announced December 2022.
-
ALARM: Active LeArning of Rowhammer Mitigations
Authors:
Amir Naseredini,
Martin Berger,
Matteo Sammartino,
Shale Xiong
Abstract:
Rowhammer is a serious security problem of contemporary dynamic random-access memory (DRAM) where reads or writes of bits can flip other bits. DRAM manufacturers add mitigations, but don't disclose details, making it difficult for customers to evaluate their efficacy. We present a tool, based on active learning, that automatically infers parameter of Rowhammer mitigations against synthetic models…
▽ More
Rowhammer is a serious security problem of contemporary dynamic random-access memory (DRAM) where reads or writes of bits can flip other bits. DRAM manufacturers add mitigations, but don't disclose details, making it difficult for customers to evaluate their efficacy. We present a tool, based on active learning, that automatically infers parameter of Rowhammer mitigations against synthetic models of modern DRAM.
△ Less
Submitted 30 November, 2022;
originally announced November 2022.
-
Towards an objective characterization of an individual's facial movements using Self-Supervised Person-Specific-Models
Authors:
Yanis Tazi,
Michael Berger,
Winrich A. Freiwald
Abstract:
Disentangling facial movements from other facial characteristics, particularly from facial identity, remains a challenging task, as facial movements display great variation between individuals. In this paper, we aim to characterize individual-specific facial movements. We present a novel training approach to learn facial movements independently of other facial characteristics, focusing on each ind…
▽ More
Disentangling facial movements from other facial characteristics, particularly from facial identity, remains a challenging task, as facial movements display great variation between individuals. In this paper, we aim to characterize individual-specific facial movements. We present a novel training approach to learn facial movements independently of other facial characteristics, focusing on each individual separately. We propose self-supervised Person-Specific Models (PSMs), in which one model per individual can learn to extract an embedding of the facial movements independently of the person's identity and other structural facial characteristics from unlabeled facial video. These models are trained using encoder-decoder-like architectures. We provide quantitative and qualitative evidence that a PSM learns a meaningful facial embedding that discovers fine-grained movements otherwise not characterized by a General Model (GM), which is trained across individuals and characterizes general patterns of facial movements. We present quantitative and qualitative evidence that this approach is easily scalable and generalizable for new individuals: facial movements knowledge learned on a person can quickly and effectively be transferred to a new person. Lastly, we propose a novel PSM using curriculum temporal learning to leverage the temporal contiguity between video frames. Our code, analysis details, and all pretrained models are available in Github and Supplementary Materials.
△ Less
Submitted 15 November, 2022;
originally announced November 2022.
-
Integration-free Learning of Flow Maps
Authors:
Saroj Sahoo,
Matthew Berger
Abstract:
We present a method for learning neural representations of flow maps from time-varying vector field data. The flow map is pervasive within the area of flow visualization, as it is foundational to numerous visualization techniques, e.g. integral curve computation for pathlines or streaklines, as well as computing separation/attraction structures within the flow field. Yet bottlenecks in flow map co…
▽ More
We present a method for learning neural representations of flow maps from time-varying vector field data. The flow map is pervasive within the area of flow visualization, as it is foundational to numerous visualization techniques, e.g. integral curve computation for pathlines or streaklines, as well as computing separation/attraction structures within the flow field. Yet bottlenecks in flow map computation, namely the numerical integration of vector fields, can easily inhibit their use within interactive visualization settings. In response, in our work we seek neural representations of flow maps that are efficient to evaluate, while remaining scalable to optimize, both in computation cost and data requirements. A key aspect of our approach is that we can frame the process of representation learning not in optimizing for samples of the flow map, but rather, a self-consistency criterion on flow map derivatives that eliminates the need for flow map samples, and thus numerical integration, altogether. Central to realizing this is a novel neural network design for flow maps, coupled with an optimization scheme, wherein our representation only requires the time-varying vector field for learning, encoded as instantaneous velocity. We show the benefits of our method over prior works in terms of accuracy and efficiency across a range of 2D and 3D time-varying vector fields, while showing how our neural representation of flow maps can benefit unsteady flow visualization techniques such as streaklines, and the finite-time Lyapunov exponent.
△ Less
Submitted 25 March, 2023; v1 submitted 6 November, 2022;
originally announced November 2022.
-
Cluster-Based Autoencoders for Volumetric Point Clouds
Authors:
Stephan Antholzer,
Martin Berger,
Tobias Hell
Abstract:
Autoencoders allow to reconstruct a given input from a small set of parameters. However, the input size is often limited due to computational costs. We therefore propose a clustering and reassembling method for volumetric point clouds, in order to allow high resolution data as input. We furthermore present an autoencoder based on the well-known FoldingNet for volumetric point clouds and discuss ho…
▽ More
Autoencoders allow to reconstruct a given input from a small set of parameters. However, the input size is often limited due to computational costs. We therefore propose a clustering and reassembling method for volumetric point clouds, in order to allow high resolution data as input. We furthermore present an autoencoder based on the well-known FoldingNet for volumetric point clouds and discuss how our approach can be utilized for blending between high resolution point clouds as well as for transferring a volumetric design/style onto a pointcloud while maintaining its shape.
△ Less
Submitted 2 November, 2022;
originally announced November 2022.
-
OA-SLAM: Leveraging Objects for Camera Relocalization in Visual SLAM
Authors:
Matthieu Zins,
Gilles Simon,
Marie-Odile Berger
Abstract:
In this work, we explore the use of objects in Simultaneous Localization and Mapping in unseen worlds and propose an object-aided system (OA-SLAM). More precisely, we show that, compared to low-level points, the major benefit of objects lies in their higher-level semantic and discriminating power. Points, on the contrary, have a better spatial localization accuracy than the generic coarse models u…
▽ More
In this work, we explore the use of objects in Simultaneous Localization and Mapping in unseen worlds and propose an object-aided system (OA-SLAM). More precisely, we show that, compared to low-level points, the major benefit of objects lies in their higher-level semantic and discriminating power. Points, on the contrary, have a better spatial localization accuracy than the generic coarse models used to represent objects (cuboid or ellipsoid). We show that combining points and objects is of great interest to address the problem of camera pose recovery. Our main contributions are: (1) we improve the relocalization ability of a SLAM system using high-level object landmarks; (2) we build an automatic system, capable of identifying, tracking and reconstructing objects with 3D ellipsoids; (3) we show that object-based localization can be used to reinitialize or resume camera tracking. Our fully automatic system allows on-the-fly object mapping and enhanced pose tracking recovery, which we think, can significantly benefit to the AR community. Our experiments show that the camera can be relocalized from viewpoints where classical methods fail. We demonstrate that this localization allows a SLAM system to continue working despite a tracking loss, which can happen frequently with an uninitiated user. Our code and test data are released at gitlab.inria.fr/tangram/oa-slam.
△ Less
Submitted 17 September, 2022;
originally announced September 2022.
-
Perspective-1-Ellipsoid: Formulation, Analysis and Solutions of the Camera Pose Estimation Problem from One Ellipse-Ellipsoid Correspondence
Authors:
Vincent Gaudillière,
Gilles Simon,
Marie-Odile Berger
Abstract:
In computer vision, camera pose estimation from correspondences between 3D geometric entities and their projections into the image has been a widely investigated problem. Although most state-of-the-art methods exploit low-level primitives such as points or lines, the emergence of very effective CNN-based object detectors in the recent years has paved the way to the use of higher-level features car…
▽ More
In computer vision, camera pose estimation from correspondences between 3D geometric entities and their projections into the image has been a widely investigated problem. Although most state-of-the-art methods exploit low-level primitives such as points or lines, the emergence of very effective CNN-based object detectors in the recent years has paved the way to the use of higher-level features carrying semantically meaningful information. Pioneering works in that direction have shown that modelling 3D objects by ellipsoids and 2D detections by ellipses offers a convenient manner to link 2D and 3D data. However, the mathematical formalism most often used in the related litterature does not enable to easily distinguish ellipsoids and ellipses from other quadrics and conics, leading to a loss of specificity potentially detrimental in some developments. Moreover, the linearization process of the projection equation creates an over-representation of the camera parameters, also possibly causing an efficiency loss. In this paper, we therefore introduce an ellipsoid-specific theoretical framework and demonstrate its beneficial properties in the context of pose estimation. More precisely, we first show that the proposed formalism enables to reduce the pose estimation problem to a position or orientation-only estimation problem in which the remaining unknowns can be derived in closed-form. Then, we demonstrate that it can be further reduced to a 1 Degree-of-Freedom (1DoF) problem and provide the analytical derivations of the pose as a function of that unique scalar unknown. We illustrate our theoretical considerations by visual examples and include a discussion on the practical aspects. Finally, we release this paper along with the corresponding source code in order to contribute towards more efficient resolutions of ellipsoid-related pose estimation problems.
△ Less
Submitted 14 June, 2023; v1 submitted 26 August, 2022;
originally announced August 2022.
-
Level Set-Based Camera Pose Estimation From Multiple 2D/3D Ellipse-Ellipsoid Correspondences
Authors:
Matthieu Zins,
Gilles Simon,
Marie-Odile Berger
Abstract:
In this paper, we propose an object-based camera pose estimation from a single RGB image and a pre-built map of objects, represented with ellipsoidal models. We show that contrary to point correspondences, the definition of a cost function characterizing the projection of a 3D object onto a 2D object detection is not straightforward. We develop an ellipse-ellipse cost based on level sets sampling,…
▽ More
In this paper, we propose an object-based camera pose estimation from a single RGB image and a pre-built map of objects, represented with ellipsoidal models. We show that contrary to point correspondences, the definition of a cost function characterizing the projection of a 3D object onto a 2D object detection is not straightforward. We develop an ellipse-ellipse cost based on level sets sampling, demonstrate its nice properties for handling partially visible objects and compare its performance with other common metrics. Finally, we show that the use of a predictive uncertainty on the detected ellipses allows a fair weighting of the contribution of the correspondences which improves the computed pose. The code is released at https://gitlab.inria.fr/tangram/level-set-based-camera-pose-estimation.
△ Less
Submitted 19 August, 2022; v1 submitted 16 July, 2022;
originally announced July 2022.
-
How Do Drivers Self-Regulate their Secondary Task Engagements? The Effect of Driving Automation on Touchscreen Interactions and Glance Behavior
Authors:
Patrick Ebel,
Moritz Berger,
Christoph Lingenfelder,
Andreas Vogelsang
Abstract:
With ever-improving driver assistance systems and large touchscreens becoming the main in-vehicle interface, drivers are more tempted than ever to engage in distracting non-driving-related tasks. However, little research exists on how driving automation affects drivers' self-regulation when interacting with center stack touchscreens. To investigate this, we employ multilevel models on a real-world…
▽ More
With ever-improving driver assistance systems and large touchscreens becoming the main in-vehicle interface, drivers are more tempted than ever to engage in distracting non-driving-related tasks. However, little research exists on how driving automation affects drivers' self-regulation when interacting with center stack touchscreens. To investigate this, we employ multilevel models on a real-world driving dataset consisting of 10,139 sequences. Our results show significant differences in drivers' interaction and glance behavior in response to varying levels of driving automation, vehicle speed, and road curvature. During partially automated driving, drivers are not only more likely to engage in secondary touchscreen tasks, but their mean glance duration toward the touchscreen also increases by 12% (Level 1) and 20% (Level 2) compared to manual driving. We further show that the effect of driving automation on drivers' self-regulation is larger than that of vehicle speed and road curvature. The derived knowledge can facilitate the safety evaluation of infotainment systems and the development of context-aware driver monitoring systems.
△ Less
Submitted 12 July, 2022; v1 submitted 9 July, 2022;
originally announced July 2022.
-
Preoperative brain tumor imaging: models and software for segmentation and standardized reporting
Authors:
D. Bouget,
A. Pedersen,
A. S. Jakola,
V. Kavouridis,
K. E. Emblem,
R. S. Eijgelaar,
I. Kommers,
H. Ardon,
F. Barkhof,
L. Bello,
M. S. Berger,
M. C. Nibali,
J. Furtner,
S. Hervey-Jumper,
A. J. S. Idema,
B. Kiesel,
A. Kloet,
E. Mandonnet,
D. M. J. Müller,
P. A. Robe,
M. Rossi,
T. Sciortino,
W. Van den Brink,
M. Wagemakers,
G. Widhalm
, et al. (5 additional authors not shown)
Abstract:
For patients suffering from brain tumor, prognosis estimation and treatment decisions are made by a multidisciplinary team based on a set of preoperative MR scans. Currently, the lack of standardized and automatic methods for tumor detection and generation of clinical reports represents a major hurdle. In this study, we investigate glioblastomas, lower grade gliomas, meningiomas, and metastases, t…
▽ More
For patients suffering from brain tumor, prognosis estimation and treatment decisions are made by a multidisciplinary team based on a set of preoperative MR scans. Currently, the lack of standardized and automatic methods for tumor detection and generation of clinical reports represents a major hurdle. In this study, we investigate glioblastomas, lower grade gliomas, meningiomas, and metastases, through four cohorts of up to 4000 patients. Tumor segmentation models were trained using the AGU-Net architecture with different preprocessing steps and protocols. Segmentation performances were assessed in-depth using a wide-range of voxel and patient-wise metrics covering volume, distance, and probabilistic aspects. Finally, two software solutions have been developed, enabling an easy use of the trained models and standardized generation of clinical reports: Raidionics and Raidionics-Slicer. Segmentation performances were quite homogeneous across the four different brain tumor types, with an average true positive Dice ranging between 80% and 90%, patient-wise recall between 88% and 98%, and patient-wise precision around 95%. With our Raidionics software, running on a desktop computer with CPU support, tumor segmentation can be performed in 16 to 54 seconds depending on the dimensions of the MRI volume. For the generation of a standardized clinical report, including the tumor segmentation and features computation, 5 to 15 minutes are necessary. All trained models have been made open-access together with the source code for both software solutions and validation metrics computation. In the future, an automatic classification of the brain tumor type would be necessary to replace manual user input. Finally, the inclusion of post-operative segmentation in both software solutions will be key for generating complete post-operative standardized clinical reports.
△ Less
Submitted 29 April, 2022;
originally announced April 2022.
-
Object-Based Visual Camera Pose Estimation From Ellipsoidal Model and 3D-Aware Ellipse Prediction
Authors:
Matthieu Zins,
Gilles Simon,
Marie-Odile Berger
Abstract:
In this paper, we propose a method for initial camera pose estimation from just a single image which is robust to viewing conditions and does not require a detailed model of the scene. This method meets the growing need of easy deployment of robotics or augmented reality applications in any environments, especially those for which no accurate 3D model nor huge amount of ground truth data are avail…
▽ More
In this paper, we propose a method for initial camera pose estimation from just a single image which is robust to viewing conditions and does not require a detailed model of the scene. This method meets the growing need of easy deployment of robotics or augmented reality applications in any environments, especially those for which no accurate 3D model nor huge amount of ground truth data are available. It exploits the ability of deep learning techniques to reliably detect objects regardless of viewing conditions. Previous works have also shown that abstracting the geometry of a scene of objects by an ellipsoid cloud allows to compute the camera pose accurately enough for various application needs. Though promising, these approaches use the ellipses fitted to the detection bounding boxes as an approximation of the imaged objects. In this paper, we go one step further and propose a learning-based method which detects improved elliptic approximations of objects which are coherent with the 3D ellipsoids in terms of perspective projection. Experiments prove that the accuracy of the computed pose significantly increases thanks to our method. This is achieved with very little effort in terms of training data acquisition - a few hundred calibrated images of which only three need manual object annotation. Code and models are released at https://gitlab.inria.fr/tangram/3d-aware-ellipses-for-visual-localization
△ Less
Submitted 9 March, 2022;
originally announced March 2022.
-
Systematic Analysis of Programming Languages and Their Execution Environments for Spectre Attacks
Authors:
Amir Naseredini,
Stefan Gast,
Martin Schwarzl,
Pedro Miguel Sousa Bernardo,
Amel Smajic,
Claudio Canella,
Martin Berger,
Daniel Gruss
Abstract:
In this paper, we analyze the security of programming languages and their execution environments (compilers and interpreters) with respect to Spectre attacks. The analysis shows that only 16 out of 42 execution environments have mitigations against at least one Spectre variant, i.e., 26 have no mitigations against any Spectre variant. Using our novel tool Speconnector, we develop Spectre proof-of-…
▽ More
In this paper, we analyze the security of programming languages and their execution environments (compilers and interpreters) with respect to Spectre attacks. The analysis shows that only 16 out of 42 execution environments have mitigations against at least one Spectre variant, i.e., 26 have no mitigations against any Spectre variant. Using our novel tool Speconnector, we develop Spectre proof-of-concept attacks in 8 programming languages and on code generated by 11 execution environments that were previously not known to be affected. Our results highlight some programming languages that are used to implement security-critical code, but remain entirely unprotected, even three years after the discovery of Spectre.
△ Less
Submitted 24 November, 2021;
originally announced November 2021.
-
STFT-LDA: An Algorithm to Facilitate the Visual Analysis of Building Seismic Responses
Authors:
Zhenge Zhao,
Danilo Motta,
Matthew Berger,
Joshua A. Levine,
Ismail B. Kuzucu,
Robert B. Fleischman,
Afonso Paiva,
Carlos Scheidegger
Abstract:
Civil engineers use numerical simulations of a building's responses to seismic forces to understand the nature of building failures, the limitations of building codes, and how to determine the latter to prevent the former. Such simulations generate large ensembles of multivariate, multiattribute time series. Comprehensive understanding of this data requires techniques that support the multivariate…
▽ More
Civil engineers use numerical simulations of a building's responses to seismic forces to understand the nature of building failures, the limitations of building codes, and how to determine the latter to prevent the former. Such simulations generate large ensembles of multivariate, multiattribute time series. Comprehensive understanding of this data requires techniques that support the multivariate nature of the time series and can compare behaviors that are both periodic and non-periodic across multiple time scales and multiple time series themselves. In this paper, we present a novel technique to extract such patterns from time series generated from simulations of seismic responses. The core of our approach is the use of topic modeling, where topics correspond to interpretable and discriminative features of the earthquakes. We transform the raw time series data into a time series of topics, and use this visual summary to compare temporal patterns in earthquakes, query earthquakes via the topics across arbitrary time scales, and enable details on demand by linking the topic visualization with the original earthquake data. We show, through a surrogate task and an expert study, that this technique allows analysts to more easily identify recurring patterns in such time series. By integrating this technique in a prototype system, we show how it enables novel forms of visual interaction.
△ Less
Submitted 1 September, 2021;
originally announced September 2021.
-
Using Probabilistic Movement Primitives in Analyzing Human Motion Difference under Transcranial Current Stimulation
Authors:
Honghu Xue,
Rebecca Herzog,
Till M Berger,
Tobias Bäumer,
Anne Weissbach,
Elmar Rueckert
Abstract:
In medical tasks such as human motion analysis, computer-aided auxiliary systems have become preferred choice for human experts for its high efficiency. However, conventional approaches are typically based on user-defined features such as movement onset times, peak velocities, motion vectors or frequency domain analyses. Such approaches entail careful data post-processing or specific domain knowle…
▽ More
In medical tasks such as human motion analysis, computer-aided auxiliary systems have become preferred choice for human experts for its high efficiency. However, conventional approaches are typically based on user-defined features such as movement onset times, peak velocities, motion vectors or frequency domain analyses. Such approaches entail careful data post-processing or specific domain knowledge to achieve a meaningful feature extraction. Besides, they are prone to noise and the manual-defined features could hardly be re-used for other analyses. In this paper, we proposed probabilistic movement primitives (ProMPs), a widely-used approach in robot skill learning, to model human motions. The benefit of ProMPs is that the features are directly learned from the data and ProMPs can capture important features describing the trajectory shape, which can easily be extended to other tasks. Distinct from previous research, where classification tasks are mostly investigated, we applied ProMPs together with a variant of Kullback-Leibler (KL) divergence to quantify the effect of different transcranial current stimulation methods on human motions. We presented an initial result with 10 participants. The results validate ProMPs as a robust and effective feature extractor for human motions.
△ Less
Submitted 5 July, 2021;
originally announced July 2021.
-
Defending Democracy: Using Deep Learning to Identify and Prevent Misinformation
Authors:
Anusua Trivedi,
Alyssa Suhm,
Prathamesh Mahankal,
Subhiksha Mukuntharaj,
Meghana D. Parab,
Malvika Mohan,
Meredith Berger,
Arathi Sethumadhavan,
Ashish Jaiman,
Rahul Dodhia
Abstract:
The rise in online misinformation in recent years threatens democracies by distorting authentic public discourse and causing confusion, fear, and even, in extreme cases, violence. There is a need to understand the spread of false content through online networks for developing interventions that disrupt misinformation before it achieves virality. Using a Deep Bidirectional Transformer for Language…
▽ More
The rise in online misinformation in recent years threatens democracies by distorting authentic public discourse and causing confusion, fear, and even, in extreme cases, violence. There is a need to understand the spread of false content through online networks for developing interventions that disrupt misinformation before it achieves virality. Using a Deep Bidirectional Transformer for Language Understanding (BERT) and propagation graphs, this study classifies and visualizes the spread of misinformation on a social media network using publicly available Twitter data. The results confirm prior research around user clusters and the virality of false content while improving the precision of deep learning models for misinformation detection. The study further demonstrates the suitability of BERT for providing a scalable model for false information detection, which can contribute to the development of more timely and accurate interventions to slow the spread of misinformation in online environments.
△ Less
Submitted 3 June, 2021;
originally announced June 2021.
-
3D-Aware Ellipse Prediction for Object-Based Camera Pose Estimation
Authors:
Matthieu Zins,
Gilles Simon,
Marie-Odile Berger
Abstract:
In this paper, we propose a method for coarse camera pose computation which is robust to viewing conditions and does not require a detailed model of the scene. This method meets the growing need of easy deployment of robotics or augmented reality applications in any environments, especially those for which no accurate 3D model nor huge amount of ground truth data are available. It exploits the abi…
▽ More
In this paper, we propose a method for coarse camera pose computation which is robust to viewing conditions and does not require a detailed model of the scene. This method meets the growing need of easy deployment of robotics or augmented reality applications in any environments, especially those for which no accurate 3D model nor huge amount of ground truth data are available. It exploits the ability of deep learning techniques to reliably detect objects regardless of viewing conditions. Previous works have also shown that abstracting the geometry of a scene of objects by an ellipsoid cloud allows to compute the camera pose accurately enough for various application needs. Though promising, these approaches use the ellipses fitted to the detection bounding boxes as an approximation of the imaged objects. In this paper, we go one step further and propose a learning-based method which detects improved elliptic approximations of objects which are coherent with the 3D ellipsoid in terms of perspective projection. Experiments prove that the accuracy of the computed pose significantly increases thanks to our method and is more robust to the variability of the boundaries of the detection boxes. This is achieved with very little effort in terms of training data acquisition -- a few hundred calibrated images of which only three need manual object annotation. Code and models are released at https://github.com/zinsmatt/3D-Aware-Ellipses-for-Visual-Localization.
△ Less
Submitted 24 May, 2021;
originally announced May 2021.
-
Compressive Neural Representations of Volumetric Scalar Fields
Authors:
Yuzhe Lu,
Kairong Jiang,
Joshua A. Levine,
Matthew Berger
Abstract:
We present an approach for compressing volumetric scalar fields using implicit neural representations. Our approach represents a scalar field as a learned function, wherein a neural network maps a point in the domain to an output scalar value. By setting the number of weights of the neural network to be smaller than the input size, we achieve compressed representations of scalar fields, thus frami…
▽ More
We present an approach for compressing volumetric scalar fields using implicit neural representations. Our approach represents a scalar field as a learned function, wherein a neural network maps a point in the domain to an output scalar value. By setting the number of weights of the neural network to be smaller than the input size, we achieve compressed representations of scalar fields, thus framing compression as a type of function approximation. Combined with carefully quantizing network weights, we show that this approach yields highly compact representations that outperform state-of-the-art volume compression approaches. The conceptual simplicity of our approach enables a number of benefits, such as support for time-varying scalar fields, optimizing to preserve spatial gradients, and random-access field evaluation. We study the impact of network design choices on compression performance, highlighting how simple network architectures are effective for a broad range of volumes.
△ Less
Submitted 11 April, 2021;
originally announced April 2021.
-
Remote Renewable Hubs For Carbon-Neutral Synthetic Fuel Production
Authors:
Mathias Berger,
David Radu,
Ghislain Detienne,
Thierry Deschuyteneer,
Aurore Richel,
Damien Ernst
Abstract:
This paper studies the economics of carbon-neutral synthetic fuel production from renewable electricity in remote areas where high-quality renewable resources are abundant. To this end, a graph-based optimisation modelling framework directly applicable to the strategic planning of remote renewable energy supply chains is proposed. More precisely, a hypergraph abstraction of planning problems is in…
▽ More
This paper studies the economics of carbon-neutral synthetic fuel production from renewable electricity in remote areas where high-quality renewable resources are abundant. To this end, a graph-based optimisation modelling framework directly applicable to the strategic planning of remote renewable energy supply chains is proposed. More precisely, a hypergraph abstraction of planning problems is introduced, wherein nodes can be viewed as optimisation subproblems with their own parameters, variables, constraints and local objective. Nodes typically represent a subsystem such as a technology, a plant or a process. Hyperedges, on the other hand, express the connectivity between subsystems. The framework is leveraged to study the economics of carbon-neutral synthetic methane production from solar and wind energy in North Africa and its delivery to Northwestern European markets. The full supply chain is modelled in an integrated fashion, which makes it possible to accurately capture the interaction between various technologies on an hourly time scale. Results suggest that the cost of synthetic methane production and delivery would be slightly under 150 EUR/MWh (higher heating value) by 2030 for a system supplying 10 TWh annually and relying on a combination of solar photovoltaic and wind power plants, assuming a uniform weighted average cost of capital of 7%. A comprehensive sensitivity analysis is also carried out in order to assess the impact of various techno-economic parameters and assumptions on synthetic methane cost, including the availability of wind power plants, the investment costs of electrolysis, methanation and direct air capture plants, their operational flexibility, the energy consumption of direct air capture plants, and financing costs.
△ Less
Submitted 10 June, 2021; v1 submitted 22 February, 2021;
originally announced February 2021.
-
A program logic for fresh name generation
Authors:
Harold Pancho Eliott,
Martin Berger
Abstract:
We present a program logic for Pitts and Stark's ν-calculus, an extension of the call-by-value simply-typed λ-calculus with a mechanism for the generation of fresh names. Names can be compared for (in)-equality, producing programs with subtle observable properties. Hidden names produced by interactions between generation and abstraction are captured logically with a second-order quantifier over ty…
▽ More
We present a program logic for Pitts and Stark's ν-calculus, an extension of the call-by-value simply-typed λ-calculus with a mechanism for the generation of fresh names. Names can be compared for (in)-equality, producing programs with subtle observable properties. Hidden names produced by interactions between generation and abstraction are captured logically with a second-order quantifier over type contexts. We illustrate usage of the logic through reasoning about well-known difficult cases from the literature.
△ Less
Submitted 12 March, 2021; v1 submitted 26 January, 2021;
originally announced January 2021.
-
Effective Distributed Representations for Academic Expert Search
Authors:
Mark Berger,
Jakub Zavrel,
Paul Groth
Abstract:
Expert search aims to find and rank experts based on a user's query. In academia, retrieving experts is an efficient way to navigate through a large amount of academic knowledge. Here, we study how different distributed representations of academic papers (i.e. embeddings) impact academic expert retrieval. We use the Microsoft Academic Graph dataset and experiment with different configurations of a…
▽ More
Expert search aims to find and rank experts based on a user's query. In academia, retrieving experts is an efficient way to navigate through a large amount of academic knowledge. Here, we study how different distributed representations of academic papers (i.e. embeddings) impact academic expert retrieval. We use the Microsoft Academic Graph dataset and experiment with different configurations of a document-centric voting model for retrieval. In particular, we explore the impact of the use of contextualized embeddings on search performance. We also present results for paper embeddings that incorporate citation information through retrofitting. Additionally, experiments are conducted using different techniques for assigning author weights based on author order. We observe that using contextual embeddings produced by a transformer model trained for sentence similarity tasks produces the most effective paper representations for document-centric expert retrieval. However, retrofitting the paper embeddings and using elaborate author contribution weighting strategies did not improve retrieval performance.
△ Less
Submitted 16 October, 2020;
originally announced October 2020.
-
Attention Flows: Analyzing and Comparing Attention Mechanisms in Language Models
Authors:
Joseph F DeRose,
Jiayao Wang,
Matthew Berger
Abstract:
Advances in language modeling have led to the development of deep attention-based models that are performant across a wide variety of natural language processing (NLP) problems. These language models are typified by a pre-training process on large unlabeled text corpora and subsequently fine-tuned for specific tasks. Although considerable work has been devoted to understanding the attention mechan…
▽ More
Advances in language modeling have led to the development of deep attention-based models that are performant across a wide variety of natural language processing (NLP) problems. These language models are typified by a pre-training process on large unlabeled text corpora and subsequently fine-tuned for specific tasks. Although considerable work has been devoted to understanding the attention mechanisms of pre-trained models, it is less understood how a model's attention mechanisms change when trained for a target NLP task. In this paper, we propose a visual analytics approach to understanding fine-tuning in attention-based language models. Our visualization, Attention Flows, is designed to support users in querying, tracing, and comparing attention within layers, across layers, and amongst attention heads in Transformer-based language models. To help users gain insight on how a classification decision is made, our design is centered on depicting classification-based attention at the deepest layer and how attention from prior layers flows throughout words in the input. Attention Flows supports the analysis of a single model, as well as the visual comparison between pre-trained and fine-tuned models via their similarities and differences. We use Attention Flows to study attention mechanisms in various sentence understanding tasks and highlight how attention evolves to address the nuances of solving these tasks.
△ Less
Submitted 3 September, 2020;
originally announced September 2020.
-
Visually Analyzing and Steering Zero Shot Learning
Authors:
Saroj Sahoo,
Matthew Berger
Abstract:
We propose a visual analytics system to help a user analyze and steer zero-shot learning models. Zero-shot learning has emerged as a viable scenario for categorizing data that consists of no labeled examples, and thus a promising approach to minimize data annotation from humans. However, it is challenging to understand where zero-shot learning fails, the cause of such failures, and how a user can…
▽ More
We propose a visual analytics system to help a user analyze and steer zero-shot learning models. Zero-shot learning has emerged as a viable scenario for categorizing data that consists of no labeled examples, and thus a promising approach to minimize data annotation from humans. However, it is challenging to understand where zero-shot learning fails, the cause of such failures, and how a user can modify the model to prevent such failures. Our visualization system is designed to help users diagnose and understand mispredictions in such models, so that they may gain insight on the behavior of a model when applied to data associated with categories not seen during training. Through usage scenarios, we highlight how our system can help a user improve performance in zero-shot learning.
△ Less
Submitted 11 September, 2020;
originally announced September 2020.
-
Visually Analyzing Contextualized Embeddings
Authors:
Matthew Berger
Abstract:
In this paper we introduce a method for visually analyzing contextualized embeddings produced by deep neural network-based language models. Our approach is inspired by linguistic probes for natural language processing, where tasks are designed to probe language models for linguistic structure, such as parts-of-speech and named entities. These approaches are largely confirmatory, however, only enab…
▽ More
In this paper we introduce a method for visually analyzing contextualized embeddings produced by deep neural network-based language models. Our approach is inspired by linguistic probes for natural language processing, where tasks are designed to probe language models for linguistic structure, such as parts-of-speech and named entities. These approaches are largely confirmatory, however, only enabling a user to test for information known a priori. In this work, we eschew supervised probing tasks, and advocate for unsupervised probes, coupled with visual exploration techniques, to assess what is learned by language models. Specifically, we cluster contextualized embeddings produced from a large text corpus, and introduce a visualization design based on this clustering and textual structure - cluster co-occurrences, cluster spans, and cluster-word membership - to help elicit the functionality of, and relationship between, individual clusters. User feedback highlights the benefits of our design in discovering different types of linguistic structures.
△ Less
Submitted 5 September, 2020;
originally announced September 2020.
-
PRAGMA: Interactively Constructing Functional Brain Parcellations
Authors:
Roza G. Bayrak,
Nhung Hoang,
Colin B. Hansen,
Catie Chang,
Matthew Berger
Abstract:
A prominent goal of neuroimaging studies is mapping the human brain, in order to identify and delineate functionally-meaningful regions and elucidate their roles in cognitive behaviors. These brain regions are typically represented by atlases that capture general trends over large populations. Despite being indispensable to neuroimaging experts, population-level atlases do not capture individual d…
▽ More
A prominent goal of neuroimaging studies is mapping the human brain, in order to identify and delineate functionally-meaningful regions and elucidate their roles in cognitive behaviors. These brain regions are typically represented by atlases that capture general trends over large populations. Despite being indispensable to neuroimaging experts, population-level atlases do not capture individual differences in functional organization. In this work, we present an interactive visualization method, PRAGMA, that allows domain experts to derive scan-specific parcellations from established atlases. PRAGMA features a user-driven, hierarchical clustering scheme for defining temporally correlated parcels in varying granularity. The visualization design supports the user in making decisions on how to perform clustering, namely when to expand, collapse, or merge parcels. This is accomplished through a set of linked and coordinated views for understanding the user's current hierarchy, assessing intra-cluster variation, and relating parcellations to an established atlas. We assess the effectiveness of PRAGMA through a user study with four neuroimaging domain experts, where our results show that PRAGMA shows the potential to enable exploration of individualized and state-specific brain parcellations and to offer interesting insights into functional brain networks.
△ Less
Submitted 3 September, 2020;
originally announced September 2020.
-
Jointly Learning Environments and Control Policies with Projected Stochastic Gradient Ascent
Authors:
Adrien Bolland,
Ioannis Boukas,
Mathias Berger,
Damien Ernst
Abstract:
We consider the joint design and control of discrete-time stochastic dynamical systems over a finite time horizon. We formulate the problem as a multi-step optimization problem under uncertainty seeking to identify a system design and a control policy that jointly maximize the expected sum of rewards collected over the time horizon considered. The transition function, the reward function and the p…
▽ More
We consider the joint design and control of discrete-time stochastic dynamical systems over a finite time horizon. We formulate the problem as a multi-step optimization problem under uncertainty seeking to identify a system design and a control policy that jointly maximize the expected sum of rewards collected over the time horizon considered. The transition function, the reward function and the policy are all parametrized, assumed known and differentiable with respect to their parameters. We then introduce a deep reinforcement learning algorithm combining policy gradient methods with model-based optimization techniques to solve this problem. In essence, our algorithm iteratively approximates the gradient of the expected return via Monte-Carlo sampling and automatic differentiation and takes projected gradient ascent steps in the space of environment and policy parameters. This algorithm is referred to as Direct Environment and Policy Search (DEPS). We assess the performance of our algorithm in three environments concerned with the design and control of a mass-spring-damper system, a small-scale off-grid power system and a drone, respectively. In addition, our algorithm is benchmarked against a state-of-the-art deep reinforcement learning algorithm used to tackle joint design and control problems. We show that DEPS performs at least as well or better in all three environments, consistently yielding solutions with higher returns in fewer iterations. Finally, solutions produced by our algorithm are also compared with solutions produced by an algorithm that does not jointly optimize environment and policy parameters, highlighting the fact that higher returns can be achieved when joint optimization is performed.
△ Less
Submitted 6 January, 2022; v1 submitted 2 June, 2020;
originally announced June 2020.
-
Visualization of Unsteady Flow Using Heat Kernel Signatures
Authors:
Kairong Jiang,
Matthew Berger,
Joshua A. Levine
Abstract:
We introduce a new technique to visualize complex flowing phenomena by using concepts from shape analysis. Our approach uses techniques that examine the intrinsic geometry of manifolds through their heat kernel, to obtain representations of such manifolds that are isometry-invariant and multi-scale. These representations permit us to compute heat kernel signatures of each point on that manifold, a…
▽ More
We introduce a new technique to visualize complex flowing phenomena by using concepts from shape analysis. Our approach uses techniques that examine the intrinsic geometry of manifolds through their heat kernel, to obtain representations of such manifolds that are isometry-invariant and multi-scale. These representations permit us to compute heat kernel signatures of each point on that manifold, and we can use these signatures as features for classification and segmentation that identify points that have similar structural properties.
Our approach adapts heat kernel signatures to unsteady flows by formulating a notion of shape where pathlines are observations of a manifold living in a high-dimensional space.
We use this space to compute and visualize heat kernel signatures associated with each pathline.
Besides being able to capture the structural features of a pathline, heat kernel signatures allow the comparison of pathlines from different flow datasets through a shape matching pipeline. We demonstrate the analytic power of heat kernel signatures by comparing both (1) different timesteps from the same unsteady flow as well as (2) flow datasets taken from ensemble simulations with varying simulation parameters. Our analysis only requires the pathlines themselves, and thus it does not utilize the underlying vector field directly. We make minimal assumptions on the pathlines: while we assume they are sampled from a continuous, unsteady flow, our computations can tolerate pathlines that have varying density and potential unknown boundaries. We evaluate our approach through visualizations of a variety of two-dimensional unsteady flows.
△ Less
Submitted 28 April, 2020;
originally announced April 2020.
-
Towards Discourse Parsing-inspired Semantic Storytelling
Authors:
Georg Rehm,
Karolina Zaczynska,
Julián Moreno-Schneider,
Malte Ostendorff,
Peter Bourgonje,
Maria Berger,
Jens Rauenbusch,
André Schmidt,
Mikka Wild
Abstract:
Previous work of ours on Semantic Storytelling uses text analytics procedures including Named Entity Recognition and Event Detection. In this paper, we outline our longer-term vision on Semantic Storytelling and describe the current conceptual and technical approach. In the project that drives our research we develop AI-based technologies that are verified by partners from industry. One long-term…
▽ More
Previous work of ours on Semantic Storytelling uses text analytics procedures including Named Entity Recognition and Event Detection. In this paper, we outline our longer-term vision on Semantic Storytelling and describe the current conceptual and technical approach. In the project that drives our research we develop AI-based technologies that are verified by partners from industry. One long-term goal is the development of an approach for Semantic Storytelling that has broad coverage and that is, furthermore, robust. We provide first results on experiments that involve discourse parsing, applied to a concrete use case, "Explore the Neighbourhood!", which is based on a semi-automatically collected data set with documents about noteworthy people in one of Berlin's districts. Though automatically obtaining annotations for coherence relations from plain text is a non-trivial challenge, our preliminary results are promising. We envision our approach to be combined with additional features (NER, coreference resolution, knowledge graphs
△ Less
Submitted 25 April, 2020;
originally announced April 2020.
-
European Language Grid: An Overview
Authors:
Georg Rehm,
Maria Berger,
Ela Elsholz,
Stefanie Hegele,
Florian Kintzel,
Katrin Marheinecke,
Stelios Piperidis,
Miltos Deligiannis,
Dimitris Galanis,
Katerina Gkirtzou,
Penny Labropoulou,
Kalina Bontcheva,
David Jones,
Ian Roberts,
Jan Hajic,
Jana Hamrlová,
Lukáš Kačena,
Khalid Choukri,
Victoria Arranz,
Andrejs Vasiļjevs,
Orians Anvari,
Andis Lagzdiņš,
Jūlija Meļņika,
Gerhard Backfried,
Erinç Dikici
, et al. (11 additional authors not shown)
Abstract:
With 24 official EU and many additional languages, multilingualism in Europe and an inclusive Digital Single Market can only be enabled through Language Technologies (LTs). European LT business is dominated by hundreds of SMEs and a few large players. Many are world-class, with technologies that outperform the global players. However, European LT business is also fragmented, by nation states, lang…
▽ More
With 24 official EU and many additional languages, multilingualism in Europe and an inclusive Digital Single Market can only be enabled through Language Technologies (LTs). European LT business is dominated by hundreds of SMEs and a few large players. Many are world-class, with technologies that outperform the global players. However, European LT business is also fragmented, by nation states, languages, verticals and sectors, significantly holding back its impact. The European Language Grid (ELG) project addresses this fragmentation by establishing the ELG as the primary platform for LT in Europe. The ELG is a scalable cloud platform, providing, in an easy-to-integrate way, access to hundreds of commercial and non-commercial LTs for all European languages, including running tools and services as well as data sets and resources. Once fully operational, it will enable the commercial and non-commercial European LT community to deposit and upload their technologies and data sets into the ELG, to deploy them through the grid, and to connect with other resources. The ELG will boost the Multilingual Digital Single Market towards a thriving European LT community, creating new jobs and opportunities. Furthermore, the ELG project organises two open calls for up to 20 pilot projects. It also sets up 32 National Competence Centres (NCCs) and the European LT Council (LTC) for outreach and coordination purposes.
△ Less
Submitted 30 March, 2020;
originally announced March 2020.
-
Making Metadata Fit for Next Generation Language Technology Platforms: The Metadata Schema of the European Language Grid
Authors:
Penny Labropoulou,
Katerina Gkirtzou,
Maria Gavriilidou,
Miltos Deligiannis,
Dimitrios Galanis,
Stelios Piperidis,
Georg Rehm,
Maria Berger,
Valérie Mapelli,
Mickaël Rigault,
Victoria Arranz,
Khalid Choukri,
Gerhard Backfried,
José Manuel Gómez Pérez,
Andres Garcia Silva
Abstract:
The current scientific and technological landscape is characterised by the increasing availability of data resources and processing tools and services. In this setting, metadata have emerged as a key factor facilitating management, sharing and usage of such digital assets. In this paper we present ELG-SHARE, a rich metadata schema catering for the description of Language Resources and Technologies…
▽ More
The current scientific and technological landscape is characterised by the increasing availability of data resources and processing tools and services. In this setting, metadata have emerged as a key factor facilitating management, sharing and usage of such digital assets. In this paper we present ELG-SHARE, a rich metadata schema catering for the description of Language Resources and Technologies (processing and generation services and tools, models, corpora, term lists, etc.), as well as related entities (e.g., organizations, projects, supporting documents, etc.). The schema powers the European Language Grid platform that aims to be the primary hub and marketplace for industry-relevant Language Technology in Europe. ELG-SHARE has been based on various metadata schemas, vocabularies, and ontologies, as well as related recommendations and guidelines.
△ Less
Submitted 30 March, 2020;
originally announced March 2020.
-
Views on Quality Requirements in Academia and Practice: Commonalities, Differences, and Context-Dependent Grey Areas
Authors:
Andreas Vogelsang,
Jonas Eckhardt,
Daniel Mendez,
Moritz Berger
Abstract:
Context: Quality requirements (QRs) are a topic of constant discussions both in industry and academia. Debates entwine around the definition of quality requirements, the way how to handle them, or their importance for project success. While many academic endeavors contribute to the body of knowledge about QRs, practitioners may have different views. In fact, we still lack a consistent body of know…
▽ More
Context: Quality requirements (QRs) are a topic of constant discussions both in industry and academia. Debates entwine around the definition of quality requirements, the way how to handle them, or their importance for project success. While many academic endeavors contribute to the body of knowledge about QRs, practitioners may have different views. In fact, we still lack a consistent body of knowledge on QRs since much of the discussion around this topic is still dominated by observations that are strongly context-dependent. This holds for both academic and practitioners' views. Our assumption is that, in consequence, those views may differ. Objective: We report on a study to better understand the extent to which available research statements on quality requirements, as found in exemplary peer-reviewed and frequently cited publications, are reflected in the perception of practitioners. Our goal is to analyze differences, commonalities, and context-dependent grey areas in the views of academics and practitioners to allow a discussion on potential misconceptions (on either sides) and opportunities for future research. Method: We conducted a survey with 109 practitioners to assess whether they agree with research statements about QRs reflected in the literature. Based on a statistical model, we evaluate the impact of a set of context factors to the perception of research statements. Results: Our results show that a majority of the statements is well respected by practitioners; however, not all of them. When examining the different groups and backgrounds of respondents, we noticed interesting deviations of perceptions within different groups that may lead to new research questions. Conclusions: Our results help identifying prevalent context-dependent differences about how academics and practitioners view QRs and pinpointing statements where further research might be useful.
△ Less
Submitted 7 February, 2020;
originally announced February 2020.
-
Elastic registration based on compliance analysis and biomechanical graph matching
Authors:
Jaime Garcia Guevara,
Igor Peterlik,
Marie-Odile Berger,
Stéphane Cotin
Abstract:
An automatic elastic registration method suited for vascularized organs is proposed. The vasculature in both the preoperative and intra-operative images is represented as a graph. A typical application of this method is the fusion of pre-operative information onto the organ during surgery, to compensate for the limited details provided by the intra-operative imaging modality (e.g. CBCT) and to cop…
▽ More
An automatic elastic registration method suited for vascularized organs is proposed. The vasculature in both the preoperative and intra-operative images is represented as a graph. A typical application of this method is the fusion of pre-operative information onto the organ during surgery, to compensate for the limited details provided by the intra-operative imaging modality (e.g. CBCT) and to cope with changes in the shape of the organ. Due to image modalities differences and organ deformation, each graph has a different topology and shape. The Adaptive Compliance Graph Matching (ACGM) method presented does not require any manual initialization, handles intra-operative nonrigid deformations of up to 65 mm and computes a complete displacement field over the organ from only the matched vasculature. ACGM is better than the previous Biomechanical Graph Matching method 3 (BGM) because it uses an efficient biomechanical vascularized liver model to compute the organ's transformation and the vessels bifurcations compliance. This allows to efficiently find the best graph matches with a novel compliance-based adaptive search. These contributions are evaluated on ten realistic synthetic and two real porcine automatically segmented datasets. ACGM obtains better target registration error (TRE) than BGM, with an average TRE in the real datasets of 4.2 mm compared to 6.5 mm, respectively. It also is up to one order of magnitude faster, less dependent on the parameters used and more robust to noise.
△ Less
Submitted 13 December, 2019;
originally announced December 2019.
-
Enriching BERT with Knowledge Graph Embeddings for Document Classification
Authors:
Malte Ostendorff,
Peter Bourgonje,
Maria Berger,
Julian Moreno-Schneider,
Georg Rehm,
Bela Gipp
Abstract:
In this paper, we focus on the classification of books using short descriptive texts (cover blurbs) and additional metadata. Building upon BERT, a deep neural language model, we demonstrate how to combine text representations with metadata and knowledge graph embeddings, which encode author information. Compared to the standard BERT approach we achieve considerably better results for the classific…
▽ More
In this paper, we focus on the classification of books using short descriptive texts (cover blurbs) and additional metadata. Building upon BERT, a deep neural language model, we demonstrate how to combine text representations with metadata and knowledge graph embeddings, which encode author information. Compared to the standard BERT approach we achieve considerably better results for the classification task. For a more coarse-grained classification using eight labels we achieve an F1- score of 87.20, while a detailed classification using 343 labels yields an F1-score of 64.70. We make the source code and trained models of our experiments publicly available
△ Less
Submitted 18 September, 2019;
originally announced September 2019.