-
TraM-NeRF: Tracing Mirror and Near-Perfect Specular Reflections through Neural Radiance Fields
Authors:
Leif Van Holland,
Ruben Bliersbach,
Jan U. Müller,
Patrick Stotko,
Reinhard Klein
Abstract:
Implicit representations like Neural Radiance Fields (NeRF) showed impressive results for photorealistic rendering of complex scenes with fine details. However, ideal or near-perfectly specular reflecting objects such as mirrors, which are often encountered in various indoor scenes, impose ambiguities and inconsistencies in the representation of the reconstructed scene leading to severe artifacts…
▽ More
Implicit representations like Neural Radiance Fields (NeRF) showed impressive results for photorealistic rendering of complex scenes with fine details. However, ideal or near-perfectly specular reflecting objects such as mirrors, which are often encountered in various indoor scenes, impose ambiguities and inconsistencies in the representation of the reconstructed scene leading to severe artifacts in the synthesized renderings. In this paper, we present a novel reflection tracing method tailored for the involved volume rendering within NeRF that takes these mirror-like objects into account while avoiding the cost of straightforward but expensive extensions through standard path tracing. By explicitly modeling the reflection behavior using physically plausible materials and estimating the reflected radiance with Monte-Carlo methods within the volume rendering formulation, we derive efficient strategies for importance sampling and the transmittance computation along rays from only few samples. We show that our novel method enables the training of consistent representations of such challenging scenes and achieves superior results in comparison to previous state-of-the-art approaches.
△ Less
Submitted 16 October, 2023;
originally announced October 2023.
-
Hidden Markov Modeling for Maximum Likelihood Neuron Reconstruction
Authors:
Thomas L. Athey,
Daniel J. Tward,
Ulrich Mueller,
Joshua T. Vogelstein,
Michael I. Miller
Abstract:
Recent advances in brain clearing and imaging have made it possible to image entire mammalian brains at sub-micron resolution. These images offer the potential to assemble brain-wide atlases of neuron morphology, but manual neuron reconstruction remains a bottleneck. Several automatic reconstruction algorithms exist, but most focus on single neuron images. In this paper, we present a probabilistic…
▽ More
Recent advances in brain clearing and imaging have made it possible to image entire mammalian brains at sub-micron resolution. These images offer the potential to assemble brain-wide atlases of neuron morphology, but manual neuron reconstruction remains a bottleneck. Several automatic reconstruction algorithms exist, but most focus on single neuron images. In this paper, we present a probabilistic reconstruction method, ViterBrain, which combines a hidden Markov state process that encodes neuron geometry with a random field appearance model of neuron fluorescence. Our method utilizes dynamic programming to compute the global maximizers of what we call the "most probable" neuron path. Our most probable estimation method models the task of reconstructing neuronal processes in the presence of other neurons, and thus is applicable in images with several neurons. Our method operates on image segmentations in order to leverage cutting edge computer vision technology. We applied our algorithm to imperfect image segmentations where false negatives severed neuronal processes, and showed that it can follow axons in the presence of noise or nearby neurons. Additionally, it creates a framework where users can intervene to, for example, fit start and endpoints. The code used in this work is available in our open-source Python package brainlit.
△ Less
Submitted 27 January, 2022; v1 submitted 4 June, 2021;
originally announced June 2021.
-
Fitting Splines to Axonal Arbors Quantifies Relationship between Branch Order and Geometry
Authors:
Thomas L. Athey,
Jacopo Teneggi,
Joshua T. Vogelstein,
Daniel Tward,
Ulrich Mueller,
Michael I. Miller
Abstract:
Neuromorphology is crucial to identifying neuronal subtypes and understanding learning. It is also implicated in neurological disease. However, standard morphological analysis focuses on macroscopic features such as branching frequency and connectivity between regions, and often neglects the internal geometry of neurons. In this work, we treat neuron trace points as a sampling of differentiable cu…
▽ More
Neuromorphology is crucial to identifying neuronal subtypes and understanding learning. It is also implicated in neurological disease. However, standard morphological analysis focuses on macroscopic features such as branching frequency and connectivity between regions, and often neglects the internal geometry of neurons. In this work, we treat neuron trace points as a sampling of differentiable curves and fit them with a set of branching B-splines. We designed our representation with the Frenet-Serret formulas from differential geometry in mind. The Frenet-Serret formulas completely characterize smooth curves, and involve two parameters, curvature and torsion. Our representation makes it possible to compute these parameters from neuron traces in closed form. These parameters are defined continuously along the curve, in contrast to other parameters like tortuosity which depend on start and end points. We applied our method to a dataset of cortical projection neurons traced in two mouse brains, and found that the parameters are distributed differently between primary, collateral, and terminal axon branches, thus quantifying geometric differences between different components of an axonal arbor. The results agreed in both brains, further validating our representation. The code used in this work can be readily applied to neuron traces in SWC format and is available in our open-source Python package brainlit: http://brainlit.neurodata.io/.
△ Less
Submitted 5 June, 2021; v1 submitted 3 April, 2021;
originally announced April 2021.
-
Coreference Resolution in Research Papers from Multiple Domains
Authors:
Arthur Brack,
Daniel Uwe Müller,
Anett Hoppe,
Ralph Ewerth
Abstract:
Coreference resolution is essential for automatic text understanding to facilitate high-level information retrieval tasks such as text summarisation or question answering. Previous work indicates that the performance of state-of-the-art approaches (e.g. based on BERT) noticeably declines when applied to scientific papers. In this paper, we investigate the task of coreference resolution in research…
▽ More
Coreference resolution is essential for automatic text understanding to facilitate high-level information retrieval tasks such as text summarisation or question answering. Previous work indicates that the performance of state-of-the-art approaches (e.g. based on BERT) noticeably declines when applied to scientific papers. In this paper, we investigate the task of coreference resolution in research papers and subsequent knowledge graph population. We present the following contributions: (1) We annotate a corpus for coreference resolution that comprises 10 different scientific disciplines from Science, Technology, and Medicine (STM); (2) We propose transfer learning for automatic coreference resolution in research papers; (3) We analyse the impact of coreference resolution on knowledge graph (KG) population; (4) We release a research KG that is automatically populated from 55,485 papers in 10 STM domains. Comprehensive experiments show the usefulness of the proposed approach. Our transfer learning approach considerably outperforms state-of-the-art baselines on our corpus with an F1 score of 61.4 (+11.0), while the evaluation against a gold standard KG shows that coreference resolution improves the quality of the populated KG significantly with an F1 score of 63.5 (+21.8).
△ Less
Submitted 4 January, 2021;
originally announced January 2021.
-
The NVIDIA PilotNet Experiments
Authors:
Mariusz Bojarski,
Chenyi Chen,
Joyjit Daw,
Alperen Değirmenci,
Joya Deri,
Bernhard Firner,
Beat Flepp,
Sachin Gogri,
Jesse Hong,
Lawrence Jackel,
Zhenhua Jia,
BJ Lee,
Bo Liu,
Fei Liu,
Urs Muller,
Samuel Payne,
Nischal Kota Nagendra Prasad,
Artem Provodin,
John Roach,
Timur Rvachov,
Neha Tadimeti,
Jesper van Engelen,
Haiguang Wen,
Eric Yang,
Zongyi Yang
Abstract:
Four years ago, an experimental system known as PilotNet became the first NVIDIA system to steer an autonomous car along a roadway. This system represents a departure from the classical approach for self-driving in which the process is manually decomposed into a series of modules, each performing a different task. In PilotNet, on the other hand, a single deep neural network (DNN) takes pixels as i…
▽ More
Four years ago, an experimental system known as PilotNet became the first NVIDIA system to steer an autonomous car along a roadway. This system represents a departure from the classical approach for self-driving in which the process is manually decomposed into a series of modules, each performing a different task. In PilotNet, on the other hand, a single deep neural network (DNN) takes pixels as input and produces a desired vehicle trajectory as output; there are no distinct internal modules connected by human-designed interfaces. We believe that handcrafted interfaces ultimately limit performance by restricting information flow through the system and that a learned approach, in combination with other artificial intelligence systems that add redundancy, will lead to better overall performing systems. We continue to conduct research toward that goal.
This document describes the PilotNet lane-keeping effort, carried out over the past five years by our NVIDIA PilotNet group in Holmdel, New Jersey. Here we present a snapshot of system status in mid-2020 and highlight some of the work done by the PilotNet group.
△ Less
Submitted 17 October, 2020;
originally announced October 2020.
-
Super-resolution of multispectral satellite images using convolutional neural networks
Authors:
M. U. Müller,
N. Ekhtiari,
R. M. Almeida,
C. Rieke
Abstract:
Super-resolution aims at increasing image resolution by algorithmic means and has progressed over the recent years due to advances in the fields of computer vision and deep learning. Convolutional Neural Networks based on a variety of architectures have been applied to the problem, e.g. autoencoders and residual networks. While most research focuses on the processing of photographs consisting only…
▽ More
Super-resolution aims at increasing image resolution by algorithmic means and has progressed over the recent years due to advances in the fields of computer vision and deep learning. Convolutional Neural Networks based on a variety of architectures have been applied to the problem, e.g. autoencoders and residual networks. While most research focuses on the processing of photographs consisting only of RGB color channels, little work can be found concentrating on multi-band, analytic satellite imagery. Satellite images often include a panchromatic band, which has higher spatial resolution but lower spectral resolution than the other bands. In the field of remote sensing, there is a long tradition of applying pan-sharpening to satellite images, i.e. bringing the multispectral bands to the higher spatial resolution by merging them with the panchromatic band. To our knowledge there are so far no approaches to super-resolution which take advantage of the panchromatic band. In this paper we propose a method to train state-of-the-art CNNs using pairs of lower-resolution multispectral and high-resolution pan-sharpened image tiles in order to create super-resolved analytic images. The derived quality metrics show that the method improves information content of the processed images. We compare the results created by four CNN architectures, with RedNet30 performing best.
△ Less
Submitted 8 April, 2020; v1 submitted 3 February, 2020;
originally announced February 2020.
-
Explaining How a Deep Neural Network Trained with End-to-End Learning Steers a Car
Authors:
Mariusz Bojarski,
Philip Yeres,
Anna Choromanska,
Krzysztof Choromanski,
Bernhard Firner,
Lawrence Jackel,
Urs Muller
Abstract:
As part of a complete software stack for autonomous driving, NVIDIA has created a neural-network-based system, known as PilotNet, which outputs steering angles given images of the road ahead. PilotNet is trained using road images paired with the steering angles generated by a human driving a data-collection car. It derives the necessary domain knowledge by observing human drivers. This eliminates…
▽ More
As part of a complete software stack for autonomous driving, NVIDIA has created a neural-network-based system, known as PilotNet, which outputs steering angles given images of the road ahead. PilotNet is trained using road images paired with the steering angles generated by a human driving a data-collection car. It derives the necessary domain knowledge by observing human drivers. This eliminates the need for human engineers to anticipate what is important in an image and foresee all the necessary rules for safe driving. Road tests demonstrated that PilotNet can successfully perform lane keeping in a wide variety of driving conditions, regardless of whether lane markings are present or not.
The goal of the work described here is to explain what PilotNet learns and how it makes its decisions. To this end we developed a method for determining which elements in the road image most influence PilotNet's steering decision. Results show that PilotNet indeed learns to recognize relevant objects on the road.
In addition to learning the obvious features such as lane markings, edges of roads, and other cars, PilotNet learns more subtle features that would be hard to anticipate and program by engineers, for example, bushes lining the edge of the road and atypical vehicle classes.
△ Less
Submitted 25 April, 2017;
originally announced April 2017.
-
Software Grand Exposure: SGX Cache Attacks Are Practical
Authors:
Ferdinand Brasser,
Urs Müller,
Alexandra Dmitrienko,
Kari Kostiainen,
Srdjan Capkun,
Ahmad-Reza Sadeghi
Abstract:
Side-channel information leakage is a known limitation of SGX. Researchers have demonstrated that secret-dependent information can be extracted from enclave execution through page-fault access patterns. Consequently, various recent research efforts are actively seeking countermeasures to SGX side-channel attacks. It is widely assumed that SGX may be vulnerable to other side channels, such as cache…
▽ More
Side-channel information leakage is a known limitation of SGX. Researchers have demonstrated that secret-dependent information can be extracted from enclave execution through page-fault access patterns. Consequently, various recent research efforts are actively seeking countermeasures to SGX side-channel attacks. It is widely assumed that SGX may be vulnerable to other side channels, such as cache access pattern monitoring, as well. However, prior to our work, the practicality and the extent of such information leakage was not studied.
In this paper we demonstrate that cache-based attacks are indeed a serious threat to the confidentiality of SGX-protected programs. Our goal was to design an attack that is hard to mitigate using known defenses, and therefore we mount our attack without interrupting enclave execution. This approach has major technical challenges, since the existing cache monitoring techniques experience significant noise if the victim process is not interrupted. We designed and implemented novel attack techniques to reduce this noise by leveraging the capabilities of the privileged adversary. Our attacks are able to recover confidential information from SGX enclaves, which we illustrate in two example cases: extraction of an entire RSA-2048 key during RSA decryption, and detection of specific human genome sequences during genomic indexing. We show that our attacks are more effective than previous cache attacks and harder to mitigate than previous SGX side-channel attacks.
△ Less
Submitted 24 February, 2017;
originally announced February 2017.
-
VisualBackProp: efficient visualization of CNNs
Authors:
Mariusz Bojarski,
Anna Choromanska,
Krzysztof Choromanski,
Bernhard Firner,
Larry Jackel,
Urs Muller,
Karol Zieba
Abstract:
This paper proposes a new method, that we call VisualBackProp, for visualizing which sets of pixels of the input image contribute most to the predictions made by the convolutional neural network (CNN). The method heavily hinges on exploring the intuition that the feature maps contain less and less irrelevant information to the prediction decision when moving deeper into the network. The technique…
▽ More
This paper proposes a new method, that we call VisualBackProp, for visualizing which sets of pixels of the input image contribute most to the predictions made by the convolutional neural network (CNN). The method heavily hinges on exploring the intuition that the feature maps contain less and less irrelevant information to the prediction decision when moving deeper into the network. The technique we propose was developed as a debugging tool for CNN-based systems for steering self-driving cars and is therefore required to run in real-time, i.e. it was designed to require less computations than a forward propagation. This makes the presented visualization method a valuable debugging tool which can be easily used during both training and inference. We furthermore justify our approach with theoretical arguments and theoretically confirm that the proposed method identifies sets of input pixels, rather than individual pixels, that collaboratively contribute to the prediction. Our theoretical findings stand in agreement with the experimental results. The empirical evaluation shows the plausibility of the proposed approach on the road video data as well as in other applications and reveals that it compares favorably to the layer-wise relevance propagation approach, i.e. it obtains similar visualization results and simultaneously achieves order of magnitude speed-ups.
△ Less
Submitted 19 May, 2017; v1 submitted 16 November, 2016;
originally announced November 2016.
-
Fast Incremental Learning for Off-Road Robot Navigation
Authors:
Artem Provodin,
Liila Torabi,
Beat Flepp,
Yann LeCun,
Michael Sergio,
L. D. Jackel,
Urs Muller,
Jure Zbontar
Abstract:
A promising approach to autonomous driving is machine learning. In such systems, training datasets are created that capture the sensory input to a vehicle as well as the desired response. A disadvantage of using a learned navigation system is that the learning process itself may require a huge number of training examples and a large amount of computing. To avoid the need to collect a large trainin…
▽ More
A promising approach to autonomous driving is machine learning. In such systems, training datasets are created that capture the sensory input to a vehicle as well as the desired response. A disadvantage of using a learned navigation system is that the learning process itself may require a huge number of training examples and a large amount of computing. To avoid the need to collect a large training set of driving examples, we describe a system that takes advantage of the huge number of training examples provided by ImageNet, but is able to adapt quickly using a small training set for the specific driving environment.
△ Less
Submitted 26 June, 2016;
originally announced June 2016.
-
End to End Learning for Self-Driving Cars
Authors:
Mariusz Bojarski,
Davide Del Testa,
Daniel Dworakowski,
Bernhard Firner,
Beat Flepp,
Prasoon Goyal,
Lawrence D. Jackel,
Mathew Monfort,
Urs Muller,
Jiakai Zhang,
Xin Zhang,
Jake Zhao,
Karol Zieba
Abstract:
We trained a convolutional neural network (CNN) to map raw pixels from a single front-facing camera directly to steering commands. This end-to-end approach proved surprisingly powerful. With minimum training data from humans the system learns to drive in traffic on local roads with or without lane markings and on highways. It also operates in areas with unclear visual guidance such as in parking l…
▽ More
We trained a convolutional neural network (CNN) to map raw pixels from a single front-facing camera directly to steering commands. This end-to-end approach proved surprisingly powerful. With minimum training data from humans the system learns to drive in traffic on local roads with or without lane markings and on highways. It also operates in areas with unclear visual guidance such as in parking lots and on unpaved roads.
The system automatically learns internal representations of the necessary processing steps such as detecting useful road features with only the human steering angle as the training signal. We never explicitly trained it to detect, for example, the outline of roads.
Compared to explicit decomposition of the problem, such as lane marking detection, path planning, and control, our end-to-end system optimizes all processing steps simultaneously. We argue that this will eventually lead to better performance and smaller systems. Better performance will result because the internal components self-optimize to maximize overall system performance, instead of optimizing human-selected intermediate criteria, e.g., lane detection. Such criteria understandably are selected for ease of human interpretation which doesn't automatically guarantee maximum system performance. Smaller networks are possible because the system learns to solve the problem with the minimal number of processing steps.
We used an NVIDIA DevBox and Torch 7 for training and an NVIDIA DRIVE(TM) PX self-driving car computer also running Torch 7 for determining where to drive. The system operates at 30 frames per second (FPS).
△ Less
Submitted 25 April, 2016;
originally announced April 2016.