Search | arXiv e-print repository

arXiv:2409.02006 [pdf, other]

Robust Fitting on a Gate Quantum Computer

Authors: Frances Fengyi Yang, Michele Sasdelli, Tat-Jun Chin

Abstract: Gate quantum computers generate significant interest due to their potential to solve certain difficult problems such as prime factorization in polynomial time. Computer vision researchers have long been attracted to the power of quantum computers. Robust fitting, which is fundamentally important to many computer vision pipelines, has recently been shown to be amenable to gate quantum computing. Th… ▽ More Gate quantum computers generate significant interest due to their potential to solve certain difficult problems such as prime factorization in polynomial time. Computer vision researchers have long been attracted to the power of quantum computers. Robust fitting, which is fundamentally important to many computer vision pipelines, has recently been shown to be amenable to gate quantum computing. The previous proposed solution was to compute Boolean influence as a measure of outlyingness using the Bernstein-Vazirani quantum circuit. However, the method assumed a quantum implementation of an $\ell_\infty$ feasibility test, which has not been demonstrated. In this paper, we take a big stride towards quantum robust fitting: we propose a quantum circuit to solve the $\ell_\infty$ feasibility test in the 1D case, which allows to demonstrate for the first time quantum robust fitting on a real gate quantum computer, the IonQ Aria. We also show how 1D Boolean influences can be accumulated to compute Boolean influences for higher-dimensional non-linear models, which we experimentally validate on real benchmark datasets. △ Less

Submitted 3 September, 2024; originally announced September 2024.

Comments: Accepted by the European Conference on Computer Vision 2024 (ECCV2024) as Oral. The paper is written for a computer vision audience who generally has minimal quantum physics background

arXiv:2310.15128 [pdf, other]

Projected Stochastic Gradient Descent with Quantum Annealed Binary Gradients

Authors: Maximilian Krahn, Michele Sasdelli, Fengyi Yang, Vladislav Golyanik, Juho Kannala, Tat-Jun Chin, Tolga Birdal

Abstract: We present, QP-SBGD, a novel layer-wise stochastic optimiser tailored towards training neural networks with binary weights, known as binary neural networks (BNNs), on quantum hardware. BNNs reduce the computational requirements and energy consumption of deep learning models with minimal loss in accuracy. However, training them in practice remains to be an open challenge. Most known BNN-optimisers… ▽ More We present, QP-SBGD, a novel layer-wise stochastic optimiser tailored towards training neural networks with binary weights, known as binary neural networks (BNNs), on quantum hardware. BNNs reduce the computational requirements and energy consumption of deep learning models with minimal loss in accuracy. However, training them in practice remains to be an open challenge. Most known BNN-optimisers either rely on projected updates or binarise weights post-training. Instead, QP-SBGD approximately maps the gradient onto binary variables, by solving a quadratic constrained binary optimisation. Under practically reasonable assumptions, we show that this update rule converges with a rate of $\mathcal{O}(1 / \sqrt{T})$. Moreover, we show how the $\mathcal{NP}$-hard projection can be effectively executed on an adiabatic quantum annealer, harnessing recent advancements in quantum computation. We also introduce a projected version of this update rule and prove that if a fixed point exists in the binary variable space, the modified updates will converge to it. Last but not least, our algorithm is implemented layer-wise, making it suitable to train larger networks on resource-limited quantum hardware. Through extensive evaluations, we show that QP-SBGD outperforms or is on par with competitive and well-established baselines such as BinaryConnect, signSGD and ProxQuant when optimising the Rosenbrock function, training BNNs as well as binary graph neural networks. △ Less

Submitted 3 September, 2024; v1 submitted 23 October, 2023; originally announced October 2023.

Journal ref: BMVC 2024

arXiv:2303.12352 [pdf, other]

Training Multilayer Perceptrons by Sampling with Quantum Annealers

Authors: Frances Fengyi Yang, Michele Sasdelli, Tat-Jun Chin

Abstract: A successful application of quantum annealing to machine learning is training restricted Boltzmann machines (RBM). However, many neural networks for vision applications are feedforward structures, such as multilayer perceptrons (MLP). Backpropagation is currently the most effective technique to train MLPs for supervised learning. This paper aims to be forward-looking by exploring the training of M… ▽ More A successful application of quantum annealing to machine learning is training restricted Boltzmann machines (RBM). However, many neural networks for vision applications are feedforward structures, such as multilayer perceptrons (MLP). Backpropagation is currently the most effective technique to train MLPs for supervised learning. This paper aims to be forward-looking by exploring the training of MLPs using quantum annealers. We exploit an equivalence between MLPs and energy-based models (EBM), which are a variation of RBMs with a maximum conditional likelihood objective. This leads to a strategy to train MLPs with quantum annealers as a sampling engine. We prove our setup for MLPs with sigmoid activation functions and one hidden layer, and demonstrated training of binary image classifiers on small subsets of the MNIST and Fashion-MNIST datasets using the D-Wave quantum annealer. Although problem sizes that are feasible on current annealers are limited, we obtained comprehensive results on feasible instances that validate our ideas. Our work establishes the potential of quantum computing for training MLPs. △ Less

Submitted 22 March, 2023; originally announced March 2023.

Comments: 22 pages, 15 figures

ACM Class: I.2.6

arXiv:2201.10305 [pdf, other]

doi 10.1109/EMBC48229.2022.9871220

Mutual information neural estimation for unsupervised multi-modal registration of brain images

Authors: Gerard Snaauw, Michele Sasdelli, Gabriel Maicas, Stephan Lau, Johan Verjans, Mark Jenkinson, Gustavo Carneiro

Abstract: Many applications in image-guided surgery and therapy require fast and reliable non-linear, multi-modal image registration. Recently proposed unsupervised deep learning-based registration methods have demonstrated superior performance compared to iterative methods in just a fraction of the time. Most of the learning-based methods have focused on mono-modal image registration. The extension to mult… ▽ More Many applications in image-guided surgery and therapy require fast and reliable non-linear, multi-modal image registration. Recently proposed unsupervised deep learning-based registration methods have demonstrated superior performance compared to iterative methods in just a fraction of the time. Most of the learning-based methods have focused on mono-modal image registration. The extension to multi-modal registration depends on the use of an appropriate similarity function, such as the mutual information (MI). We propose guiding the training of a deep learning-based registration method with MI estimation between an image-pair in an end-to-end trainable network. Our results show that a small, 2-layer network produces competitive results in both mono- and multi-modal registration, with sub-second run-times. Comparisons to both iterative and deep learning-based methods show that our MI-based method produces topologically and qualitatively superior results with an extremely low rate of non-diffeomorphic transformations. Real-time clinical application will benefit from a better visual matching of anatomical structures and less registration failures/outliers. △ Less

Submitted 6 October, 2022; v1 submitted 25 January, 2022; originally announced January 2022.

Comments: 4 pages, 4 figures, 2022 44th Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC), oral presentation

Journal ref: 2022 44th Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC), 2022, pp. 3510-3513

arXiv:2201.10110 [pdf, other]

A Hybrid Quantum-Classical Algorithm for Robust Fitting

Authors: Anh-Dzung Doan, Michele Sasdelli, David Suter, Tat-Jun Chin

Abstract: Fitting geometric models onto outlier contaminated data is provably intractable. Many computer vision systems rely on random sampling heuristics to solve robust fitting, which do not provide optimality guarantees and error bounds. It is therefore critical to develop novel approaches that can bridge the gap between exact solutions that are costly, and fast heuristics that offer no quality assurance… ▽ More Fitting geometric models onto outlier contaminated data is provably intractable. Many computer vision systems rely on random sampling heuristics to solve robust fitting, which do not provide optimality guarantees and error bounds. It is therefore critical to develop novel approaches that can bridge the gap between exact solutions that are costly, and fast heuristics that offer no quality assurances. In this paper, we propose a hybrid quantum-classical algorithm for robust fitting. Our core contribution is a novel robust fitting formulation that solves a sequence of integer programs and terminates with a global solution or an error bound. The combinatorial subproblems are amenable to a quantum annealer, which helps to tighten the bound efficiently. While our usage of quantum computing does not surmount the fundamental intractability of robust fitting, by providing error bounds our algorithm is a practical improvement over randomised heuristics. Moreover, our work represents a concrete application of quantum computing in computer vision. We present results obtained using an actual quantum computer (D-Wave Advantage) and via simulation. Source code: https://github.com/dadung/HQC-robust-fitting △ Less

Submitted 27 June, 2022; v1 submitted 25 January, 2022; originally announced January 2022.

Comments: IEEE/CVF Computer Vision and Pattern Recognition Conference (CVPR) 2022

arXiv:2112.01723 [pdf, other]

Adversarial Attacks against a Satellite-borne Multispectral Cloud Detector

Authors: Andrew Du, Yee Wei Law, Michele Sasdelli, Bo Chen, Ken Clarke, Michael Brown, Tat-Jun Chin

Abstract: Data collected by Earth-observing (EO) satellites are often afflicted by cloud cover. Detecting the presence of clouds -- which is increasingly done using deep learning -- is crucial preprocessing in EO applications. In fact, advanced EO satellites perform deep learning-based cloud detection on board the satellites and downlink only clear-sky data to save precious bandwidth. In this paper, we high… ▽ More Data collected by Earth-observing (EO) satellites are often afflicted by cloud cover. Detecting the presence of clouds -- which is increasingly done using deep learning -- is crucial preprocessing in EO applications. In fact, advanced EO satellites perform deep learning-based cloud detection on board the satellites and downlink only clear-sky data to save precious bandwidth. In this paper, we highlight the vulnerability of deep learning-based cloud detection towards adversarial attacks. By optimising an adversarial pattern and superimposing it into a cloudless scene, we bias the neural network into detecting clouds in the scene. Since the input spectra of cloud detectors include the non-visible bands, we generated our attacks in the multispectral domain. This opens up the potential of multi-objective attacks, specifically, adversarial biasing in the cloud-sensitive bands and visual camouflage in the visible bands. We also investigated mitigation strategies against the adversarial attacks. We hope our work further builds awareness of the potential of adversarial attacks in the EO community. △ Less

Submitted 3 December, 2021; originally announced December 2021.

arXiv:2108.11765 [pdf, other]

Physical Adversarial Attacks on an Aerial Imagery Object Detector

Authors: Andrew Du, Bo Chen, Tat-Jun Chin, Yee Wei Law, Michele Sasdelli, Ramesh Rajasegaran, Dillon Campbell

Abstract: Deep neural networks (DNNs) have become essential for processing the vast amounts of aerial imagery collected using earth-observing satellite platforms. However, DNNs are vulnerable towards adversarial examples, and it is expected that this weakness also plagues DNNs for aerial imagery. In this work, we demonstrate one of the first efforts on physical adversarial attacks on aerial imagery, whereby… ▽ More Deep neural networks (DNNs) have become essential for processing the vast amounts of aerial imagery collected using earth-observing satellite platforms. However, DNNs are vulnerable towards adversarial examples, and it is expected that this weakness also plagues DNNs for aerial imagery. In this work, we demonstrate one of the first efforts on physical adversarial attacks on aerial imagery, whereby adversarial patches were optimised, fabricated and installed on or near target objects (cars) to significantly reduce the efficacy of an object detector applied on overhead images. Physical adversarial attacks on aerial images, particularly those captured from satellite platforms, are challenged by atmospheric factors (lighting, weather, seasons) and the distance between the observer and target. To investigate the effects of these challenges, we devised novel experiments and metrics to evaluate the efficacy of physical adversarial attacks against object detectors in aerial scenes. Our results indicate the palpable threat posed by physical adversarial attacks towards DNNs for processing satellite imagery. △ Less

Submitted 20 October, 2021; v1 submitted 26 August, 2021; originally announced August 2021.

arXiv:2107.02751 [pdf, other]

Quantum Annealing Formulation for Binary Neural Networks

Authors: Michele Sasdelli, Tat-Jun Chin

Abstract: Quantum annealing is a promising paradigm for building practical quantum computers. Compared to other approaches, quantum annealing technology has been scaled up to a larger number of qubits. On the other hand, deep learning has been profoundly successful in pushing the boundaries of AI. It is thus natural to investigate potentially game changing technologies such as quantum annealers to augment t… ▽ More Quantum annealing is a promising paradigm for building practical quantum computers. Compared to other approaches, quantum annealing technology has been scaled up to a larger number of qubits. On the other hand, deep learning has been profoundly successful in pushing the boundaries of AI. It is thus natural to investigate potentially game changing technologies such as quantum annealers to augment the capabilities of deep learning. In this work, we explore binary neural networks, which are lightweight yet powerful models typically intended for resource constrained devices. Departing from current training regimes for binary networks that smooth/approximate the activation functions to make the network differentiable, we devise a quadratic unconstrained binary optimization formulation for the training problem. While the problem is intractable, i.e., the cost to estimate the binary weights scales exponentially with network size, we show how the problem can be optimized directly on a quantum annealer, thereby opening up to the potential gains of quantum computing. We experimentally validated our formulation via simulation and testing on an actual quantum annealer (D-Wave Advantage), the latter to the extent allowable by the capacity of current technology. △ Less

Submitted 4 July, 2021; originally announced July 2021.

Comments: 13 pages, 4 figures

arXiv:2106.09053 [pdf, other]

doi 10.1093/mnras/stab1736

ASASSN-14lp: two possible solutions for the observed UV suppression

Authors: Barnabás Barna, Talytha Pereira, Stefan Taubenberger, Mark Magee, Markus Kromer, Wolfgang Kerzendorf, Christian Vogl, Marc E. Williamson, Andreas Flörs, Ulrich M. Noebauer, Ryan J. Foley, Michele Sasdelli, Wolfgang Hillebrandt

Abstract: We test the adequacy of ultraviolet (UV) spectra for characterizing the outer structure of Type Ia supernova (SN) ejecta. For this purpose, we perform spectroscopic analysis for ASASSN-14lp, a normal SN Ia showing low continuum in the mid-UV regime. To explain the strong UV suppression, two possible origins have been investigated by mapping the chemical profiles over a significant part of their ej… ▽ More We test the adequacy of ultraviolet (UV) spectra for characterizing the outer structure of Type Ia supernova (SN) ejecta. For this purpose, we perform spectroscopic analysis for ASASSN-14lp, a normal SN Ia showing low continuum in the mid-UV regime. To explain the strong UV suppression, two possible origins have been investigated by mapping the chemical profiles over a significant part of their ejecta. We fit the spectral time series with mid-UV coverage obtained before and around maximum light by HST, supplemented with ground-based optical observations for the earliest epochs. The synthetic spectra are calculated with the one dimensional MC radiative-transfer code TARDIS from self-consistent ejecta models. Among several physical parameters, we constrain the abundance profiles of nine chemical elements. We find that a distribution of $^{56}$Ni (and other iron-group elements) that extends toward the highest velocities reproduces the observed UV flux well. The presence of radioactive material in the outer layers of the ejecta, if confirmed, implies strong constraints on the possible explosion scenarios. We investigate the impact of the inferred $^{56}$Ni distribution on the early light curves with the radiative transfer code TURTLS, and confront the results with the observed light curves of ASASSN-14lp. The inferred abundances are not in conflict with the observed photometry. We also test whether the UV suppression can be reproduced if the radiation at the photosphere is significantly lower in the UV regime than the pure Planck function. In this case, solar metallicity might be sufficient enough at the highest velocities to reproduce the UV suppression. △ Less

Submitted 16 June, 2021; originally announced June 2021.

Comments: 18 pages, 16 figures, accepted for publication in MNRAS

arXiv:1908.04930 [pdf, other]

Generalised Zero-Shot Learning with Domain Classification in a Joint Semantic and Visual Space

Authors: Rafael Felix, Ben Harwood, Michele Sasdelli, Gustavo Carneiro

Abstract: Generalised zero-shot learning (GZSL) is a classification problem where the learning stage relies on a set of seen visual classes and the inference stage aims to identify both the seen visual classes and a new set of unseen visual classes. Critically, both the learning and inference stages can leverage a semantic representation that is available for the seen and unseen classes. Most state-of-the-a… ▽ More Generalised zero-shot learning (GZSL) is a classification problem where the learning stage relies on a set of seen visual classes and the inference stage aims to identify both the seen visual classes and a new set of unseen visual classes. Critically, both the learning and inference stages can leverage a semantic representation that is available for the seen and unseen classes. Most state-of-the-art GZSL approaches rely on a mapping between latent visual and semantic spaces without considering if a particular sample belongs to the set of seen or unseen classes. In this paper, we propose a novel GZSL method that learns a joint latent representation that combines both visual and semantic information. This mitigates the need for learning a mapping between the two spaces. Our method also introduces a domain classification that estimates whether a sample belongs to a seen or an unseen class. Our classifier then combines a class discriminator with this domain classifier with the goal of reducing the natural bias that GZSL approaches have toward the seen classes. Experiments show that our method achieves state-of-the-art results in terms of harmonic mean, the area under the seen and unseen curve and unseen classification accuracy on public GZSL benchmark data sets. Our code will be available upon acceptance of this paper. △ Less

Submitted 13 August, 2019; originally announced August 2019.

arXiv:1908.02013 [pdf, other]

Generalised Zero-Shot Learning with a Classifier Ensemble over Multi-Modal Embedding Spaces

Authors: Rafael Felix, Ben Harwood, Michele Sasdelli, Gustavo Carneiro

Abstract: Generalised zero-shot learning (GZSL) methods aim to classify previously seen and unseen visual classes by leveraging the semantic information of those classes. In the context of GZSL, semantic information is non-visual data such as a text description of both seen and unseen classes. Previous GZSL methods have utilised transformations between visual and semantic embedding spaces, as well as the le… ▽ More Generalised zero-shot learning (GZSL) methods aim to classify previously seen and unseen visual classes by leveraging the semantic information of those classes. In the context of GZSL, semantic information is non-visual data such as a text description of both seen and unseen classes. Previous GZSL methods have utilised transformations between visual and semantic embedding spaces, as well as the learning of joint spaces that include both visual and semantic information. In either case, classification is then performed on a single learned space. We argue that each embedding space contains complementary information for the GZSL problem. By using just a visual, semantic or joint space some of this information will invariably be lost. In this paper, we demonstrate the advantages of our new GZSL method that combines the classification of visual, semantic and joint spaces. Most importantly, this ensembling allows for more information from the source domains to be seen during classification. An additional contribution of our work is the application of a calibration procedure for each classifier in the ensemble. This calibration mitigates the problem of model selection when combining the classifiers. Lastly, our proposed method achieves state-of-the-art results on the CUB, AWA1 and AWA2 benchmark data sets and provides competitive performance on the SUN data set. △ Less

Submitted 6 August, 2019; originally announced August 2019.

arXiv:1902.08544 [pdf, other]

doi 10.1051/0004-6361/201935345

Rapid Classification of TESS Planet Candidates with Convolutional Neural Networks

Authors: Hugh P. Osborn, Megan Ansdell, Yani Ioannou, Michele Sasdelli, Daniel Angerhausen, Douglas A. Caldwell, Jon M. Jenkins, Chedy Räissi, Jeffrey C. Smith

Abstract: Accurately and rapidly classifying exoplanet candidates from transit surveys is a goal of growing importance as the data rates from space-based survey missions increases. This is especially true for NASA's TESS mission which generates thousands of new candidates each month. Here we created the first deep learning model capable of classifying TESS planet candidates. We adapted the neural network mo… ▽ More Accurately and rapidly classifying exoplanet candidates from transit surveys is a goal of growing importance as the data rates from space-based survey missions increases. This is especially true for NASA's TESS mission which generates thousands of new candidates each month. Here we created the first deep learning model capable of classifying TESS planet candidates. We adapted the neural network model of Ansdell et al. (2018) to TESS data. We then trained and tested this updated model on 4 sectors of high-fidelity, pixel-level simulations data created using the Lilith simulator and processed using the full TESS SPOC pipeline. We find our model performs very well on our simulated data, with 97% average precision and 92% accuracy on planets in the 2-class model. This accuracy is also boosted by another ~4% if planets found at the wrong periods are included. We also performed 3- and 4-class classification of planets, blended & target eclipsing binaries, and non-astrophysical false positives, which have slightly lower average precision and planet accuracies, but are useful for follow-up decisions. When applied to real TESS data, 61% of TCEs coincident with currently published TOIs are recovered as planets, 4% more are suggested to be EBs, and we propose a further 200 TCEs as planet candidates. △ Less

Submitted 22 February, 2019; originally announced February 2019.

Comments: 11 pages, 10 figures, Submitted to A&A

Journal ref: A&A 633, A53 (2020)

arXiv:1902.04570 [pdf, other]

Real-time tracker with fast recovery from target loss

Authors: Alessandro Bay, Panagiotis Sidiropoulos, Eduard Vazquez, Michele Sasdelli

Abstract: In this paper, we introduce a variation of a state-of-the-art real-time tracker (CFNet), which adds to the original algorithm robustness to target loss without a significant computational overhead. The new method is based on the assumption that the feature map can be used to estimate the tracking confidence more accurately. When the confidence is low, we avoid updating the object's position throug… ▽ More In this paper, we introduce a variation of a state-of-the-art real-time tracker (CFNet), which adds to the original algorithm robustness to target loss without a significant computational overhead. The new method is based on the assumption that the feature map can be used to estimate the tracking confidence more accurately. When the confidence is low, we avoid updating the object's position through the feature map; instead, the tracker passes to a single-frame failure mode, during which the patch's low-level visual content is used to swiftly update the object's position, before recovering from the target loss in the next frame. The experimental evidence provided by evaluating the method on several tracking datasets validates both the theoretical assumption that the feature map is associated to tracking confidence, and that the proposed implementation can achieve target recovery in multiple scenarios, without compromising the real-time performance. △ Less

Submitted 12 February, 2019; originally announced February 2019.

Comments: arXiv admin note: substantial text overlap with arXiv:1806.07844

arXiv:1901.04623 [pdf, other]

Multi-modal Ensemble Classification for Generalized Zero Shot Learning

Authors: Rafael Felix, Michele Sasdelli, Ian Reid, Gustavo Carneiro

Abstract: Generalized zero shot learning (GZSL) is defined by a training process containing a set of visual samples from seen classes and a set of semantic samples from seen and unseen classes, while the testing process consists of the classification of visual samples from seen and unseen classes. Current approaches are based on testing processes that focus on only one of the modalities (visual or semantic)… ▽ More Generalized zero shot learning (GZSL) is defined by a training process containing a set of visual samples from seen classes and a set of semantic samples from seen and unseen classes, while the testing process consists of the classification of visual samples from seen and unseen classes. Current approaches are based on testing processes that focus on only one of the modalities (visual or semantic), even when the training uses both modalities (mostly for regularizing the training process). This under-utilization of modalities, particularly during testing, can hinder the classification accuracy of the method. In addition, we note a scarce attention to the development of learning methods that explicitly optimize a balanced performance of seen and unseen classes. Such issue is one of the reasons behind the vastly superior classification accuracy of seen classes in GZSL methods. In this paper, we mitigate these issues by proposing a new GZSL method based on multi-modal training and testing processes, where the optimization explicitly promotes a balanced classification accuracy between seen and unseen classes. Furthermore, we explore Bayesian inference for the visual and semantic classifiers, which is another novelty of our work in the GZSL framework. Experiments show that our method holds the state of the art (SOTA) results in terms of harmonic mean (H-mean) classification between seen and unseen classes and area under the seen and unseen curve (AUSUC) on several public GZSL benchmarks. △ Less

Submitted 5 February, 2019; v1 submitted 14 January, 2019; originally announced January 2019.

Comments: 10 pages, 3 Figures, 4 Tables

arXiv:1811.02949 [pdf, other]

Instance Retrieval at Fine-grained Level Using Multi-Attribute Recognition

Authors: Roshanak Zakizadeh, Yu Qian, Michele Sasdelli, Eduard Vazquez

Abstract: In this paper, we present a method for instance ranking and retrieval at fine-grained level based on the global features extracted from a multi-attribute recognition model which is not dependent on landmarks information or part-based annotations. Further, we make this architecture suitable for mobile-device application by adopting the bilinear CNN to make the multi-attribute recognition model smal… ▽ More In this paper, we present a method for instance ranking and retrieval at fine-grained level based on the global features extracted from a multi-attribute recognition model which is not dependent on landmarks information or part-based annotations. Further, we make this architecture suitable for mobile-device application by adopting the bilinear CNN to make the multi-attribute recognition model smaller (in terms of the number of parameters). The experiments run on the Dress category of DeepFashion In-Shop Clothes Retrieval and CUB200 datasets show that the results of instance retrieval at fine-grained level are promising for these datasets, specially in terms of texture and color. △ Less

Submitted 7 November, 2018; originally announced November 2018.

arXiv:1810.13434 [pdf, other]

doi 10.3847/2041-8213/aaf23b

Scientific Domain Knowledge Improves Exoplanet Transit Classification with Deep Learning

Authors: Megan Ansdell, Yani Ioannou, Hugh P. Osborn, Michele Sasdelli, Jeffrey C. Smith, Jon M. Jenkins, Chedy Raissi, Daniel Angerhausen

Abstract: Space-based missions such as Kepler, and soon TESS, provide large datasets that must be analyzed efficiently and systematically. Recent work by Shallue & Vanderburg (2018) successfully used state-of-the-art deep learning models to automatically classify Kepler transit signals as either exoplanets or false positives; our application of their model yielded 95.8% accuracy and 95.5% average precision.… ▽ More Space-based missions such as Kepler, and soon TESS, provide large datasets that must be analyzed efficiently and systematically. Recent work by Shallue & Vanderburg (2018) successfully used state-of-the-art deep learning models to automatically classify Kepler transit signals as either exoplanets or false positives; our application of their model yielded 95.8% accuracy and 95.5% average precision. Here we expand upon that work by including additional scientific domain knowledge into the network architecture and input representations to significantly increase overall model performance to 97.5% accuracy and 98.0% average precision. Notably, we achieve 15-20% gains in recall for the lowest signal-to-noise transits that can correspond to rocky planets in the habitable zone. We input into the network centroid time-series information derived from Kepler data plus key stellar parameters taken from the Kepler DR25 catalogue. We also implement data augmentation techniques to alleviate model over-fitting. These improvements allow us to drastically reduce the size of the model, while still maintaining improved performance; smaller models are better for generalization, for example from Kepler to TESS data. This work illustrates the importance of including expert domain knowledge in even state-of-the-art deep learning models when applying them to scientific research problems that seek to identify weak signals in noisy data. This classification tool will be especially useful for upcoming space-based photometry missions focused on finding small planets, such as TESS and PLATO. △ Less

Submitted 21 November, 2018; v1 submitted 31 October, 2018; originally announced October 2018.

Comments: 8 pages, 5 figures, accepted to ApJ Letters

arXiv:1807.11674 [pdf, ps, other]

Improving the Annotation of DeepFashion Images for Fine-grained Attribute Recognition

Authors: Roshanak Zakizadeh, Michele Sasdelli, Yu Qian, Eduard Vazquez

Abstract: DeepFashion is a widely used clothing dataset with 50 categories and more than overall 200k images where each image is annotated with fine-grained attributes. This dataset is often used for clothes recognition and although it provides comprehensive annotations, the attributes distribution is unbalanced and repetitive specially for training fine-grained attribute recognition models. In this work, w… ▽ More DeepFashion is a widely used clothing dataset with 50 categories and more than overall 200k images where each image is annotated with fine-grained attributes. This dataset is often used for clothes recognition and although it provides comprehensive annotations, the attributes distribution is unbalanced and repetitive specially for training fine-grained attribute recognition models. In this work, we tailored DeepFashion for fine-grained attribute recognition task by focusing on each category separately. After selecting categories with sufficient number of images for training, we remove very scarce attributes and merge the duplicate ones in each category, then we clean the dataset based on the new list of attributes. We use a bilinear convolutional neural network with pairwise ranking loss function for multi-label fine-grained attribute recognition and show that the new annotations improve the results for such a task. The detailed annotations for each of the selected categories are provided for public use. △ Less

Submitted 31 July, 2018; originally announced July 2018.

arXiv:1806.07844 [pdf, other]

Hide and Seek tracker: Real-time recovery from target loss

Authors: Alessandro Bay, Panagiotis Sidiropoulos, Eduard Vazquez, Michele Sasdelli

Abstract: In this paper, we examine the real-time recovery of a video tracker from a target loss, using information that is already available from the original tracker and without a significant computational overhead. More specifically, before using the tracker output to update the target position we estimate the detection confidence. In the case of a low confidence, the position update is rejected and the… ▽ More In this paper, we examine the real-time recovery of a video tracker from a target loss, using information that is already available from the original tracker and without a significant computational overhead. More specifically, before using the tracker output to update the target position we estimate the detection confidence. In the case of a low confidence, the position update is rejected and the tracker passes to a single-frame failure mode, during which the patch low-level visual content is used to swiftly update the object position, before recovering from the target loss in the next frame. Orthogonally to this improvement, we further enhance the running average method used for creating the query model in tracking-through-similarity. The experimental evidence provided by evaluation on standard tracking datasets (OTB-50, OTB-100 and OTB-2013) validate that target recovery can be successfully achieved without compromising the real-time update of the target position. △ Less

Submitted 20 June, 2018; originally announced June 2018.

arXiv:1806.07124 [pdf, ps, other]

FineTag: Multi-attribute Classification at Fine-grained Level in Images

Authors: Roshanak Zakizadeh, Michele Sasdelli, Yu Qian, Eduard Vazquez

Abstract: In this paper, we address the extraction of the fine-grained attributes of an instance as a `multi-attribute classification' problem. To this end, we propose an end-to-end architecture by adopting the bi-linear Convolutional Neural Network with the pairwise ranking loss. This is the first time such architecture is applied for the fine-grained attributes classification problem. We compared the prop… ▽ More In this paper, we address the extraction of the fine-grained attributes of an instance as a `multi-attribute classification' problem. To this end, we propose an end-to-end architecture by adopting the bi-linear Convolutional Neural Network with the pairwise ranking loss. This is the first time such architecture is applied for the fine-grained attributes classification problem. We compared the proposed method with a competitive deep Convolutional Neural Network baseline. Extensive experiments show that the proposed method attains/outperforms the performance of compared baseline with significantly less number of parameters ($40\times$ less). We demonstrated our approach on CUB200 birds dataset whose annotations are adapted in this work for multi-attribute classification at fine-grained level. △ Less

Submitted 25 October, 2018; v1 submitted 19 June, 2018; originally announced June 2018.

arXiv:1705.09451 [pdf, other]

Algorithmic clothing: hybrid recommendation, from street-style-to-shop

Authors: Y Qian, P Giaccone, M Sasdelli, E Vasquez, B Sengupta

Abstract: In this paper we detail Cortexica's (https://www.cortexica.com) recommendation framework -- particularly, we describe how a hybrid visual recommender system can be created by combining conditional random fields for segmentation and deep neural networks for object localisation and feature representation. The recommendation system that is built after localisation, segmentation and classification has… ▽ More In this paper we detail Cortexica's (https://www.cortexica.com) recommendation framework -- particularly, we describe how a hybrid visual recommender system can be created by combining conditional random fields for segmentation and deep neural networks for object localisation and feature representation. The recommendation system that is built after localisation, segmentation and classification has two properties -- first, it is knowledge based in the sense that it learns pairwise preference/occurrence matrix by utilising knowledge from experts (images from fashion blogs) and second, it is content-based as it utilises a deep learning based framework for learning feature representation. Such a construct is especially useful when there is a scarcity of user preference data, that forms the foundation of many collaborative recommendation algorithms. △ Less

Submitted 12 November, 2017; v1 submitted 26 May, 2017; originally announced May 2017.

Comments: KDD 2017 Workshop on ML meets Fashion

arXiv:1703.02898 [pdf, other]

Large-scale image analysis using docker sandboxing

Authors: B Sengupta, E Vazquez, M Sasdelli, Y Qian, M Peniak, L Netherton, G Delfino

Abstract: With the advent of specialized hardware such as Graphics Processing Units (GPUs), large scale image localization, classification and retrieval have seen increased prevalence. Designing scalable software architecture that co-evolves with such specialized hardware is a challenge in the commercial setting. In this paper, we describe one such architecture (\textit{Cortexica}) that leverages scalabilit… ▽ More With the advent of specialized hardware such as Graphics Processing Units (GPUs), large scale image localization, classification and retrieval have seen increased prevalence. Designing scalable software architecture that co-evolves with such specialized hardware is a challenge in the commercial setting. In this paper, we describe one such architecture (\textit{Cortexica}) that leverages scalability of GPUs and sandboxing offered by docker containers. This allows for the flexibility of mixing different computer architectures as well as computational algorithms with the security of a trusted environment. We illustrate the utility of this framework in a commercial setting i.e., searching for multiple products in an image by combining image localisation and retrieval. △ Less

Submitted 7 March, 2017; originally announced March 2017.

arXiv:1612.07104 [pdf, other]

doi 10.1093/mnras/stw3323

A metric space for type Ia supernova spectra: a new method to assess explosion scenarios

Authors: Michele Sasdelli, W. Hillebrandt, M. Kromer, E. E. O. Ishida, F. K. Roepke, S. A. Simm, R. Pakmor

Abstract: Over the past years type Ia supernovae (SNe Ia) have become a major tool to determine the expansion history of the Universe, and considerable attention has been given to, both, observations and models of these events. However, until now, their progenitors are not known. The observed diversity of light curves and spectra seems to point at different progenitor channels and explosion mechanisms. Here… ▽ More Over the past years type Ia supernovae (SNe Ia) have become a major tool to determine the expansion history of the Universe, and considerable attention has been given to, both, observations and models of these events. However, until now, their progenitors are not known. The observed diversity of light curves and spectra seems to point at different progenitor channels and explosion mechanisms. Here, we present a new way to compare model predictions with observations in a systematic way. Our method is based on the construction of a metric space for SN Ia spectra by means of linear Principal Component Analysis (PCA), taking care of missing and/or noisy data, and making use of Partial Least Square regression (PLS) to find correlations between spectral properties and photometric data. We investigate realizations of the three major classes of explosion models that are presently discussed: delayed-detonation Chandrasekhar-mass explosions, sub-Chandrasekhar-mass detonations, and double-degenerate mergers, and compare them with data. We show that in the PC space all scenarios have observed counterparts, supporting the idea that different progenitors are likely. However, all classes of models face problems in reproducing the observed correlations between spectral properties and light curves and colors. Possible reasons are briefly discussed. △ Less

Submitted 21 December, 2016; originally announced December 2016.

Comments: 28 pages, 28 figures; accepted for publication in MNRAS

arXiv:1605.05507 [pdf, other]

doi 10.1093/mnras/stw1214

Luminosity distributions of Type Ia Supernovae

Authors: Chris Ashall, Paolo Mazzali, Michele Sasdelli, Simon Prentice

Abstract: We have assembled a dataset of 165 low redshift, $z<$0.06, publicly available type Ia supernovae (SNe Ia). We produce maximum light magnitude ($M_{B}$ and $M_{V}$) distributions of SNe Ia to explore the diversity of parameter space that they can fill. Before correction for host galaxy extinction we find that the mean $M_{B}$ and $M_{V}$ of SNe Ia are $-18.58\pm0.07$mag and $-18.72\pm0.05$mag respe… ▽ More We have assembled a dataset of 165 low redshift, $z<$0.06, publicly available type Ia supernovae (SNe Ia). We produce maximum light magnitude ($M_{B}$ and $M_{V}$) distributions of SNe Ia to explore the diversity of parameter space that they can fill. Before correction for host galaxy extinction we find that the mean $M_{B}$ and $M_{V}$ of SNe Ia are $-18.58\pm0.07$mag and $-18.72\pm0.05$mag respectively. Host galaxy extinction is corrected using a new method based on the SN spectrum. After correction, the mean values of $M_{B}$ and $M_{V}$ of SNe Ia are $-19.10\pm0.06$ and $-19.10\pm0.05$mag respectively. After correction for host galaxy extinction, `normal' SNeIa ($Δm_{15}(B)<1.6$mag) fill a larger parameter space in the Width-Luminosity Relation (WLR) than previously suggested, and there is evidence for luminous SNe Ia with large $Δm_{15}(B)$. We find a bimodal distribution in $Δm_{15}(B)$, with a pronounced lack of transitional events at $Δm_{15}(B)$=1.6 mag. We confirm that faster, low-luminosity SNe tend to come from passive galaxies. Dividing the sample by host galaxy type, SNe Ia from star-forming (S-F) galaxies have a mean $M_{B}=-19.20 \pm 0.05$ mag, while SNe Ia from passive galaxies have a mean $M_{B}=-18.57 \pm 0.24$ mag. Even excluding fast declining SNe, `normal' ($M_{B}<-18$ mag) SNe Ia from S-F and passive galaxies are distinct. In the $V$-band, there is a difference of 0.4$ \pm $0.13 mag between the median ($M_{V}$) values of the `normal' SN Ia population from passive and S-F galaxies. This is consistent with ($\sim 15 \pm $10)% of `normal' SNe Ia from S-F galaxies coming from an old stellar population. △ Less

Submitted 20 May, 2016; v1 submitted 18 May, 2016; originally announced May 2016.

Comments: Accepted for publication in MNRAS

arXiv:1604.03899 [pdf, other]

doi 10.1093/mnras/stw900

Breaking the color-reddening degeneracy in type Ia supernovae

Authors: M. Sasdelli, E. E. O. Ishida, W. Hillebrandt, C. Ashall, P. A. Mazzali, S. Prentice

Abstract: A new method to study the intrinsic color and luminosity of type Ia supernovae (SNe Ia) is presented. A metric space built using principal component analysis (PCA) on spectral series SNe Ia between -12.5 and +17.5 days from B maximum is used as a set of predictors. This metric space is built to be insensitive to reddening. Hence, it does not predict the part of color excess due to dust-extinction.… ▽ More A new method to study the intrinsic color and luminosity of type Ia supernovae (SNe Ia) is presented. A metric space built using principal component analysis (PCA) on spectral series SNe Ia between -12.5 and +17.5 days from B maximum is used as a set of predictors. This metric space is built to be insensitive to reddening. Hence, it does not predict the part of color excess due to dust-extinction. At the same time, the rich variability of SN Ia spectra is a good predictor of a large fraction of the intrinsic color variability. Such metric space is a good predictor of the epoch when the maximum in the B-V color curve is reached. Multivariate Partial Least Square (PLS) regression predicts the intrinsic B band light-curve and the intrinsic B-V color curve up to a month after maximum. This allows to study the relation between the light curves of SNe Ia and their spectra. The total-to-selective extinction ratio RV in the host-galaxy of SNe Ia is found, on average, to be consistent with typical Milky-Way values. This analysis shows the importance of collecting spectra to study SNe Ia, even with large sample publicly available. Future automated surveys as LSST will provide a large number of light curves. The analysis shows that observing accompaning spectra for a significative number of SNe will be important even in the case of "normal" SNe Ia. △ Less

Submitted 13 April, 2016; originally announced April 2016.

Comments: 11 pages, 11 figures

arXiv:1512.06810 [pdf, other]

doi 10.1093/mnras/stw1228

Exploring the spectroscopic diversity of type Ia supernovae with DRACULA: a machine learning approach

Authors: Michele Sasdelli, E. E. O. Ishida, R. Vilalta, M. Aguena, V. C. Busti, H. Camacho, A. M. M. Trindade, F. Gieseke, R. S. de Souza, Y. T. Fantaye, P. A. Mazzali

Abstract: The existence of multiple subclasses of type Ia supernovae (SNeIa) has been the subject of great debate in the last decade. One major challenge inevitably met when trying to infer the existence of one or more subclasses is the time consuming, and subjective, process of subclass definition. In this work, we show how machine learning tools facilitate identification of subtypes of SNeIa through the e… ▽ More The existence of multiple subclasses of type Ia supernovae (SNeIa) has been the subject of great debate in the last decade. One major challenge inevitably met when trying to infer the existence of one or more subclasses is the time consuming, and subjective, process of subclass definition. In this work, we show how machine learning tools facilitate identification of subtypes of SNeIa through the establishment of a hierarchical group structure in the continuous space of spectral diversity formed by these objects. Using Deep Learning, we were capable of performing such identification in a 4 dimensional feature space (+1 for time evolution), while the standard Principal Component Analysis barely achieves similar results using 15 principal components. This is evidence that the progenitor system and the explosion mechanism can be described by a small number of initial physical parameters. As a proof of concept, we show that our results are in close agreement with a previously suggested classification scheme and that our proposed method can grasp the main spectral features behind the definition of such subtypes. This allows the confirmation of the velocity of lines as a first order effect in the determination of SNIa subtypes, followed by 91bg-like events. Given the expected data deluge in the forthcoming years, our proposed approach is essential to allow a quick and statistically coherent identification of SNeIa subtypes (and outliers). All tools used in this work were made publicly available in the Python package Dimensionality Reduction And Clustering for Unsupervised Learning in Astronomy (DRACULA) and can be found within COINtoolbox (https://github.com/COINtoolbox/DRACULA). △ Less

Submitted 30 June, 2016; v1 submitted 21 December, 2015; originally announced December 2015.

Comments: 16 pages, 12 figures, accepted for publication in MNRAS

arXiv:1512.03995 [pdf, other]

doi 10.3847/0004-637X/817/2/114

A Luminous Peculiar Type Ia Supernova SN 2011hr: More Like SN 1991T or SN 2007if?

Authors: Jujia Zhang, Xiaofeng Wang, Michele Sasdelli, Tianmeng Zhang, Zhengweei Liu, Paolo A. Mazzali, Xiangcun Meng, Keiichi Maeda, Juncheng Chen, Fang Huang, Xulin Zhao, Kaicheng Zhang, Qian Zhai, Elena Pian, Bo Wang, Liang Chang, Weimin Yi, Chuan-Jun Wang, Xueli Wang, Yuxin Xin, Jianguo Wang, Baoli Lun, Xiangming Zheng, Xiliang Zhang, Yufeng Fan , et al. (1 additional authors not shown)

Abstract: Photometric and spectroscopic observations of a slowly declining, luminous Type Ia supernova (SN Ia) SN 2011hr in the starburst galaxy NGC 2691 are presented. SN 2011hr is found to peak at $M_{B}=-19.84 \pm 0.40\,\rm{mag}$, with a post-maximum decline rate $Δ$m$_{15}$(B) = 0.92 $\pm$ 0.03\,$\rm{mag}$. From the maximum-light bolometric luminosity,… ▽ More Photometric and spectroscopic observations of a slowly declining, luminous Type Ia supernova (SN Ia) SN 2011hr in the starburst galaxy NGC 2691 are presented. SN 2011hr is found to peak at $M_{B}=-19.84 \pm 0.40\,\rm{mag}$, with a post-maximum decline rate $Δ$m$_{15}$(B) = 0.92 $\pm$ 0.03\,$\rm{mag}$. From the maximum-light bolometric luminosity, $L=(2.30 \pm 0.90) \times 10^{43}\,\rm{erg\,s^{-1}}$, we estimate the mass of synthesized \Nifs\ in SN 2011hr to be $M(\rm{^{56}Ni})=1.11 \pm 0.43\,M_{\sun}$. SN 2011hr appears more luminous than SN 1991T at around maximum light, and the absorption features from its intermediate-mass elements (IMEs) are noticeably weaker than the latter at similar phases. Spectral modeling suggests that SN 2011hr has the IMEs of $\sim$\,0.07 M$_{\sun}$ in the outer ejecta, which is much lower than the typical value of normal SNe Ia (i.e., 0.3 -- 0.4 M$_{\sun}$) and is also lower than the value of SN 1991T (i.e., $\sim$\,0.18 M$_{\sun}$). These results indicate that SN 2011hr may arise from a Chandrasekhar-mass white dwarf progenitor that experienced a more efficient burning process in the explosion. Nevertheless, it is still possible that SN 2011hr may serve as a transitional object connecting the SN 1991T-like SNe Ia with the superluminous subclass like SN 2007if given that the latter also shows very weak IMEs at all phases. △ Less

Submitted 7 January, 2016; v1 submitted 12 December, 2015; originally announced December 2015.

Comments: 13pages, 13figures. Accepted for publication in the ApJ

arXiv:1411.4424 [pdf, other]

doi 10.1093/mnras/stu2416

A metric space for type Ia supernova spectra

Authors: Michele Sasdelli, W. Hillebrandt, G. Aldering, P. Antilogus, C. Aragon, S. Bailey, C. Baltay, S. Benitez-Herrera, S. Bongard, C. Buton, A. Canto, F. Cellier-Holzem, J. Chen, M. Childress, N. Chotard, Y. Copin, H. K. Fakhouri, U. Feindt, M. Fink, M. Fleury, D. Fouchez, E. Gangler, J. Guy, E. E. O. Ishida, A. G. Kim , et al. (21 additional authors not shown)

Abstract: We develop a new framework for use in exploring Type Ia Supernova (SN Ia) spectra. Combining Principal Component Analysis (PCA) and Partial Least Square analysis (PLS) we are able to establish correlations between the Principal Components (PCs) and spectroscopic/photometric SNe Ia features. The technique was applied to ~120 supernova and ~800 spectra from the Nearby Supernova Factory. The ability… ▽ More We develop a new framework for use in exploring Type Ia Supernova (SN Ia) spectra. Combining Principal Component Analysis (PCA) and Partial Least Square analysis (PLS) we are able to establish correlations between the Principal Components (PCs) and spectroscopic/photometric SNe Ia features. The technique was applied to ~120 supernova and ~800 spectra from the Nearby Supernova Factory. The ability of PCA to group together SNe Ia with similar spectral features, already explored in previous studies, is greatly enhanced by two important modifications: (1) the initial data matrix is built using derivatives of spectra over the wavelength, which increases the weight of weak lines and discards extinction, and (2) we extract time evolution information through the use of entire spectral sequences concatenated in each line of the input data matrix. These allow us to define a stable PC parameter space which can be used to characterize synthetic SN Ia spectra by means of real SN features. Using PLS, we demonstrate that the information from important previously known spectral indicators (namely the pseudo-equivalent width (pEW) of Si II 5972 / Si II 6355 and the line velocity of S II 5640 / Si II 6355) at a given epoch, is contained within the PC space and can be determined through a linear combination of the most important PCs. We also show that the PC space encompasses photometric features like B or V magnitudes, B-V color and SALT2 parameters c and x1. The observed colors and magnitudes, that are heavily affected by extinction, cannot be reconstructed using this technique alone. All the above mentioned applications allowed us to construct a metric space for comparing synthetic SN Ia spectra with observations. △ Less

Submitted 17 November, 2014; originally announced November 2014.

Comments: 22 pages, 26 figures, 3 tables, accepted for publication in MNRAS

arXiv:1409.0116 [pdf, ps, other]

doi 10.1093/mnras/stu1777

Abundance stratification in Type Ia Supernovae - IV: the luminous, peculiar SN 1991T

Authors: Michele Sasdelli, Paolo A. Mazzali, Elena Pian, Ken'ichi Nomoto, Stephan Hachinger, Enrico Cappellaro, Stefano Benetti

Abstract: The abundance distribution of the elements in the ejecta of the peculiar, luminous Type Ia supernova (SN Ia) 1991T is obtained modelling spectra from before maximum light until a year after the explosion, with the method of "Abundance Tomography". SN 1991T is different from other slowly declining SNe Ia (e.g. SN 1999ee) in having a weaker Si II 6355 line and strong features of iron group elements… ▽ More The abundance distribution of the elements in the ejecta of the peculiar, luminous Type Ia supernova (SN Ia) 1991T is obtained modelling spectra from before maximum light until a year after the explosion, with the method of "Abundance Tomography". SN 1991T is different from other slowly declining SNe Ia (e.g. SN 1999ee) in having a weaker Si II 6355 line and strong features of iron group elements before maximum. The distance to the SN is investigated along with the abundances and the density profile. The ionization transition that happens around maximum sets a strict upper limit on the luminosity. Both W7 and the WDD3 delayed detonation model are tested. WDD3 is found to provide marginally better fits. In this model the core of the ejecta is dominated by stable Fe with a mass of about 0.15 solar masses, as in most SNe Ia. The layer above is mainly 56Ni up to v~10000 km/s (~0.78 solar masses). A significant amount of 56Ni (~3 %) is located in the outer layers. A narrow layer between 10000 km/s and ~12000 km/s is dominated by intermediate mass elements (IME), ~0.18 solar masses. This is small for a SN Ia. The high luminosity and the consequently high ionization, and the high 56Ni abundance at high velocities explain the peculiar early-time spectra of SN 1991T. The outer part is mainly of oxygen, ~0.3 solar masses. Carbon lines are never detected, yielding an upper limit of 0.01 solar masses for C. The abundances obtained with the W7 density model are qualitatively similar to those of the WDD3 model. Different elements are stratified with moderate mixing, resembling a delayed detonation. △ Less

Submitted 30 August, 2014; originally announced September 2014.

Comments: 19 pages, 9 figures, 4 tables, accepted for publication in MNRAS

Showing 1–28 of 28 results for author: Sasdelli, M