Zum Hauptinhalt springen

Showing 1–50 of 77 results for author: Petersson, L

Searching in archive cs. Search in all archives.
.
  1. arXiv:2404.11256  [pdf, other

    cs.CV

    MMCBE: Multi-modality Dataset for Crop Biomass Estimation and Beyond

    Authors: Xuesong Li, Zeeshan Hayder, Ali Zia, Connor Cassidy, Shiming Liu, Warwick Stiller, Eric Stone, Warren Conaty, Lars Petersson, Vivien Rolland

    Abstract: Crop biomass, a critical indicator of plant growth, health, and productivity, is invaluable for crop breeding programs and agronomic research. However, the accurate and scalable quantification of crop biomass remains inaccessible due to limitations in existing measurement methods. One of the obstacles impeding the advancement of current crop biomass prediction methodologies is the scarcity of publ… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

    Comments: 10 pages, 10 figures, 3 tables

  2. arXiv:2404.09378  [pdf, other

    cs.CV

    Orientation-conditioned Facial Texture Mapping for Video-based Facial Remote Photoplethysmography Estimation

    Authors: Sam Cantrill, David Ahmedt-Aristizabal, Lars Petersson, Hanna Suominen, Mohammad Ali Armin

    Abstract: Camera-based remote photoplethysmography (rPPG) enables contactless measurement of important physiological signals such as pulse rate (PR). However, dynamic and unconstrained subject motion introduces significant variability into the facial appearance in video, confounding the ability of video-based methods to accurately extract the rPPG signal. In this study, we leverage the 3D facial surface to… ▽ More

    Submitted 1 May, 2024; v1 submitted 14 April, 2024; originally announced April 2024.

    Comments: 12 pages, 8 figures, 6 tables; minor corrections

    ACM Class: I.4.9

  3. arXiv:2403.18442  [pdf, other

    cs.CV

    Backpropagation-free Network for 3D Test-time Adaptation

    Authors: Yanshuo Wang, Ali Cheraghian, Zeeshan Hayder, Jie Hong, Sameera Ramasinghe, Shafin Rahman, David Ahmedt-Aristizabal, Xuesong Li, Lars Petersson, Mehrtash Harandi

    Abstract: Real-world systems often encounter new data over time, which leads to experiencing target domain shifts. Existing Test-Time Adaptation (TTA) methods tend to apply computationally heavy and memory-intensive backpropagation-based approaches to handle this. Here, we propose a novel method that uses a backpropagation-free approach for TTA for the specific case of 3D data. Our model uses a two-stream a… ▽ More

    Submitted 24 April, 2024; v1 submitted 27 March, 2024; originally announced March 2024.

    Comments: CVPR 2024

  4. arXiv:2403.14235  [pdf, other

    astro-ph.GA astro-ph.CO astro-ph.IM cs.CV cs.LG

    RG-CAT: Detection Pipeline and Catalogue of Radio Galaxies in the EMU Pilot Survey

    Authors: Nikhel Gupta, Ray P. Norris, Zeeshan Hayder, Minh Huynh, Lars Petersson, X. Rosalind Wang, Andrew M. Hopkins, Heinz Andernach, Yjan Gordon, Simone Riggi, Miranda Yew, Evan J. Crawford, Bärbel Koribalski, Miroslav D. Filipović, Anna D. Kapinśka, Stanislav Shabala, Tessa Vernstrom, Joshua R. Marvil

    Abstract: We present source detection and catalogue construction pipelines to build the first catalogue of radio galaxies from the 270 $\rm deg^2$ pilot survey of the Evolutionary Map of the Universe (EMU-PS) conducted with the Australian Square Kilometre Array Pathfinder (ASKAP) telescope. The detection pipeline uses Gal-DINO computer-vision networks (Gupta et al., 2024) to predict the categories of radio… ▽ More

    Submitted 21 March, 2024; originally announced March 2024.

    Comments: Accepted for publication in PASA. The paper has 22 pages, 12 figures and 5 tables

  5. arXiv:2312.10930  [pdf, other

    cs.CV

    Deep Learning Approaches for Seizure Video Analysis: A Review

    Authors: David Ahmedt-Aristizabal, Mohammad Ali Armin, Zeeshan Hayder, Norberto Garcia-Cairasco, Lars Petersson, Clinton Fookes, Simon Denman, Aileen McGonigal

    Abstract: Seizure events can manifest as transient disruptions in the control of movements which may be organized in distinct behavioral sequences, accompanied or not by other observable features such as altered facial expressions. The analysis of these clinical signs, referred to as semiology, is subject to observer variations when specialists evaluate video-recorded events in the clinical setting. To enha… ▽ More

    Submitted 4 March, 2024; v1 submitted 17 December, 2023; originally announced December 2023.

    Comments: Accepted in Epilepsy & Behavior

  6. arXiv:2312.06728  [pdf, other

    cs.CV astro-ph.CO astro-ph.GA astro-ph.IM

    A Multimodal Dataset and Benchmark for Radio Galaxy and Infrared Host Detection

    Authors: Nikhel Gupta, Zeeshan Hayder, Ray P. Norris, Minh Hyunh, Lars Petersson

    Abstract: We present a novel multimodal dataset developed by expert astronomers to automate the detection and localisation of multi-component extended radio galaxies and their corresponding infrared hosts. The dataset comprises 4,155 instances of galaxies in 2,800 images with both radio and infrared modalities. Each instance contains information on the extended radio galaxy class, its corresponding bounding… ▽ More

    Submitted 11 December, 2023; originally announced December 2023.

    Comments: Accepted in NeurIPS 2023 conference ML4PS workshop (https://nips.cc/). The full version accepted in PASA, is available at https://doi.org/10.1017/pasa.2023.64

  7. arXiv:2312.00306  [pdf, other

    astro-ph.IM astro-ph.CO astro-ph.GA cs.CV

    RadioGalaxyNET: Dataset and Novel Computer Vision Algorithms for the Detection of Extended Radio Galaxies and Infrared Hosts

    Authors: Nikhel Gupta, Zeeshan Hayder, Ray P. Norris, Minh Huynh, Lars Petersson

    Abstract: Creating radio galaxy catalogues from next-generation deep surveys requires automated identification of associated components of extended sources and their corresponding infrared hosts. In this paper, we introduce RadioGalaxyNET, a multimodal dataset, and a suite of novel computer vision algorithms designed to automate the detection and localization of multi-component extended radio galaxies and t… ▽ More

    Submitted 30 November, 2023; originally announced December 2023.

    Comments: Accepted for publication in PASA. The paper has 17 pages, 6 figures, 5 tables

  8. arXiv:2311.15836  [pdf, other

    cs.CV

    Syn3DWound: A Synthetic Dataset for 3D Wound Bed Analysis

    Authors: Léo Lebrat, Rodrigo Santa Cruz, Remi Chierchia, Yulia Arzhaeva, Mohammad Ali Armin, Joshua Goldsmith, Jeremy Oorloff, Prithvi Reddy, Chuong Nguyen, Lars Petersson, Michelle Barakat-Johnson, Georgina Luscombe, Clinton Fookes, Olivier Salvado, David Ahmedt-Aristizabal

    Abstract: Wound management poses a significant challenge, particularly for bedridden patients and the elderly. Accurate diagnostic and healing monitoring can significantly benefit from modern image analysis, providing accurate and precise measurements of wounds. Despite several existing techniques, the shortage of expansive and diverse training datasets remains a significant obstacle to constructing machine… ▽ More

    Submitted 3 March, 2024; v1 submitted 27 November, 2023; originally announced November 2023.

    Comments: In the IEEE International Symposium on Biomedical Imaging (ISBI) 2024

  9. arXiv:2310.03335  [pdf, other

    cs.CV

    Continual Test-time Domain Adaptation via Dynamic Sample Selection

    Authors: Yanshuo Wang, Jie Hong, Ali Cheraghian, Shafin Rahman, David Ahmedt-Aristizabal, Lars Petersson, Mehrtash Harandi

    Abstract: The objective of Continual Test-time Domain Adaptation (CTDA) is to gradually adapt a pre-trained model to a sequence of target domains without accessing the source data. This paper proposes a Dynamic Sample Selection (DSS) method for CTDA. DSS consists of dynamic thresholding, positive learning, and negative learning processes. Traditionally, models learn from unlabeled unknown environment data a… ▽ More

    Submitted 27 November, 2023; v1 submitted 5 October, 2023; originally announced October 2023.

    Comments: 2024 IEEE/CVF Winter Conference on Applications of Computer Vision

  10. arXiv:2308.12558  [pdf, other

    cs.CV

    Hyperbolic Audio-visual Zero-shot Learning

    Authors: Jie Hong, Zeeshan Hayder, Junlin Han, Pengfei Fang, Mehrtash Harandi, Lars Petersson

    Abstract: Audio-visual zero-shot learning aims to classify samples consisting of a pair of corresponding audio and video sequences from classes that are not present during training. An analysis of the audio-visual data reveals a large degree of hyperbolicity, indicating the potential benefit of using a hyperbolic transformation to achieve curvature-aware geometric learning, with the aim of exploring more co… ▽ More

    Submitted 16 December, 2023; v1 submitted 24 August, 2023; originally announced August 2023.

    Comments: ICCV 2023

  11. arXiv:2308.05166  [pdf, other

    astro-ph.IM astro-ph.CO astro-ph.GA cs.CV cs.LG

    Deep Learning for Morphological Identification of Extended Radio Galaxies using Weak Labels

    Authors: Nikhel Gupta, Zeeshan Hayder, Ray P. Norris, Minh Huynh, Lars Petersson, X. Rosalind Wang, Heinz Andernach, Bärbel S. Koribalski, Miranda Yew, Evan J. Crawford

    Abstract: The present work discusses the use of a weakly-supervised deep learning algorithm that reduces the cost of labelling pixel-level masks for complex radio galaxies with multiple components. The algorithm is trained on weak class-level labels of radio galaxies to get class activation maps (CAMs). The CAMs are further refined using an inter-pixel relations network (IRNet) to get instance segmentation… ▽ More

    Submitted 9 August, 2023; originally announced August 2023.

    Comments: 14 pages, 6 figues, accepted for publication in PASA

  12. arXiv:2305.19538  [pdf

    cs.CV cs.LG eess.IV

    Automatic Illumination Spectrum Recovery

    Authors: Nariman Habili, Jeremy Oorloff, Lars Petersson

    Abstract: We develop a deep learning network to estimate the illumination spectrum of hyperspectral images under various lighting conditions. To this end, a dataset, IllumNet, was created. Images were captured using a Specim IQ camera under various illumination conditions, both indoor and outdoor. Outdoor images were captured in sunny, overcast, and shady conditions and at different times of the day. For in… ▽ More

    Submitted 30 May, 2023; originally announced May 2023.

    Comments: CSIRO Technical report, 19 pages

  13. arXiv:2305.16555  [pdf, other

    cs.CV

    CVB: A Video Dataset of Cattle Visual Behaviors

    Authors: Ali Zia, Renuka Sharma, Reza Arablouei, Greg Bishop-Hurley, Jody McNally, Neil Bagnall, Vivien Rolland, Brano Kusy, Lars Petersson, Aaron Ingham

    Abstract: Existing image/video datasets for cattle behavior recognition are mostly small, lack well-defined labels, or are collected in unrealistic controlled environments. This limits the utility of machine learning (ML) models learned from them. Therefore, we introduce a new dataset, called Cattle Visual Behaviors (CVB), that consists of 502 video clips, each fifteen seconds long, captured in natural ligh… ▽ More

    Submitted 3 July, 2023; v1 submitted 25 May, 2023; originally announced May 2023.

  14. Scalable Optimal Transport Methods in Machine Learning: A Contemporary Survey

    Authors: Abdelwahed Khamis, Russell Tsuchida, Mohamed Tarek, Vivien Rolland, Lars Petersson

    Abstract: Optimal Transport (OT) is a mathematical framework that first emerged in the eighteenth century and has led to a plethora of methods for answering many theoretical and applied questions. The last decade has been a witness to the remarkable contributions of this classical optimization problem to machine learning. This paper is about where and how optimal transport is used in machine learning with a… ▽ More

    Submitted 21 March, 2024; v1 submitted 8 May, 2023; originally announced May 2023.

    Comments: Accepted @ TPAMI 24

    Journal ref: IEEE Transactions on Pattern Analysis and Machine Intelligence 2024

  15. Topological Deep Learning: A Review of an Emerging Paradigm

    Authors: Ali Zia, Abdelwahed Khamis, James Nichols, Zeeshan Hayder, Vivien Rolland, Lars Petersson

    Abstract: Topological data analysis (TDA) provides insight into data shape. The summaries obtained by these methods are principled global descriptions of multi-dimensional data whilst exhibiting stable properties such as robustness to deformation and noise. Such properties are desirable in deep learning pipelines but they are typically obtained using non-TDA strategies. This is partly caused by the difficul… ▽ More

    Submitted 7 February, 2023; originally announced February 2023.

    Comments: 7 pages and 2 references

  16. arXiv:2212.02749  [pdf, other

    cs.CV

    A Hyperspectral and RGB Dataset for Building Facade Segmentation

    Authors: Nariman Habili, Ernest Kwan, Weihao Li, Christfried Webers, Jeremy Oorloff, Mohammad Ali Armin, Lars Petersson

    Abstract: Hyperspectral Imaging (HSI) provides detailed spectral information and has been utilised in many real-world applications. This work introduces an HSI dataset of building facades in a light industry environment with the aim of classifying different building materials in a scene. The dataset is called the Light Industrial Building HSI (LIB-HSI) dataset. This dataset consists of nine categories and 4… ▽ More

    Submitted 5 December, 2022; originally announced December 2022.

  17. arXiv:2212.02011  [pdf, other

    cs.CV

    PointCaM: Cut-and-Mix for Open-Set Point Cloud Learning

    Authors: Jie Hong, Shi Qiu, Weihao Li, Saeed Anwar, Mehrtash Harandi, Nick Barnes, Lars Petersson

    Abstract: Point cloud learning is receiving increasing attention, however, most existing point cloud models lack the practical ability to deal with the unavoidable presence of unknown objects. This paper mainly discusses point cloud learning under open-set settings, where we train the model without data from unknown classes and identify them in the inference stage. Basically, we propose to solve open-set po… ▽ More

    Submitted 24 August, 2023; v1 submitted 4 December, 2022; originally announced December 2022.

  18. arXiv:2211.07625  [pdf, other

    cs.CV cs.AI cs.LG

    What Images are More Memorable to Machines?

    Authors: Junlin Han, Huangying Zhan, Jie Hong, Pengfei Fang, Hongdong Li, Lars Petersson, Ian Reid

    Abstract: This paper studies the problem of measuring and predicting how memorable an image is to pattern recognition machines, as a path to explore machine intelligence. Firstly, we propose a self-supervised machine memory quantification pipeline, dubbed ``MachineMem measurer'', to collect machine memorability scores of images. Similar to humans, machines also tend to memorize certain kinds of images, wher… ▽ More

    Submitted 11 July, 2023; v1 submitted 14 November, 2022; originally announced November 2022.

    Comments: Code: https://github.com/JunlinHan/MachineMem Project page: https://junlinhan.github.io/projects/machinemem.html

  19. arXiv:2210.06120  [pdf, other

    cs.CV

    Efficient Gaussian Process Model on Class-Imbalanced Datasets for Generalized Zero-Shot Learning

    Authors: Changkun Ye, Nick Barnes, Lars Petersson, Russell Tsuchida

    Abstract: Zero-Shot Learning (ZSL) models aim to classify object classes that are not seen during the training process. However, the problem of class imbalance is rarely discussed, despite its presence in several ZSL datasets. In this paper, we propose a Neural Network model that learns a latent feature embedding and a Gaussian Process (GP) regression model that predicts latent feature prototypes of unseen… ▽ More

    Submitted 11 October, 2022; originally announced October 2022.

    Comments: Paper Accepted in ICPR 2022

  20. arXiv:2209.06469  [pdf, other

    cs.CV

    Learning Deep Optimal Embeddings with Sinkhorn Divergences

    Authors: Soumava Kumar Roy, Yan Han, Mehrtash Harandi, Lars Petersson

    Abstract: Deep Metric Learning algorithms aim to learn an efficient embedding space to preserve the similarity relationships among the input data. Whilst these algorithms have achieved significant performance gains across a wide plethora of tasks, they have also failed to consider and increase comprehensive similarity constraints; thus learning a sub-optimal metric in the embedding space. Moreover, up until… ▽ More

    Submitted 14 September, 2022; originally announced September 2022.

  21. arXiv:2208.01188  [pdf, other

    cs.CV

    Curved Geometric Networks for Visual Anomaly Recognition

    Authors: Jie Hong, Pengfei Fang, Weihao Li, Junlin Han, Lars Petersson, Mehrtash Harandi

    Abstract: Learning a latent embedding to understand the underlying nature of data distribution is often formulated in Euclidean spaces with zero curvature. However, the success of the geometry constraints, posed in the embedding space, indicates that curved spaces might encode more structural information, leading to better discriminative power and hence richer representations. In this work, we investigate b… ▽ More

    Submitted 1 August, 2022; originally announced August 2022.

  22. arXiv:2205.15955  [pdf, other

    cs.CV eess.IV

    CropMix: Sampling a Rich Input Distribution via Multi-Scale Cropping

    Authors: Junlin Han, Lars Petersson, Hongdong Li, Ian Reid

    Abstract: We present a simple method, CropMix, for the purpose of producing a rich input distribution from the original dataset distribution. Unlike single random cropping, which may inadvertently capture only limited information, or irrelevant information, like pure background, unrelated objects, etc, we crop an image multiple times using distinct crop scales, thereby ensuring that multi-scale information… ▽ More

    Submitted 31 May, 2022; originally announced May 2022.

    Comments: Code: https://github.com/JunlinHan/CropMix

  23. Monitoring of Pigmented Skin Lesions Using 3D Whole Body Imaging

    Authors: David Ahmedt-Aristizabal, Chuong Nguyen, Lachlan Tychsen-Smith, Ashley Stacey, Shenghong Li, Joseph Pathikulangara, Lars Petersson, Dadong Wang

    Abstract: Advanced artificial intelligence and machine learning have great potential to redefine how skin lesions are detected, mapped, tracked and documented. Here, We propose a 3D whole-body imaging system known as 3DSkin-mapper to enable automated detection, evaluation and mapping of skin lesions. A modular camera rig arranged in a cylindrical configuration was designed to automatically capture images of… ▽ More

    Submitted 26 February, 2023; v1 submitted 14 May, 2022; originally announced May 2022.

    Comments: In Computer Methods and Programs in Biomedicine

    Journal ref: Volume 232, April 2023, 107451

  24. arXiv:2204.06788  [pdf, other

    cs.CV

    Pyramidal Attention for Saliency Detection

    Authors: Tanveer Hussain, Abbas Anwar, Saeed Anwar, Lars Petersson, Sung Wook Baik

    Abstract: Salient object detection (SOD) extracts meaningful contents from an input image. RGB-based SOD methods lack the complementary depth clues; hence, providing limited performance for complex scenarios. Similarly, RGB-D models process RGB and depth inputs, but the depth data availability during testing may hinder the model's practical applicability. This paper exploits only RGB images, estimates depth… ▽ More

    Submitted 14 April, 2022; originally announced April 2022.

    Comments: Accepted at CVPRW 2022. (2022 IEEE CVPR Workshop on Fair, Data Efficient and Trusted Computer Vision)

  25. arXiv:2204.05604  [pdf, other

    cs.CV

    Towards Open-Set Object Detection and Discovery

    Authors: Jiyang Zheng, Weihao Li, Jie Hong, Lars Petersson, Nick Barnes

    Abstract: With the human pursuit of knowledge, open-set object detection (OSOD) has been designed to identify unknown objects in a dynamic world. However, an issue with the current setting is that all the predicted unknown objects share the same category as "unknown", which require incremental learning via a human-in-the-loop approach to label novel classes. In order to address this problem, we present a ne… ▽ More

    Submitted 12 April, 2022; originally announced April 2022.

    Comments: CVPRW 2022

  26. arXiv:2203.12919  [pdf, other

    cs.CV cs.GR cs.LG

    Learning Dense Correspondence from Synthetic Environments

    Authors: Mithun Lal, Anthony Paproki, Nariman Habili, Lars Petersson, Olivier Salvado, Clinton Fookes

    Abstract: Estimation of human shape and pose from a single image is a challenging task. It is an even more difficult problem to map the identified human shape onto a 3D human model. Existing methods map manually labelled human pixels in real 2D images onto the 3D surface, which is prone to human error, and the sparsity of available annotated data often leads to sub-optimal results. We propose to solve the p… ▽ More

    Submitted 24 March, 2022; originally announced March 2022.

    Comments: Submitted to ICIP 2022

  27. arXiv:2203.12116  [pdf, other

    cs.CV cs.RO

    GOSS: Towards Generalized Open-set Semantic Segmentation

    Authors: Jie Hong, Weihao Li, Junlin Han, Jiyang Zheng, Pengfei Fang, Mehrtash Harandi, Lars Petersson

    Abstract: In this paper, we present and study a new image segmentation task, called Generalized Open-set Semantic Segmentation (GOSS). Previously, with the well-known open-set semantic segmentation (OSS), the intelligent agent only detects the unknown regions without further processing, limiting their perception of the environment. It stands to reason that a further analysis of the detected unknown pixels w… ▽ More

    Submitted 22 March, 2022; originally announced March 2022.

  28. arXiv:2202.13096  [pdf, other

    cs.CV cs.HC cs.LG

    Continuous Human Action Recognition for Human-Machine Interaction: A Review

    Authors: Harshala Gammulle, David Ahmedt-Aristizabal, Simon Denman, Lachlan Tychsen-Smith, Lars Petersson, Clinton Fookes

    Abstract: With advances in data-driven machine learning research, a wide variety of prediction models have been proposed to capture spatio-temporal features for the analysis of video streams. Recognising actions and detecting action transitions within an input video are challenging but necessary tasks for applications that require real-time human-machine interaction. By reviewing a large body of recent rela… ▽ More

    Submitted 26 February, 2022; originally announced February 2022.

    Comments: Preprint submitted to ACM Computing Surveys

    Journal ref: 2023, Volume 55, Issue 13s

  29. arXiv:2201.12078  [pdf, other

    cs.CV cs.LG

    You Only Cut Once: Boosting Data Augmentation with a Single Cut

    Authors: Junlin Han, Pengfei Fang, Weihao Li, Jie Hong, Mohammad Ali Armin, Ian Reid, Lars Petersson, Hongdong Li

    Abstract: We present You Only Cut Once (YOCO) for performing data augmentations. YOCO cuts one image into two pieces and performs data augmentations individually within each piece. Applying YOCO improves the diversity of the augmentation per sample and encourages neural networks to recognize objects from partial information. YOCO enjoys the properties of parameter-free, easy usage, and boosting almost all a… ▽ More

    Submitted 15 June, 2022; v1 submitted 28 January, 2022; originally announced January 2022.

    Comments: ICML 2022, Code: https://github.com/JunlinHan/YOCO

  30. arXiv:2112.09600  [pdf, other

    cs.CL

    Transcribing Natural Languages for The Deaf via Neural Editing Programs

    Authors: Dongxu Li, Chenchen Xu, Liu Liu, Yiran Zhong, Rong Wang, Lars Petersson, Hongdong Li

    Abstract: This work studies the task of glossification, of which the aim is to em transcribe natural spoken language sentences for the Deaf (hard-of-hearing) community to ordered sign language glosses. Previous sequence-to-sequence language models trained with paired sentence-gloss data often fail to capture the rich connections between the two distinct languages, leading to unsatisfactory transcriptions. W… ▽ More

    Submitted 17 December, 2021; originally announced December 2021.

  31. In-Bed Human Pose Estimation from Unseen and Privacy-Preserving Image Domains

    Authors: Ting Cao, Mohammad Ali Armin, Simon Denman, Lars Petersson, David Ahmedt-Aristizabal

    Abstract: Medical applications have benefited greatly from the rapid advancement in computer vision. Considering patient monitoring in particular, in-bed human posture estimation offers important health-related metrics with potential value in medical condition assessments. Despite great progress in this domain, it remains challenging due to substantial ambiguity during occlusions, and the lack of large corp… ▽ More

    Submitted 24 January, 2022; v1 submitted 29 November, 2021; originally announced November 2021.

    Comments: In the IEEE International Symposium on Biomedical Imaging (ISBI)

    Journal ref: ISBI 2022

  32. arXiv:2110.12197  [pdf, other

    cs.LG cs.CV

    Towards a Robust Differentiable Architecture Search under Label Noise

    Authors: Christian Simon, Piotr Koniusz, Lars Petersson, Yan Han, Mehrtash Harandi

    Abstract: Neural Architecture Search (NAS) is the game changer in designing robust neural architectures. Architectures designed by NAS outperform or compete with the best manual network designs in terms of accuracy, size, memory footprint and FLOPs. That said, previous studies focus on developing NAS algorithms for clean high quality data, a restrictive and somewhat unrealistic assumption. In this paper, fo… ▽ More

    Submitted 23 October, 2021; originally announced October 2021.

    Comments: Accepted to WACV 2022

  33. arXiv:2109.09300  [pdf, other

    cs.LG cs.CV

    Feature Correlation Aggregation: on the Path to Better Graph Neural Networks

    Authors: Jieming Zhou, Tong Zhang, Pengfei Fang, Lars Petersson, Mehrtash Harandi

    Abstract: Prior to the introduction of Graph Neural Networks (GNNs), modeling and analyzing irregular data, particularly graphs, was thought to be the Achilles' heel of deep learning. The core concept of GNNs is to find a representation by recursively aggregating the representations of a central node and those of its neighbors. The core concept of GNNs is to find a representation by recursively aggregating… ▽ More

    Submitted 20 September, 2021; originally announced September 2021.

  34. arXiv:2108.11364  [pdf, other

    cs.CV eess.IV

    Blind Image Decomposition

    Authors: Junlin Han, Weihao Li, Pengfei Fang, Chunyi Sun, Jie Hong, Mohammad Ali Armin, Lars Petersson, Hongdong Li

    Abstract: We propose and study a novel task named Blind Image Decomposition (BID), which requires separating a superimposed image into constituent underlying images in a blind setting, that is, both the source components involved in mixing as well as the mixing mechanism are unknown. For example, rain may consist of multiple components, such as rain streaks, raindrops, snow, and haze. Rainy images can be tr… ▽ More

    Submitted 18 July, 2022; v1 submitted 25 August, 2021; originally announced August 2021.

    Comments: ECCV 2022. Project page: https://junlinhan.github.io/projects/BID.html. Code: https://github.com/JunlinHan/BID

  35. A Survey on Graph-Based Deep Learning for Computational Histopathology

    Authors: David Ahmedt-Aristizabal, Mohammad Ali Armin, Simon Denman, Clinton Fookes, Lars Petersson

    Abstract: With the remarkable success of representation learning for prediction problems, we have witnessed a rapid expansion of the use of machine learning and deep learning for the analysis of digital pathology and biopsy image patches. However, learning over patch-wise features using convolutional neural networks limits the ability of the model to capture global contextual information and comprehensively… ▽ More

    Submitted 27 September, 2021; v1 submitted 1 July, 2021; originally announced July 2021.

    Comments: Preprint submitted to Computerized Medical Imaging and Graphics

    Journal ref: Volume 95, January 2022, 102027

  36. arXiv:2106.10718  [pdf, other

    eess.IV cs.CV

    Underwater Image Restoration via Contrastive Learning and a Real-world Dataset

    Authors: Junlin Han, Mehrdad Shoeiby, Tim Malthus, Elizabeth Botha, Janet Anstee, Saeed Anwar, Ran Wei, Mohammad Ali Armin, Hongdong Li, Lars Petersson

    Abstract: Underwater image restoration is of significant importance in unveiling the underwater world. Numerous techniques and algorithms have been developed in the past decades. However, due to fundamental difficulties associated with imaging/sensing, lighting, and refractive geometric distortions, in capturing clear underwater images, no comprehensive evaluations have been conducted of underwater image re… ▽ More

    Submitted 20 June, 2021; originally announced June 2021.

    Comments: In submission, code/dataset are at https://github.com/JunlinHan/CWR. arXiv admin note: text overlap with arXiv:2103.09697

  37. Video-Based Inpatient Fall Risk Assessment: A Case Study

    Authors: Ziqing Wang, Mohammad Ali Armin, Simon Denman, Lars Petersson, David Ahmedt-Aristizabal

    Abstract: Inpatient falls are a serious safety issue in hospitals and healthcare facilities. Recent advances in video analytics for patient monitoring provide a non-intrusive avenue to reduce this risk through continuous activity monitoring. However, in-bed fall risk assessment systems have received less attention in the literature. The majority of prior studies have focused on fall event detection, and do… ▽ More

    Submitted 27 May, 2021; originally announced June 2021.

    Journal ref: IEEE Engineering in Medicine & Biology Society (EMBC) 2021

  38. Towards Interpretable Attention Networks for Cervical Cancer Analysis

    Authors: Ruiqi Wang, Mohammad Ali Armin, Simon Denman, Lars Petersson, David Ahmedt-Aristizabal

    Abstract: Recent advances in deep learning have enabled the development of automated frameworks for analysing medical images and signals, including analysis of cervical cancer. Many previous works focus on the analysis of isolated cervical cells, or do not offer sufficient methods to explain and understand how the proposed models reach their classification decisions on multi-cell images. Here, we evaluate v… ▽ More

    Submitted 27 May, 2021; originally announced June 2021.

    Journal ref: IEEE Engineering in Medicine & Biology Society (EMBC) 2021

  39. arXiv:2105.13137  [pdf, other

    cs.LG cs.CV q-bio.QM

    Graph-Based Deep Learning for Medical Diagnosis and Analysis: Past, Present and Future

    Authors: David Ahmedt-Aristizabal, Mohammad Ali Armin, Simon Denman, Clinton Fookes, Lars Petersson

    Abstract: With the advances of data-driven machine learning research, a wide variety of prediction problems have been tackled. It has become critical to explore how machine learning and specifically deep learning methods can be exploited to analyse healthcare data. A major limitation of existing methods has been the focus on grid-like data; however, the structure of physiological recordings are often irregu… ▽ More

    Submitted 27 May, 2021; originally announced May 2021.

    Journal ref: Sensors 2021, 21, 4758

  40. arXiv:2104.07689  [pdf, other

    cs.CV eess.IV

    Dual Contrastive Learning for Unsupervised Image-to-Image Translation

    Authors: Junlin Han, Mehrdad Shoeiby, Lars Petersson, Mohammad Ali Armin

    Abstract: Unsupervised image-to-image translation tasks aim to find a mapping between a source domain X and a target domain Y from unpaired training data. Contrastive learning for Unpaired image-to-image Translation (CUT) yields state-of-the-art results in modeling unsupervised image-to-image translation by maximizing mutual information between input and output patches using only one encoder for both domain… ▽ More

    Submitted 15 April, 2021; originally announced April 2021.

    Comments: Accepted to NTIRE, CVPRW 2021. Code is available at https://github.com/JunlinHan/DCLGAN

  41. arXiv:2104.04980  [pdf, other

    cs.CV

    Zero-Shot Learning on 3D Point Cloud Objects and Beyond

    Authors: Ali Cheraghian, Shafinn Rahman, Townim F. Chowdhury, Dylan Campbell, Lars Petersson

    Abstract: Zero-shot learning, the task of learning to recognize new classes not seen during training, has received considerable attention in the case of 2D image classification. However, despite the increasing ubiquity of 3D sensors, the corresponding 3D point cloud classification problem has not been meaningfully explored and introduces new challenges. In this paper, we identify some of the challenges and… ▽ More

    Submitted 11 April, 2021; originally announced April 2021.

    Comments: arXiv admin note: substantial text overlap with arXiv:1912.07161

  42. arXiv:2104.04192  [pdf, other

    cs.CV

    Reinforced Attention for Few-Shot Learning and Beyond

    Authors: Jie Hong, Pengfei Fang, Weihao Li, Tong Zhang, Christian Simon, Mehrtash Harandi, Lars Petersson

    Abstract: Few-shot learning aims to correctly recognize query samples from unseen classes given a limited number of support samples, often by relying on global embeddings of images. In this paper, we propose to equip the backbone network with an attention agent, which is trained by reinforcement learning. The policy gradient algorithm is employed to train the agent towards adaptively localizing the represen… ▽ More

    Submitted 9 April, 2021; originally announced April 2021.

  43. arXiv:2103.09697  [pdf, other

    cs.CV eess.IV

    Single Underwater Image Restoration by Contrastive Learning

    Authors: Junlin Han, Mehrdad Shoeiby, Tim Malthus, Elizabeth Botha, Janet Anstee, Saeed Anwar, Ran Wei, Lars Petersson, Mohammad Ali Armin

    Abstract: Underwater image restoration attracts significant attention due to its importance in unveiling the underwater world. This paper elaborates on a novel method that achieves state-of-the-art results for underwater image restoration based on the unsupervised image-to-image translation framework. We design our method by leveraging from contrastive learning and generative adversarial networks to maximiz… ▽ More

    Submitted 15 April, 2021; v1 submitted 17 March, 2021; originally announced March 2021.

    Comments: Accepted to IGARSS 2021 as oral presentation. Code is available at https://github.com/JunlinHan/CWR

  44. arXiv:2103.04059  [pdf, other

    cs.CV

    Semantic-aware Knowledge Distillation for Few-Shot Class-Incremental Learning

    Authors: Ali Cheraghian, Shafin Rahman, Pengfei Fang, Soumava Kumar Roy, Lars Petersson, Mehrtash Harandi

    Abstract: Few-shot class incremental learning (FSCIL) portrays the problem of learning new concepts gradually, where only a few examples per concept are available to the learner. Due to the limited number of examples for training, the techniques developed for standard incremental learning cannot be applied verbatim to FSCIL. In this work, we introduce a distillation algorithm to address the problem of FSCIL… ▽ More

    Submitted 30 March, 2021; v1 submitted 6 March, 2021; originally announced March 2021.

    Comments: Accepted at CVPR 2021

  45. arXiv:2011.00774  [pdf, other

    cs.CV

    Set Augmented Triplet Loss for Video Person Re-Identification

    Authors: Pengfei Fang, Pan Ji, Lars Petersson, Mehrtash Harandi

    Abstract: Modern video person re-identification (re-ID) machines are often trained using a metric learning approach, supervised by a triplet loss. The triplet loss used in video re-ID is usually based on so-called clip features, each aggregated from a few frame features. In this paper, we propose to model the video clip as a set and instead study the distance between sets in the corresponding triplet loss.… ▽ More

    Submitted 6 November, 2020; v1 submitted 2 November, 2020; originally announced November 2020.

    Comments: to appear in WACV 2021

  46. arXiv:2010.03108  [pdf, other

    cs.CV cs.LG

    Channel Recurrent Attention Networks for Video Pedestrian Retrieval

    Authors: Pengfei Fang, Pan Ji, Jieming Zhou, Lars Petersson, Mehrtash Harandi

    Abstract: Full attention, which generates an attention value per element of the input feature maps, has been successfully demonstrated to be beneficial in visual tasks. In this work, we propose a fully attentional network, termed {\it channel recurrent attention network}, for the task of video pedestrian retrieval. The main attention unit, \textit{channel recurrent attention}, identifies attention maps at t… ▽ More

    Submitted 6 October, 2020; originally announced October 2020.

    Comments: To appear in ACCV 2020

  47. arXiv:2006.09597  [pdf, other

    cs.CV

    Cross-Correlated Attention Networks for Person Re-Identification

    Authors: Jieming Zhou, Soumava Kumar Roy, Pengfei Fang, Mehrtash Harandi, Lars Petersson

    Abstract: Deep neural networks need to make robust inference in the presence of occlusion, background clutter, pose and viewpoint variations -- to name a few -- when the task of person re-identification is considered. Attention mechanisms have recently proven to be successful in handling the aforementioned challenges to some degree. However previous designs fail to capture inherent inter-dependencies betwee… ▽ More

    Submitted 16 June, 2020; originally announced June 2020.

    Comments: Accepted by Image and Vision Computing

    Journal ref: Image and Vision Computing, Vol. 100, 2020, p. 103931

  48. arXiv:2004.13524  [pdf, other

    cs.CV cs.LG eess.IV

    Attention Based Real Image Restoration

    Authors: Saeed Anwar, Nick Barnes, Lars Petersson

    Abstract: Deep convolutional neural networks perform better on images containing spatially invariant degradations, also known as synthetic degradations; however, their performance is limited on real-degraded photographs and requires multiple-stage network modeling. To advance the practicability of restoration algorithms, this paper proposes a novel single-stage blind real image restoration network (R$^2$Net… ▽ More

    Submitted 1 October, 2020; v1 submitted 26 April, 2020; originally announced April 2020.

    Comments: arXiv admin note: substantial text overlap with arXiv:1904.07396

  49. arXiv:2004.06853  [pdf, other

    eess.IV cs.CV cs.LG

    Mosaic Super-resolution via Sequential Feature Pyramid Networks

    Authors: Mehrdad Shoeiby, Mohammad Ali Armin, Sadegh Aliakbarian, Saeed Anwar, Lars Petersson

    Abstract: Advances in the design of multi-spectral cameras have led to great interests in a wide range of applications, from astronomy to autonomous driving. However, such cameras inherently suffer from a trade-off between the spatial and spectral resolution. In this paper, we propose to address this limitation by introducing a novel method to carry out super-resolution on raw mosaic images, multi-spectral… ▽ More

    Submitted 14 April, 2020; originally announced April 2020.

    Comments: Accepted by IEEE CVPR Workshop

  50. arXiv:2003.11154  [pdf, other

    cs.CV cs.LG eess.IV

    A Systematic Evaluation: Fine-Grained CNN vs. Traditional CNN Classifiers

    Authors: Saeed Anwar, Nick Barnes, Lars Petersson

    Abstract: To make the best use of the underlying minute and subtle differences, fine-grained classifiers collect information about inter-class variations. The task is very challenging due to the small differences between the colors, viewpoint, and structure in the same class entities. The classification becomes more difficult due to the similarities between the differences in viewpoint with other classes an… ▽ More

    Submitted 2 November, 2021; v1 submitted 24 March, 2020; originally announced March 2020.