Zum Hauptinhalt springen

Showing 1–17 of 17 results for author: Fiameni, G

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.02075  [pdf, other

    cs.CV

    Label Anything: Multi-Class Few-Shot Semantic Segmentation with Visual Prompts

    Authors: Pasquale De Marinis, Nicola Fanelli, Raffaele Scaringi, Emanuele Colonna, Giuseppe Fiameni, Gennaro Vessio, Giovanna Castellano

    Abstract: We present Label Anything, an innovative neural network architecture designed for few-shot semantic segmentation (FSS) that demonstrates remarkable generalizability across multiple classes with minimal examples required per class. Diverging from traditional FSS methods that predominantly rely on masks for annotating support images, Label Anything introduces varied visual prompts -- points, boundin… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

  2. arXiv:2403.13479  [pdf, other

    cs.CV cs.AI

    Deepfake Detection without Deepfakes: Generalization via Synthetic Frequency Patterns Injection

    Authors: Davide Alessandro Coccomini, Roberto Caldelli, Claudio Gennaro, Giuseppe Fiameni, Giuseppe Amato, Fabrizio Falchi

    Abstract: Deepfake detectors are typically trained on large sets of pristine and generated images, resulting in limited generalization capacity; they excel at identifying deepfakes created through methods encountered during training but struggle with those generated by unknown techniques. This paper introduces a learning approach aimed at significantly enhancing the generalization capabilities of deepfake d… ▽ More

    Submitted 20 March, 2024; originally announced March 2024.

  3. arXiv:2312.09993  [pdf, other

    cs.CL

    LLaMAntino: LLaMA 2 Models for Effective Text Generation in Italian Language

    Authors: Pierpaolo Basile, Elio Musacchio, Marco Polignano, Lucia Siciliani, Giuseppe Fiameni, Giovanni Semeraro

    Abstract: Large Language Models represent state-of-the-art linguistic models designed to equip computers with the ability to comprehend natural language. With its exceptional capacity to capture complex contextual relationships, the LLaMA (Large Language Model Meta AI) family represents a novel advancement in the field of natural language processing by releasing foundational models designed to improve the n… ▽ More

    Submitted 15 December, 2023; originally announced December 2023.

  4. arXiv:2308.14619  [pdf, other

    cs.CV

    Compositional Semantic Mix for Domain Adaptation in Point Cloud Segmentation

    Authors: Cristiano Saltori, Fabio Galasso, Giuseppe Fiameni, Nicu Sebe, Fabio Poiesi, Elisa Ricci

    Abstract: Deep-learning models for 3D point cloud semantic segmentation exhibit limited generalization capabilities when trained and tested on data captured with different sensors or in varying environments due to domain shift. Domain adaptation methods can be employed to mitigate this domain shift, for instance, by simulating sensor noise, developing domain-agnostic generators, or training point cloud comp… ▽ More

    Submitted 29 August, 2023; v1 submitted 28 August, 2023; originally announced August 2023.

    Comments: TPAMI. arXiv admin note: text overlap with arXiv:2207.09778

  5. arXiv:2307.02392  [pdf, other

    cs.CV

    RADiff: Controllable Diffusion Models for Radio Astronomical Maps Generation

    Authors: Renato Sortino, Thomas Cecconello, Andrea DeMarco, Giuseppe Fiameni, Andrea Pilzer, Andrew M. Hopkins, Daniel Magro, Simone Riggi, Eva Sciacca, Adriano Ingallinera, Cristobal Bordiu, Filomena Bufano, Concetto Spampinato

    Abstract: Along with the nearing completion of the Square Kilometre Array (SKA), comes an increasing demand for accurate and reliable automated solutions to extract valuable information from the vast amount of data it will allow acquiring. Automated source finding is a particularly important task in this context, as it enables the detection and classification of astronomical objects. Deep-learning-based obj… ▽ More

    Submitted 5 July, 2023; originally announced July 2023.

  6. Radio astronomical images object detection and segmentation: A benchmark on deep learning methods

    Authors: Renato Sortino, Daniel Magro, Giuseppe Fiameni, Eva Sciacca, Simone Riggi, Andrea DeMarco, Concetto Spampinato, Andrew M. Hopkins, Filomena Bufano, Francesco Schillirò, Cristobal Bordiu, Carmelo Pino

    Abstract: In recent years, deep learning has been successfully applied in various scientific domains. Following these promising results and performances, it has recently also started being evaluated in the domain of radio astronomy. In particular, since radio astronomy is entering the Big Data era, with the advent of the largest telescope in the world - the Square Kilometre Array (SKA), the task of automati… ▽ More

    Submitted 25 May, 2023; v1 submitted 8 March, 2023; originally announced March 2023.

  7. arXiv:2212.08830  [pdf, other

    cs.CV

    Inductive Attention for Video Action Anticipation

    Authors: Tsung-Ming Tai, Giuseppe Fiameni, Cheng-Kuang Lee, Simon See, Oswald Lanz

    Abstract: Anticipating future actions based on spatiotemporal observations is essential in video understanding and predictive computer vision. Moreover, a model capable of anticipating the future has important applications, it can benefit precautionary systems to react before an event occurs. However, unlike in the action recognition task, future information is inaccessible at observation time -- a model ca… ▽ More

    Submitted 18 March, 2023; v1 submitted 17 December, 2022; originally announced December 2022.

  8. arXiv:2207.09778  [pdf, other

    cs.CV cs.AI cs.LG

    CoSMix: Compositional Semantic Mix for Domain Adaptation in 3D LiDAR Segmentation

    Authors: Cristiano Saltori, Fabio Galasso, Giuseppe Fiameni, Nicu Sebe, Elisa Ricci, Fabio Poiesi

    Abstract: 3D LiDAR semantic segmentation is fundamental for autonomous driving. Several Unsupervised Domain Adaptation (UDA) methods for point cloud data have been recently proposed to improve model generalization for different sensors and environments. Researchers working on UDA problems in the image domain have shown that sample mixing can mitigate domain shift. We propose a new approach of sample mixing… ▽ More

    Submitted 20 July, 2022; originally announced July 2022.

    Comments: Accepted at ECCV 2022

  9. arXiv:2207.09763  [pdf, other

    cs.CV cs.AI cs.LG

    GIPSO: Geometrically Informed Propagation for Online Adaptation in 3D LiDAR Segmentation

    Authors: Cristiano Saltori, Evgeny Krivosheev, Stéphane Lathuilière, Nicu Sebe, Fabio Galasso, Giuseppe Fiameni, Elisa Ricci, Fabio Poiesi

    Abstract: 3D point cloud semantic segmentation is fundamental for autonomous driving. Most approaches in the literature neglect an important aspect, i.e., how to deal with domain shift when handling dynamic scenes. This can significantly hinder the navigation capabilities of self-driving vehicles. This paper advances the state of the art in this research field. Our first contribution consists in analysing a… ▽ More

    Submitted 20 July, 2022; originally announced July 2022.

    Comments: Accepted at ECCV 2022

  10. arXiv:2206.10869  [pdf, other

    cs.CV

    NVIDIA-UNIBZ Submission for EPIC-KITCHENS-100 Action Anticipation Challenge 2022

    Authors: Tsung-Ming Tai, Oswald Lanz, Giuseppe Fiameni, Yi-Kwan Wong, Sze-Sen Poon, Cheng-Kuang Lee, Ka-Chun Cheung, Simon See

    Abstract: In this report, we describe the technical details of our submission for the EPIC-Kitchen-100 action anticipation challenge. Our modelings, the higher-order recurrent space-time transformer and the message-passing neural network with edge learning, are both recurrent-based architectures which observe only 2.5 seconds inference context to form the action anticipation prediction. By averaging the pre… ▽ More

    Submitted 22 June, 2022; originally announced June 2022.

  11. arXiv:2206.01009  [pdf, other

    cs.CV

    Unified Recurrence Modeling for Video Action Anticipation

    Authors: Tsung-Ming Tai, Giuseppe Fiameni, Cheng-Kuang Lee, Simon See, Oswald Lanz

    Abstract: Forecasting future events based on evidence of current conditions is an innate skill of human beings, and key for predicting the outcome of any decision making. In artificial vision for example, we would like to predict the next human action before it happens, without observing the future video frames associated to it. Computer vision models for action anticipation are expected to collect the subt… ▽ More

    Submitted 2 June, 2022; originally announced June 2022.

  12. Efficient yet Competitive Speech Translation: FBK@IWSLT2022

    Authors: Marco Gaido, Sara Papi, Dennis Fucci, Giuseppe Fiameni, Matteo Negri, Marco Turchi

    Abstract: The primary goal of this FBK's systems submission to the IWSLT 2022 offline and simultaneous speech translation tasks is to reduce model training costs without sacrificing translation quality. As such, we first question the need of ASR pre-training, showing that it is not essential to achieve competitive results. Second, we focus on data filtering, showing that a simple method that looks at the ra… ▽ More

    Submitted 5 May, 2022; originally announced May 2022.

    Comments: IWSLT 2022 System Description

    Journal ref: Proceedings of the 19th International Conference on Spoken Language Translation (IWSLT 2022)

  13. arXiv:2111.12727  [pdf, other

    cs.CV cs.AI cs.CL cs.MM

    Generating More Pertinent Captions by Leveraging Semantics and Style on Multi-Source Datasets

    Authors: Marcella Cornia, Lorenzo Baraldi, Giuseppe Fiameni, Rita Cucchiara

    Abstract: This paper addresses the task of generating fluent descriptions by training on a non-uniform combination of data sources, containing both human-annotated and web-collected captions. Large-scale datasets with noisy image-text pairs, indeed, provide a sub-optimal source of supervision because of their low-quality descriptive style, while human-annotated datasets are cleaner but smaller in scale. To… ▽ More

    Submitted 30 November, 2023; v1 submitted 24 November, 2021; originally announced November 2021.

    Comments: Accepted to IJCV

  14. arXiv:2109.07247  [pdf, other

    cs.RO cs.CV

    Towards Precise Pruning Points Detection using Semantic-Instance-Aware Plant Models for Grapevine Winter Pruning Automation

    Authors: Miguel Fernandes, Antonello Scaldaferri, Paolo Guadagna, Giuseppe Fiameni, Tao Teng, Matteo Gatti, Stefano Poni, Claudio Semini, Darwin Caldwell, Fei Chen

    Abstract: Grapevine winter pruning is a complex task, that requires skilled workers to execute it correctly. The complexity makes it time consuming. It is an operation that requires about 80-120 hours per hectare annually, making an automated robotic system that helps in speeding up the process a crucial tool in large-size vineyards. We will describe (a) a novel expert annotated dataset for grapevine segmen… ▽ More

    Submitted 15 September, 2021; originally announced September 2021.

    Comments: arXiv admin note: text overlap with arXiv:2106.04208

  15. arXiv:2107.06912  [pdf, other

    cs.CV cs.CL

    From Show to Tell: A Survey on Deep Learning-based Image Captioning

    Authors: Matteo Stefanini, Marcella Cornia, Lorenzo Baraldi, Silvia Cascianelli, Giuseppe Fiameni, Rita Cucchiara

    Abstract: Connecting Vision and Language plays an essential role in Generative Intelligence. For this reason, large research efforts have been devoted to image captioning, i.e. describing images with syntactically and semantically meaningful sentences. Starting from 2015 the task has generally been addressed with pipelines composed of a visual encoder and a language model for text generation. During these y… ▽ More

    Submitted 30 November, 2021; v1 submitted 14 July, 2021; originally announced July 2021.

  16. arXiv:2106.04208  [pdf, other

    cs.CV cs.RO

    Grapevine Winter Pruning Automation: On Potential Pruning Points Detection through 2D Plant Modeling using Grapevine Segmentation

    Authors: Miguel Fernandes, Antonello Scaldaferri, Giuseppe Fiameni, Tao Teng, Matteo Gatti, Stefano Poni, Claudio Semini, Darwin Caldwell, Fei Chen

    Abstract: Grapevine winter pruning is a complex task, that requires skilled workers to execute it correctly. The complexity of this task is also the reason why it is time consuming. Considering that this operation takes about 80-120 hours/ha to be completed, and therefore is even more crucial in large-size vineyards, an automated system can help to speed up the process. To this end, this paper presents a no… ▽ More

    Submitted 8 June, 2021; originally announced June 2021.

  17. arXiv:2104.08665  [pdf, other

    cs.CV

    Higher Order Recurrent Space-Time Transformer for Video Action Prediction

    Authors: Tsung-Ming Tai, Giuseppe Fiameni, Cheng-Kuang Lee, Oswald Lanz

    Abstract: Endowing visual agents with predictive capability is a key step towards video intelligence at scale. The predominant modeling paradigm for this is sequence learning, mostly implemented through LSTMs. Feed-forward Transformer architectures have replaced recurrent model designs in ML applications of language processing and also partly in computer vision. In this paper we investigate on the competiti… ▽ More

    Submitted 21 September, 2021; v1 submitted 17 April, 2021; originally announced April 2021.