Zum Hauptinhalt springen

Showing 51–100 of 227 results for author: Sebe, N

.
  1. arXiv:2307.08012  [pdf, other

    cs.CV

    Householder Projector for Unsupervised Latent Semantics Discovery

    Authors: Yue Song, Jichao Zhang, Nicu Sebe, Wei Wang

    Abstract: Generative Adversarial Networks (GANs), especially the recent style-based generators (StyleGANs), have versatile semantics in the structured latent space. Latent semantics discovery methods emerge to move around the latent code such that only one factor varies during the traversal. Recently, an unsupervised method proposed a promising direction to directly use the eigenvectors of the projection ma… ▽ More

    Submitted 16 July, 2023; originally announced July 2023.

    Comments: ICCV23

  2. arXiv:2305.15753  [pdf, other

    cs.CV

    T2TD: Text-3D Generation Model based on Prior Knowledge Guidance

    Authors: Weizhi Nie, Ruidong Chen, Weijie Wang, Bruno Lepri, Nicu Sebe

    Abstract: In recent years, 3D models have been utilized in many applications, such as auto-driver, 3D reconstruction, VR, and AR. However, the scarcity of 3D model data does not meet its practical demands. Thus, generating high-quality 3D models efficiently from textual descriptions is a promising but challenging way to solve this problem. In this paper, inspired by the ability of human beings to complement… ▽ More

    Submitted 25 May, 2023; originally announced May 2023.

  3. arXiv:2305.14107  [pdf, other

    cs.CV

    Federated Generalized Category Discovery

    Authors: Nan Pu, Zhun Zhong, Xinyuan Ji, Nicu Sebe

    Abstract: Generalized category discovery (GCD) aims at grouping unlabeled samples from known and unknown classes, given labeled data of known classes. To meet the recent decentralization trend in the community, we introduce a practical yet challenging task, namely Federated GCD (Fed-GCD), where the training data are distributively stored in local clients and cannot be shared among clients. The goal of Fed-G… ▽ More

    Submitted 23 May, 2023; originally announced May 2023.

    Comments: 17 pages, 3 figures

  4. arXiv:2305.11288  [pdf, other

    cs.LG

    Riemannian Multinomial Logistics Regression for SPD Neural Networks

    Authors: Ziheng Chen, Yue Song, Gaowen Liu, Ramana Rao Kompella, Xiaojun Wu, Nicu Sebe

    Abstract: Deep neural networks for learning Symmetric Positive Definite (SPD) matrices are gaining increasing attention in machine learning. Despite the significant progress, most existing SPD networks use traditional Euclidean classifiers on an approximated space rather than intrinsic classifiers that accurately capture the geometry of SPD manifolds. Inspired by Hyperbolic Neural Networks (HNNs), we propos… ▽ More

    Submitted 20 March, 2024; v1 submitted 18 May, 2023; originally announced May 2023.

    Comments: Accepted to CVPR 2024

  5. arXiv:2304.12944  [pdf, other

    cs.LG cs.CV

    Latent Traversals in Generative Models as Potential Flows

    Authors: Yue Song, T. Anderson Keller, Nicu Sebe, Max Welling

    Abstract: Despite the significant recent progress in deep generative models, the underlying structure of their latent spaces is still poorly understood, thereby making the task of performing semantically meaningful latent traversals an open research challenge. Most prior work has aimed to solve this challenge by modeling latent structures linearly, and finding corresponding linear directions which result in… ▽ More

    Submitted 1 July, 2023; v1 submitted 25 April, 2023; originally announced April 2023.

    Comments: ICML 2023

  6. arXiv:2304.09228  [pdf

    cond-mat.mes-hall cond-mat.mtrl-sci

    Spin-orbit readout using thin films of topological insulator Sb2Te3 deposited by industrial magnetron sputtering

    Authors: S. Teresi, N. Sebe, T. Frottier, J. Patterson, A. Kandazoglou, P. Noël, P. Sgarro, D. Térébénec, N. Bernier, F. Hippert, J. -P. Attané, L. Vila, P. Noé, M. Cosset-Chéneau

    Abstract: Driving a spin-logic circuit requires the production of a large output signal by spin-charge interconversion in spin-orbit readout devices. This should be possible by using topological insulators, which are known for their high spin-charge interconversion efficiency. However, high-quality topological insulators have so far only been obtained on a small scale, or with large scale deposition techniq… ▽ More

    Submitted 23 June, 2023; v1 submitted 18 April, 2023; originally announced April 2023.

  7. arXiv:2303.17546  [pdf, other

    cs.CV cs.AI cs.LG

    PAIR-Diffusion: A Comprehensive Multimodal Object-Level Image Editor

    Authors: Vidit Goel, Elia Peruzzo, Yifan Jiang, Dejia Xu, Xingqian Xu, Nicu Sebe, Trevor Darrell, Zhangyang Wang, Humphrey Shi

    Abstract: Generative image editing has recently witnessed extremely fast-paced growth. Some works use high-level conditioning such as text, while others use low-level conditioning. Nevertheless, most of them lack fine-grained control over the properties of the different objects present in the image, i.e. object-level image editing. In this work, we tackle the task by perceiving the images as an amalgamation… ▽ More

    Submitted 8 April, 2024; v1 submitted 30 March, 2023; originally announced March 2023.

    Comments: Accepted in CVPR 2024, Project page https://vidit98.github.io/publication/conference-paper/pair_diff.html

  8. arXiv:2303.17393  [pdf, ps, other

    cs.CV

    Dynamic Conceptional Contrastive Learning for Generalized Category Discovery

    Authors: Nan Pu, Zhun Zhong, Nicu Sebe

    Abstract: Generalized category discovery (GCD) is a recently proposed open-world problem, which aims to automatically cluster partially labeled data. The main challenge is that the unlabeled data contain instances that are not only from known categories of the labeled data but also from novel categories. This leads traditional novel category discovery (NCD) methods to be incapacitated for GCD, due to their… ▽ More

    Submitted 30 March, 2023; originally announced March 2023.

    Comments: 10 pages, 5 figures, accepted by CVPR2023

  9. arXiv:2303.15975  [pdf, other

    cs.CV cs.LG

    Large-scale Pre-trained Models are Surprisingly Strong in Incremental Novel Class Discovery

    Authors: Mingxuan Liu, Subhankar Roy, Zhun Zhong, Nicu Sebe, Elisa Ricci

    Abstract: Discovering novel concepts in unlabelled datasets and in a continuous manner is an important desideratum of lifelong learners. In the literature such problems have been partially addressed under very restricted settings, where novel classes are learned by jointly accessing a related labelled set (e.g., NCD) or by leveraging only a supervisedly pre-trained model (e.g., class-iNCD). In this work we… ▽ More

    Submitted 23 August, 2024; v1 submitted 28 March, 2023; originally announced March 2023.

    Comments: Accepted as a conference paper to ICPR 2024; The code is opensource

  10. arXiv:2303.15477  [pdf, other

    cs.LG

    Adaptive Log-Euclidean Metrics for SPD Matrix Learning

    Authors: Ziheng Chen, Yue Song, Tianyang Xu, Zhiwu Huang, Xiao-Jun Wu, Nicu Sebe

    Abstract: Symmetric Positive Definite (SPD) matrices have received wide attention in machine learning due to their intrinsic capacity to encode underlying structural correlation in data. Many successful Riemannian metrics have been proposed to reflect the non-Euclidean geometry of SPD manifolds. However, most existing metric tensors are fixed, which might lead to sub-optimal performance for SPD matrix learn… ▽ More

    Submitted 29 August, 2024; v1 submitted 26 March, 2023; originally announced March 2023.

    Comments: Accepted by TIP 2024

  11. arXiv:2303.11296  [pdf, other

    cs.CV

    Attribute-preserving Face Dataset Anonymization via Latent Code Optimization

    Authors: Simone Barattin, Christos Tzelepis, Ioannis Patras, Nicu Sebe

    Abstract: This work addresses the problem of anonymizing the identity of faces in a dataset of images, such that the privacy of those depicted is not violated, while at the same time the dataset is useful for downstream task such as for training machine learning models. To the best of our knowledge, we are the first to explicitly address this issue and deal with two major drawbacks of the existing state-of-… ▽ More

    Submitted 20 March, 2023; originally announced March 2023.

    Comments: Accepted for publication in CVPR 2023

  12. arXiv:2303.09270  [pdf, other

    cs.CV

    SpectralCLIP: Preventing Artifacts in Text-Guided Style Transfer from a Spectral Perspective

    Authors: Zipeng Xu, Songlong Xing, Enver Sangineto, Nicu Sebe

    Abstract: Owing to the power of vision-language foundation models, e.g., CLIP, the area of image synthesis has seen recent important advances. Particularly, for style transfer, CLIP enables transferring more general and abstract styles without collecting the style images in advance, as the style can be efficiently described with natural language, and the result is optimized by minimizing the CLIP similarity… ▽ More

    Submitted 2 November, 2023; v1 submitted 16 March, 2023; originally announced March 2023.

    Comments: WACV 2024

  13. arXiv:2303.09268  [pdf, other

    cs.CV

    StylerDALLE: Language-Guided Style Transfer Using a Vector-Quantized Tokenizer of a Large-Scale Generative Model

    Authors: Zipeng Xu, Enver Sangineto, Nicu Sebe

    Abstract: Despite the progress made in the style transfer task, most previous work focus on transferring only relatively simple features like color or texture, while missing more abstract concepts such as overall art expression or painter-specific traits. However, these abstract semantics can be captured by models like DALL-E or CLIP, which have been trained using huge datasets of images and textual documen… ▽ More

    Submitted 9 October, 2023; v1 submitted 16 March, 2023; originally announced March 2023.

    Comments: ICCV 2023

  14. arXiv:2303.08225  [pdf, other

    cs.CV cs.AI

    Graph Transformer GANs for Graph-Constrained House Generation

    Authors: Hao Tang, Zhenyu Zhang, Humphrey Shi, Bo Li, Ling Shao, Nicu Sebe, Radu Timofte, Luc Van Gool

    Abstract: We present a novel graph Transformer generative adversarial network (GTGAN) to learn effective graph node relations in an end-to-end fashion for the challenging graph-constrained house generation task. The proposed graph-Transformer-based generator includes a novel graph Transformer encoder that combines graph convolutions and self-attentions in a Transformer to model both local and global interac… ▽ More

    Submitted 14 March, 2023; originally announced March 2023.

    Comments: CVPR 2023

  15. arXiv:2303.03680  [pdf, other

    cs.CV

    Logit Margin Matters: Improving Transferable Targeted Adversarial Attack by Logit Calibration

    Authors: Juanjuan Weng, Zhiming Luo, Zhun Zhong, Shaozi Li, Nicu Sebe

    Abstract: Previous works have extensively studied the transferability of adversarial samples in untargeted black-box scenarios. However, it still remains challenging to craft targeted adversarial examples with higher transferability than non-targeted ones. Recent studies reveal that the traditional Cross-Entropy (CE) loss function is insufficient to learn transferable targeted adversarial examples due to th… ▽ More

    Submitted 7 March, 2023; originally announced March 2023.

  16. arXiv:2301.03949  [pdf, other

    cs.CV

    Modiff: Action-Conditioned 3D Motion Generation with Denoising Diffusion Probabilistic Models

    Authors: Mengyi Zhao, Mengyuan Liu, Bin Ren, Shuling Dai, Nicu Sebe

    Abstract: Diffusion-based generative models have recently emerged as powerful solutions for high-quality synthesis in multiple domains. Leveraging the bidirectional Markov chains, diffusion probabilistic models generate samples by inferring the reversed Markov chain based on the learned distribution mapping at the forward diffusion process. In this work, we propose Modiff, a conditional paradigm that benefi… ▽ More

    Submitted 28 March, 2023; v1 submitted 10 January, 2023; originally announced January 2023.

  17. arXiv:2212.09068  [pdf, other

    cs.CV

    Style-Hallucinated Dual Consistency Learning: A Unified Framework for Visual Domain Generalization

    Authors: Yuyang Zhao, Zhun Zhong, Na Zhao, Nicu Sebe, Gim Hee Lee

    Abstract: Domain shift widely exists in the visual world, while modern deep neural networks commonly suffer from severe performance degradation under domain shift due to the poor generalization ability, which limits the real-world applications. The domain shift mainly lies in the limited source environmental variations and the large distribution gap between source and unseen target data. To this end, we pro… ▽ More

    Submitted 24 November, 2023; v1 submitted 18 December, 2022; originally announced December 2022.

    Comments: Accepted by IJCV. Journal extension of arXiv:2204.02548. Code is available at https://github.com/HeliosZhao/SHADE-VisualDG

  18. arXiv:2212.05599  [pdf, other

    cs.CV cs.LG

    Orthogonal SVD Covariance Conditioning and Latent Disentanglement

    Authors: Yue Song, Nicu Sebe, Wei Wang

    Abstract: Inserting an SVD meta-layer into neural networks is prone to make the covariance ill-conditioned, which could harm the model in the training stability and generalization abilities. In this paper, we systematically study how to improve the covariance conditioning by enforcing orthogonality to the Pre-SVD layer. Existing orthogonal treatments on the weights are first investigated. However, these tec… ▽ More

    Submitted 11 December, 2022; originally announced December 2022.

    Comments: Accepted by IEEE T-PAMI. arXiv admin note: substantial text overlap with arXiv:2207.02119

  19. arXiv:2212.04067  [pdf, other

    cs.CV

    Consistency-Aware Anchor Pyramid Network for Crowd Localization

    Authors: Xinyan Liu, Guorong Li, Yuankai Qi, Zhenjun Han, Qingming Huang, Ming-Hsuan Yang, Nicu Sebe

    Abstract: Crowd localization aims to predict the spatial position of humans in a crowd scenario. We observe that the performance of existing methods is challenged from two aspects: (i) ranking inconsistency between test and training phases; and (ii) fixed anchor resolution may underfit or overfit crowd densities of local regions. To address these problems, we design a supervision target reassignment strateg… ▽ More

    Submitted 7 December, 2022; originally announced December 2022.

  20. arXiv:2211.10437  [pdf, other

    cs.CV

    A Structure-Guided Diffusion Model for Large-Hole Image Completion

    Authors: Daichi Horita, Jiaolong Yang, Dong Chen, Yuki Koyama, Kiyoharu Aizawa, Nicu Sebe

    Abstract: Image completion techniques have made significant progress in filling missing regions (i.e., holes) in images. However, large-hole completion remains challenging due to limited structural information. In this paper, we address this problem by integrating explicit structural guidance into diffusion-based image completion, forming our structure-guided diffusion model (SGDM). It consists of two casca… ▽ More

    Submitted 6 September, 2023; v1 submitted 18 November, 2022; originally announced November 2022.

    Comments: BMVC2023. Code: https://github.com/UdonDa/Structure_Guided_Diffusion_Model

  21. arXiv:2211.06742  [pdf, other

    cs.CV cs.AI

    Deep Unsupervised Key Frame Extraction for Efficient Video Classification

    Authors: Hao Tang, Lei Ding, Songsong Wu, Bin Ren, Nicu Sebe, Paolo Rota

    Abstract: Video processing and analysis have become an urgent task since a huge amount of videos (e.g., Youtube, Hulu) are uploaded online every day. The extraction of representative key frames from videos is very important in video processing and analysis since it greatly reduces computing resources and time. Although great progress has been made recently, large-scale video classification remains an open p… ▽ More

    Submitted 12 November, 2022; originally announced November 2022.

    Comments: Accepted to TOMM

  22. arXiv:2211.06719  [pdf, other

    cs.CV cs.AI

    Bipartite Graph Reasoning GANs for Person Pose and Facial Image Synthesis

    Authors: Hao Tang, Ling Shao, Philip H. S. Torr, Nicu Sebe

    Abstract: We present a novel bipartite graph reasoning Generative Adversarial Network (BiGraphGAN) for two challenging tasks: person pose and facial image synthesis. The proposed graph generator consists of two novel blocks that aim to model the pose-to-pose and pose-to-image relations, respectively. Specifically, the proposed bipartite graph reasoning (BGR) block aims to reason the long-range cross relatio… ▽ More

    Submitted 12 November, 2022; originally announced November 2022.

    Comments: Accepted to IJCV, an extended version of a paper published in BMVC 2020. arXiv admin note: substantial text overlap with arXiv:2008.04381

  23. arXiv:2210.09836  [pdf, other

    cs.CV

    Overlap-guided Gaussian Mixture Models for Point Cloud Registration

    Authors: Guofeng Mei, Fabio Poiesi, Cristiano Saltori, Jian Zhang, Elisa Ricci, Nicu Sebe

    Abstract: Probabilistic 3D point cloud registration methods have shown competitive performance in overcoming noise, outliers, and density variations. However, registering point cloud pairs in the case of partial overlap is still a challenge. This paper proposes a novel overlap-guided probabilistic registration approach that computes the optimal transformation from matched Gaussian Mixture Model (GMM) parame… ▽ More

    Submitted 17 October, 2022; originally announced October 2022.

    Comments: Accepted in WACV 2023

  24. Budget-Aware Pruning for Multi-Domain Learning

    Authors: Samuel Felipe dos Santos, Rodrigo Berriel, Thiago Oliveira-Santos, Nicu Sebe, Jurandy Almeida

    Abstract: Deep learning has achieved state-of-the-art performance on several computer vision tasks and domains. Nevertheless, it still has a high computational cost and demands a significant amount of parameters. Such requirements hinder the use in resource-limited environments and demand both software and hardware optimization. Another limitation is that deep models are usually specialized into a single do… ▽ More

    Submitted 16 September, 2023; v1 submitted 14 October, 2022; originally announced October 2022.

    Journal ref: 22nd International Conference on Image Analysis and Processing (ICIAP'23), 2023, pp. 477-489

  25. arXiv:2210.02884  [pdf, other

    cs.CV

    Vision+X: A Survey on Multimodal Learning in the Light of Data

    Authors: Ye Zhu, Yu Wu, Nicu Sebe, Yan Yan

    Abstract: We are perceiving and communicating with the world in a multisensory manner, where different information sources are sophisticatedly processed and interpreted by separate parts of the human brain to constitute a complex, yet harmonious and unified sensing system. To endow the machines with true intelligence, multimodal machine learning that incorporates data from various sources has become an incr… ▽ More

    Submitted 7 June, 2024; v1 submitted 5 October, 2022; originally announced October 2022.

    Comments: Survey paper on multimodal learning and generation, to appear at IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)

  26. arXiv:2210.02798  [pdf, other

    cs.CV

    Data Augmentation-free Unsupervised Learning for 3D Point Cloud Understanding

    Authors: Guofeng Mei, Cristiano Saltori, Fabio Poiesi, Jian Zhang, Elisa Ricci, Nicu Sebe, Qiang Wu

    Abstract: Unsupervised learning on 3D point clouds has undergone a rapid evolution, especially thanks to data augmentation-based contrastive methods. However, data augmentation is not ideal as it requires a careful selection of the type of augmentations to perform, which in turn can affect the geometric and semantic information learned by the network during self-training. To overcome this issue, we propose… ▽ More

    Submitted 6 October, 2022; originally announced October 2022.

    Comments: BMVC 2022

  27. arXiv:2210.00841  [pdf, other

    cs.CV cs.LG

    Smooth image-to-image translations with latent space interpolations

    Authors: Yahui Liu, Enver Sangineto, Yajing Chen, Linchao Bao, Haoxian Zhang, Nicu Sebe, Bruno Lepri, Marco De Nadai

    Abstract: Multi-domain image-to-image (I2I) translations can transform a source image according to the style of a target domain. One important, desired characteristic of these transformations, is their graduality, which corresponds to a smooth change between the source and the target image when their respective latent-space representations are linearly interpolated. However, state-of-the-art methods usually… ▽ More

    Submitted 14 March, 2023; v1 submitted 3 October, 2022; originally announced October 2022.

  28. arXiv:2209.15402  [pdf, other

    cs.CV

    Rethinking the Learning Paradigm for Facial Expression Recognition

    Authors: Weijie Wang, Nicu Sebe, Bruno Lepri

    Abstract: Due to the subjective crowdsourcing annotations and the inherent inter-class similarity of facial expressions, the real-world Facial Expression Recognition (FER) datasets usually exhibit ambiguous annotation. To simplify the learning paradigm, most previous methods convert ambiguous annotation results into precise one-hot annotations and train FER models in an end-to-end supervised manner. In this… ▽ More

    Submitted 30 September, 2022; originally announced September 2022.

  29. arXiv:2209.09132  [pdf, other

    cond-mat.quant-gas cond-mat.stat-mech gr-qc quant-ph

    Experimental Observation of Curved Light-Cones in a Quantum Field Simulator

    Authors: Mohammadamin Tajik, Marek Gluza, Nicolas Sebe, Philipp Schüttelkopf, Federica Cataldini, João Sabino, Frederik Møller, Si-Cong Ji, Sebastian Erne, Giacomo Guarnieri, Spyros Sotiriadis, Jens Eisert, Jörg Schmiedmayer

    Abstract: We investigate signal propagation in a quantum field simulator of the Klein-Gordon model realized by two strongly coupled parallel one-dimensional quasi-condensates. By measuring local phononic fields after a quench, we observe the propagation of correlations along sharp light-cone fronts. If the local atomic density is inhomogeneous, these propagation fronts are curved. For sharp edges, the propa… ▽ More

    Submitted 19 September, 2022; originally announced September 2022.

    Comments: 19 pages, 12 figures

    Journal ref: Proceedings of the National Academy of Sciences 120, e2301287120 (2023)

  30. arXiv:2209.08590  [pdf, other

    cs.LG cs.CV

    RankFeat: Rank-1 Feature Removal for Out-of-distribution Detection

    Authors: Yue Song, Nicu Sebe, Wei Wang

    Abstract: The task of out-of-distribution (OOD) detection is crucial for deploying machine learning models in real-world settings. In this paper, we observe that the singular value distributions of the in-distribution (ID) and OOD features are quite different: the OOD feature matrix tends to have a larger dominant singular value than the ID feature, and the class predictions of OOD samples are largely deter… ▽ More

    Submitted 18 September, 2022; originally announced September 2022.

    Comments: NeurIPS22

  31. arXiv:2209.02136  [pdf, other

    cs.CV cs.AI

    Facial Expression Translation using Landmark Guided GANs

    Authors: Hao Tang, Nicu Sebe

    Abstract: We propose a simple yet powerful Landmark guided Generative Adversarial Network (LandmarkGAN) for the facial expression-to-expression translation using a single image, which is an important and challenging task in computer vision since the expression-to-expression translation is a non-linear and non-aligned problem. Moreover, it requires a high-level semantic understanding between the input and ou… ▽ More

    Submitted 5 September, 2022; originally announced September 2022.

    Comments: Accepted to TAFFC

  32. arXiv:2208.12550  [pdf, other

    cs.CV cs.GR

    Training and Tuning Generative Neural Radiance Fields for Attribute-Conditional 3D-Aware Face Generation

    Authors: Jichao Zhang, Aliaksandr Siarohin, Yahui Liu, Hao Tang, Nicu Sebe, Wei Wang

    Abstract: Generative Neural Radiance Fields (GNeRF) based 3D-aware GANs have demonstrated remarkable capabilities in generating high-quality images while maintaining strong 3D consistency. Notably, significant advancements have been made in the domain of face generation. However, most existing models prioritize view consistency over disentanglement, resulting in limited semantic/attribute control during gen… ▽ More

    Submitted 18 October, 2023; v1 submitted 26 August, 2022; originally announced August 2022.

    Comments: 13 pages

  33. arXiv:2208.07591  [pdf, other

    cs.CV cs.LG

    Uncertainty-guided Source-free Domain Adaptation

    Authors: Subhankar Roy, Martin Trapp, Andrea Pilzer, Juho Kannala, Nicu Sebe, Elisa Ricci, Arno Solin

    Abstract: Source-free domain adaptation (SFDA) aims to adapt a classifier to an unlabelled target data set by only using a pre-trained source model. However, the absence of the source data and the domain shift makes the predictions on the target data unreliable. We propose quantifying the uncertainty in the source model predictions and utilizing it to guide the target adaptation. For this, we construct a pr… ▽ More

    Submitted 16 August, 2022; originally announced August 2022.

    Comments: ECCV 2022

  34. arXiv:2207.12842  [pdf, other

    cs.CV

    Unsupervised Domain Adaptation for Video Transformers in Action Recognition

    Authors: Victor G. Turrisi da Costa, Giacomo Zara, Paolo Rota, Thiago Oliveira-Santos, Nicu Sebe, Vittorio Murino, Elisa Ricci

    Abstract: Over the last few years, Unsupervised Domain Adaptation (UDA) techniques have acquired remarkable importance and popularity in computer vision. However, when compared to the extensive literature available for images, the field of videos is still relatively unexplored. On the other hand, the performance of a model in action recognition is heavily affected by domain shift. In this paper, we propose… ▽ More

    Submitted 26 July, 2022; originally announced July 2022.

    Comments: Accepted at ICPR 2022

  35. arXiv:2207.09778  [pdf, other

    cs.CV cs.AI cs.LG

    CoSMix: Compositional Semantic Mix for Domain Adaptation in 3D LiDAR Segmentation

    Authors: Cristiano Saltori, Fabio Galasso, Giuseppe Fiameni, Nicu Sebe, Elisa Ricci, Fabio Poiesi

    Abstract: 3D LiDAR semantic segmentation is fundamental for autonomous driving. Several Unsupervised Domain Adaptation (UDA) methods for point cloud data have been recently proposed to improve model generalization for different sensors and environments. Researchers working on UDA problems in the image domain have shown that sample mixing can mitigate domain shift. We propose a new approach of sample mixing… ▽ More

    Submitted 20 July, 2022; originally announced July 2022.

    Comments: Accepted at ECCV 2022

  36. arXiv:2207.09763  [pdf, other

    cs.CV cs.AI cs.LG

    GIPSO: Geometrically Informed Propagation for Online Adaptation in 3D LiDAR Segmentation

    Authors: Cristiano Saltori, Evgeny Krivosheev, Stéphane Lathuilière, Nicu Sebe, Fabio Galasso, Giuseppe Fiameni, Elisa Ricci, Fabio Poiesi

    Abstract: 3D point cloud semantic segmentation is fundamental for autonomous driving. Most approaches in the literature neglect an important aspect, i.e., how to deal with domain shift when handling dynamic scenes. This can significantly hinder the navigation capabilities of self-driving vehicles. This paper advances the state of the art in this research field. Our first contribution consists in analysing a… ▽ More

    Submitted 20 July, 2022; originally announced July 2022.

    Comments: Accepted at ECCV 2022

  37. arXiv:2207.08605  [pdf, other

    cs.CV

    Class-incremental Novel Class Discovery

    Authors: Subhankar Roy, Mingxuan Liu, Zhun Zhong, Nicu Sebe, Elisa Ricci

    Abstract: We study the new task of class-incremental Novel Class Discovery (class-iNCD), which refers to the problem of discovering novel categories in an unlabelled data set by leveraging a pre-trained model that has been trained on a labelled data set containing disjoint yet related categories. Apart from discovering novel classes, we also aim at preserving the ability of the model to recognize previously… ▽ More

    Submitted 18 July, 2022; originally announced July 2022.

    Comments: ECCV 2022

  38. arXiv:2207.04892  [pdf, other

    cs.CV

    Adversarial Style Augmentation for Domain Generalized Urban-Scene Segmentation

    Authors: Zhun Zhong, Yuyang Zhao, Gim Hee Lee, Nicu Sebe

    Abstract: In this paper, we consider the problem of domain generalization in semantic segmentation, which aims to learn a robust model using only labeled synthetic (source) data. The model is expected to perform well on unseen real (target) domains. Our study finds that the image style variation can largely influence the model's performance and the style features can be well represented by the channel-wise… ▽ More

    Submitted 12 October, 2022; v1 submitted 11 July, 2022; originally announced July 2022.

    Comments: NeurIPS 2022

  39. arXiv:2207.04242  [pdf, other

    cs.CV

    PI-Trans: Parallel-ConvMLP and Implicit-Transformation Based GAN for Cross-View Image Translation

    Authors: Bin Ren, Hao Tang, Yiming Wang, Xia Li, Wei Wang, Nicu Sebe

    Abstract: For semantic-guided cross-view image translation, it is crucial to learn where to sample pixels from the source view image and where to reallocate them guided by the target view semantic map, especially when there is little overlap or drastic view difference between the source and target images. Hence, one not only needs to encode the long-range dependencies among pixels in both the source view im… ▽ More

    Submitted 6 March, 2023; v1 submitted 9 July, 2022; originally announced July 2022.

    Comments: 5 pages, 5 figures

    Journal ref: 2023 IEEE International Conference on Acoustics, Speech and Signal Processing

  40. arXiv:2207.04228  [pdf, other

    cs.CV cs.LG math.NA

    Batch-efficient EigenDecomposition for Small and Medium Matrices

    Authors: Yue Song, Nicu Sebe, Wei Wang

    Abstract: EigenDecomposition (ED) is at the heart of many computer vision algorithms and applications. One crucial bottleneck limiting its usage is the expensive computation cost, particularly for a mini-batch of matrices in the deep neural networks. In this paper, we propose a QR-based ED method dedicated to the application scenarios of computer vision. Our proposed method performs the ED entirely by batch… ▽ More

    Submitted 9 July, 2022; originally announced July 2022.

    Comments: Accepted by ECCV22

  41. arXiv:2207.02119  [pdf, other

    cs.CV cs.LG math.NA

    Improving Covariance Conditioning of the SVD Meta-layer by Orthogonality

    Authors: Yue Song, Nicu Sebe, Wei Wang

    Abstract: Inserting an SVD meta-layer into neural networks is prone to make the covariance ill-conditioned, which could harm the model in the training stability and generalization abilities. In this paper, we systematically study how to improve the covariance conditioning by enforcing orthogonality to the Pre-SVD layer. Existing orthogonal treatments on the weights are first investigated. However, these tec… ▽ More

    Submitted 5 July, 2022; originally announced July 2022.

    Comments: Accepted by ECCV22

  42. arXiv:2207.01685  [pdf, other

    cs.CV

    Interaction Transformer for Human Reaction Generation

    Authors: Baptiste Chopin, Hao Tang, Naima Otberdout, Mohamed Daoudi, Nicu Sebe

    Abstract: We address the challenging task of human reaction generation, which aims to generate a corresponding reaction based on an input action. Most of the existing works do not focus on generating and predicting the reaction and cannot generate the motion when only the action is given as input. To address this limitation, we propose a novel interaction Transformer (InterFormer) consisting of a Transforme… ▽ More

    Submitted 1 February, 2023; v1 submitted 4 July, 2022; originally announced July 2022.

    Journal ref: IEEE Transactions On Multimedia 2023

  43. Unsupervised High-Resolution Portrait Gaze Correction and Animation

    Authors: Jichao Zhang, Jingjing Chen, Hao Tang, Enver Sangineto, Peng Wu, Yan Yan, Nicu Sebe, Wei Wang

    Abstract: This paper proposes a gaze correction and animation method for high-resolution, unconstrained portrait images, which can be trained without the gaze angle and the head pose annotations. Common gaze-correction methods usually require annotating training data with precise gaze, and head pose information. Solving this problem using an unsupervised method remains an open problem, especially for high-r… ▽ More

    Submitted 1 July, 2022; originally announced July 2022.

    Comments: Accepted to TIP. arXiv admin note: text overlap with arXiv:2008.03834

  44. arXiv:2206.04636  [pdf, other

    cs.CV cs.LG

    Spatial Entropy as an Inductive Bias for Vision Transformers

    Authors: Elia Peruzzo, Enver Sangineto, Yahui Liu, Marco De Nadai, Wei Bi, Bruno Lepri, Nicu Sebe

    Abstract: Recent work on Vision Transformers (VTs) showed that introducing a local inductive bias in the VT architecture helps reducing the number of samples necessary for training. However, the architecture modifications lead to a loss of generality of the Transformer backbone, partially contradicting the push towards the development of uniform architectures, shared, e.g., by both the Computer Vision and t… ▽ More

    Submitted 14 March, 2023; v1 submitted 9 June, 2022; originally announced June 2022.

  45. On the Eigenvalues of Global Covariance Pooling for Fine-grained Visual Recognition

    Authors: Yue Song, Nicu Sebe, Wei Wang

    Abstract: The Fine-Grained Visual Categorization (FGVC) is challenging because the subtle inter-class variations are difficult to be captured. One notable research line uses the Global Covariance Pooling (GCP) layer to learn powerful representations with second-order statistics, which can effectively model inter-class differences. In our previous conference paper, we show that truncating small eigenvalues o… ▽ More

    Submitted 26 May, 2022; originally announced May 2022.

    Comments: Accepted by IEEE T-PAMI

    Journal ref: IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022

  46. arXiv:2205.12551  [pdf, other

    cs.CV cs.CR

    Masked Jigsaw Puzzle: A Versatile Position Embedding for Vision Transformers

    Authors: Bin Ren, Yahui Liu, Yue Song, Wei Bi, Rita Cucchiara, Nicu Sebe, Wei Wang

    Abstract: Position Embeddings (PEs), an arguably indispensable component in Vision Transformers (ViTs), have been shown to improve the performance of ViTs on many vision tasks. However, PEs have a potentially high risk of privacy leakage since the spatial information of the input patches is exposed. This caveat naturally raises a series of interesting questions about the impact of PEs on the accuracy, priva… ▽ More

    Submitted 26 May, 2023; v1 submitted 25 May, 2022; originally announced May 2022.

    Comments: Accepted to CVPR2023

  47. arXiv:2205.09180  [pdf, other

    cs.LG cs.CL cs.CV

    Learning Rate Curriculum

    Authors: Florinel-Alin Croitoru, Nicolae-Catalin Ristea, Radu Tudor Ionescu, Nicu Sebe

    Abstract: Most curriculum learning methods require an approach to sort the data samples by difficulty, which is often cumbersome to perform. In this work, we propose a novel curriculum learning approach termed Learning Rate Curriculum (LeRaC), which leverages the use of a different learning rate for each layer of a neural network to create a data-agnostic curriculum during the initial training epochs. More… ▽ More

    Submitted 20 July, 2024; v1 submitted 18 May, 2022; originally announced May 2022.

    Comments: Accepted at the International Journal of Computer Vision

  48. arXiv:2204.03525  [pdf, other

    cs.LG cs.AI

    Temporal Alignment for History Representation in Reinforcement Learning

    Authors: Aleksandr Ermolov, Enver Sangineto, Nicu Sebe

    Abstract: Environments in Reinforcement Learning are usually only partially observable. To address this problem, a possible solution is to provide the agent with information about the past. However, providing complete observations of numerous steps can be excessive. Inspired by human memory, we propose to represent history with only important changes in the environment and, in our approach, to obtain automa… ▽ More

    Submitted 7 April, 2022; originally announced April 2022.

    Comments: ICPR 2022

  49. arXiv:2204.02548  [pdf, other

    cs.CV

    Style-Hallucinated Dual Consistency Learning for Domain Generalized Semantic Segmentation

    Authors: Yuyang Zhao, Zhun Zhong, Na Zhao, Nicu Sebe, Gim Hee Lee

    Abstract: In this paper, we study the task of synthetic-to-real domain generalized semantic segmentation, which aims to learn a model that is robust to unseen real-world scenes using only synthetic data. The large domain shift between synthetic and real-world data, including the limited source environmental variations and the large distribution gap between synthetic and real-world data, significantly hinder… ▽ More

    Submitted 19 July, 2022; v1 submitted 5 April, 2022; originally announced April 2022.

    Comments: ECCV 2022

  50. arXiv:2203.11832  [pdf, other

    cs.CV cs.MM

    Cross-View Panorama Image Synthesis

    Authors: Songsong Wu, Hao Tang, Xiao-Yuan Jing, Haifeng Zhao, Jianjun Qian, Nicu Sebe, Yan Yan

    Abstract: In this paper, we tackle the problem of synthesizing a ground-view panorama image conditioned on a top-view aerial image, which is a challenging problem due to the large gap between the two image domains with different view-points. Instead of learning cross-view mapping in a feedforward pass, we propose a novel adversarial feedback GAN framework named PanoGAN with two key components: an adversaria… ▽ More

    Submitted 22 March, 2022; originally announced March 2022.

    Comments: Accepted to IEEE Transactions on Multimedia