Zum Hauptinhalt springen

Showing 1–50 of 72 results for author: Koniusz, P

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.03179  [pdf, other

    cs.CV cs.AI cs.LG

    Motion meets Attention: Video Motion Prompts

    Authors: Qixiang Chen, Lei Wang, Piotr Koniusz, Tom Gedeon

    Abstract: Videos contain rich spatio-temporal information. Traditional methods for extracting motion, used in tasks such as action recognition, often rely on visual contents rather than precise motion features. This phenomenon is referred to as 'blind motion extraction' behavior, which proves inefficient in capturing motions of interest due to a lack of motion-guided cues. Recently, attention mechanisms hav… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

    Comments: Research report

  2. arXiv:2404.00521  [pdf, other

    cs.LG cs.CV

    CHAIN: Enhancing Generalization in Data-Efficient GANs via lipsCHitz continuity constrAIned Normalization

    Authors: Yao Ni, Piotr Koniusz

    Abstract: Generative Adversarial Networks (GANs) significantly advanced image generation but their performance heavily depends on abundant training data. In scenarios with limited data, GANs often struggle with discriminator overfitting and unstable training. Batch Normalization (BN), despite being known for enhancing generalization and training stability, has rarely been used in the discriminator of Data-E… ▽ More

    Submitted 1 June, 2024; v1 submitted 30 March, 2024; originally announced April 2024.

    Comments: Accepted by CVPR 2024. 26 pages. Improve Lemma 3.1 - Prop. 3.1 logic flow. Code: https://github.com/MaxwellYaoNi/CHAIN

  3. arXiv:2403.14821  [pdf, other

    cs.CV

    Learning Gaussian Representation for Eye Fixation Prediction

    Authors: Peipei Song, Jing Zhang, Piotr Koniusz, Nick Barnes

    Abstract: Existing eye fixation prediction methods perform the mapping from input images to the corresponding dense fixation maps generated from raw fixation points. However, due to the stochastic nature of human fixation, the generated dense fixation maps may be a less-than-ideal representation of human fixation. To provide a robust fixation model, we introduce Gaussian Representation for eye fixation mode… ▽ More

    Submitted 21 March, 2024; originally announced March 2024.

    Comments: 11 pages, 7 figures

  4. arXiv:2402.04599  [pdf, other

    cs.CV cs.AI cs.LG

    Meet JEANIE: a Similarity Measure for 3D Skeleton Sequences via Temporal-Viewpoint Alignment

    Authors: Lei Wang, Jun Liu, Liang Zheng, Tom Gedeon, Piotr Koniusz

    Abstract: Video sequences exhibit significant nuisance variations (undesired effects) of speed of actions, temporal locations, and subjects' poses, leading to temporal-viewpoint misalignment when comparing two sets of frames or evaluating the similarity of two sequences. Thus, we propose Joint tEmporal and cAmera viewpoiNt alIgnmEnt (JEANIE) for sequence pairs. In particular, we focus on 3D skeleton sequenc… ▽ More

    Submitted 25 March, 2024; v1 submitted 7 February, 2024; originally announced February 2024.

    Comments: Accepted by the International Journal of Computer Vision (IJCV). An extension of our ACCV'22 paper [arXiv:arXiv:2210.16820] which was distinguished by the Sang Uk Lee Best Student Paper Award

  5. arXiv:2312.01315  [pdf, other

    cs.CV

    Few-shot Shape Recognition by Learning Deep Shape-aware Features

    Authors: Wenlong Shi, Changsheng Lu, Ming Shao, Yinjie Zhang, Siyu Xia, Piotr Koniusz

    Abstract: Traditional shape descriptors have been gradually replaced by convolutional neural networks due to their superior performance in feature extraction and classification. The state-of-the-art methods recognize object shapes via image reconstruction or pixel classification. However , these methods are biased toward texture information and overlook the essential shape descriptions, thus, they fail to g… ▽ More

    Submitted 3 December, 2023; originally announced December 2023.

    Comments: Accepted by WACV 2024; 8 pages for main paper

  6. arXiv:2310.18737  [pdf, other

    cs.CV cs.AI cs.LG

    Pre-training with Random Orthogonal Projection Image Modeling

    Authors: Maryam Haghighat, Peyman Moghadam, Shaheer Mohamed, Piotr Koniusz

    Abstract: Masked Image Modeling (MIM) is a powerful self-supervised strategy for visual pre-training without the use of labels. MIM applies random crops to input images, processes them with an encoder, and then recovers the masked inputs with a decoder, which encourages the network to capture and learn structural information about objects and scenes. The intermediate feature representations obtained from MI… ▽ More

    Submitted 21 April, 2024; v1 submitted 28 October, 2023; originally announced October 2023.

    Comments: Published as a conference paper at the International Conference on Learning Representations (ICLR) 2024. 19 pages

  7. arXiv:2310.18209  [pdf, other

    cs.LG cs.AI

    Alignment and Outer Shell Isotropy for Hyperbolic Graph Contrastive Learning

    Authors: Yifei Zhang, Hao Zhu, Jiahong Liu, Piotr Koniusz, Irwin King

    Abstract: Learning good self-supervised graph representations that are beneficial to downstream tasks is challenging. Among a variety of methods, contrastive learning enjoys competitive performance. The embeddings of contrastive learning are arranged on a hypersphere that enables the Cosine distance measurement in the Euclidean space. However, the underlying structure of many domains such as graphs exhibits… ▽ More

    Submitted 27 October, 2023; originally announced October 2023.

  8. arXiv:2310.10059  [pdf, other

    cs.CV cs.AI cs.LG

    Flow Dynamics Correction for Action Recognition

    Authors: Lei Wang, Piotr Koniusz

    Abstract: Various research studies indicate that action recognition performance highly depends on the types of motions being extracted and how accurate the human actions are represented. In this paper, we investigate different optical flow, and features extracted from these optical flow that capturing both short-term and long-term motion dynamics. We perform power normalization on the magnitude component of… ▽ More

    Submitted 15 December, 2023; v1 submitted 16 October, 2023; originally announced October 2023.

    Comments: Accepted by IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2024)

  9. arXiv:2310.05615  [pdf, other

    cs.CV cs.AI cs.LG

    Adaptive Multi-head Contrastive Learning

    Authors: Lei Wang, Piotr Koniusz, Tom Gedeon, Liang Zheng

    Abstract: In contrastive learning, two views of an original image, generated by different augmentations, are considered a positive pair, and their similarity is required to be high. Similarly, two views of distinct images form a negative pair, with encouraged low similarity. Typically, a single similarity measure, provided by a lone projection head, evaluates positive and negative sample pairs. However, due… ▽ More

    Submitted 10 July, 2024; v1 submitted 9 October, 2023; originally announced October 2023.

    Comments: Accepted at the 18th European Conference on Computer Vision (ECCV 2024)

  10. arXiv:2309.13563  [pdf, other

    cs.CV

    Multivariate Prototype Representation for Domain-Generalized Incremental Learning

    Authors: Can Peng, Piotr Koniusz, Kaiyu Guo, Brian C. Lovell, Peyman Moghadam

    Abstract: Deep learning models suffer from catastrophic forgetting when being fine-tuned with samples of new classes. This issue becomes even more pronounced when faced with the domain shift between training and testing data. In this paper, we study the critical and less explored Domain-Generalized Class-Incremental Learning (DGCIL). We design a DGCIL approach that remembers old classes, adapts to new class… ▽ More

    Submitted 24 September, 2023; originally announced September 2023.

  11. arXiv:2307.09321  [pdf, other

    cs.LG cs.AI cs.DB

    Exploiting Field Dependencies for Learning on Categorical Data

    Authors: Zhibin Li, Piotr Koniusz, Lu Zhang, Daniel Edward Pagendam, Peyman Moghadam

    Abstract: Traditional approaches for learning on categorical data underexploit the dependencies between columns (\aka fields) in a dataset because they rely on the embedding of data points driven alone by the classification/regression loss. In contrast, we propose a novel method for learning on categorical data with the goal of exploiting dependencies between fields. Instead of modelling statistics of featu… ▽ More

    Submitted 18 July, 2023; originally announced July 2023.

    Comments: IEEE Transactions on Pattern Analysis and Machine Intelligence (submitted June 2022, accepted July 2023)

  12. arXiv:2307.03407  [pdf, other

    cs.CV

    Distilling Self-Supervised Vision Transformers for Weakly-Supervised Few-Shot Classification & Segmentation

    Authors: Dahyun Kang, Piotr Koniusz, Minsu Cho, Naila Murray

    Abstract: We address the task of weakly-supervised few-shot image classification and segmentation, by leveraging a Vision Transformer (ViT) pretrained with self-supervision. Our proposed method takes token representations from the self-supervised ViT and leverages their correlations, via self-attention, to produce classification and segmentation predictions through separate task heads. Our model is able to… ▽ More

    Submitted 7 July, 2023; originally announced July 2023.

    Comments: Accepted at CVPR 2023

    Journal ref: CVPR 2023

  13. arXiv:2305.05740  [pdf, other

    cs.LG cs.SI

    Message Passing Neural Networks for Traffic Forecasting

    Authors: Arian Prabowo, Hao Xue, Wei Shao, Piotr Koniusz, Flora D. Salim

    Abstract: A road network, in the context of traffic forecasting, is typically modeled as a graph where the nodes are sensors that measure traffic metrics (such as speed) at that location. Traffic forecasting is interesting because it is complex as the future speed of a road is dependent on a number of different factors. Therefore, to properly forecast traffic, we need a model that is capable of capturing al… ▽ More

    Submitted 9 May, 2023; originally announced May 2023.

    Comments: 18 pages, 5 figures

  14. Traffic Forecasting on New Roads Using Spatial Contrastive Pre-Training (SCPT)

    Authors: Arian Prabowo, Hao Xue, Wei Shao, Piotr Koniusz, Flora D. Salim

    Abstract: New roads are being constructed all the time. However, the capabilities of previous deep forecasting models to generalize to new roads not seen in the training data (unseen roads) are rarely explored. In this paper, we introduce a novel setup called a spatio-temporal (ST) split to evaluate the models' capabilities to generalize to unseen roads. In this setup, the models are trained on data from a… ▽ More

    Submitted 21 September, 2023; v1 submitted 9 May, 2023; originally announced May 2023.

    Comments: 25 pages including reference, an additional 3 pages of appendix, 8 figures. ECML PKDD 2023 Journal track special issue: Data Mining and Knowledge Discovery (DAMI)

  15. arXiv:2304.11598  [pdf, other

    cs.CV cs.AI cs.LG

    Transductive Few-shot Learning with Prototype-based Label Propagation by Iterative Graph Refinement

    Authors: Hao Zhu, Piotr Koniusz

    Abstract: Few-shot learning (FSL) is popular due to its ability to adapt to novel classes. Compared with inductive few-shot learning, transductive models typically perform better as they leverage all samples of the query set. The two existing classes of methods, prototype-based and graph-based, have the disadvantages of inaccurate prototype estimation and sub-optimal graph construction with kernel functions… ▽ More

    Submitted 23 April, 2023; originally announced April 2023.

    Comments: This paper is published at CVPR 2023

  16. arXiv:2304.11597  [pdf, other

    cs.CV cs.AI cs.LG

    Learning Partial Correlation based Deep Visual Representation for Image Classification

    Authors: Saimunur Rahman, Piotr Koniusz, Lei Wang, Luping Zhou, Peyman Moghadam, Changming Sun

    Abstract: Visual representation based on covariance matrix has demonstrates its efficacy for image classification by characterising the pairwise correlation of different channels in convolutional feature maps. However, pairwise correlation will become misleading once there is another channel correlating with both channels of interest, resulting in the ``confounding'' effect. For this case, ``partial correla… ▽ More

    Submitted 26 April, 2023; v1 submitted 23 April, 2023; originally announced April 2023.

    Comments: This paper is published at CVPR 2023

  17. arXiv:2304.03140  [pdf, other

    cs.CV

    From Saliency to DINO: Saliency-guided Vision Transformer for Few-shot Keypoint Detection

    Authors: Changsheng Lu, Hao Zhu, Piotr Koniusz

    Abstract: Unlike current deep keypoint detectors that are trained to recognize limited number of body parts, few-shot keypoint detection (FSKD) attempts to localize any keypoints, including novel or base keypoints, depending on the reference samples. FSKD requires the semantically meaningful relations for keypoint similarity learning to overcome the ubiquitous noise and ambiguous local patterns. One rescue… ▽ More

    Submitted 6 April, 2023; originally announced April 2023.

    Comments: 15 pages, 10 figures

  18. arXiv:2303.14474  [pdf, other

    cs.CV cs.AI cs.LG

    3Mformer: Multi-order Multi-mode Transformer for Skeletal Action Recognition

    Authors: Lei Wang, Piotr Koniusz

    Abstract: Many skeletal action recognition models use GCNs to represent the human body by 3D body joints connected body parts. GCNs aggregate one- or few-hop graph neighbourhoods, and ignore the dependency between not linked body joints. We propose to form hypergraph to model hyper-edges between graph nodes (e.g., third- and fourth-order hyper-edges capture three and four nodes) which help capture higher-or… ▽ More

    Submitted 25 March, 2023; originally announced March 2023.

    Comments: This paper is accepted by CVPR 2023

    Journal ref: CVPR 2023

  19. arXiv:2302.09956  [pdf, other

    cs.LG cs.CV cs.DB

    Because Every Sensor Is Unique, so Is Every Pair: Handling Dynamicity in Traffic Forecasting

    Authors: Arian Prabowo, Wei Shao, Hao Xue, Piotr Koniusz, Flora D. Salim

    Abstract: Traffic forecasting is a critical task to extract values from cyber-physical infrastructures, which is the backbone of smart transportation. However owing to external contexts, the dynamics at each sensor are unique. For example, the afternoon peaks at sensors near schools are more likely to occur earlier than those near residential areas. In this paper, we first analyze real-world traffic data to… ▽ More

    Submitted 28 February, 2023; v1 submitted 20 February, 2023; originally announced February 2023.

    Comments: 20 pages, IoTDI 2023; Correction on Fig. 4

    Journal ref: IoTDI 2023

  20. Event-guided Multi-patch Network with Self-supervision for Non-uniform Motion Deblurring

    Authors: Hongguang Zhang, Limeng Zhang, Yuchao Dai, Hongdong Li, Piotr Koniusz

    Abstract: Contemporary deep learning multi-scale deblurring models suffer from many issues: 1) They perform poorly on non-uniformly blurred images/videos; 2) Simply increasing the model depth with finer-scale levels cannot improve deblurring; 3) Individual RGB frames contain a limited motion information for deblurring; 4) Previous models have a limited robustness to spatial transformations and noise. Below,… ▽ More

    Submitted 14 February, 2023; originally announced February 2023.

    Comments: International Journal of Computer Vision. arXiv admin note: substantial text overlap with arXiv:1904.03468

  21. arXiv:2212.01026  [pdf, other

    cs.LG cs.AI cs.CV

    Spectral Feature Augmentation for Graph Contrastive Learning and Beyond

    Authors: Yifei Zhang, Hao Zhu, Zixing Song, Piotr Koniusz, Irwin King

    Abstract: Although augmentations (e.g., perturbation of graph edges, image crops) boost the efficiency of Contrastive Learning (CL), feature level augmentation is another plausible, complementary yet not well researched strategy. Thus, we present a novel spectral feature argumentation for contrastive learning on graphs (and images). To this end, for each data view, we estimate a low-rank approximation per f… ▽ More

    Submitted 2 December, 2022; originally announced December 2022.

    Comments: This paper has been published with the Thirty-Seventh AAAI Conference on Artificial Intelligence (AAAI 2023)

  22. arXiv:2211.00005  [pdf, other

    cs.CV cs.AI cs.LG

    Uncertainty-DTW for Time Series and Sequences

    Authors: Lei Wang, Piotr Koniusz

    Abstract: Dynamic Time Warping (DTW) is used for matching pairs of sequences and celebrated in applications such as forecasting the evolution of time series, clustering time series or even matching sequence pairs in few-shot action recognition. The transportation plan of DTW contains a set of paths; each path matches frames between two sequences under a varying degree of time warping, to account for varying… ▽ More

    Submitted 30 October, 2022; originally announced November 2022.

    Comments: Accepted as an oral paper at the 17th European Conference on Computer Vision (ECCV 2022). arXiv admin note: text overlap with arXiv:2210.16820

  23. arXiv:2210.16897  [pdf, other

    cs.CV cs.AI cs.LG

    Time-rEversed diffusioN tEnsor Transformer: A new TENET of Few-Shot Object Detection

    Authors: Shan Zhang, Naila Murray, Lei Wang, Piotr Koniusz

    Abstract: In this paper, we tackle the challenging problem of Few-shot Object Detection. Existing FSOD pipelines (i) use average-pooled representations that result in information loss; and/or (ii) discard position information that can help detect object instances. Consequently, such pipelines are sensitive to large intra-class appearance and geometric variations between support and query images. To address… ▽ More

    Submitted 30 October, 2022; originally announced October 2022.

    Comments: Accepted at the 17th European Conference on Computer Vision (ECCV 2022)

  24. arXiv:2210.16820  [pdf, other

    cs.CV cs.AI cs.LG

    Temporal-Viewpoint Transportation Plan for Skeletal Few-shot Action Recognition

    Authors: Lei Wang, Piotr Koniusz

    Abstract: We propose a Few-shot Learning pipeline for 3D skeleton-based action recognition by Joint tEmporal and cAmera viewpoiNt alIgnmEnt (JEANIE). To factor out misalignment between query and support sequences of 3D body joints, we propose an advanced variant of Dynamic Time Warping which jointly models each smooth path between the query and support frames to achieve simultaneously the best alignment in… ▽ More

    Submitted 30 October, 2022; originally announced October 2022.

    Comments: Accepted as an oral paper at the 16th Asian Conference on Computer Vision (ACCV 2022). It extends our arXiv preprint arXiv:2112.12668 (2021)

  25. COSTA: Covariance-Preserving Feature Augmentation for Graph Contrastive Learning

    Authors: Yifei Zhang, Hao Zhu, Zixing Song, Piotr Koniusz, Irwin King

    Abstract: Graph contrastive learning (GCL) improves graph representation learning, leading to SOTA on various downstream tasks. The graph augmentation step is a vital but scarcely studied step of GCL. In this paper, we show that the node embedding obtained via the graph augmentations is highly biased, somewhat limiting contrastive models from learning discriminative features for downstream tasks. Thus, inst… ▽ More

    Submitted 13 June, 2022; v1 submitted 9 June, 2022; originally announced June 2022.

    Comments: This paper is accepted by the ACM KDD 2022

  26. arXiv:2203.14148  [pdf, other

    cs.CV

    Accurate 3-DoF Camera Geo-Localization via Ground-to-Satellite Image Matching

    Authors: Yujiao Shi, Xin Yu, Liu Liu, Dylan Campbell, Piotr Koniusz, Hongdong Li

    Abstract: We address the problem of ground-to-satellite image geo-localization, that is, estimating the camera latitude, longitude and orientation (azimuth angle) by matching a query image captured at the ground level against a large-scale database with geotagged satellite images. Our prior arts treat the above task as pure image retrieval by selecting the most similar satellite reference image matching the… ▽ More

    Submitted 26 March, 2022; originally announced March 2022.

    Comments: submitted to TPAMI in Jan 2021

  27. Graph-adaptive Rectified Linear Unit for Graph Neural Networks

    Authors: Yifei Zhang, Hao Zhu, Ziqiao Meng, Piotr Koniusz, Irwin King

    Abstract: Graph Neural Networks (GNNs) have achieved remarkable success by extending traditional convolution to learning on non-Euclidean data. The key to the GNNs is adopting the neural message-passing paradigm with two stages: aggregation and update. The current design of GNNs considers the topology information in the aggregation stage. However, in the updating stage, all nodes share the same updating fun… ▽ More

    Submitted 13 February, 2022; originally announced February 2022.

    Comments: TheWebConf (WWW), 2022

  28. Multi-level Second-order Few-shot Learning

    Authors: Hongguang Zhang, Hongdong Li, Piotr Koniusz

    Abstract: We propose a Multi-level Second-order (MlSo) few-shot learning network for supervised or unsupervised few-shot image classification and few-shot action recognition. We leverage so-called power-normalized second-order base learner streams combined with features that express multiple levels of visual abstraction, and we use self-supervised discriminating mechanisms. As Second-order Pooling (SoP) is… ▽ More

    Submitted 15 January, 2022; originally announced January 2022.

    Comments: IEEE Transactions on Multimedia

  29. arXiv:2201.05493  [pdf, other

    cs.LG

    Contrastive Laplacian Eigenmaps

    Authors: Hao Zhu, Ke Sun, Piotr Koniusz

    Abstract: Graph contrastive learning attracts/disperses node representations for similar/dissimilar node pairs under some notion of similarity. It may be combined with a low-dimensional embedding of nodes to preserve intrinsic and structural properties of a graph. In this paper, we extend the celebrated Laplacian Eigenmaps with contrastive learning, and call them COntrastive Laplacian EigenmapS (COLES). Sta… ▽ More

    Submitted 14 January, 2022; originally announced January 2022.

    Comments: Accepted by NeurIPS 2021. Includes the main paper and the supplementary material. OpenReview: https://openreview.net/forum?id=iLn-bhP-kKH

    Journal ref: 35th Conference on Neural Information Processing Systems (NeurIPS 2021)

  30. arXiv:2112.12668  [pdf, other

    cs.CV cs.HC cs.LG

    3D Skeleton-based Few-shot Action Recognition with JEANIE is not so Naïve

    Authors: Lei Wang, Jun Liu, Piotr Koniusz

    Abstract: In this paper, we propose a Few-shot Learning pipeline for 3D skeleton-based action recognition by Joint tEmporal and cAmera viewpoiNt alIgnmEnt (JEANIE). To factor out misalignment between query and support sequences of 3D body joints, we propose an advanced variant of Dynamic Time Warping which jointly models each smooth path between the query and support frames to achieve simultaneously the bes… ▽ More

    Submitted 23 December, 2021; originally announced December 2021.

    Comments: Full 17 page version

  31. arXiv:2112.12618  [pdf, other

    cs.CV cs.LG

    Manifold Learning Benefits GANs

    Authors: Yao Ni, Piotr Koniusz, Richard Hartley, Richard Nock

    Abstract: In this paper, we improve Generative Adversarial Networks by incorporating a manifold learning step into the discriminator. We consider locality-constrained linear and subspace-based manifolds, and locality-constrained non-linear manifolds. In our design, the manifold learning and coding steps are intertwined with layers of the discriminator, with the goal of attracting intermediate feature repres… ▽ More

    Submitted 1 April, 2022; v1 submitted 23 December, 2021; originally announced December 2021.

    Comments: CVPR 2022, 32 pages full version

  32. arXiv:2112.06183  [pdf, other

    cs.CV

    Few-shot Keypoint Detection with Uncertainty Learning for Unseen Species

    Authors: Changsheng Lu, Piotr Koniusz

    Abstract: Current non-rigid object keypoint detectors perform well on a chosen kind of species and body parts, and require a large amount of labelled keypoints for training. Moreover, their heatmaps, tailored to specific body parts, cannot recognize novel keypoints (keypoints not labelled for training) on unseen species. We raise an interesting yet challenging question: how to detect both base (annotated fo… ▽ More

    Submitted 1 April, 2022; v1 submitted 12 December, 2021; originally announced December 2021.

    Comments: Accepted by CVPR 2022; 8 pages for main paper, 6 pages for supplementary materials

  33. arXiv:2110.13494  [pdf, other

    cs.CV

    Meta-Learning for Multi-Label Few-Shot Classification

    Authors: Christian Simon, Piotr Koniusz, Mehrtash Harandi

    Abstract: Even with the luxury of having abundant data, multi-label classification is widely known to be a challenging task to address. This work targets the problem of multi-label meta-learning, where a model learns to predict multiple labels within a query (e.g., an image) by just observing a few supporting examples. In doing so, we first propose a benchmark for Few-Shot Learning (FSL) with multiple label… ▽ More

    Submitted 26 October, 2021; originally announced October 2021.

    Comments: Accepted to WACV 2022

  34. arXiv:2110.12197  [pdf, other

    cs.LG cs.CV

    Towards a Robust Differentiable Architecture Search under Label Noise

    Authors: Christian Simon, Piotr Koniusz, Lars Petersson, Yan Han, Mehrtash Harandi

    Abstract: Neural Architecture Search (NAS) is the game changer in designing robust neural architectures. Architectures designed by NAS outperform or compete with the best manual network designs in terms of accuracy, size, memory footprint and FLOPs. That said, previous studies focus on developing NAS algorithms for clean high quality data, a restrictive and somewhat unrealistic assumption. In this paper, fo… ▽ More

    Submitted 23 October, 2021; originally announced October 2021.

    Comments: Accepted to WACV 2022

  35. arXiv:2110.11881  [pdf, other

    cs.CV cs.CL

    Simple Dialogue System with AUDITED

    Authors: Yusuf Tas, Piotr Koniusz

    Abstract: We devise a multimodal conversation system for dialogue utterances composed of text, image or both modalities. We leverage Auxiliary UnsuperviseD vIsual and TExtual Data (AUDITED). To improve the performance of text-based task, we utilize translations of target sentences from English to French to form the assisted supervision. For the image-based task, we employ the DeepFashion dataset in which we… ▽ More

    Submitted 22 October, 2021; originally announced October 2021.

    Comments: Accepted by the BMVC 2021

  36. arXiv:2110.05216  [pdf, other

    cs.CV cs.LG

    High-order Tensor Pooling with Attention for Action Recognition

    Authors: Lei Wang, Ke Sun, Piotr Koniusz

    Abstract: We aim at capturing high-order statistics of feature vectors formed by a neural network, and propose end-to-end second- and higher-order pooling to form a tensor descriptor. Tensor descriptors require a robust similarity measure due to low numbers of aggregated vectors and the burstiness phenomenon, when a given feature appears more/less frequently than statistically expected. The Heat Diffusion P… ▽ More

    Submitted 15 December, 2023; v1 submitted 11 October, 2021; originally announced October 2021.

    Comments: Accepted by IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2024)

  37. arXiv:2108.10703  [pdf, other

    cs.LG cs.AI cs.SI

    REFINE: Random RangE FInder for Network Embedding

    Authors: Hao Zhu, Piotr Koniusz

    Abstract: Network embedding approaches have recently attracted considerable interest as they learn low-dimensional vector representations of nodes. Embeddings based on the matrix factorization are effective but they are usually computationally expensive due to the eigen-decomposition step. In this paper, we propose a Random RangE FInder based Network Embedding (REFINE) algorithm, which can perform embedding… ▽ More

    Submitted 24 August, 2021; originally announced August 2021.

    Journal ref: 30th ACM International Conference on Information and Knowledge Management (CIKM 2021)

  38. arXiv:2107.11666  [pdf, other

    cs.LG cs.AI cs.CL

    Graph Convolutional Network with Generalized Factorized Bilinear Aggregation

    Authors: Hao Zhu, Piotr Koniusz

    Abstract: Although Graph Convolutional Networks (GCNs) have demonstrated their power in various applications, the graph convolutional layers, as the most important component of GCN, are still using linear transformations and a simple pooling step. In this paper, we propose a novel generalization of Factorized Bilinear (FB) layer to model the feature interactions in GCNs. FB performs two matrix-vector multip… ▽ More

    Submitted 24 July, 2021; originally announced July 2021.

  39. Predicting Flight Delay with Spatio-Temporal Trajectory Convolutional Network and Airport Situational Awareness Map

    Authors: Wei Shao, Arian Prabowo, Sichen Zhao, Piotr Koniusz, Flora D. Salim

    Abstract: To model and forecast flight delays accurately, it is crucial to harness various vehicle trajectory and contextual sensor data on airport tarmac areas. These heterogeneous sensor data, if modelled correctly, can be used to generate a situational awareness map. Existing techniques apply traditional supervised learning methods onto historical data, contextual features and route information among dif… ▽ More

    Submitted 19 May, 2021; originally announced May 2021.

    Comments: single column. Neurocomputing, 2021

  40. arXiv:2104.08572  [pdf, other

    cs.LG cs.CV

    On Learning the Geodesic Path for Incremental Learning

    Authors: Christian Simon, Piotr Koniusz, Mehrtash Harandi

    Abstract: Neural networks notoriously suffer from the problem of catastrophic forgetting, the phenomenon of forgetting the past knowledge when acquiring new knowledge. Overcoming catastrophic forgetting is of significant importance to emulate the process of "incremental learning", where the model is capable of learning from sequential experience in an efficient and robust way. State-of-the-art techniques fo… ▽ More

    Submitted 17 April, 2021; originally announced April 2021.

    Comments: Accepted to CVPR 2021

  41. Tensor Representations for Action Recognition

    Authors: Piotr Koniusz, Lei Wang, Anoop Cherian

    Abstract: Human actions in video sequences are characterized by the complex interplay between spatial features and their temporal dynamics. In this paper, we propose novel tensor representations for compactly capturing such higher-order relationships between visual features for the task of action recognition. We propose two tensor-based feature representations, viz. (i) sequence compatibility kernel (SCK) a… ▽ More

    Submitted 28 August, 2021; v1 submitted 28 December, 2020; originally announced December 2020.

    Comments: Published with TPAMI, 2020. arXiv admin note: text overlap with arXiv:1604.00239

  42. Power Normalizations in Fine-grained Image, Few-shot Image and Graph Classification

    Authors: Piotr Koniusz, Hongguang Zhang

    Abstract: Power Normalizations (PN) are useful non-linear operators which tackle feature imbalances in classification problems. We study PNs in the deep learning setup via a novel PN layer pooling feature maps. Our layer combines the feature vectors and their respective spatial locations in the feature maps produced by the last convolutional layer of CNN into a positive definite matrix with second-order sta… ▽ More

    Submitted 28 August, 2021; v1 submitted 27 December, 2020; originally announced December 2020.

    Comments: Accepted by TPAMI, July 2020

  43. arXiv:2009.11260  [pdf, other

    cs.CL

    A Token-wise CNN-based Method for Sentence Compression

    Authors: Weiwei Hou, Hanna Suominen, Piotr Koniusz, Sabrina Caldwell, Tom Gedeon

    Abstract: Sentence compression is a Natural Language Processing (NLP) task aimed at shortening original sentences and preserving their key information. Its applications can benefit many fields e.g. one can build tools for language education. However, current methods are largely based on Recurrent Neural Network (RNN) models which suffer from poor processing speed. To address this issue, in this paper, we pr… ▽ More

    Submitted 23 September, 2020; originally announced September 2020.

  44. arXiv:2002.03923  [pdf, other

    cs.CV

    6DoF Object Pose Estimation via Differentiable Proxy Voting Loss

    Authors: Xin Yu, Zheyu Zhuang, Piotr Koniusz, Hongdong Li

    Abstract: Estimating a 6DOF object pose from a single image is very challenging due to occlusions or textureless appearances. Vector-field based keypoint voting has demonstrated its effectiveness and superiority on tackling those issues. However, direct regression of vector-fields neglects that the distances between pixels and keypoints also affect the deviations of hypotheses dramatically. In other words,… ▽ More

    Submitted 4 May, 2020; v1 submitted 10 February, 2020; originally announced February 2020.

  45. arXiv:2002.03461  [pdf, other

    cs.LG cs.DB cs.IR stat.ML

    Relation Embedding for Personalised POI Recommendation

    Authors: Xianjing Wang, Flora D. Salim, Yongli Ren, Piotr Koniusz

    Abstract: Point-of-Interest (POI) recommendation is one of the most important location-based services helping people discover interesting venues or services. However, the extreme user-POI matrix sparsity and the varying spatio-temporal context pose challenges for POI systems, which affects the quality of POI recommendations. To this end, we propose a translation-based relation embedding for POI recommendati… ▽ More

    Submitted 19 February, 2020; v1 submitted 9 February, 2020; originally announced February 2020.

    Comments: 12 pages, 3 figures, Accepted in the 24th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD 2020)

  46. Self-supervising Action Recognition by Statistical Moment and Subspace Descriptors

    Authors: Lei Wang, Piotr Koniusz

    Abstract: In this paper, we build on a concept of self-supervision by taking RGB frames as input to learn to predict both action concepts and auxiliary descriptors e.g., object descriptors. So-called hallucination streams are trained to predict auxiliary cues, simultaneously fed into classification layers, and then hallucinated at the testing stage to aid network. We design and hallucinate two descriptors,… ▽ More

    Submitted 5 August, 2021; v1 submitted 14 January, 2020; originally announced January 2020.

    Comments: ACM MM'21

  47. arXiv:2001.03919  [pdf, other

    cs.CV

    Rethinking Class Relations: Absolute-relative Supervised and Unsupervised Few-shot Learning

    Authors: Hongguang Zhang, Piotr Koniusz, Songlei Jian, Hongdong Li, Philip H. S. Torr

    Abstract: The majority of existing few-shot learning methods describe image relations with binary labels. However, such binary relations are insufficient to teach the network complicated real-world relations, due to the lack of decision smoothness. Furthermore, current few-shot learning models capture only the similarity via relation labels, but they are not exposed to class concepts associated with objects… ▽ More

    Submitted 9 June, 2021; v1 submitted 12 January, 2020; originally announced January 2020.

    Comments: IEEE/CVF Conference on Computer Vision and Pattern Recognition 2021

  48. arXiv:2001.03905  [pdf, other

    cs.CV

    Few-shot Action Recognition with Permutation-invariant Attention

    Authors: Hongguang Zhang, Li Zhang, Xiaojuan Qi, Hongdong Li, Philip H. S. Torr, Piotr Koniusz

    Abstract: Many few-shot learning models focus on recognising images. In contrast, we tackle a challenging task of few-shot action recognition from videos. We build on a C3D encoder for spatio-temporal video blocks to capture short-range action patterns. Such encoded blocks are aggregated by permutation-invariant pooling to make our approach robust to varying action lengths and long-range temporal dependenci… ▽ More

    Submitted 3 August, 2020; v1 submitted 12 January, 2020; originally announced January 2020.

    Comments: ECCV2020 Spotlight

  49. arXiv:2001.01600  [pdf, other

    cs.CV

    Improving Few-shot Learning by Spatially-aware Matching and CrossTransformer

    Authors: Hongguang Zhang, Philip H. S. Torr, Piotr Koniusz

    Abstract: Current few-shot learning models capture visual object relations in the so-called meta-learning setting under a fixed-resolution input. However, such models have a limited generalization ability under the scale and location mismatch between objects, as only few samples from target classes are provided. Therefore, the lack of a mechanism to match the scale and location between pairs of compared ima… ▽ More

    Submitted 8 October, 2022; v1 submitted 6 January, 2020; originally announced January 2020.

    Comments: Asian Conference on Computer Vision 2022

  50. COLTRANE: ConvolutiOnaL TRAjectory NEtwork for Deep Map Inference

    Authors: Arian Prabowo, Piotr Koniusz, Wei Shao, Flora D. Salim

    Abstract: The process of automatic generation of a road map from GPS trajectories, called map inference, remains a challenging task to perform on a geospatial data from a variety of domains as the majority of existing studies focus on road maps in cities. Inherently, existing algorithms are not guaranteed to work on unusual geospatial sites, such as an airport tarmac, pedestrianized paths and shortcuts, or… ▽ More

    Submitted 24 September, 2019; originally announced September 2019.

    Comments: BuildSys 2019

    Journal ref: BuildSys 2019