Zum Hauptinhalt springen

Showing 1–22 of 22 results for author: Vosselman, G

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.14920  [pdf, other

    cs.CV

    RoIPoly: Vectorized Building Outline Extraction Using Vertex and Logit Embeddings

    Authors: Weiqin Jiao, Hao Cheng, Claudio Persello, George Vosselman

    Abstract: Polygonal building outlines are crucial for geographic and cartographic applications. The existing approaches for outline extraction from aerial or satellite imagery are typically decomposed into subtasks, e.g., building masking and vectorization, or treat this task as a sequence-to-sequence prediction of ordered vertices. The former lacks efficiency, and the latter often generates redundant verti… ▽ More

    Submitted 20 July, 2024; originally announced July 2024.

  2. arXiv:2407.14912  [pdf, other

    cs.CV

    PolyR-CNN: R-CNN for end-to-end polygonal building outline extraction

    Authors: Weiqin Jiao, Claudio Persello, George Vosselman

    Abstract: Polygonal building outline extraction has been a research focus in recent years. Most existing methods have addressed this challenging task by decomposing it into several subtasks and employing carefully designed architectures. Despite their accuracy, such pipelines often introduce inefficiencies during training and inference. This paper presents an end-to-end framework, denoted as PolyR-CNN, whic… ▽ More

    Submitted 20 July, 2024; originally announced July 2024.

  3. arXiv:2406.11472  [pdf, other

    cs.CV

    Learning from Exemplars for Interactive Image Segmentation

    Authors: Kun Li, Hao Cheng, George Vosselman, Michael Ying Yang

    Abstract: Interactive image segmentation enables users to interact minimally with a machine, facilitating the gradual refinement of the segmentation mask for a target of interest. Previous studies have demonstrated impressive performance in extracting a single target mask through interactive segmentation. However, the information cues of previously interacted objects have been overlooked in the existing met… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: Under review

  4. arXiv:2403.12848  [pdf, other

    cs.CV

    Planner3D: LLM-enhanced graph prior meets 3D indoor scene explicit regularization

    Authors: Yao Wei, Martin Renqiang Min, George Vosselman, Li Erran Li, Michael Ying Yang

    Abstract: Compositional 3D scene synthesis has diverse applications across a spectrum of industries such as robotics, films, and video games, as it closely mirrors the complexity of real-world multi-object environments. Conventional works typically employ shape retrieval based frameworks which naturally suffer from limited shape diversity. Recent progresses have been made in object shape generation with gen… ▽ More

    Submitted 26 August, 2024; v1 submitted 19 March, 2024; originally announced March 2024.

    Comments: 16 pages, 10 figures

  5. arXiv:2402.03896  [pdf, other

    cs.CV

    Convincing Rationales for Visual Question Answering Reasoning

    Authors: Kun Li, George Vosselman, Michael Ying Yang

    Abstract: Visual Question Answering (VQA) is a challenging task of predicting the answer to a question about the content of an image. It requires deep understanding of both the textual question and visual image. Prior works directly evaluate the answering models by simply calculating the accuracy of the predicted answers. However, the inner reasoning behind the prediction is disregarded in such a "black box… ▽ More

    Submitted 6 February, 2024; originally announced February 2024.

    Comments: under review

  6. arXiv:2309.00158  [pdf, other

    cs.CV

    BuilDiff: 3D Building Shape Generation using Single-Image Conditional Point Cloud Diffusion Models

    Authors: Yao Wei, George Vosselman, Michael Ying Yang

    Abstract: 3D building generation with low data acquisition costs, such as single image-to-3D, becomes increasingly important. However, most of the existing single image-to-3D building creation works are restricted to those images with specific viewing angles, hence they are difficult to scale to general-view images that commonly appear in practical cases. To fill this gap, we propose a novel 3D building sha… ▽ More

    Submitted 31 August, 2023; originally announced September 2023.

    Comments: 10 pages, 6 figures, accepted to ICCVW2023

  7. arXiv:2308.05515  [pdf

    cs.RO cs.AI

    Mono-hydra: Real-time 3D scene graph construction from monocular camera input with IMU

    Authors: U. V. B. L. Udugama, G. Vosselman, F. Nex

    Abstract: The ability of robots to autonomously navigate through 3D environments depends on their comprehension of spatial concepts, ranging from low-level geometry to high-level semantics, such as objects, places, and buildings. To enable such comprehension, 3D scene graphs have emerged as a robust tool for representing the environment as a layered graph of concepts and their relationships. However, buildi… ▽ More

    Submitted 10 August, 2023; originally announced August 2023.

    Comments: 7 pages, 5 figures, GSW 2023 conference paper

  8. arXiv:2307.02280  [pdf, other

    cs.CV

    Interactive Image Segmentation with Cross-Modality Vision Transformers

    Authors: Kun Li, George Vosselman, Michael Ying Yang

    Abstract: Interactive image segmentation aims to segment the target from the background with the manual guidance, which takes as input multimodal data such as images, clicks, scribbles, and bounding boxes. Recently, vision transformers have achieved a great success in several downstream visual tasks, and a few efforts have been made to bring this powerful architecture to interactive segmentation task. Howev… ▽ More

    Submitted 5 July, 2023; originally announced July 2023.

    Comments: 16 pages

  9. arXiv:2303.10386  [pdf, other

    cs.RO cs.CV

    Channel-Aware Distillation Transformer for Depth Estimation on Nano Drones

    Authors: Ning Zhang, Francesco Nex, George Vosselman, Norman Kerle

    Abstract: Autonomous navigation of drones using computer vision has achieved promising performance. Nano-sized drones based on edge computing platforms are lightweight, flexible, and cheap, thus suitable for exploring narrow spaces. However, due to their extremely limited computing power and storage, vision algorithms designed for high-performance GPU platforms cannot be used for nano drones. To address thi… ▽ More

    Submitted 18 March, 2023; originally announced March 2023.

  10. arXiv:2301.09460  [pdf, other

    cs.CV

    HRVQA: A Visual Question Answering Benchmark for High-Resolution Aerial Images

    Authors: Kun Li, George Vosselman, Michael Ying Yang

    Abstract: Visual question answering (VQA) is an important and challenging multimodal task in computer vision. Recently, a few efforts have been made to bring VQA task to aerial images, due to its potential real-world applications in disaster monitoring, urban planning, and digital earth product generation. However, not only the huge variation in the appearance, scale and orientation of the concepts in aeria… ▽ More

    Submitted 23 January, 2023; originally announced January 2023.

  11. arXiv:2211.13202  [pdf, other

    cs.CV

    Lite-Mono: A Lightweight CNN and Transformer Architecture for Self-Supervised Monocular Depth Estimation

    Authors: Ning Zhang, Francesco Nex, George Vosselman, Norman Kerle

    Abstract: Self-supervised monocular depth estimation that does not require ground truth for training has attracted attention in recent years. It is of high interest to design lightweight but effective models so that they can be deployed on edge devices. Many existing architectures benefit from using heavier backbones at the expense of model sizes. This paper achieves comparable results with a lightweight ar… ▽ More

    Submitted 15 March, 2023; v1 submitted 23 November, 2022; originally announced November 2022.

    Comments: Accepted to CVPR2023

  12. arXiv:2210.04072  [pdf, other

    cs.CV

    Flow-based GAN for 3D Point Cloud Generation from a Single Image

    Authors: Yao Wei, George Vosselman, Michael Ying Yang

    Abstract: Generating a 3D point cloud from a single 2D image is of great importance for 3D scene understanding applications. To reconstruct the whole 3D shape of the object shown in the image, the existing deep learning based approaches use either explicit or implicit generative modeling of point clouds, which, however, suffer from limited quality. In this work, we aim to alleviate this issue by introducing… ▽ More

    Submitted 8 October, 2022; originally announced October 2022.

    Comments: 13 pages, 5 figures, accepted to BMVC2022

  13. arXiv:2102.03099  [pdf, other

    cs.CV

    Bidirectional Multi-scale Attention Networks for Semantic Segmentation of Oblique UAV Imagery

    Authors: Ye Lyu, George Vosselman, Gui-Song Xia, Michael Ying Yang

    Abstract: Semantic segmentation for aerial platforms has been one of the fundamental scene understanding task for the earth observation. Most of the semantic segmentation research focused on scenes captured in nadir view, in which objects have relatively smaller scale variation compared with scenes captured in oblique view. The huge scale variation of objects in oblique images limits the performance of deep… ▽ More

    Submitted 5 February, 2021; originally announced February 2021.

  14. arXiv:2012.10192  [pdf

    cs.CV

    LGENet: Local and Global Encoder Network for Semantic Segmentation of Airborne Laser Scanning Point Clouds

    Authors: Yaping Lin, George Vosselman, Yanpeng Cao, Michael Ying Yang

    Abstract: Interpretation of Airborne Laser Scanning (ALS) point clouds is a critical procedure for producing various geo-information products like 3D city models, digital terrain models and land use maps. In this paper, we present a local and global encoder network (LGENet) for semantic segmentation of ALS point clouds. Adapting the KPConv network, we first extract features by both 2D and 3D point convoluti… ▽ More

    Submitted 18 December, 2020; originally announced December 2020.

    Comments: Submitted to ISPRS Journal of Photogrammetry and Remote Sensing

  15. arXiv:2003.00981  [pdf, other

    cs.CV

    Plug & Play Convolutional Regression Tracker for Video Object Detection

    Authors: Ye Lyu, Michael Ying Yang, George Vosselman, Gui-Song Xia

    Abstract: Video object detection targets to simultaneously localize the bounding boxes of the objects and identify their classes in a given video. One challenge for video object detection is to consistently detect all objects across the whole video. As the appearance of objects may deteriorate in some frames, features or detections from the other frames are commonly used to enhance the prediction. In this p… ▽ More

    Submitted 2 March, 2020; originally announced March 2020.

  16. arXiv:1910.00032  [pdf, other

    cs.CV

    LIP: Learning Instance Propagation for Video Object Segmentation

    Authors: Ye Lyu, George Vosselman, Gui-Song Xia, Michael Ying Yang

    Abstract: In recent years, the task of segmenting foreground objects from background in a video, i.e. video object segmentation (VOS), has received considerable attention. In this paper, we propose a single end-to-end trainable deep neural network, convolutional gated recurrent Mask-RCNN, for tackling the semi-supervised VOS task. We take advantage of both the instance segmentation network (Mask-RCNN) and t… ▽ More

    Submitted 30 September, 2019; originally announced October 2019.

    Comments: ICCVW19

  17. arXiv:1904.12586  [pdf

    cs.CV

    Robust object extraction from remote sensing data

    Authors: Sophie Crommelinck, Mila Koeva, Michael Ying Yang, George Vosselman

    Abstract: The extraction of object outlines has been a research topic during the last decades. In spite of advances in photogrammetry, remote sensing and computer vision, this task remains challenging due to object and data complexity. The development of object extraction approaches is promoted through publically available benchmark datasets and evaluation frameworks. Many aspects of performance evaluation… ▽ More

    Submitted 3 April, 2019; originally announced April 2019.

    Comments: unpublished study (15 pages)

  18. arXiv:1904.03692  [pdf, other

    cs.CV

    Unsupervised Domain Adaptation for Multispectral Pedestrian Detection

    Authors: Dayan Guan, Xing Luo, Yanpeng Cao, Jiangxin Yang, Yanlong Cao, George Vosselman, Michael Ying Yang

    Abstract: Multimodal information (e.g., visible and thermal) can generate robust pedestrian detections to facilitate around-the-clock computer vision applications, such as autonomous driving and video surveillance. However, it still remains a crucial challenge to train a reliable detector working well in different multispectral pedestrian datasets without manual annotations. In this paper, we propose a nove… ▽ More

    Submitted 7 April, 2019; originally announced April 2019.

  19. arXiv:1810.10438  [pdf, other

    cs.CV

    UAVid: A Semantic Segmentation Dataset for UAV Imagery

    Authors: Ye Lyu, George Vosselman, Guisong Xia, Alper Yilmaz, Michael Ying Yang

    Abstract: Semantic segmentation has been one of the leading research interests in computer vision recently. It serves as a perception foundation for many fields, such as robotics and autonomous driving. The fast development of semantic segmentation attributes enormously to the large scale datasets, especially for the deep learning related methods. There already exist several semantic segmentation datasets f… ▽ More

    Submitted 18 May, 2020; v1 submitted 24 October, 2018; originally announced October 2018.

    Comments: Accepted by ISPRS Journal of Photogrammetry and Remote Sensing

  20. arXiv:1807.09562  [pdf, other

    cs.CV

    Change Detection between Multimodal Remote Sensing Data Using Siamese CNN

    Authors: Zhenchao Zhang, George Vosselman, Markus Gerke, Devis Tuia, Michael Ying Yang

    Abstract: Detecting topographic changes in the urban environment has always been an important task for urban planning and monitoring. In practice, remote sensing data are often available in different modalities and at different time epochs. Change detection between multimodal data can be very challenging since the data show different characteristics. Given 3D laser scanning point clouds and 2D imagery from… ▽ More

    Submitted 25 July, 2018; originally announced July 2018.

  21. arXiv:1807.09546  [pdf

    cs.CV

    Patch-based Evaluation of Dense Image Matching Quality

    Authors: Zhenchao Zhang, Markus Gerke, George Vosselman, Michael Ying Yang

    Abstract: Airborne laser scanning and photogrammetry are two main techniques to obtain 3D data representing the object surface. Due to the high cost of laser scanning, we want to explore the potential of using point clouds derived by dense image matching (DIM), as effective alternatives to laser scanning data. We present a framework to evaluate point clouds from dense image matching and derived Digital Surf… ▽ More

    Submitted 25 July, 2018; originally announced July 2018.

    Comments: 16 pages

    Journal ref: International Journal of Applied Earth Observation and Geoinformation, 2018

  22. arXiv:1709.01813  [pdf

    cs.CV

    Towards Automated Cadastral Boundary Delineation from UAV Data

    Authors: Sophie Crommelinck, Michael Ying Yang, Mila Koeva, Markus Gerke, Rohan Bennett, George Vosselman

    Abstract: Unmanned aerial vehicles (UAV) are evolving as an alternative tool to acquire land tenure data. UAVs can capture geospatial data at high quality and resolution in a cost-effective, transparent and flexible manner, from which visible land parcel boundaries, i.e., cadastral boundaries are delineable. This delineation is to no extent automated, even though physical objects automatically retrievable t… ▽ More

    Submitted 6 September, 2017; originally announced September 2017.

    Comments: Report on current state (August 2017) of PhD work of first author. Further info: https://its4land.com/automate-it-wp5/