Skip to main content

Showing 1–18 of 18 results for author: Shimada, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.17988  [pdf, other

    cs.CV

    DICE: End-to-end Deformation Capture of Hand-Face Interactions from a Single Image

    Authors: Qingxuan Wu, Zhiyang Dou, Sirui Xu, Soshi Shimada, Chen Wang, Zhengming Yu, Yuan Liu, Cheng Lin, Zeyu Cao, Taku Komura, Vladislav Golyanik, Christian Theobalt, Wenping Wang, Lingjie Liu

    Abstract: Reconstructing 3D hand-face interactions with deformations from a single image is a challenging yet crucial task with broad applications in AR, VR, and gaming. The challenges stem from self-occlusions during single-view hand-face interactions, diverse spatial relationships between hands and face, complex deformations, and the ambiguity of the single-view setting. The first and only method for hand… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

    Comments: 23 pages, 9 figures, 3 tables

  2. arXiv:2312.14929  [pdf, other

    cs.CV cs.GR

    MACS: Mass Conditioned 3D Hand and Object Motion Synthesis

    Authors: Soshi Shimada, Franziska Mueller, Jan Bednarik, Bardia Doosti, Bernd Bickel, Danhang Tang, Vladislav Golyanik, Jonathan Taylor, Christian Theobalt, Thabo Beeler

    Abstract: The physical properties of an object, such as mass, significantly affect how we manipulate it with our hands. Surprisingly, this aspect has so far been neglected in prior work on 3D motion synthesis. To improve the naturalness of the synthesized 3D hand object motions, this work proposes MACS the first MAss Conditioned 3D hand and object motion Synthesis approach. Our approach is based on cascaded… ▽ More

    Submitted 22 December, 2023; originally announced December 2023.

  3. arXiv:2309.16670  [pdf, other

    cs.CV cs.GR cs.HC

    Decaf: Monocular Deformation Capture for Face and Hand Interactions

    Authors: Soshi Shimada, Vladislav Golyanik, Patrick Pérez, Christian Theobalt

    Abstract: Existing methods for 3D tracking from monocular RGB videos predominantly consider articulated and rigid objects. Modelling dense non-rigid object deformations in this setting remained largely unaddressed so far, although such effects can improve the realism of the downstream applications such as AR/VR and avatar communications. This is due to the severe ill-posedness of the monocular view setting… ▽ More

    Submitted 13 October, 2023; v1 submitted 28 September, 2023; originally announced September 2023.

  4. arXiv:2208.08439  [pdf, other

    cs.CV

    MoCapDeform: Monocular 3D Human Motion Capture in Deformable Scenes

    Authors: Zhi Li, Soshi Shimada, Bernt Schiele, Christian Theobalt, Vladislav Golyanik

    Abstract: 3D human motion capture from monocular RGB images respecting interactions of a subject with complex and possibly deformable environments is a very challenging, ill-posed and under-explored problem. Existing methods address it only weakly and do not model possible surface deformations often occurring when humans interact with scene surfaces. In contrast, this paper proposes MoCapDeform, i.e., a new… ▽ More

    Submitted 17 August, 2022; originally announced August 2022.

    Comments: 11 pages, 8 figures, 3 tables; project page: https://4dqv.mpi-inf.mpg.de/MoCapDeform/

    Journal ref: International Conference on 3D Vision 2022 (Oral)

  5. arXiv:2208.01633  [pdf, other

    cs.CV

    UnrealEgo: A New Dataset for Robust Egocentric 3D Human Motion Capture

    Authors: Hiroyasu Akada, Jian Wang, Soshi Shimada, Masaki Takahashi, Christian Theobalt, Vladislav Golyanik

    Abstract: We present UnrealEgo, i.e., a new large-scale naturalistic dataset for egocentric 3D human pose estimation. UnrealEgo is based on an advanced concept of eyeglasses equipped with two fisheye cameras that can be used in unconstrained environments. We design their virtual prototype and attach them to 3D human models for stereo view capture. We next generate a large corpus of human motions. As a conse… ▽ More

    Submitted 2 August, 2022; originally announced August 2022.

    Comments: 21 pages, 10 figures, 10 tables; project page: https://4dqv.mpi-inf.mpg.de/UnrealEgo/

    Journal ref: European Conference on Computer Vision (ECCV) 2022

  6. arXiv:2206.08368  [pdf, other

    cs.CV

    Unbiased 4D: Monocular 4D Reconstruction with a Neural Deformation Model

    Authors: Erik C. M. Johnson, Marc Habermann, Soshi Shimada, Vladislav Golyanik, Christian Theobalt

    Abstract: Capturing general deforming scenes from monocular RGB video is crucial for many computer graphics and vision applications. However, current approaches suffer from drawbacks such as struggling with large scene deformations, inaccurate shape completion or requiring 2D point tracks. In contrast, our method, Ub4D, handles large deformations, performs shape completion in occluded regions, and can opera… ▽ More

    Submitted 4 May, 2023; v1 submitted 16 June, 2022; originally announced June 2022.

    Comments: 26 pages, 17 figures, 8 tables

  7. arXiv:2205.05677  [pdf, other

    cs.CV cs.GR cs.HC

    HULC: 3D Human Motion Capture with Pose Manifold Sampling and Dense Contact Guidance

    Authors: Soshi Shimada, Vladislav Golyanik, Zhi Li, Patrick Pérez, Weipeng Xu, Christian Theobalt

    Abstract: Marker-less monocular 3D human motion capture (MoCap) with scene interactions is a challenging research topic relevant for extended reality, robotics and virtual avatar generation. Due to the inherent depth ambiguity of monocular settings, 3D motions captured with existing methods often contain severe artefacts such as incorrect body-scene inter-penetrations, jitter and body floating. To tackle th… ▽ More

    Submitted 26 July, 2022; v1 submitted 11 May, 2022; originally announced May 2022.

  8. arXiv:2203.08528  [pdf, other

    cs.GR

    Physical Inertial Poser (PIP): Physics-aware Real-time Human Motion Tracking from Sparse Inertial Sensors

    Authors: Xinyu Yi, Yuxiao Zhou, Marc Habermann, Soshi Shimada, Vladislav Golyanik, Christian Theobalt, Feng Xu

    Abstract: Motion capture from sparse inertial sensors has shown great potential compared to image-based approaches since occlusions do not lead to a reduced tracking quality and the recording space is not restricted to be within the viewing frustum of the camera. However, capturing the motion and global position only from a sparse set of inertial sensors is inherently ambiguous and challenging. In consequen… ▽ More

    Submitted 16 March, 2022; v1 submitted 16 March, 2022; originally announced March 2022.

    Comments: Accepted by CVPR 2022 with 3 strong accepts. Project page: https://xinyu-yi.github.io/PIP/

  9. arXiv:2108.08844  [pdf, other

    cs.CV

    Gravity-Aware Monocular 3D Human-Object Reconstruction

    Authors: Rishabh Dabral, Soshi Shimada, Arjun Jain, Christian Theobalt, Vladislav Golyanik

    Abstract: This paper proposes GraviCap, i.e., a new approach for joint markerless 3D human motion capture and object trajectory estimation from monocular RGB videos. We focus on scenes with objects partially observed during a free flight. In contrast to existing monocular methods, we can recover scale, object trajectories as well as human bone lengths in meters and the ground plane's orientation, thanks to… ▽ More

    Submitted 19 August, 2021; originally announced August 2021.

    Comments: 12 pages, six figures, five tables; project webpage: http://4dqv.mpi-inf.mpg.de/GraviCap/

    Journal ref: International Conference on Computer Vision (ICCV) 2021

  10. arXiv:2107.01205  [pdf, other

    cs.CV

    HandVoxNet++: 3D Hand Shape and Pose Estimation using Voxel-Based Neural Networks

    Authors: Jameel Malik, Soshi Shimada, Ahmed Elhayek, Sk Aziz Ali, Christian Theobalt, Vladislav Golyanik, Didier Stricker

    Abstract: 3D hand shape and pose estimation from a single depth map is a new and challenging computer vision problem with many applications. Existing methods addressing it directly regress hand meshes via 2D convolutional neural networks, which leads to artefacts due to perspective distortions in the images. To address the limitations of the existing methods, we develop HandVoxNet++, i.e., a voxel-based dee… ▽ More

    Submitted 5 December, 2021; v1 submitted 2 July, 2021; originally announced July 2021.

    Comments: 13 pages, 6 tables, 7 figures; project webpage: http://4dqv.mpi-inf.mpg.de/HandVoxNet++/. arXiv admin note: text overlap with arXiv:2004.01588

    Journal ref: IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2021

  11. arXiv:2106.11308  [pdf, other

    cs.CV

    Fast Simultaneous Gravitational Alignment of Multiple Point Sets

    Authors: Vladislav Golyanik, Soshi Shimada, Christian Theobalt

    Abstract: The problem of simultaneous rigid alignment of multiple unordered point sets which is unbiased towards any of the inputs has recently attracted increasing interest, and several reliable methods have been newly proposed. While being remarkably robust towards noise and clustered outliers, current approaches require sophisticated initialisation schemes and do not scale well to large point sets. This… ▽ More

    Submitted 21 June, 2021; originally announced June 2021.

    Comments: Project webpage: http://gvv.mpi-inf.mpg.de/projects/MBGA/

    Journal ref: 3DV 2020

  12. arXiv:2105.01057  [pdf, other

    cs.CV cs.GR cs.HC

    Neural Monocular 3D Human Motion Capture with Physical Awareness

    Authors: Soshi Shimada, Vladislav Golyanik, Weipeng Xu, Patrick Pérez, Christian Theobalt

    Abstract: We present a new trainable system for physically plausible markerless 3D human motion capture, which achieves state-of-the-art results in a broad range of challenging scenarios. Unlike most neural methods for human motion capture, our approach, which we dub physionical, is aware of physical and environmental constraints. It combines in a fully differentiable way several key innovations, i.e., 1. a… ▽ More

    Submitted 3 May, 2021; originally announced May 2021.

  13. arXiv:2008.08880  [pdf, other

    cs.CV cs.GR

    PhysCap: Physically Plausible Monocular 3D Motion Capture in Real Time

    Authors: Soshi Shimada, Vladislav Golyanik, Weipeng Xu, Christian Theobalt

    Abstract: Marker-less 3D human motion capture from a single colour camera has seen significant progress. However, it is a very challenging and severely ill-posed problem. In consequence, even the most accurate state-of-the-art approaches have significant limitations. Purely kinematic formulations on the basis of individual joints or skeletons, and the frequent frame-wise reconstruction in state-of-the-art m… ▽ More

    Submitted 9 December, 2020; v1 submitted 20 August, 2020; originally announced August 2020.

    Comments: 16 pages, 11 figures

  14. arXiv:2004.04023  [pdf, other

    cs.CV

    A Deep Learning Approach for Determining Effects of Tuta Absoluta in Tomato Plants

    Authors: Denis P. Rubanga, Loyani K. Loyani, Mgaya Richard, Sawahiko Shimada

    Abstract: Early quantification of Tuta absoluta pest's effects in tomato plants is a very important factor in controlling and preventing serious damages of the pest. The invasion of Tuta absoluta is considered a major threat to tomato production causing heavy loss ranging from 80 to 100 percent when not properly managed. Therefore, real-time and early quantification of tomato leaf miner Tuta absoluta, can p… ▽ More

    Submitted 8 April, 2020; originally announced April 2020.

    Comments: Paper presented at the ICLR 2020 Workshop on Computer Vision for Agriculture (CV4A)

  15. arXiv:2004.01588  [pdf, other

    cs.CV

    HandVoxNet: Deep Voxel-Based Network for 3D Hand Shape and Pose Estimation from a Single Depth Map

    Authors: Jameel Malik, Ibrahim Abdelaziz, Ahmed Elhayek, Soshi Shimada, Sk Aziz Ali, Vladislav Golyanik, Christian Theobalt, Didier Stricker

    Abstract: 3D hand shape and pose estimation from a single depth map is a new and challenging computer vision problem with many applications. The state-of-the-art methods directly regress 3D hand meshes from 2D depth images via 2D convolutional neural networks, which leads to artefacts in the estimations due to perspective distortions in the images. In contrast, we propose a novel architecture with 3D convol… ▽ More

    Submitted 3 April, 2020; originally announced April 2020.

    Comments: 10 pages, 8 figures, 5 tables, CVPR

  16. arXiv:1907.10367  [pdf, other

    cs.CV cs.CG

    DispVoxNets: Non-Rigid Point Set Alignment with Supervised Learning Proxies

    Authors: Soshi Shimada, Vladislav Golyanik, Edgar Tretschk, Didier Stricker, Christian Theobalt

    Abstract: We introduce a supervised-learning framework for non-rigid point set alignment of a new kind - Displacements on Voxels Networks (DispVoxNets) - which abstracts away from the point set representation and regresses 3D displacement fields on regularly sampled proxy 3D voxel grids. Thanks to recently released collections of deformable objects with known intra-state correspondences, DispVoxNets learn a… ▽ More

    Submitted 6 August, 2019; v1 submitted 24 July, 2019; originally announced July 2019.

  17. arXiv:1904.12144  [pdf, other

    cs.CV

    IsMo-GAN: Adversarial Learning for Monocular Non-Rigid 3D Reconstruction

    Authors: Soshi Shimada, Vladislav Golyanik, Christian Theobalt, Didier Stricker

    Abstract: The majority of the existing methods for non-rigid 3D surface regression from monocular 2D images require an object template or point tracks over multiple frames as an input, and are still far from real-time processing rates. In this work, we present the Isometry-Aware Monocular Generative Adversarial Network (IsMo-GAN) - an approach for direct 3D reconstruction from a single image, trained for th… ▽ More

    Submitted 21 June, 2021; v1 submitted 27 April, 2019; originally announced April 2019.

    Comments: 13 pages, 11 figures, 4 tables, 6 sections, 73 references

  18. arXiv:1803.10193  [pdf, other

    cs.CV

    HDM-Net: Monocular Non-Rigid 3D Reconstruction with Learned Deformation Model

    Authors: Vladislav Golyanik, Soshi Shimada, Kiran Varanasi, Didier Stricker

    Abstract: Monocular dense 3D reconstruction of deformable objects is a hard ill-posed problem in computer vision. Current techniques either require dense correspondences and rely on motion and deformation cues, or assume a highly accurate reconstruction (referred to as a template) of at least a single frame given in advance and operate in the manner of non-rigid tracking. Accurate computation of dense point… ▽ More

    Submitted 5 August, 2019; v1 submitted 27 March, 2018; originally announced March 2018.

    Comments: 9 pages, 9 figures