Zum Hauptinhalt springen

Showing 1–29 of 29 results for author: Ukita, N

Searching in archive cs. Search in all archives.
.
  1. arXiv:2403.19428  [pdf, other

    cs.CV

    Burst Super-Resolution with Diffusion Models for Improving Perceptual Quality

    Authors: Kyotaro Tokoro, Kazutoshi Akita, Norimichi Ukita

    Abstract: While burst LR images are useful for improving the SR image quality compared with a single LR image, prior SR networks accepting the burst LR images are trained in a deterministic manner, which is known to produce a blurry SR image. In addition, it is difficult to perfectly align the burst LR images, making the SR image more blurry. Since such blurry images are perceptually degraded, we aim to rec… ▽ More

    Submitted 8 April, 2024; v1 submitted 28 March, 2024; originally announced March 2024.

    Comments: Accepted to IJCNN 2024 (International Joint Conference on Neural Networks)

  2. arXiv:2403.15849  [pdf, other

    cs.CV

    Inpainting-Driven Mask Optimization for Object Removal

    Authors: Kodai Shimosato, Norimichi Ukita

    Abstract: This paper proposes a mask optimization method for improving the quality of object removal using image inpainting. While many inpainting methods are trained with a set of random masks, a target for inpainting may be an object, such as a person, in many realistic scenarios. This domain gap between masks in training and inference images increases the difficulty of the inpainting task. In our method,… ▽ More

    Submitted 23 March, 2024; originally announced March 2024.

    Comments: Accepted to IJCNN 2024 (International Joint Conference on Neural Networks)

  3. arXiv:2403.15832  [pdf, other

    cs.CV

    Time-series Initialization and Conditioning for Video-agnostic Stabilization of Video Super-Resolution using Recurrent Networks

    Authors: Hiroshi Mori, Norimichi Ukita

    Abstract: A Recurrent Neural Network (RNN) for Video Super Resolution (VSR) is generally trained with randomly clipped and cropped short videos extracted from original training videos due to various challenges in learning RNNs. However, since this RNN is optimized to super-resolve short videos, VSR of long videos is degraded due to the domain gap. Our preliminary experiments reveal that such degradation cha… ▽ More

    Submitted 23 March, 2024; originally announced March 2024.

    Comments: Accepted to IJCNN 2024 (International Joint Conference on Neural Networks)

  4. arXiv:2403.15787  [pdf, other

    cs.CV

    Depth Estimation fusing Image and Radar Measurements with Uncertain Directions

    Authors: Masaya Kotani, Takeru Oba, Norimichi Ukita

    Abstract: This paper proposes a depth estimation method using radar-image fusion by addressing the uncertain vertical directions of sparse radar measurements. In prior radar-image fusion work, image features are merged with the uncertain sparse depths measured by radar through convolutional layers. This approach is disturbed by the features computed with the uncertain radar depths. Furthermore, since the fe… ▽ More

    Submitted 23 March, 2024; originally announced March 2024.

    Comments: Accepted to IJCNN 2024 (International Joint Conference on Neural Networks)

  5. arXiv:2403.08995  [pdf, other

    cs.CV

    NTIRE 2023 Image Shadow Removal Challenge Technical Report: Team IIM_TTI

    Authors: Yuki Kondo, Riku Miyata, Fuma Yasue, Taito Naruki, Norimichi Ukita

    Abstract: In this paper, we analyze and discuss ShadowFormer in preparation for the NTIRE2023 Shadow Removal Challenge [1], implementing five key improvements: image alignment, the introduction of a perceptual quality loss function, the semi-automatic annotation for shadow detection, joint learning of shadow detection and removal, and the introduction of new data augmentation technique "CutShadow" for shado… ▽ More

    Submitted 14 March, 2024; v1 submitted 13 March, 2024; originally announced March 2024.

    Comments: This version is a brief technical report submitted to the organizers, and there are still some points to be added; please wait for updates until May 2024. The code can be found here (https://github.com/Yuki-11/NTIRE2023_ShadowRemoval_IIM_TTI)

  6. arXiv:2403.02753  [pdf, other

    cs.CV

    Learning Group Activity Features Through Person Attribute Prediction

    Authors: Chihiro Nakatani, Hiroaki Kawashima, Norimichi Ukita

    Abstract: This paper proposes Group Activity Feature (GAF) learning in which features of multi-person activity are learned as a compact latent vector. Unlike prior work in which the manual annotation of group activities is required for supervised learning, our method learns the GAF through person attribute prediction without group activity annotations. By learning the whole network in an end-to-end manner s… ▽ More

    Submitted 11 March, 2024; v1 submitted 5 March, 2024; originally announced March 2024.

    Comments: Accepted to CVPR2024

  7. arXiv:2311.05041  [pdf, other

    cs.CV cs.LG

    Active Transfer Learning for Efficient Video-Specific Human Pose Estimation

    Authors: Hiromu Taketsugu, Norimichi Ukita

    Abstract: Human Pose (HP) estimation is actively researched because of its wide range of applications. However, even estimators pre-trained on large datasets may not perform satisfactorily due to a domain gap between the training and test data. To address this issue, we present our approach combining Active Learning (AL) and Transfer Learning (TL) to adapt HP estimators to individual video domains efficient… ▽ More

    Submitted 8 November, 2023; originally announced November 2023.

    Comments: 17 pages, 12 figures, Accepted by WACV 2024

    ACM Class: I.2.10; I.4.8

  8. arXiv:2310.08116  [pdf, other

    cs.RO cs.CV

    Multimodal Active Measurement for Human Mesh Recovery in Close Proximity

    Authors: Takahiro Maeda, Keisuke Takeshita, Norimichi Ukita, Kazuhito Tanaka

    Abstract: For physical human-robot interactions (pHRI), a robot needs to estimate the accurate body pose of a target person. However, in these pHRI scenarios, the robot cannot fully observe the target person's body with equipped cameras because the target person must be close to the robot for physical interaction. This close distance leads to severe truncation and occlusions and thus results in poor accurac… ▽ More

    Submitted 19 July, 2024; v1 submitted 12 October, 2023; originally announced October 2023.

  9. arXiv:2308.08824  [pdf, other

    cs.CV

    Fast Inference and Update of Probabilistic Density Estimation on Trajectory Prediction

    Authors: Takahiro Maeda, Norimichi Ukita

    Abstract: Safety-critical applications such as autonomous vehicles and social robots require fast computation and accurate probability density estimation on trajectory prediction. To address both requirements, this paper presents a new normalizing flow-based trajectory prediction model named FlowChain. FlowChain is a stack of conditional continuously-indexed flows (CIFs) that are expressive and allow analyt… ▽ More

    Submitted 17 August, 2023; originally announced August 2023.

    Comments: Accepted at ICCV2023

  10. arXiv:2308.05382  [pdf, other

    cs.CV

    Interaction-aware Joint Attention Estimation Using People Attributes

    Authors: Chihiro Nakatani, Hiroaki Kawashima, Norimichi Ukita

    Abstract: This paper proposes joint attention estimation in a single image. Different from related work in which only the gaze-related attributes of people are independently employed, (I) their locations and actions are also employed as contextual cues for weighting their attributes, and (ii) interactions among all of these attributes are explicitly modeled in our method. For the interaction modeling, we pr… ▽ More

    Submitted 10 August, 2023; originally announced August 2023.

    Comments: Accepted to ICCV2023

  11. MVA2023 Small Object Detection Challenge for Spotting Birds: Dataset, Methods, and Results

    Authors: Yuki Kondo, Norimichi Ukita, Takayuki Yamaguchi, Hao-Yu Hou, Mu-Yi Shen, Chia-Chi Hsu, En-Ming Huang, Yu-Chen Huang, Yu-Cheng Xia, Chien-Yao Wang, Chun-Yi Lee, Da Huo, Marc A. Kastner, Tingwei Liu, Yasutomo Kawanishi, Takatsugu Hirayama, Takahiro Komamizu, Ichiro Ide, Yosuke Shinya, Xinyao Liu, Guang Liang, Syusuke Yasui

    Abstract: Small Object Detection (SOD) is an important machine vision topic because (i) a variety of real-world applications require object detection for distant objects and (ii) SOD is a challenging task due to the noisy, blurred, and less-informative image appearances of small objects. This paper proposes a new SOD dataset consisting of 39,070 images including 137,121 bird instances, which is called the S… ▽ More

    Submitted 18 July, 2023; originally announced July 2023.

    Comments: This paper is included in the proceedings of the 18th International Conference on Machine Vision Applications (MVA2023). It will be officially published at a later date. Project page : https://www.mva-org.jp/mva2023/challenge

    Journal ref: 2023 18th International Conference on Machine Vision and Applications (MVA)

  12. arXiv:2306.09483  [pdf, other

    cs.CV cs.LG cs.RO

    R2-Diff: Denoising by diffusion as a refinement of retrieved motion for image-based motion prediction

    Authors: Takeru Oba, Norimichi Ukita

    Abstract: Image-based motion prediction is one of the essential techniques for robot manipulation. Among the various prediction models, we focus on diffusion models because they have achieved state-of-the-art performance in various applications. In image-based motion prediction, diffusion models stochastically predict contextually appropriate motion by gradually denoising random Gaussian noise based on the… ▽ More

    Submitted 15 June, 2023; originally announced June 2023.

    Comments: 22 pages, preprint submitted to Neurocomputing

    MSC Class: 68T40

  13. arXiv:2302.12491  [pdf, other

    cs.CV cs.AI eess.IV

    Joint Learning of Blind Super-Resolution and Crack Segmentation for Realistic Degraded Images

    Authors: Yuki Kondo, Norimichi Ukita

    Abstract: This paper proposes crack segmentation augmented by super resolution (SR) with deep neural networks. In the proposed method, a SR network is jointly trained with a binary segmentation network in an end-to-end manner. This joint learning allows the SR network to be optimized for improving segmentation results. For realistic scenarios, the SR network is extended from non-blind to blind for processin… ▽ More

    Submitted 25 February, 2024; v1 submitted 24 February, 2023; originally announced February 2023.

    Comments: Accepted to IEEE Transactions on Instrumentation and Measurement (TIM) 2024. The project page is located at https://yuki-11.github.io/CSBSR-project-page/

  14. arXiv:2302.11208  [pdf, other

    cs.CV cs.LG

    KS-DETR: Knowledge Sharing in Attention Learning for Detection Transformer

    Authors: Kaikai Zhao, Norimichi Ukita

    Abstract: Scaled dot-product attention applies a softmax function on the scaled dot-product of queries and keys to calculate weights and then multiplies the weights and values. In this work, we study how to improve the learning of scaled dot-product attention to improve the accuracy of DETR. Our method is based on the following observations: using ground truth foreground-background mask (GT Fg-Bg Mask) as a… ▽ More

    Submitted 16 March, 2023; v1 submitted 22 February, 2023; originally announced February 2023.

  15. arXiv:2302.08478  [pdf, other

    eess.IV cs.CV

    Kernelized Back-Projection Networks for Blind Super Resolution

    Authors: Tomoki Yoshida, Yuki Kondo, Takahiro Maeda, Kazutoshi Akita, Norimichi Ukita

    Abstract: Since non-blind Super Resolution (SR) fails to super-resolve Low-Resolution (LR) images degraded by arbitrary degradations, SR with the degradation model is required. However, this paper reveals that non-blind SR that is trained simply with various blur kernels exhibits comparable performance as those with the degradation model for blind SR. This result motivates us to revisit high-performance non… ▽ More

    Submitted 27 October, 2023; v1 submitted 16 February, 2023; originally announced February 2023.

    Comments: The first two authors contributed equally to this work

  16. Data-Driven Stochastic Motion Evaluation and Optimization with Image by Spatially-Aligned Temporal Encoding

    Authors: Takeru Oba, Norimichi Ukita

    Abstract: This paper proposes a probabilistic motion prediction method for long motions. The motion is predicted so that it accomplishes a task from the initial state observed in the given image. While our method evaluates the task achievability by the Energy-Based Model (EBM), previous EBMs are not designed for evaluating the consistency between different domains (i.e., image and motion in our method). Our… ▽ More

    Submitted 9 February, 2023; originally announced February 2023.

    Comments: Accepted at ICRA 2023. Font is different from the submitted paper. 8 pages, 8 figures

  17. arXiv:2208.12940  [pdf, other

    cs.CV

    Actor-identified Spatiotemporal Action Detection -- Detecting Who Is Doing What in Videos

    Authors: Fan Yang, Norimichi Ukita, Sakriani Sakti, Satoshi Nakamura

    Abstract: The success of deep learning on video Action Recognition (AR) has motivated researchers to progressively promote related tasks from the coarse level to the fine-grained level. Compared with conventional AR which only predicts an action label for the entire video, Temporal Action Detection (TAD) has been investigated for estimating the start and end time for each action in videos. Taking TAD a step… ▽ More

    Submitted 7 September, 2022; v1 submitted 27 August, 2022; originally announced August 2022.

  18. arXiv:2203.09116  [pdf, other

    cs.CV cs.LG

    MotionAug: Augmentation with Physical Correction for Human Motion Prediction

    Authors: Takahiro Maeda, Norimichi Ukita

    Abstract: This paper presents a motion data augmentation scheme incorporating motion synthesis encouraging diversity and motion correction imposing physical plausibility. This motion synthesis consists of our modified Variational AutoEncoder (VAE) and Inverse Kinematics (IK). In this VAE, our proposed sampling-near-samples method generates various valid motions even with insufficient training motion data. O… ▽ More

    Submitted 17 August, 2023; v1 submitted 17 March, 2022; originally announced March 2022.

    Comments: Accepted at CVPR2022

  19. arXiv:2106.03839  [pdf, other

    cs.CV

    NTIRE 2021 Challenge on Burst Super-Resolution: Methods and Results

    Authors: Goutam Bhat, Martin Danelljan, Radu Timofte, Kazutoshi Akita, Wooyeong Cho, Haoqiang Fan, Lanpeng Jia, Daeshik Kim, Bruno Lecouat, Youwei Li, Shuaicheng Liu, Ziluan Liu, Ziwei Luo, Takahiro Maeda, Julien Mairal, Christian Micheloni, Xuan Mo, Takeru Oba, Pavel Ostyakov, Jean Ponce, Sanghyeok Son, Jian Sun, Norimichi Ukita, Rao Muhammad Umer, Youliang Yan , et al. (3 additional authors not shown)

    Abstract: This paper reviews the NTIRE2021 challenge on burst super-resolution. Given a RAW noisy burst as input, the task in the challenge was to generate a clean RGB image with 4 times higher resolution. The challenge contained two tracks; Track 1 evaluating on synthetically generated data, and Track 2 using real-world bursts from mobile camera. In the final testing phase, 6 teams submitted results using… ▽ More

    Submitted 7 June, 2021; originally announced June 2021.

    Comments: NTIRE 2021 Burst Super-Resolution challenge report

  20. arXiv:2009.06290  [pdf, other

    cs.CV

    AIM 2020 Challenge on Video Extreme Super-Resolution: Methods and Results

    Authors: Dario Fuoli, Zhiwu Huang, Shuhang Gu, Radu Timofte, Arnau Raventos, Aryan Esfandiari, Salah Karout, Xuan Xu, Xin Li, Xin Xiong, Jinge Wang, Pablo Navarrete Michelini, Wenhao Zhang, Dongyang Zhang, Hanwei Zhu, Dan Xia, Haoyu Chen, Jinjin Gu, Zhi Zhang, Tongtong Zhao, Shanshan Zhao, Kazutoshi Akita, Norimichi Ukita, Hrishikesh P S, Densen Puthussery , et al. (1 additional authors not shown)

    Abstract: This paper reviews the video extreme super-resolution challenge associated with the AIM 2020 workshop at ECCV 2020. Common scaling factors for learned video super-resolution (VSR) do not go beyond factor 4. Missing information can be restored well in this region, especially in HR videos, where the high-frequency content mostly consists of texture details. The task in this challenge is to upscale v… ▽ More

    Submitted 14 September, 2020; originally announced September 2020.

  21. arXiv:2009.00382  [pdf, ps, other

    eess.IV cs.CV

    Image Super-Resolution using Explicit Perceptual Loss

    Authors: Tomoki Yoshida, Kazutoshi Akita, Muhammad Haris, Norimichi Ukita

    Abstract: This paper proposes an explicit way to optimize the super-resolution network for generating visually pleasing images. The previous approaches use several loss functions which is hard to interpret and has the implicit relationships to improve the perceptual score. We show how to exploit the machine learning based model which is directly trained to provide the perceptual score on generated images. I… ▽ More

    Submitted 1 September, 2020; originally announced September 2020.

    Comments: 9 pages, 5 figures

  22. arXiv:2005.01056  [pdf, other

    eess.IV cs.CV

    NTIRE 2020 Challenge on Perceptual Extreme Super-Resolution: Methods and Results

    Authors: Kai Zhang, Shuhang Gu, Radu Timofte, Taizhang Shang, Qiuju Dai, Shengchen Zhu, Tong Yang, Yandong Guo, Younghyun Jo, Sejong Yang, Seon Joo Kim, Lin Zha, Jiande Jiang, Xinbo Gao, Wen Lu, Jing Liu, Kwangjin Yoon, Taegyun Jeon, Kazutoshi Akita, Takeru Ooba, Norimichi Ukita, Zhipeng Luo, Yuehan Yao, Zhenyu Xu, Dongliang He , et al. (38 additional authors not shown)

    Abstract: This paper reviews the NTIRE 2020 challenge on perceptual extreme super-resolution with focus on proposed solutions and results. The challenge task was to super-resolve an input image with a magnification factor 16 based on a set of prior examples of low and corresponding high resolution images. The goal is to obtain a network design capable to produce high resolution results with the best percept… ▽ More

    Submitted 3 May, 2020; originally announced May 2020.

    Comments: CVPRW 2020

  23. arXiv:2003.13170  [pdf, other

    cs.CV

    Space-Time-Aware Multi-Resolution Video Enhancement

    Authors: Muhammad Haris, Greg Shakhnarovich, Norimichi Ukita

    Abstract: We consider the problem of space-time super-resolution (ST-SR): increasing spatial resolution of video frames and simultaneously interpolating frames to increase the frame rate. Modern approaches handle these axes one at a time. In contrast, our proposed model called STARnet super-resolves jointly in space and time. This allows us to leverage mutually informative relationships between time and spa… ▽ More

    Submitted 29 March, 2020; originally announced March 2020.

    Comments: To appear in CVPR2020

  24. arXiv:1906.01399  [pdf, other

    cs.CV

    Semi- and Weakly-supervised Human Pose Estimation

    Authors: Norimichi Ukita, Yusuke Uematsu

    Abstract: For human pose estimation in still images, this paper proposes three semi- and weakly-supervised learning schemes. While recent advances of convolutional neural networks improve human pose estimation using supervised training data, our focus is to explore the semi- and weakly-supervised schemes. Our proposed schemes initially learn conventional model(s) for pose estimation from a small amount of s… ▽ More

    Submitted 4 June, 2019; originally announced June 2019.

    Comments: Revised preprint submitted to CVIU

  25. arXiv:1904.05677  [pdf, other

    cs.CV

    Deep Back-Projection Networks for Single Image Super-resolution

    Authors: Muhammad Haris, Greg Shakhnarovich, Norimichi Ukita

    Abstract: Previous feed-forward architectures of recently proposed deep super-resolution networks learn the features of low-resolution inputs and the non-linear mapping from those to a high-resolution output. However, this approach does not fully address the mutual dependencies of low- and high-resolution images. We propose Deep Back-Projection Networks (DBPN), the winner of two image super-resolution chall… ▽ More

    Submitted 12 June, 2020; v1 submitted 4 April, 2019; originally announced April 2019.

    Comments: To appear in TPAMI 2020. The code is available at https://github.com/alterzero/DBPN-Pytorch arXiv admin note: substantial text overlap with arXiv:1803.02735

  26. arXiv:1903.10128  [pdf, other

    cs.CV

    Recurrent Back-Projection Network for Video Super-Resolution

    Authors: Muhammad Haris, Greg Shakhnarovich, Norimichi Ukita

    Abstract: We proposed a novel architecture for the problem of video super-resolution. We integrate spatial and temporal contexts from continuous video frames using a recurrent encoder-decoder module, that fuses multi-frame information with the more traditional, single frame super-resolution path for the target frame. In contrast to most prior work where frames are pooled together by stacking or warping, our… ▽ More

    Submitted 25 March, 2019; originally announced March 2019.

    Comments: To appear in CVPR2019

  27. arXiv:1901.09156  [pdf, ps, other

    cs.CV

    Human Pose Estimation using Motion Priors and Ensemble Models

    Authors: Norimichi Ukita

    Abstract: Human pose estimation in images and videos is one of key technologies for realizing a variety of human activity recognition tasks (e.g., human-computer interaction, gesture recognition, surveillance, and video summarization). This paper presents two types of human pose estimation methodologies; 1) 3D human pose tracking using motion priors and 2) 2D human pose estimation with ensemble modeling.

    Submitted 25 January, 2019; originally announced January 2019.

    Comments: 6 pages

    Journal ref: Presented at the 2017 International Conference on Advanced Computer Science and Information Systems (ICACSIS)

  28. arXiv:1803.11316  [pdf, other

    cs.CV

    Task-Driven Super Resolution: Object Detection in Low-resolution Images

    Authors: Muhammad Haris, Greg Shakhnarovich, Norimichi Ukita

    Abstract: We consider how image super resolution (SR) can contribute to an object detection task in low-resolution images. Intuitively, SR gives a positive impact on the object detection task. While several previous works demonstrated that this intuition is correct, SR and detector are optimized independently in these works. This paper proposes a novel framework to train a deep neural network where the SR s… ▽ More

    Submitted 29 March, 2018; originally announced March 2018.

  29. arXiv:1803.02735  [pdf, other

    cs.CV

    Deep Back-Projection Networks For Super-Resolution

    Authors: Muhammad Haris, Greg Shakhnarovich, Norimichi Ukita

    Abstract: The feed-forward architectures of recently proposed deep super-resolution networks learn representations of low-resolution inputs, and the non-linear mapping from those to high-resolution output. However, this approach does not fully address the mutual dependencies of low- and high-resolution images. We propose Deep Back-Projection Networks (DBPN), that exploit iterative up- and down-sampling laye… ▽ More

    Submitted 7 March, 2018; originally announced March 2018.

    Comments: To appear in CVPR2018