Skip to main content

Showing 1–33 of 33 results for author: Ouyang, H

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.10288  [pdf, other

    cs.CL cs.AI

    Timeline-based Sentence Decomposition with In-Context Learning for Temporal Fact Extraction

    Authors: Jianhao Chen, Haoyuan Ouyang, Junyang Ren, Wentao Ding, Wei Hu, Yuzhong Qu

    Abstract: Facts extraction is pivotal for constructing knowledge graphs. Recently, the increasing demand for temporal facts in downstream tasks has led to the emergence of the task of temporal fact extraction. In this paper, we specifically address the extraction of temporal facts from natural language text. Previous studies fail to handle the challenge of establishing time-to-fact correspondences in comple… ▽ More

    Submitted 18 June, 2024; v1 submitted 16 May, 2024; originally announced May 2024.

    Comments: Accepted to ACL2024 main conference

  2. arXiv:2404.11614  [pdf, other

    cs.CV

    Dynamic Typography: Bringing Text to Life via Video Diffusion Prior

    Authors: Zichen Liu, Yihao Meng, Hao Ouyang, Yue Yu, Bolin Zhao, Daniel Cohen-Or, Huamin Qu

    Abstract: Text animation serves as an expressive medium, transforming static communication into dynamic experiences by infusing words with motion to evoke emotions, emphasize meanings, and construct compelling narratives. Crafting animations that are semantically aware poses significant challenges, demanding expertise in graphic design and animation. We present an automated text animation scheme, termed "Dy… ▽ More

    Submitted 18 April, 2024; v1 submitted 17 April, 2024; originally announced April 2024.

    Comments: Our demo page is available at: https://animate-your-word.github.io/demo/

  3. arXiv:2404.11613  [pdf, other

    cs.CV

    InFusion: Inpainting 3D Gaussians via Learning Depth Completion from Diffusion Prior

    Authors: Zhiheng Liu, Hao Ouyang, Qiuyu Wang, Ka Leong Cheng, Jie Xiao, Kai Zhu, Nan Xue, Yu Liu, Yujun Shen, Yang Cao

    Abstract: 3D Gaussians have recently emerged as an efficient representation for novel view synthesis. This work studies its editability with a particular focus on the inpainting task, which aims to supplement an incomplete set of 3D Gaussians with additional points for visually harmonious rendering. Compared to 2D inpainting, the crux of inpainting 3D Gaussians is to figure out the rendering-relevant proper… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

    Comments: Project page: https://johanan528.github.io/Infusion

  4. arXiv:2402.16370  [pdf, other

    cs.CV

    DEYO: DETR with YOLO for End-to-End Object Detection

    Authors: Haodong Ouyang

    Abstract: The training paradigm of DETRs is heavily contingent upon pre-training their backbone on the ImageNet dataset. However, the limited supervisory signals provided by the image classification task and one-to-one matching strategy result in an inadequately pre-trained neck for DETRs. Additionally, the instability of matching in the early stages of training engenders inconsistencies in the optimization… ▽ More

    Submitted 26 February, 2024; originally announced February 2024.

    Comments: arXiv admin note: text overlap with arXiv:2309.11851

  5. arXiv:2402.14000  [pdf, other

    cs.CV

    Real-time 3D-aware Portrait Editing from a Single Image

    Authors: Qingyan Bai, Zifan Shi, Yinghao Xu, Hao Ouyang, Qiuyu Wang, Ceyuan Yang, Xuan Wang, Gordon Wetzstein, Yujun Shen, Qifeng Chen

    Abstract: This work presents 3DPE, a practical method that can efficiently edit a face image following given prompts, like reference images or text descriptions, in a 3D-aware manner. To this end, a lightweight module is distilled from a 3D portrait generator and a text-to-image model, which provide prior knowledge of face geometry and superior editing capability, respectively. Such a design brings two comp… ▽ More

    Submitted 18 July, 2024; v1 submitted 21 February, 2024; originally announced February 2024.

    Comments: ECCV 2024 camera-ready version. Project page: https://github.com/EzioBy/3dpe

  6. arXiv:2312.11053  [pdf, other

    cs.AI cs.DB

    Conflict Detection for Temporal Knowledge Graphs:A Fast Constraint Mining Algorithm and New Benchmarks

    Authors: Jianhao Chen, Junyang Ren, Wentao Ding, Haoyuan Ouyang, Wei Hu, Yuzhong Qu

    Abstract: Temporal facts, which are used to describe events that occur during specific time periods, have become a topic of increased interest in the field of knowledge graph (KG) research. In terms of quality management, the introduction of time restrictions brings new challenges to maintaining the temporal consistency of KGs. Previous studies rely on manually enumerated temporal constraints to detect conf… ▽ More

    Submitted 18 December, 2023; originally announced December 2023.

  7. arXiv:2312.09242  [pdf, other

    cs.CV cs.GR

    Text2Immersion: Generative Immersive Scene with 3D Gaussians

    Authors: Hao Ouyang, Kathryn Heal, Stephen Lombardi, Tiancheng Sun

    Abstract: We introduce Text2Immersion, an elegant method for producing high-quality 3D immersive scenes from text prompts. Our proposed pipeline initiates by progressively generating a Gaussian cloud using pre-trained 2D diffusion and depth estimation models. This is followed by a refining stage on the Gaussian cloud, interpolating and refining it to enhance the details of the generated scene. Distinct from… ▽ More

    Submitted 14 December, 2023; originally announced December 2023.

    Comments: Project page: https://ken-ouyang.github.io/text2immersion/index.html

  8. arXiv:2312.06657  [pdf, other

    cs.CV

    Learning Naturally Aggregated Appearance for Efficient 3D Editing

    Authors: Ka Leong Cheng, Qiuyu Wang, Zifan Shi, Kecheng Zheng, Yinghao Xu, Hao Ouyang, Qifeng Chen, Yujun Shen

    Abstract: Neural radiance fields, which represent a 3D scene as a color field and a density field, have demonstrated great progress in novel view synthesis yet are unfavorable for editing due to the implicitness. In view of such a deficiency, we propose to replace the color field with an explicit 2D appearance aggregation, also called canonical image, with which users can easily customize their 3D editing v… ▽ More

    Submitted 11 December, 2023; originally announced December 2023.

    Comments: Project Webpage: https://felixcheng97.github.io/AGAP/, Code: https://github.com/felixcheng97/AGAP

  9. arXiv:2312.01739  [pdf, other

    cs.LG cs.AI

    Divide-and-Conquer Strategy for Large-Scale Dynamic Bayesian Network Structure Learning

    Authors: Hui Ouyang, Cheng Chen, Ke Tang

    Abstract: Dynamic Bayesian Networks (DBNs), renowned for their interpretability, have become increasingly vital in representing complex stochastic processes in various domains such as gene expression analysis, healthcare, and traffic prediction. Structure learning of DBNs from data is challenging, particularly for datasets with thousands of variables. Most current algorithms for DBN structure learning are a… ▽ More

    Submitted 4 December, 2023; originally announced December 2023.

  10. arXiv:2309.11851  [pdf, other

    cs.CV

    DEYOv3: DETR with YOLO for Real-time Object Detection

    Authors: Haodong Ouyang

    Abstract: Recently, end-to-end object detectors have gained significant attention from the research community due to their outstanding performance. However, DETR typically relies on supervised pretraining of the backbone on ImageNet, which limits the practical application of DETR and the design of the backbone, affecting the model's potential generalization ability. In this paper, we propose a new training… ▽ More

    Submitted 22 September, 2023; v1 submitted 21 September, 2023; originally announced September 2023.

    Comments: Work in progress

  11. arXiv:2308.07926  [pdf, other

    cs.CV

    CoDeF: Content Deformation Fields for Temporally Consistent Video Processing

    Authors: Hao Ouyang, Qiuyu Wang, Yuxi Xiao, Qingyan Bai, Juntao Zhang, Kecheng Zheng, Xiaowei Zhou, Qifeng Chen, Yujun Shen

    Abstract: We present the content deformation field CoDeF as a new type of video representation, which consists of a canonical content field aggregating the static contents in the entire video and a temporal deformation field recording the transformations from the canonical image (i.e., rendered from the canonical content field) to each individual frame along the time axis.Given a target video, these two fie… ▽ More

    Submitted 15 August, 2023; originally announced August 2023.

    Comments: Project Webpage: https://qiuyu96.github.io/CoDeF/, Code: https://github.com/qiuyu96/CoDeF

  12. arXiv:2307.02751  [pdf, ps, other

    cs.SD cs.CR eess.AS

    DSARSR: Deep Stacked Auto-encoders Enhanced Robust Speaker Recognition

    Authors: Zhifeng Wang, Chunyan Zeng, Surong Duan, Hongjie Ouyang, Hongmin Xu

    Abstract: Speaker recognition is a biometric modality that utilizes the speaker's speech segments to recognize the identity, determining whether the test speaker belongs to one of the enrolled speakers. In order to improve the robustness of the i-vector framework on cross-channel conditions and explore the nova method for applying deep learning to speaker recognition, the Stacked Auto-encoders are used to g… ▽ More

    Submitted 5 July, 2023; originally announced July 2023.

    Comments: 12 pages, 3 figures

  13. arXiv:2306.09165  [pdf, other

    cs.CV

    DEYOv2: Rank Feature with Greedy Matching for End-to-End Object Detection

    Authors: Haodong Ouyang

    Abstract: This paper presents a novel object detector called DEYOv2, an improved version of the first-generation DEYO (DETR with YOLO) model. DEYOv2, similar to its predecessor, DEYOv2 employs a progressive reasoning approach to accelerate model training and enhance performance. The study delves into the limitations of one-to-one matching in optimization and proposes solutions to effectively address the iss… ▽ More

    Submitted 2 July, 2023; v1 submitted 15 June, 2023; originally announced June 2023.

    Comments: SOTA detector

  14. arXiv:2303.02542  [pdf

    cs.CE

    Physics-informed neural network for friction-involved nonsmooth dynamics problems

    Authors: Zilin Li, Jinshuai Bai, Huajing Ouyang, Saulo Martelli, Jun Zhao, Ming Tang, Yang Yang, Hongtao Wei, Pan Liu, Wei-Ron Han, Yuantong Gu

    Abstract: Friction-induced vibration (FIV) is very common in engineering areas. Analysing the dynamic behaviour of systems containing a multiple-contact point frictional interface is an important topic. However, accurately simulating nonsmooth/discontinuous dynamic behaviour due to friction is challenging. This paper presents a new physics-informed neural network approach for solving nonsmooth friction-indu… ▽ More

    Submitted 10 October, 2023; v1 submitted 4 March, 2023; originally announced March 2023.

    Comments: 37 Pages, 24 figures

  15. arXiv:2211.15662  [pdf, other

    cs.CV

    High-fidelity 3D GAN Inversion by Pseudo-multi-view Optimization

    Authors: Jiaxin Xie, Hao Ouyang, Jingtan Piao, Chenyang Lei, Qifeng Chen

    Abstract: We present a high-fidelity 3D generative adversarial network (GAN) inversion framework that can synthesize photo-realistic novel views while preserving specific details of the input image. High-fidelity 3D GAN inversion is inherently challenging due to the geometry-texture trade-off in 3D inversion, where overfitting to a single view input image often damages the estimated geometry during the late… ▽ More

    Submitted 28 November, 2022; v1 submitted 28 November, 2022; originally announced November 2022.

    Comments: Project website: https://ken-ouyang.github.io/HFGI3D/index.html ; Github link: https://github.com/jiaxinxie97/HFGI3D

  16. arXiv:2211.06588  [pdf, other

    cs.CV

    DEYO: DETR with YOLO for Step-by-Step Object Detection

    Authors: Haodong Ouyang

    Abstract: Object detection is an important topic in computer vision, with post-processing, an essential part of the typical object detection pipeline, posing a significant bottleneck affecting the performance of traditional object detection models. The detection transformer (DETR), as the first end-to-end target detection model, discards the requirement of manual components like the anchor and non-maximum s… ▽ More

    Submitted 15 June, 2023; v1 submitted 12 November, 2022; originally announced November 2022.

  17. arXiv:2205.12952  [pdf, other

    cs.CV

    Pretraining is All You Need for Image-to-Image Translation

    Authors: Tengfei Wang, Ting Zhang, Bo Zhang, Hao Ouyang, Dong Chen, Qifeng Chen, Fang Wen

    Abstract: We propose to use pretraining to boost general image-to-image translation. Prior image-to-image translation methods usually need dedicated architectural design and train individual translation models from scratch, struggling for high-quality generation of complex scenes, especially when paired training data are not abundant. In this paper, we regard each image-to-image translation problem as a dow… ▽ More

    Submitted 25 May, 2022; originally announced May 2022.

    Comments: Project Page: https://tengfei-wang.github.io/PITI/index.html

  18. arXiv:2204.11820  [pdf, other

    cs.CV

    Real-Time Neural Character Rendering with Pose-Guided Multiplane Images

    Authors: Hao Ouyang, Bo Zhang, Pan Zhang, Hao Yang, Jiaolong Yang, Dong Chen, Qifeng Chen, Fang Wen

    Abstract: We propose pose-guided multiplane image (MPI) synthesis which can render an animatable character in real scenes with photorealistic quality. We use a portable camera rig to capture the multi-view images along with the driving signal for the moving subject. Our method generalizes the image-to-image translation paradigm, which translates the human pose to a 3D scene representation -- MPIs that can b… ▽ More

    Submitted 25 April, 2022; originally announced April 2022.

    Comments: Project website: https://ken-ouyang.github.io/cmpi/index.html

  19. arXiv:2201.11632  [pdf, other

    cs.CV cs.AI

    Deep Video Prior for Video Consistency and Propagation

    Authors: Chenyang Lei, Yazhou Xing, Hao Ouyang, Qifeng Chen

    Abstract: Applying an image processing algorithm independently to each video frame often leads to temporal inconsistency in the resulting video. To address this issue, we present a novel and general approach for blind video temporal consistency. Our method is only trained on a pair of original and processed videos directly instead of a large dataset. Unlike most previous methods that enforce temporal consis… ▽ More

    Submitted 27 January, 2022; originally announced January 2022.

    Comments: Accepted by TPAMI in Dec 2021; extension of NeurIPS2020 Blind Video Temporal Consistency via Deep Video Prior. arXiv admin note: substantial text overlap with arXiv:2010.11838

  20. arXiv:2111.14946  [pdf, other

    cs.DC

    Verifying Transactional Consistency of MongoDB

    Authors: Hongrong Ouyang, Hengfeng Wei, Yu Huang, Haixiang Li, Anqun Pan

    Abstract: MongoDB is a popular general-purpose, document-oriented, distributed NoSQL database. It supports transactions in three different deployments: single-document transactions utilizing the WiredTiger storage engine in a standalone node, multi-document transactions in a replica set which consists of a primary node and several secondary nodes, and distributed transactions in a sharded cluster which is a… ▽ More

    Submitted 15 June, 2022; v1 submitted 29 November, 2021; originally announced November 2021.

    Comments: v0.2, update with proof of correctness. 17 pages(16 pages excluding reference), 8 algorithms, 5 tables and 2 figures

  21. arXiv:2108.01912  [pdf, other

    cs.CV

    Internal Video Inpainting by Implicit Long-range Propagation

    Authors: Hao Ouyang, Tengfei Wang, Qifeng Chen

    Abstract: We propose a novel framework for video inpainting by adopting an internal learning strategy. Unlike previous methods that use optical flow for cross-frame context propagation to inpaint unknown regions, we show that this can be achieved implicitly by fitting a convolutional neural network to known regions. Moreover, to handle challenging sequences with ambiguous backgrounds or long-term occlusion,… ▽ More

    Submitted 17 August, 2021; v1 submitted 4 August, 2021; originally announced August 2021.

    Comments: ICCV 2021

  22. Priority prediction of Asian Hornet sighting report using machine learning methods

    Authors: Yixin Liu, Jiaxin Guo, Jieyang Dong, Luoqian Jiang, Haoyuan Ouyang

    Abstract: As infamous invaders to the North American ecosystem, the Asian giant hornet (Vespa mandarinia) is devastating not only to native bee colonies, but also to local apiculture. One of the most effective way to combat the harmful species is to locate and destroy their nests. By mobilizing the public to actively report possible sightings of the Asian giant hornet, the governmentcould timely send inspec… ▽ More

    Submitted 28 June, 2021; originally announced July 2021.

    Comments: 2021 IEEE International Conference on Software Engineering and Artificial Intelligence

  23. arXiv:2106.04963  [pdf, other

    cs.CL

    Psycholinguistic Tripartite Graph Network for Personality Detection

    Authors: Tao Yang, Feifan Yang, Haolan Ouyang, Xiaojun Quan

    Abstract: Most of the recent work on personality detection from online posts adopts multifarious deep neural networks to represent the posts and builds predictive models in a data-driven manner, without the exploitation of psycholinguistic knowledge that may unveil the connections between one's language usage and his psychological traits. In this paper, we propose a psycholinguistic knowledge-based triparti… ▽ More

    Submitted 9 June, 2021; originally announced June 2021.

    Comments: Accepted by ACL 2021

  24. arXiv:2105.04761  [pdf, other

    cs.IR cs.LG

    Federated Unbiased Learning to Rank

    Authors: Chang Li, Hua Ouyang

    Abstract: Unbiased Learning to Rank (ULTR) studies the problem of learning a ranking function based on biased user interactions. In this framework, ULTR algorithms have to rely on a large amount of user data that are collected, stored, and aggregated by central servers. In this paper, we consider an on-device search setting, where users search against their personal corpora on their local devices, and the… ▽ More

    Submitted 10 May, 2021; originally announced May 2021.

  25. arXiv:2104.09068  [pdf, other

    cs.CV

    Image Inpainting with External-internal Learning and Monochromic Bottleneck

    Authors: Tengfei Wang, Hao Ouyang, Qifeng Chen

    Abstract: Although recent inpainting approaches have demonstrated significant improvements with deep neural networks, they still suffer from artifacts such as blunt structures and abrupt colors when filling in the missing regions. To address these issues, we propose an external-internal inpainting scheme with a monochromic bottleneck that helps image inpainting models remove these artifacts. In the external… ▽ More

    Submitted 19 April, 2021; originally announced April 2021.

    Comments: CVPR 2021

  26. arXiv:2104.05237  [pdf, other

    cs.CV eess.IV

    Neural Camera Simulators

    Authors: Hao Ouyang, Zifan Shi, Chenyang Lei, Ka Lung Law, Qifeng Chen

    Abstract: We present a controllable camera simulator based on deep neural networks to synthesize raw image data under different camera settings, including exposure time, ISO, and aperture. The proposed simulator includes an exposure module that utilizes the principle of modern lens designs for correcting the luminance level. It also contains a noise module using the noise level function and an aperture modu… ▽ More

    Submitted 9 August, 2021; v1 submitted 12 April, 2021; originally announced April 2021.

    Comments: Accepted to CVPR2021

  27. arXiv:1901.01760  [pdf, ps, other

    cs.CV

    Human Pose Estimation with Spatial Contextual Information

    Authors: Hong Zhang, Hao Ouyang, Shu Liu, Xiaojuan Qi, Xiaoyong Shen, Ruigang Yang, Jiaya Jia

    Abstract: We explore the importance of spatial contextual information in human pose estimation. Most state-of-the-art pose networks are trained in a multi-stage manner and produce several auxiliary predictions for deep supervision. With this principle, we present two conceptually simple and yet computational efficient modules, namely Cascade Prediction Fusion (CPF) and Pose Graph Neural Network (PGNN), to e… ▽ More

    Submitted 7 January, 2019; originally announced January 2019.

  28. arXiv:1807.00669  [pdf, other

    cs.CR cs.LO cs.SE

    Verifying Security Protocols using Dynamic Strategies

    Authors: Yan Xiong, Cheng Su, Wenchao Huang, Fuyou Miao, Wansen Wang, Hengyi Ouyang

    Abstract: Current formal approaches have been successfully used to find design flaws in many security protocols. However, it is still challenging to automatically analyze protocols due to their large or infinite state spaces. In this paper, we propose a novel framework that can automatically verifying security protocols without any human intervention. Experimental results show that SmartVerif automatically… ▽ More

    Submitted 25 August, 2019; v1 submitted 26 June, 2018; originally announced July 2018.

    Comments: arXiv admin note: text overlap with arXiv:1403.1142, arXiv:1703.00426 by other authors

  29. arXiv:1606.00399  [pdf, other

    cs.LG math.CO stat.ML

    Scaling Submodular Maximization via Pruned Submodularity Graphs

    Authors: Tianyi Zhou, Hua Ouyang, Yi Chang, Jeff Bilmes, Carlos Guestrin

    Abstract: We propose a new random pruning method (called "submodular sparsification (SS)") to reduce the cost of submodular maximization. The pruning is applied via a "submodularity graph" over the $n$ ground elements, where each directed edge is associated with a pairwise dependency defined by the submodular function. In each step, SS prunes a $1-1/\sqrt{c}$ (for $c>1$) fraction of the nodes using weights… ▽ More

    Submitted 1 June, 2016; originally announced June 2016.

  30. arXiv:1411.2331  [pdf, ps, other

    stat.ML cs.LG

    N$^3$LARS: Minimum Redundancy Maximum Relevance Feature Selection for Large and High-dimensional Data

    Authors: Makoto Yamada, Avishek Saha, Hua Ouyang, Dawei Yin, Yi Chang

    Abstract: We propose a feature selection method that finds non-redundant features from a large and high-dimensional data in nonlinear way. Specifically, we propose a nonlinear extension of the non-negative least-angle regression (LARS) called N${}^3$LARS, where the similarity between input and output is measured through the normalized version of the Hilbert-Schmidt Independence Criterion (HSIC). An advantag… ▽ More

    Submitted 10 November, 2014; originally announced November 2014.

    Comments: arXiv admin note: text overlap with arXiv:1202.0515

  31. arXiv:1211.0632  [pdf, ps, other

    cs.LG math.OC stat.ML

    Stochastic ADMM for Nonsmooth Optimization

    Authors: Hua Ouyang, Niao He, Alexander Gray

    Abstract: We present a stochastic setting for optimization problems with nonsmooth convex separable objective functions over linear equality constraints. To solve such problems, we propose a stochastic Alternating Direction Method of Multipliers (ADMM) algorithm. Our algorithm applies to a more general class of nonsmooth convex functions that does not necessarily have a closed-form solution by minimizing th… ▽ More

    Submitted 22 January, 2013; v1 submitted 3 November, 2012; originally announced November 2012.

    Comments: A short version of this paper appears in the 5th NIPS Workshop on Optimization for Machine Learning, Lake Tahoe, Nevada, USA, 2012

  32. arXiv:1205.4481  [pdf, ps, other

    cs.LG stat.CO stat.ML

    Stochastic Smoothing for Nonsmooth Minimizations: Accelerating SGD by Exploiting Structure

    Authors: Hua Ouyang, Alexander Gray

    Abstract: In this work we consider the stochastic minimization of nonsmooth convex loss functions, a central problem in machine learning. We propose a novel algorithm called Accelerated Nonsmooth Stochastic Gradient Descent (ANSGD), which exploits the structure of common nonsmooth loss functions to achieve optimal convergence rates for a class of problems including SVMs. It is the first stochastic algorithm… ▽ More

    Submitted 1 October, 2012; v1 submitted 20 May, 2012; originally announced May 2012.

    Comments: Full length version of ICML'12 with all proofs. In this version, a bug in proving Theorem 6 is fixed. We'd like to thank Dr. Francesco Orabona for pointing it out

  33. arXiv:1105.2274  [pdf, ps, other

    cs.LG cs.DC

    Data-Distributed Weighted Majority and Online Mirror Descent

    Authors: Hua Ouyang, Alexander Gray

    Abstract: In this paper, we focus on the question of the extent to which online learning can benefit from distributed computing. We focus on the setting in which $N$ agents online-learn cooperatively, where each agent only has access to its own data. We propose a generic data-distributed online learning meta-algorithm. We then introduce the Distributed Weighted Majority and Distributed Online Mirror Descent… ▽ More

    Submitted 11 May, 2011; originally announced May 2011.