Zum Hauptinhalt springen

Showing 1–6 of 6 results for author: Tuo, Z

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.12899  [pdf, other

    cs.CV cs.AI cs.MM

    DreamStory: Open-Domain Story Visualization by LLM-Guided Multi-Subject Consistent Diffusion

    Authors: Huiguo He, Huan Yang, Zixi Tuo, Yuan Zhou, Qiuyue Wang, Yuhang Zhang, Zeyu Liu, Wenhao Huang, Hongyang Chao, Jian Yin

    Abstract: Story visualization aims to create visually compelling images or videos corresponding to textual narratives. Despite recent advances in diffusion models yielding promising results, existing methods still struggle to create a coherent sequence of subject-consistent frames based solely on a story. To this end, we propose DreamStory, an automatic open-domain story visualization framework by leveragin… ▽ More

    Submitted 17 July, 2024; originally announced July 2024.

  2. arXiv:2310.13518  [pdf, other

    cs.SE

    Vision-Based Mobile App GUI Testing: A Survey

    Authors: Shengcheng Yu, Chunrong Fang, Ziyuan Tuo, Quanjun Zhang, Chunyang Chen, Zhenyu Chen, Zhendong Su

    Abstract: Graphical User Interface (GUI) has become one of the most significant parts of mobile applications (apps). It is a direct bridge between mobile apps and end users, which directly affects the end user's experience. Neglecting GUI quality can undermine the value and effectiveness of the entire mobile app solution. Significant research efforts have been devoted to GUI testing, one effective method to… ▽ More

    Submitted 20 October, 2023; originally announced October 2023.

  3. arXiv:2307.16371  [pdf, other

    cs.CV

    MobileVidFactory: Automatic Diffusion-Based Social Media Video Generation for Mobile Devices from Text

    Authors: Junchen Zhu, Huan Yang, Wenjing Wang, Huiguo He, Zixi Tuo, Yongsheng Yu, Wen-Huang Cheng, Lianli Gao, Jingkuan Song, Jianlong Fu, Jiebo Luo

    Abstract: Videos for mobile devices become the most popular access to share and acquire information recently. For the convenience of users' creation, in this paper, we present a system, namely MobileVidFactory, to automatically generate vertical mobile videos where users only need to give simple texts mainly. Our system consists of two parts: basic and customized generation. In the basic generation, we take… ▽ More

    Submitted 30 July, 2023; originally announced July 2023.

    Comments: Accepted by ACM-MM 2023 demo

  4. arXiv:2306.07257  [pdf, other

    cs.CV

    MovieFactory: Automatic Movie Creation from Text using Large Generative Models for Language and Images

    Authors: Junchen Zhu, Huan Yang, Huiguo He, Wenjing Wang, Zixi Tuo, Wen-Huang Cheng, Lianli Gao, Jingkuan Song, Jianlong Fu

    Abstract: In this paper, we present MovieFactory, a powerful framework to generate cinematic-picture (3072$\times$1280), film-style (multi-scene), and multi-modality (sounding) movies on the demand of natural languages. As the first fully automated movie generation model to the best of our knowledge, our approach empowers users to create captivating movies with smooth transitions using simple text inputs, s… ▽ More

    Submitted 12 June, 2023; originally announced June 2023.

  5. arXiv:2305.10874  [pdf, other

    cs.CV

    Swap Attention in Spatiotemporal Diffusions for Text-to-Video Generation

    Authors: Wenjing Wang, Huan Yang, Zixi Tuo, Huiguo He, Junchen Zhu, Jianlong Fu, Jiaying Liu

    Abstract: With the explosive popularity of AI-generated content (AIGC), video generation has recently received a lot of attention. Generating videos guided by text instructions poses significant challenges, such as modeling the complex relationship between space and time, and the lack of large-scale text-video paired data. Existing text-video datasets suffer from limitations in both content quality and scal… ▽ More

    Submitted 24 April, 2024; v1 submitted 18 May, 2023; originally announced May 2023.

  6. arXiv:2303.09826  [pdf, other

    cs.CV

    Learning Data-Driven Vector-Quantized Degradation Model for Animation Video Super-Resolution

    Authors: Zixi Tuo, Huan Yang, Jianlong Fu, Yujie Dun, Xueming Qian

    Abstract: Existing real-world video super-resolution (VSR) methods focus on designing a general degradation pipeline for open-domain videos while ignoring data intrinsic characteristics which strongly limit their performance when applying to some specific domains (eg., animation videos). In this paper, we thoroughly explore the characteristics of animation videos and leverage the rich priors in real-world a… ▽ More

    Submitted 19 September, 2023; v1 submitted 17 March, 2023; originally announced March 2023.