Zum Hauptinhalt springen

Showing 1–30 of 30 results for author: Kemelmacher-Shlizerman, I

.
  1. arXiv:2408.15239  [pdf, other

    cs.CV

    Generative Inbetweening: Adapting Image-to-Video Models for Keyframe Interpolation

    Authors: Xiaojuan Wang, Boyang Zhou, Brian Curless, Ira Kemelmacher-Shlizerman, Aleksander Holynski, Steven M. Seitz

    Abstract: We present a method for generating video sequences with coherent motion between a pair of input key frames. We adapt a pretrained large-scale image-to-video diffusion model (originally trained to generate videos moving forward in time from a single input image) for key frame interpolation, i.e., to produce a video in between two input frames. We accomplish this adaptation through a lightweight fin… ▽ More

    Submitted 27 August, 2024; originally announced August 2024.

    Comments: project page: https://svd-keyframe-interpolation.github.io/

  2. arXiv:2406.04542  [pdf, other

    cs.CV cs.GR

    M&M VTO: Multi-Garment Virtual Try-On and Editing

    Authors: Luyang Zhu, Yingwei Li, Nan Liu, Hao Peng, Dawei Yang, Ira Kemelmacher-Shlizerman

    Abstract: We present M&M VTO, a mix and match virtual try-on method that takes as input multiple garment images, text description for garment layout and an image of a person. An example input includes: an image of a shirt, an image of a pair of pants, "rolled sleeves, shirt tucked in", and an image of a person. The output is a visualization of how those garments (in the desired layout) would look like on th… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

    Comments: CVPR 2024 Highlight. Project website: https://mmvto.github.io/

  3. arXiv:2404.17104  [pdf, other

    cs.HC cs.CV

    Don't Look at the Camera: Achieving Perceived Eye Contact

    Authors: Alice Gao, Samyukta Jayakumar, Marcello Maniglia, Brian Curless, Ira Kemelmacher-Shlizerman, Aaron R. Seitz, Steven M. Seitz

    Abstract: We consider the question of how to best achieve the perception of eye contact when a person is captured by camera and then rendered on a 2D display. For single subjects photographed by a camera, conventional wisdom tells us that looking directly into the camera achieves eye contact. Through empirical user studies, we show that it is instead preferable to {\em look just below the camera lens}. We q… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

  4. arXiv:2311.03560  [pdf, other

    eess.AS

    HRTF Estimation in the Wild

    Authors: Vivek Jayaram, Ira Kemelmacher-Shlizerman, Steven M. Seitz

    Abstract: Head Related Transfer Functions (HRTFs) play a crucial role in creating immersive spatial audio experiences. However, HRTFs differ significantly from person to person, and traditional methods for estimating personalized HRTFs are expensive, time-consuming, and require specialized equipment. We imagine a world where your personalized HRTF can be determined by capturing data through earbuds in every… ▽ More

    Submitted 6 November, 2023; originally announced November 2023.

    Comments: 9 Pages. Presented at UIST '23

  5. Animating Street View

    Authors: Mengyi Shan, Brian Curless, Ira Kemelmacher-Shlizerman, Steve Seitz

    Abstract: We present a system that automatically brings street view imagery to life by populating it with naturally behaving, animated pedestrians and vehicles. Our approach is to remove existing people and vehicles from the input image, insert moving objects with proper scale, angle, motion, and appearance, plan paths and traffic behavior, as well as render the scene with plausible occlusion and shadowing… ▽ More

    Submitted 12 October, 2023; originally announced October 2023.

    Comments: SIGGRAPH Asia 2023 Conference Track

  6. arXiv:2308.14740  [pdf, other

    cs.CV cs.GR cs.LG

    Total Selfie: Generating Full-Body Selfies

    Authors: Bowei Chen, Brian Curless, Ira Kemelmacher-Shlizerman, Steven M. Seitz

    Abstract: We present a method to generate full-body selfies from photographs originally taken at arms length. Because self-captured photos are typically taken close up, they have limited field of view and exaggerated perspective that distorts facial shapes. We instead seek to generate the photo some one else would take of you from a few feet away. Our approach takes as input four selfies of your face and bo… ▽ More

    Submitted 3 April, 2024; v1 submitted 28 August, 2023; originally announced August 2023.

    Comments: Project page: https://homes.cs.washington.edu/~boweiche/project_page/totalselfie/

  7. arXiv:2306.08276  [pdf, other

    cs.CV cs.GR

    TryOnDiffusion: A Tale of Two UNets

    Authors: Luyang Zhu, Dawei Yang, Tyler Zhu, Fitsum Reda, William Chan, Chitwan Saharia, Mohammad Norouzi, Ira Kemelmacher-Shlizerman

    Abstract: Given two images depicting a person and a garment worn by another person, our goal is to generate a visualization of how the garment might look on the input person. A key challenge is to synthesize a photorealistic detail-preserving visualization of the garment, while warping the garment to accommodate a significant body pose and shape change across the subjects. Previous methods either focus on g… ▽ More

    Submitted 14 June, 2023; originally announced June 2023.

    Comments: CVPR 2023. Project page: https://tryondiffusion.github.io/

  8. arXiv:2304.06025  [pdf, other

    cs.CV

    DreamPose: Fashion Image-to-Video Synthesis via Stable Diffusion

    Authors: Johanna Karras, Aleksander Holynski, Ting-Chun Wang, Ira Kemelmacher-Shlizerman

    Abstract: We present DreamPose, a diffusion-based method for generating animated fashion videos from still images. Given an image and a sequence of human body poses, our method synthesizes a video containing both human and fabric motion. To achieve this, we transform a pretrained text-to-image model (Stable Diffusion) into a pose-and-image guided video synthesis model, using a novel fine-tuning strategy, a… ▽ More

    Submitted 30 October, 2023; v1 submitted 12 April, 2023; originally announced April 2023.

    Comments: Project page: https://grail.cs.washington.edu/projects/dreampose/

  9. arXiv:2302.08504  [pdf, other

    cs.CV cs.GR

    PersonNeRF: Personalized Reconstruction from Photo Collections

    Authors: Chung-Yi Weng, Pratul P. Srinivasan, Brian Curless, Ira Kemelmacher-Shlizerman

    Abstract: We present PersonNeRF, a method that takes a collection of photos of a subject (e.g. Roger Federer) captured across multiple years with arbitrary body poses and appearances, and enables rendering the subject with arbitrary novel combinations of viewpoint, body pose, and appearance. PersonNeRF builds a customized neural volumetric 3D model of the subject that is able to render an entire space spann… ▽ More

    Submitted 16 February, 2023; originally announced February 2023.

    Comments: Project Page: https://grail.cs.washington.edu/projects/personnerf/

  10. ClearBuds: Wireless Binaural Earbuds for Learning-Based Speech Enhancement

    Authors: Ishan Chatterjee, Maruchi Kim, Vivek Jayaram, Shyamnath Gollakota, Ira Kemelmacher-Shlizerman, Shwetak Patel, Steven M. Seitz

    Abstract: We present ClearBuds, the first hardware and software system that utilizes a neural network to enhance speech streamed from two wireless earbuds. Real-time speech enhancement for wireless earbuds requires high-quality sound separation and background cancellation, operating in real-time and on a mobile phone. Clear-Buds bridges state-of-the-art deep learning for blind audio source separation and in… ▽ More

    Submitted 27 June, 2022; originally announced June 2022.

    Comments: 12 pages, Published in Mobisys 2022

  11. arXiv:2201.04127  [pdf, other

    cs.CV cs.GR

    HumanNeRF: Free-viewpoint Rendering of Moving People from Monocular Video

    Authors: Chung-Yi Weng, Brian Curless, Pratul P. Srinivasan, Jonathan T. Barron, Ira Kemelmacher-Shlizerman

    Abstract: We introduce a free-viewpoint rendering method -- HumanNeRF -- that works on a given monocular video of a human performing complex body motions, e.g. a video from YouTube. Our method enables pausing the video at any frame and rendering the subject from arbitrary new camera viewpoints or even a full 360-degree camera path for that particular frame and body pose. This task is particularly challengin… ▽ More

    Submitted 14 June, 2022; v1 submitted 11 January, 2022; originally announced January 2022.

    Comments: CVPR 2022 (oral). Project page with videos: https://grail.cs.washington.edu/projects/humannerf/

  12. arXiv:2112.11427  [pdf, other

    cs.CV cs.AI cs.GR cs.LG

    StyleSDF: High-Resolution 3D-Consistent Image and Geometry Generation

    Authors: Roy Or-El, Xuan Luo, Mengyi Shan, Eli Shechtman, Jeong Joon Park, Ira Kemelmacher-Shlizerman

    Abstract: We introduce a high resolution, 3D-consistent image and shape generation technique which we call StyleSDF. Our method is trained on single-view RGB data only, and stands on the shoulders of StyleGAN2 for image generation, while solving two main challenges in 3D-aware GANs: 1) high-resolution, view-consistent generation of the RGB images, and 2) detailed 3D shape. We achieve this by merging a SDF-b… ▽ More

    Submitted 29 March, 2022; v1 submitted 21 December, 2021; originally announced December 2021.

    Comments: Camera-Ready version. Paper was accepted as oral to CVPR 2022. Added discussions and figures from the rebuttal to the supplementary material (sections C & F). Project Webpage: https://stylesdf.github.io/

  13. arXiv:2105.08051  [pdf, other

    cs.CV cs.GR

    A Light Stage on Every Desk

    Authors: Soumyadip Sengupta, Brian Curless, Ira Kemelmacher-Shlizerman, Steve Seitz

    Abstract: Every time you sit in front of a TV or monitor, your face is actively illuminated by time-varying patterns of light. This paper proposes to use this time-varying illumination for synthetic relighting of your face with any new illumination condition. In doing so, we take inspiration from the light stage work of Debevec et al., who first demonstrated the ability to relight people captured in a contr… ▽ More

    Submitted 11 November, 2021; v1 submitted 17 May, 2021; originally announced May 2021.

    Comments: Updated citations from v1

  14. arXiv:2101.02285  [pdf, other

    cs.CV cs.GR

    TryOnGAN: Body-Aware Try-On via Layered Interpolation

    Authors: Kathleen M Lewis, Srivatsan Varadharajan, Ira Kemelmacher-Shlizerman

    Abstract: Given a pair of images-target person and garment on another person-we automatically generate the target person in the given garment. Previous methods mostly focused on texture transfer via paired data training, while overlooking body shape deformations, skin color, and seamless blending of garment with the person. This work focuses on those three components, while also not requiring paired data tr… ▽ More

    Submitted 2 June, 2021; v1 submitted 6 January, 2021; originally announced January 2021.

  15. arXiv:2012.12884  [pdf, other

    cs.CV cs.GR

    Vid2Actor: Free-viewpoint Animatable Person Synthesis from Video in the Wild

    Authors: Chung-Yi Weng, Brian Curless, Ira Kemelmacher-Shlizerman

    Abstract: Given an "in-the-wild" video of a person, we reconstruct an animatable model of the person in the video. The output model can be rendered in any body pose to any camera view, via the learned controls, without explicit 3D mesh reconstruction. At the core of our method is a volumetric 3D human representation reconstructed with a deep network trained on input video, enabling novel pose/view synthesis… ▽ More

    Submitted 23 December, 2020; originally announced December 2020.

    Comments: Project Page: https://grail.cs.washington.edu/projects/vid2actor/ Supplementary Video: https://youtu.be/Zec8Us0v23o

  16. arXiv:2012.07810  [pdf, other

    cs.CV

    Real-Time High-Resolution Background Matting

    Authors: Shanchuan Lin, Andrey Ryabtsev, Soumyadip Sengupta, Brian Curless, Steve Seitz, Ira Kemelmacher-Shlizerman

    Abstract: We introduce a real-time, high-resolution background replacement technique which operates at 30fps in 4K resolution, and 60fps for HD on a modern GPU. Our technique is based on background matting, where an additional frame of the background is captured and used in recovering the alpha matte and the foreground layer. The main challenge is to compute a high-quality alpha matte, preserving strand-lev… ▽ More

    Submitted 14 December, 2020; originally announced December 2020.

  17. arXiv:2010.06007  [pdf, other

    cs.SD cs.AI

    The Cone of Silence: Speech Separation by Localization

    Authors: Teerapat Jenrungrot, Vivek Jayaram, Steve Seitz, Ira Kemelmacher-Shlizerman

    Abstract: Given a multi-microphone recording of an unknown number of speakers talking concurrently, we simultaneously localize the sources and separate the individual speakers. At the core of our method is a deep network, in the waveform domain, which isolates sources within an angular region $θ\pm w/2$, given an angle of interest $θ$ and angular window size $w$. By exponentially decreasing $w$, we can perf… ▽ More

    Submitted 12 October, 2020; originally announced October 2020.

    Comments: 9 pages + references + supplementary. Oral presentation at NeurIPS 2020

  18. arXiv:2007.13303  [pdf, other

    cs.CV

    Reconstructing NBA Players

    Authors: Luyang Zhu, Konstantinos Rematas, Brian Curless, Steve Seitz, Ira Kemelmacher-Shlizerman

    Abstract: Great progress has been made in 3D body pose and shape estimation from a single photo. Yet, state-of-the-art results still suffer from errors due to challenging body poses, modeling clothing, and self occlusions. The domain of basketball games is particularly challenging, as it exhibits all of these challenges. In this paper, we introduce a new approach for reconstruction of basketball players tha… ▽ More

    Submitted 27 July, 2020; originally announced July 2020.

    Comments: ECCV 2020

  19. arXiv:2004.00626  [pdf, other

    cs.CV

    Background Matting: The World is Your Green Screen

    Authors: Soumyadip Sengupta, Vivek Jayaram, Brian Curless, Steve Seitz, Ira Kemelmacher-Shlizerman

    Abstract: We propose a method for creating a matte -- the per-pixel foreground color and alpha -- of a person by taking photos or videos in an everyday setting with a handheld camera. Most existing matting methods require a green screen background or a manually created trimap to produce a good matte. Automatic, trimap-free methods are appearing, but are not of comparable quality. In our trimap free approach… ▽ More

    Submitted 9 April, 2020; v1 submitted 1 April, 2020; originally announced April 2020.

    Comments: Accepted to CVPR 2020

  20. arXiv:2003.09764  [pdf, other

    cs.CV

    Lifespan Age Transformation Synthesis

    Authors: Roy Or-El, Soumyadip Sengupta, Ohad Fried, Eli Shechtman, Ira Kemelmacher-Shlizerman

    Abstract: We address the problem of single photo age progression and regression-the prediction of how a person might look in the future, or how they looked in the past. Most existing aging methods are limited to changing the texture, overlooking transformations in head shape that occur during the human aging and growth process. This limits the applicability of previous methods to aging of adults to slightly… ▽ More

    Submitted 24 July, 2020; v1 submitted 21 March, 2020; originally announced March 2020.

    Comments: ECCV 2020 Camera-Ready version. Main Changes: 1. Added Ethics & Bias statement in the supplementary material 2. Comparison figures to PyGAN [46] and S2GAN [13] were removed due to copyright issues. These figures can be found in the project's webpage (link is provided in the paper). 3. Added links to the code and dataset (Github)

  21. arXiv:1812.02246  [pdf, other

    cs.CV cs.GR

    Photo Wake-Up: 3D Character Animation from a Single Photo

    Authors: Chung-Yi Weng, Brian Curless, Ira Kemelmacher-Shlizerman

    Abstract: We present a method and application for animating a human subject from a single photo. E.g., the character can walk out, run, sit, or jump in 3D. The key contributions of this paper are: 1) an application of viewing and animating humans in single photos in 3D, 2) a novel 2D warping method to deform a posable template body model to fit the person's complex silhouette to create an animatable mesh, a… ▽ More

    Submitted 5 December, 2018; originally announced December 2018.

    Comments: The project page is at https://grail.cs.washington.edu/projects/wakeup/, and the supplementary video is at https://youtu.be/G63goXc5MyU

  22. arXiv:1809.04765  [pdf, other

    cs.CV cs.GR

    Video to Fully Automatic 3D Hair Model

    Authors: Shu Liang, Xiufeng Huang, Xianyu Meng, Kunyao Chen, Linda G. Shapiro, Ira Kemelmacher-Shlizerman

    Abstract: Imagine taking a selfie video with your mobile phone and getting as output a 3D model of your head (face and 3D hair strands) that can be later used in VR, AR, and any other domain. State of the art hair reconstruction methods allow either a single photo (thus compromising 3D quality) or multiple views, but they require manual user interaction (manual hair segmentation and capture of fixed camera… ▽ More

    Submitted 13 September, 2018; originally announced September 2018.

    Comments: supplementary video: https://www.youtube.com/watch?v=so_CMv7Xd40

  23. arXiv:1809.04764  [pdf, other

    cs.CV

    3D Face Hallucination from a Single Depth Frame

    Authors: Shu Liang, Ira Kemelmacher-Shlizerman, Linda G. Shapiro

    Abstract: We present an algorithm that takes a single frame of a person's face from a depth camera, e.g., Kinect, and produces a high-resolution 3D mesh of the input face. We leverage a dataset of 3D face meshes of 1204 distinct individuals ranging from age 3 to 40, captured in a neutral expression. We divide the input depth frame into semantically significant regions (eyes, nose, mouth, cheeks) and search… ▽ More

    Submitted 13 September, 2018; originally announced September 2018.

    Comments: published on 3Dv 2014

  24. arXiv:1809.04763  [pdf, other

    cs.CV

    Head Reconstruction from Internet Photos

    Authors: Shu Liang, Linda G. Shapiro, Ira Kemelmacher-Shlizerman

    Abstract: 3D face reconstruction from Internet photos has recently produced exciting results. A person's face, e.g., Tom Hanks, can be modeled and animated in 3D from a completely uncalibrated photo collection. Most methods, however, focus solely on face area and mask out the rest of the head. This paper proposes that head modeling from the Internet is a problem we can solve. We target reconstruction of the… ▽ More

    Submitted 13 September, 2018; originally announced September 2018.

    Comments: Published on ECCV 2016

  25. arXiv:1806.00890  [pdf, other

    cs.CV

    Soccer on Your Tabletop

    Authors: Konstantinos Rematas, Ira Kemelmacher-Shlizerman, Brian Curless, Steve Seitz

    Abstract: We present a system that transforms a monocular video of a soccer game into a moving 3D reconstruction, in which the players and field can be rendered interactively with a 3D viewer or through an Augmented Reality device. At the heart of our paper is an approach to estimate the depth map of each player, using a CNN that is trained on 3D player data extracted from soccer video games. We compare wit… ▽ More

    Submitted 3 June, 2018; originally announced June 2018.

    Comments: CVPR'18. Project: http://grail.cs.washington.edu/projects/soccer/

  26. arXiv:1712.09382  [pdf, other

    eess.AS cs.CV cs.SD

    Audio to Body Dynamics

    Authors: Eli Shlizerman, Lucio M. Dery, Hayden Schoen, Ira Kemelmacher-Shlizerman

    Abstract: We present a method that gets as input an audio of violin or piano playing, and outputs a video of skeleton predictions which are further used to animate an avatar. The key idea is to create an animation of an avatar that moves their hands similarly to how a pianist or violinist would do, just from audio. Aiming for a fully detailed correct arms and fingers motion is a goal, however, it's not clea… ▽ More

    Submitted 19 December, 2017; originally announced December 2017.

    Comments: Link with videos https://arviolin.github.io/AudioBodyDynamics/

    Journal ref: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018

  27. arXiv:1705.00393  [pdf, other

    cs.CV

    Level Playing Field for Million Scale Face Recognition

    Authors: Aaron Nech, Ira Kemelmacher-Shlizerman

    Abstract: Face recognition has the perception of a solved problem, however when tested at the million-scale exhibits dramatic variation in accuracies across the different algorithms. Are the algorithms very different? Is access to good/big training data their secret weapon? Where should face recognition improve? To address those questions, we created a benchmark, MF2, that requires all algorithms to be trai… ▽ More

    Submitted 30 April, 2017; originally announced May 2017.

  28. arXiv:1512.00596  [pdf, other

    cs.CV

    The MegaFace Benchmark: 1 Million Faces for Recognition at Scale

    Authors: Ira Kemelmacher-Shlizerman, Steve Seitz, Daniel Miller, Evan Brossard

    Abstract: Recent face recognition experiments on a major benchmark LFW show stunning performance--a number of algorithms achieve near to perfect score, surpassing human recognition rates. In this paper, we advocate evaluations at the million scale (LFW includes only 13K photos of 5K people). To this end, we have assembled the MegaFace dataset and created the first MegaFace challenge. Our dataset includes On… ▽ More

    Submitted 2 December, 2015; originally announced December 2015.

  29. arXiv:1506.00752  [pdf, other

    cs.CV

    What Makes Kevin Spacey Look Like Kevin Spacey

    Authors: Supasorn Suwajanakorn, Ira Kemelmacher-Shlizerman, Steve Seitz

    Abstract: We reconstruct a controllable model of a person from a large photo collection that captures his or her {\em persona}, i.e., physical appearance and behavior. The ability to operate on unstructured photo collections enables modeling a huge number of people, including celebrities and other well photographed people without requiring them to be scanned. Moreover, we show the ability to drive or {\em p… ▽ More

    Submitted 2 June, 2015; originally announced June 2015.

  30. arXiv:1505.02108  [pdf, other

    cs.CV

    MegaFace: A Million Faces for Recognition at Scale

    Authors: D. Miller, E. Brossard, S. Seitz, I. Kemelmacher-Shlizerman

    Abstract: Recent face recognition experiments on the LFW benchmark show that face recognition is performing stunningly well, surpassing human recognition rates. In this paper, we study face recognition at scale. Specifically, we have collected from Flickr a \textbf{Million} faces and evaluated state of the art face recognition algorithms on this dataset. We found that the performance of algorithms varies--w… ▽ More

    Submitted 7 September, 2015; v1 submitted 8 May, 2015; originally announced May 2015.

    Comments: Please see http://megaface.cs.washington.edu/ for code and data