-
A polynomial-time classical algorithm for noisy quantum circuits
Authors:
Thomas Schuster,
Chao Yin,
Xun Gao,
Norman Y. Yao
Abstract:
We provide a polynomial-time classical algorithm for noisy quantum circuits. The algorithm computes the expectation value of any observable for any circuit, with a small average error over input states drawn from an ensemble (e.g. the computational basis). Our approach is based upon the intuition that noise exponentially damps non-local correlations relative to local correlations. This enables one…
▽ More
We provide a polynomial-time classical algorithm for noisy quantum circuits. The algorithm computes the expectation value of any observable for any circuit, with a small average error over input states drawn from an ensemble (e.g. the computational basis). Our approach is based upon the intuition that noise exponentially damps non-local correlations relative to local correlations. This enables one to classically simulate a noisy quantum circuit by only keeping track of the dynamics of local quantum information. Our algorithm also enables sampling from the output distribution of a circuit in quasi-polynomial time, so long as the distribution anti-concentrates. A number of practical implications are discussed, including a fundamental limit on the efficacy of noise mitigation strategies: any quantum circuit for which error mitigation is efficient must be classically simulable.
△ Less
Submitted 17 July, 2024;
originally announced July 2024.
-
Approximately-symmetric neural networks for quantum spin liquids
Authors:
Dominik S. Kufel,
Jack Kemp,
Simon M. Linsel,
Chris R. Laumann,
Norman Y. Yao
Abstract:
We propose and analyze a family of approximately-symmetric neural networks for quantum spin liquid problems. These tailored architectures are parameter-efficient, scalable, and significantly out-perform existing symmetry-unaware neural network architectures. Utilizing the mixed-field toric code model, we demonstrate that our approach is competitive with the state-of-the-art tensor network and quan…
▽ More
We propose and analyze a family of approximately-symmetric neural networks for quantum spin liquid problems. These tailored architectures are parameter-efficient, scalable, and significantly out-perform existing symmetry-unaware neural network architectures. Utilizing the mixed-field toric code model, we demonstrate that our approach is competitive with the state-of-the-art tensor network and quantum Monte Carlo methods. Moreover, at the largest system sizes (N=480), our method allows us to explore Hamiltonians with sign problems beyond the reach of both quantum Monte Carlo and finite-size matrix-product states. The network comprises an exactly symmetric block following a non-symmetric block, which we argue learns a transformation of the ground state analogous to quasiadiabatic continuation. Our work paves the way toward investigating quantum spin liquid problems within interpretable neural network architectures
△ Less
Submitted 27 May, 2024;
originally announced May 2024.
-
NOVA-3D: Non-overlapped Views for 3D Anime Character Reconstruction
Authors:
Hongsheng Wang,
Nanjie Yao,
Xinrui Zhou,
Shengyu Zhang,
Huahao Xu,
Fei Wu,
Feng Lin
Abstract:
In the animation industry, 3D modelers typically rely on front and back non-overlapped concept designs to guide the 3D modeling of anime characters. However, there is currently a lack of automated approaches for generating anime characters directly from these 2D designs. In light of this, we explore a novel task of reconstructing anime characters from non-overlapped views. This presents two main c…
▽ More
In the animation industry, 3D modelers typically rely on front and back non-overlapped concept designs to guide the 3D modeling of anime characters. However, there is currently a lack of automated approaches for generating anime characters directly from these 2D designs. In light of this, we explore a novel task of reconstructing anime characters from non-overlapped views. This presents two main challenges: existing multi-view approaches cannot be directly applied due to the absence of overlapping regions, and there is a scarcity of full-body anime character data and standard benchmarks. To bridge the gap, we present Non-Overlapped Views for 3D \textbf{A}nime Character Reconstruction (NOVA-3D), a new framework that implements a method for view-aware feature fusion to learn 3D-consistent features effectively and synthesizes full-body anime characters from non-overlapped front and back views directly. To facilitate this line of research, we collected the NOVA-Human dataset, which comprises multi-view images and accurate camera parameters for 3D anime characters. Extensive experiments demonstrate that the proposed method outperforms baseline approaches, achieving superior reconstruction of anime characters with exceptional detail fidelity. In addition, to further verify the effectiveness of our method, we applied it to the animation head reconstruction task and improved the state-of-the-art baseline to 94.453 in SSIM, 7.726 in LPIPS, and 19.575 in PSNR on average. Codes and datasets are available at https://wanghongsheng01.github.io/NOVA-3D/.
△ Less
Submitted 21 May, 2024;
originally announced May 2024.
-
EmbSum: Leveraging the Summarization Capabilities of Large Language Models for Content-Based Recommendations
Authors:
Chiyu Zhang,
Yifei Sun,
Minghao Wu,
Jun Chen,
Jie Lei,
Muhammad Abdul-Mageed,
Rong Jin,
Angli Liu,
Ji Zhu,
Sem Park,
Ning Yao,
Bo Long
Abstract:
Content-based recommendation systems play a crucial role in delivering personalized content to users in the digital world. In this work, we introduce EmbSum, a novel framework that enables offline pre-computations of users and candidate items while capturing the interactions within the user engagement history. By utilizing the pretrained encoder-decoder model and poly-attention layers, EmbSum deri…
▽ More
Content-based recommendation systems play a crucial role in delivering personalized content to users in the digital world. In this work, we introduce EmbSum, a novel framework that enables offline pre-computations of users and candidate items while capturing the interactions within the user engagement history. By utilizing the pretrained encoder-decoder model and poly-attention layers, EmbSum derives User Poly-Embedding (UPE) and Content Poly-Embedding (CPE) to calculate relevance scores between users and candidate items. EmbSum actively learns the long user engagement histories by generating user-interest summary with supervision from large language model (LLM). The effectiveness of EmbSum is validated on two datasets from different domains, surpassing state-of-the-art (SoTA) methods with higher accuracy and fewer parameters. Additionally, the model's ability to generate summaries of user interests serves as a valuable by-product, enhancing its usefulness for personalized content recommendations.
△ Less
Submitted 19 August, 2024; v1 submitted 19 May, 2024;
originally announced May 2024.
-
SPAR: Personalized Content-Based Recommendation via Long Engagement Attention
Authors:
Chiyu Zhang,
Yifei Sun,
Jun Chen,
Jie Lei,
Muhammad Abdul-Mageed,
Sinong Wang,
Rong Jin,
Sem Park,
Ning Yao,
Bo Long
Abstract:
Leveraging users' long engagement histories is essential for personalized content recommendations. The success of pretrained language models (PLMs) in NLP has led to their use in encoding user histories and candidate items, framing content recommendations as textual semantic matching tasks. However, existing works still struggle with processing very long user historical text and insufficient user-…
▽ More
Leveraging users' long engagement histories is essential for personalized content recommendations. The success of pretrained language models (PLMs) in NLP has led to their use in encoding user histories and candidate items, framing content recommendations as textual semantic matching tasks. However, existing works still struggle with processing very long user historical text and insufficient user-item interaction. In this paper, we introduce a content-based recommendation framework, SPAR, which effectively tackles the challenges of holistic user interest extraction from the long user engagement history. It achieves so by leveraging PLM, poly-attention layers and attention sparsity mechanisms to encode user's history in a session-based manner. The user and item side features are sufficiently fused for engagement prediction while maintaining standalone representations for both sides, which is efficient for practical model deployment. Moreover, we enhance user profiling by exploiting large language model (LLM) to extract global interests from user engagement history. Extensive experiments on two benchmark datasets demonstrate that our framework outperforms existing state-of-the-art (SoTA) methods.
△ Less
Submitted 21 May, 2024; v1 submitted 16 February, 2024;
originally announced February 2024.
-
DiffusionTalker: Personalization and Acceleration for Speech-Driven 3D Face Diffuser
Authors:
Peng Chen,
Xiaobao Wei,
Ming Lu,
Yitong Zhu,
Naiming Yao,
Xingyu Xiao,
Hui Chen
Abstract:
Speech-driven 3D facial animation has been an attractive task in both academia and industry. Traditional methods mostly focus on learning a deterministic mapping from speech to animation. Recent approaches start to consider the non-deterministic fact of speech-driven 3D face animation and employ the diffusion model for the task. However, personalizing facial animation and accelerating animation ge…
▽ More
Speech-driven 3D facial animation has been an attractive task in both academia and industry. Traditional methods mostly focus on learning a deterministic mapping from speech to animation. Recent approaches start to consider the non-deterministic fact of speech-driven 3D face animation and employ the diffusion model for the task. However, personalizing facial animation and accelerating animation generation are still two major limitations of existing diffusion-based methods. To address the above limitations, we propose DiffusionTalker, a diffusion-based method that utilizes contrastive learning to personalize 3D facial animation and knowledge distillation to accelerate 3D animation generation. Specifically, to enable personalization, we introduce a learnable talking identity to aggregate knowledge in audio sequences. The proposed identity embeddings extract customized facial cues across different people in a contrastive learning manner. During inference, users can obtain personalized facial animation based on input audio, reflecting a specific talking style. With a trained diffusion model with hundreds of steps, we distill it into a lightweight model with 8 steps for acceleration. Extensive experiments are conducted to demonstrate that our method outperforms state-of-the-art methods. The code will be released.
△ Less
Submitted 2 December, 2023; v1 submitted 28 November, 2023;
originally announced November 2023.
-
A Robust Deep Learning Method with Uncertainty Estimation for the Pathological Classification of Renal Cell Carcinoma based on CT Images
Authors:
Ni Yao,
Hang Hu,
Kaicong Chen,
Chen Zhao,
Yuan Guo,
Boya Li,
Jiaofen Nan,
Yanting Li,
Chuang Han,
Fubao Zhu,
Weihua Zhou,
Li Tian
Abstract:
Objectives To develop and validate a deep learning-based diagnostic model incorporating uncertainty estimation so as to facilitate radiologists in the preoperative differentiation of the pathological subtypes of renal cell carcinoma (RCC) based on CT images. Methods Data from 668 consecutive patients, pathologically proven RCC, were retrospectively collected from Center 1. By using five-fold cross…
▽ More
Objectives To develop and validate a deep learning-based diagnostic model incorporating uncertainty estimation so as to facilitate radiologists in the preoperative differentiation of the pathological subtypes of renal cell carcinoma (RCC) based on CT images. Methods Data from 668 consecutive patients, pathologically proven RCC, were retrospectively collected from Center 1. By using five-fold cross-validation, a deep learning model incorporating uncertainty estimation was developed to classify RCC subtypes into clear cell RCC (ccRCC), papillary RCC (pRCC), and chromophobe RCC (chRCC). An external validation set of 78 patients from Center 2 further evaluated the model's performance. Results In the five-fold cross-validation, the model's area under the receiver operating characteristic curve (AUC) for the classification of ccRCC, pRCC, and chRCC was 0.868 (95% CI: 0.826-0.923), 0.846 (95% CI: 0.812-0.886), and 0.839 (95% CI: 0.802-0.88), respectively. In the external validation set, the AUCs were 0.856 (95% CI: 0.838-0.882), 0.787 (95% CI: 0.757-0.818), and 0.793 (95% CI: 0.758-0.831) for ccRCC, pRCC, and chRCC, respectively. Conclusions The developed deep learning model demonstrated robust performance in predicting the pathological subtypes of RCC, while the incorporated uncertainty emphasized the importance of understanding model confidence, which is crucial for assisting clinical decision-making for patients with renal tumors. Clinical relevance statement Our deep learning approach, integrated with uncertainty estimation, offers clinicians a dual advantage: accurate RCC subtype predictions complemented by diagnostic confidence references, promoting informed decision-making for patients with RCC.
△ Less
Submitted 12 November, 2023; v1 submitted 1 November, 2023;
originally announced November 2023.
-
Manta Ray Inspired Flapping-Wing Blimp
Authors:
Kentaro Nojima-Schmunk,
David Turzak,
Kevin Kim,
Andrew Vu,
James Yang,
Sreeauditya Motukuri,
Ningshi Yao,
Daigo Shishika
Abstract:
Lighter-than-air vehicles or blimps, are an evolving platform in robotics with several beneficial properties such as energy efficiency, collision resistance, and ability to work in close proximity to human users. While existing blimp designs have mainly used propeller-based propulsion, we focus our attention to an alternate locomotion method, flapping wings. Specifically, this paper introduces a f…
▽ More
Lighter-than-air vehicles or blimps, are an evolving platform in robotics with several beneficial properties such as energy efficiency, collision resistance, and ability to work in close proximity to human users. While existing blimp designs have mainly used propeller-based propulsion, we focus our attention to an alternate locomotion method, flapping wings. Specifically, this paper introduces a flapping-wing blimp inspired by manta rays, in contrast to existing research on flapping-wing vehicles that draw inspiration from insects or birds. We present the overall design and control scheme of the blimp as well as the analysis on how the wing performs. The effects of wing shape and flapping characteristics on the thrust generation are studied experimentally. We also demonstrate that the flapping-wing blimp has a significant range advantage over a propeller-based system.
△ Less
Submitted 16 October, 2023;
originally announced October 2023.
-
Lighter-Than-Air Autonomous Ball Capture and Scoring Robot -- Design, Development, and Deployment
Authors:
Joseph Prince Mathew,
Dinesh Karri,
James Yang,
Kevin Zhu,
Yojan Gautam,
Kentaro Nojima-Schmunk,
Daigo Shishika,
Ningshi Yao,
Cameron Nowzari
Abstract:
This paper describes the full end-to-end design of our primary scoring agent in an aerial autonomous robotics competition from April 2023. As open-ended robotics competitions become more popular, we wish to begin documenting successful team designs and approaches. The intended audience of this paper is not only any future or potential participant in this particular national Defend The Republic (DT…
▽ More
This paper describes the full end-to-end design of our primary scoring agent in an aerial autonomous robotics competition from April 2023. As open-ended robotics competitions become more popular, we wish to begin documenting successful team designs and approaches. The intended audience of this paper is not only any future or potential participant in this particular national Defend The Republic (DTR) competition, but rather anyone thinking about designing their first robot or system to be entered in a competition with clear goals. Future DTR participants can and should either build on the ideas here, or find new alternate strategies that can defeat the most successful design last time. For non-DTR participants but students interested in robotics competitions, identifying the minimum viable system needed to be competitive is still important in helping manage time and prioritizing tasks that are crucial to competition success first.
△ Less
Submitted 12 September, 2023;
originally announced September 2023.
-
MLA-BIN: Model-level Attention and Batch-instance Style Normalization for Domain Generalization of Federated Learning on Medical Image Segmentation
Authors:
Fubao Zhu,
Yanhui Tian,
Chuang Han,
Yanting Li,
Jiaofen Nan,
Ni Yao,
Weihua Zhou
Abstract:
The privacy protection mechanism of federated learning (FL) offers an effective solution for cross-center medical collaboration and data sharing. In multi-site medical image segmentation, each medical site serves as a client of FL, and its data naturally forms a domain. FL supplies the possibility to improve the performance of seen domains model. However, there is a problem of domain generalizatio…
▽ More
The privacy protection mechanism of federated learning (FL) offers an effective solution for cross-center medical collaboration and data sharing. In multi-site medical image segmentation, each medical site serves as a client of FL, and its data naturally forms a domain. FL supplies the possibility to improve the performance of seen domains model. However, there is a problem of domain generalization (DG) in the actual de-ployment, that is, the performance of the model trained by FL in unseen domains will decrease. Hence, MLA-BIN is proposed to solve the DG of FL in this study. Specifically, the model-level attention module (MLA) and batch-instance style normalization (BIN) block were designed. The MLA represents the unseen domain as a linear combination of seen domain models. The atten-tion mechanism is introduced for the weighting coefficient to obtain the optimal coefficient ac-cording to the similarity of inter-domain data features. MLA enables the global model to gen-eralize to unseen domain. In the BIN block, batch normalization (BN) and instance normalization (IN) are combined to perform the shallow layers of the segmentation network for style normali-zation, solving the influence of inter-domain image style differences on DG. The extensive experimental results of two medical image seg-mentation tasks demonstrate that the proposed MLA-BIN outperforms state-of-the-art methods.
△ Less
Submitted 29 June, 2023;
originally announced June 2023.
-
Incremental Value and Interpretability of Radiomics Features of Both Lung and Epicardial Adipose Tissue for Detecting the Severity of COVID-19 Infection
Authors:
Ni Yao,
Yanhui Tian,
Daniel Gama das Neves,
Chen Zhao,
Claudio Tinoco Mesquita,
Wolney de Andrade Martins,
Alair Augusto Sarmet Moreira Damas dos Santos,
Yanting Li,
Chuang Han,
Fubao Zhu,
Neng Dai,
Weihua Zhou
Abstract:
Epicardial adipose tissue (EAT) is known for its pro-inflammatory properties and association with Coronavirus Disease 2019 (COVID-19) severity. However, current EAT segmentation methods do not consider positional information. Additionally, the detection of COVID-19 severity lacks consideration for EAT radiomics features, which limits interpretability. This study investigates the use of radiomics f…
▽ More
Epicardial adipose tissue (EAT) is known for its pro-inflammatory properties and association with Coronavirus Disease 2019 (COVID-19) severity. However, current EAT segmentation methods do not consider positional information. Additionally, the detection of COVID-19 severity lacks consideration for EAT radiomics features, which limits interpretability. This study investigates the use of radiomics features from EAT and lungs to detect the severity of COVID-19 infections. A retrospective analysis of 515 patients with COVID-19 (Cohort1: 415, Cohort2: 100) was conducted using a proposed three-stage deep learning approach for EAT extraction. Lung segmentation was achieved using a published method. A hybrid model for detecting the severity of COVID-19 was built in a derivation cohort, and its performance and uncertainty were evaluated in internal (125, Cohort1) and external (100, Cohort2) validation cohorts. For EAT extraction, the Dice similarity coefficients (DSC) of the two centers were 0.972 (+-0.011) and 0.968 (+-0.005), respectively. For severity detection, the hybrid model with radiomics features of both lungs and EAT showed improvements in AUC, net reclassification improvement (NRI), and integrated discrimination improvement (IDI) compared to the model with only lung radiomics features. The hybrid model exhibited an increase of 0.1 (p<0.001), 19.3%, and 18.0% respectively, in the internal validation cohort and an increase of 0.09 (p<0.001), 18.0%, and 18.0%, respectively, in the external validation cohort while outperforming existing detection methods. Uncertainty quantification and radiomics features analysis confirmed the interpretability of case prediction after inclusion of EAT features.
△ Less
Submitted 6 December, 2023; v1 submitted 28 January, 2023;
originally announced January 2023.
-
Classically-Verifiable Quantum Advantage from a Computational Bell Test
Authors:
Gregory D. Kahanamoku-Meyer,
Soonwon Choi,
Umesh V. Vazirani,
Norman Y. Yao
Abstract:
We propose and analyze a novel interactive protocol for demonstrating quantum computational advantage, which is efficiently classically verifiable. Our protocol relies upon the cryptographic hardness of trapdoor claw-free functions (TCFs). Through a surprising connection to Bell's inequality, our protocol avoids the need for an adaptive hardcore bit, with essentially no increase in the quantum cir…
▽ More
We propose and analyze a novel interactive protocol for demonstrating quantum computational advantage, which is efficiently classically verifiable. Our protocol relies upon the cryptographic hardness of trapdoor claw-free functions (TCFs). Through a surprising connection to Bell's inequality, our protocol avoids the need for an adaptive hardcore bit, with essentially no increase in the quantum circuit complexity and no extra cryptographic assumptions. Crucially, this expands the set of compatible TCFs, and we propose two new constructions: one based upon the decisional Diffie-Hellman problem and the other based upon Rabin's function, $x^2 \bmod N$. We also describe two independent innovations which improve the efficiency of our protocol's implementation: (i) a scheme to discard so-called "garbage bits", thereby removing the need for reversibility in the quantum circuits, and (ii) a natural way of performing post-selection which significantly reduces the fidelity needed to demonstrate quantum advantage. These two constructions may also be of independent interest, as they may be applicable to other TCF-based quantum cryptography such as certifiable random number generation. Finally, we design several efficient circuits for $x^2 \bmod N$ and describe a blueprint for their implementation on a Rydberg-atom-based quantum computer.
△ Less
Submitted 16 August, 2022; v1 submitted 1 April, 2021;
originally announced April 2021.
-
Enhancing Scalability of a Matrix-Free Eigensolver for Studying Many-Body Localization
Authors:
Roel Van Beeumen,
Khaled Z. Ibrahim,
Gregory D. Kahanamoku-Meyer,
Norman Y. Yao,
Chao Yang
Abstract:
In [Van Beeumen, et. al, HPC Asia 2020, https://www.doi.org/10.1145/3368474.3368497] a scalable and matrix-free eigensolver was proposed for studying the many-body localization (MBL) transition of two-level quantum spin chain models with nearest-neighbor $XX+YY$ interactions plus $Z$ terms. This type of problem is computationally challenging because the vector space dimension grows exponentially w…
▽ More
In [Van Beeumen, et. al, HPC Asia 2020, https://www.doi.org/10.1145/3368474.3368497] a scalable and matrix-free eigensolver was proposed for studying the many-body localization (MBL) transition of two-level quantum spin chain models with nearest-neighbor $XX+YY$ interactions plus $Z$ terms. This type of problem is computationally challenging because the vector space dimension grows exponentially with the physical system size, and averaging over different configurations of the random disorder is needed to obtain relevant statistical behavior. For each eigenvalue problem, eigenvalues from different regions of the spectrum and their corresponding eigenvectors need to be computed. Traditionally, the interior eigenstates for a single eigenvalue problem are computed via the shift-and-invert Lanczos algorithm. Due to the extremely high memory footprint of the LU factorizations, this technique is not well suited for large number of spins $L$, e.g., one needs thousands of compute nodes on modern high performance computing infrastructures to go beyond $L = 24$. The matrix-free approach does not suffer from this memory bottleneck, however, its scalability is limited by a computation and communication imbalance. We present a few strategies to reduce this imbalance and to significantly enhance the scalability of the matrix-free eigensolver. To optimize the communication performance, we leverage the consistent space runtime, CSPACER, and show its efficiency in accelerating the MBL irregular communication patterns at scale compared to optimized MPI non-blocking two-sided and one-sided RMA implementation variants. The efficiency and effectiveness of the proposed algorithm is demonstrated by computing eigenstates on a massively parallel many-core high performance computer.
△ Less
Submitted 30 November, 2020;
originally announced December 2020.
-
Abnormal respiratory patterns classifier may contribute to large-scale screening of people infected with COVID-19 in an accurate and unobtrusive manner
Authors:
Yunlu Wang,
Menghan Hu,
Qingli Li,
Xiao-Ping Zhang,
Guangtao Zhai,
Nan Yao
Abstract:
Research significance: The extended version of this paper has been accepted by IEEE Internet of Things journal (DOI: 10.1109/JIOT.2020.2991456), please cite the journal version. During the epidemic prevention and control period, our study can be helpful in prognosis, diagnosis and screening for the patients infected with COVID-19 (the novel coronavirus) based on breathing characteristics. Accordin…
▽ More
Research significance: The extended version of this paper has been accepted by IEEE Internet of Things journal (DOI: 10.1109/JIOT.2020.2991456), please cite the journal version. During the epidemic prevention and control period, our study can be helpful in prognosis, diagnosis and screening for the patients infected with COVID-19 (the novel coronavirus) based on breathing characteristics. According to the latest clinical research, the respiratory pattern of COVID-19 is different from the respiratory patterns of flu and the common cold. One significant symptom that occurs in the COVID-19 is Tachypnea. People infected with COVID-19 have more rapid respiration. Our study can be utilized to distinguish various respiratory patterns and our device can be preliminarily put to practical use. Demo videos of this method working in situations of one subject and two subjects can be downloaded online. Research details: Accurate detection of the unexpected abnormal respiratory pattern of people in a remote and unobtrusive manner has great significance. In this work, we innovatively capitalize on depth camera and deep learning to achieve this goal. The challenges in this task are twofold: the amount of real-world data is not enough for training to get the deep model; and the intra-class variation of different types of respiratory patterns is large and the outer-class variation is small. In this paper, considering the characteristics of actual respiratory signals, a novel and efficient Respiratory Simulation Model (RSM) is first proposed to fill the gap between the large amount of training data and scarce real-world data. The proposed deep model and the modeling ideas have the great potential to be extended to large scale applications such as public places, sleep scenario, and office environment.
△ Less
Submitted 20 December, 2020; v1 submitted 12 February, 2020;
originally announced February 2020.
-
Geometry-Contrastive GAN for Facial Expression Transfer
Authors:
Fengchun Qiao,
Naiming Yao,
Zirui Jiao,
Zhihao Li,
Hui Chen,
Hongan Wang
Abstract:
In this paper, we propose a Geometry-Contrastive Generative Adversarial Network (GC-GAN) for transferring continuous emotions across different subjects. Given an input face with certain emotion and a target facial expression from another subject, GC-GAN can generate an identity-preserving face with the target expression. Geometry information is introduced into cGANs as continuous conditions to gui…
▽ More
In this paper, we propose a Geometry-Contrastive Generative Adversarial Network (GC-GAN) for transferring continuous emotions across different subjects. Given an input face with certain emotion and a target facial expression from another subject, GC-GAN can generate an identity-preserving face with the target expression. Geometry information is introduced into cGANs as continuous conditions to guide the generation of facial expressions. In order to handle the misalignment across different subjects or emotions, contrastive learning is used to transform geometry manifold into an embedded semantic manifold of facial expressions. Therefore, the embedded geometry is injected into the latent space of GANs and control the emotion generation effectively. Experimental results demonstrate that our proposed method can be applied in facial expression transfer even there exist big differences in facial shapes and expressions between different subjects.
△ Less
Submitted 22 October, 2018; v1 submitted 6 February, 2018;
originally announced February 2018.