Search | arXiv e-print repository

AniFaceDiff: High-Fidelity Face Reenactment via Facial Parametric Conditioned Diffusion Models

Authors: Ken Chen, Sachith Seneviratne, Wei Wang, Dongting Hu, Sanjay Saha, Md. Tarek Hasan, Sanka Rasnayaka, Tamasha Malepathirana, Mingming Gong, Saman Halgamuge

Abstract: Face reenactment refers to the process of transferring the pose and facial expressions from a reference (driving) video onto a static facial (source) image while maintaining the original identity of the source image. Previous research in this domain has made significant progress by training controllable deep generative models to generate faces based on specific identity, pose and expression condit… ▽ More Face reenactment refers to the process of transferring the pose and facial expressions from a reference (driving) video onto a static facial (source) image while maintaining the original identity of the source image. Previous research in this domain has made significant progress by training controllable deep generative models to generate faces based on specific identity, pose and expression conditions. However, the mechanisms used in these methods to control pose and expression often inadvertently introduce identity information from the driving video, while also causing a loss of expression-related details. This paper proposes a new method based on Stable Diffusion, called AniFaceDiff, incorporating a new conditioning module for high-fidelity face reenactment. First, we propose an enhanced 2D facial snapshot conditioning approach by facial shape alignment to prevent the inclusion of identity information from the driving video. Then, we introduce an expression adapter conditioning mechanism to address the potential loss of expression-related information. Our approach effectively preserves pose and expression fidelity from the driving video while retaining the identity and fine details of the source image. Through experiments on the VoxCeleb dataset, we demonstrate that our method achieves state-of-the-art results in face reenactment, showcasing superior image quality, identity preservation, and expression accuracy, especially for cross-identity scenarios. Considering the ethical concerns surrounding potential misuse, we analyze the implications of our method, evaluate current state-of-the-art deepfake detectors, and identify their shortcomings to guide future research. △ Less

Submitted 19 June, 2024; originally announced June 2024.

arXiv:2402.06937 [pdf, other]

Assessing Uncertainty Estimation Methods for 3D Image Segmentation under Distribution Shifts

Authors: Masoumeh Javanbakhat, Md Tasnimul Hasan, Cristoph Lippert

Abstract: In recent years, machine learning has witnessed extensive adoption across various sectors, yet its application in medical image-based disease detection and diagnosis remains challenging due to distribution shifts in real-world data. In practical settings, deployed models encounter samples that differ significantly from the training dataset, especially in the health domain, leading to potential per… ▽ More In recent years, machine learning has witnessed extensive adoption across various sectors, yet its application in medical image-based disease detection and diagnosis remains challenging due to distribution shifts in real-world data. In practical settings, deployed models encounter samples that differ significantly from the training dataset, especially in the health domain, leading to potential performance issues. This limitation hinders the expressiveness and reliability of deep learning models in health applications. Thus, it becomes crucial to identify methods capable of producing reliable uncertainty estimation in the context of distribution shifts in the health sector. In this paper, we explore the feasibility of using cutting-edge Bayesian and non-Bayesian methods to detect distributionally shifted samples, aiming to achieve reliable and trustworthy diagnostic predictions in segmentation task. Specifically, we compare three distinct uncertainty estimation methods, each designed to capture either unimodal or multimodal aspects in the posterior distribution. Our findings demonstrate that methods capable of addressing multimodal characteristics in the posterior distribution, offer more dependable uncertainty estimates. This research contributes to enhancing the utility of deep learning in healthcare, making diagnostic predictions more robust and trustworthy. △ Less

Submitted 10 February, 2024; originally announced February 2024.

arXiv:2304.11703 [pdf]

An Artificial Intelligence-based Framework to Achieve the Sustainable Development Goals in the Context of Bangladesh

Authors: Md. Tarek Hasan, Mohammad Nazmush Shamael, Arifa Akter, Rokibul Islam, Md. Saddam Hossain Mukta, Salekul Islam

Abstract: Sustainable development is a framework for achieving human development goals. It provides natural systems' ability to deliver natural resources and ecosystem services. Sustainable development is crucial for the economy and society. Artificial intelligence (AI) has attracted increasing attention in recent years, with the potential to have a positive influence across many domains. AI is a commonly e… ▽ More Sustainable development is a framework for achieving human development goals. It provides natural systems' ability to deliver natural resources and ecosystem services. Sustainable development is crucial for the economy and society. Artificial intelligence (AI) has attracted increasing attention in recent years, with the potential to have a positive influence across many domains. AI is a commonly employed component in the quest for long-term sustainability. In this study, we explore the impact of AI on three pillars of sustainable development: society, environment, and economy, as well as numerous case studies from which we may deduce the impact of AI in a variety of areas, i.e., agriculture, classifying waste, smart water management, and Heating, Ventilation, and Air Conditioning (HVAC) systems. Furthermore, we present AI-based strategies for achieving Sustainable Development Goals (SDGs) which are effective for developing countries like Bangladesh. The framework that we propose may reduce the negative impact of AI and promote the proactiveness of this technology. △ Less

Submitted 23 April, 2023; originally announced April 2023.

Comments: 11 pages, 5 figures, This is a part of the Proceedings of the 5th International Conference on Sustainable Development, Published by Institute of Development Studies and Sustainable Development (IDSS), United International University, United City, Madani Avenue, Badda, Dhaka 1212, Bangladesh, Link: icsd.uiu.ac.bd/wp-content/uploads/2022/11/5th-UIU-ICSD-2022-Proceedings..pdf

arXiv:2206.13982 [pdf]

doi 10.1109/ICONAT53423.2022.9725937

A Proposed Bi-LSTM Method to Fake News Detection

Authors: Taminul Islam, MD Alamin Hosen, Akhi Mony, MD Touhid Hasan, Israt Jahan, Arindom Kundu

Abstract: Recent years have seen an explosion in social media usage, allowing people to connect with others. Since the appearance of platforms such as Facebook and Twitter, such platforms influence how we speak, think, and behave. This problem negatively undermines confidence in content because of the existence of fake news. For instance, false news was a determining factor in influencing the outcome of the… ▽ More Recent years have seen an explosion in social media usage, allowing people to connect with others. Since the appearance of platforms such as Facebook and Twitter, such platforms influence how we speak, think, and behave. This problem negatively undermines confidence in content because of the existence of fake news. For instance, false news was a determining factor in influencing the outcome of the U.S. presidential election and other sites. Because this information is so harmful, it is essential to make sure we have the necessary tools to detect and resist it. We applied Bidirectional Long Short-Term Memory (Bi-LSTM) to determine if the news is false or real in order to showcase this study. A number of foreign websites and newspapers were used for data collection. After creating & running the model, the work achieved 84% model accuracy and 62.0 F1-macro scores with training data. △ Less

Submitted 15 June, 2022; originally announced June 2022.

Comments: Accepted and published in 2022 International Conference for Advancement in Technology, 5 pages, 8 figures

Journal ref: In 2022 International Conference for Advancement in Technology (ICONAT) (pp. 1-5). IEEE (2022, January)

arXiv:2206.05319 [pdf, other]

Object Instance Identification in Dynamic Environments

Authors: Takuma Yagi, Md Tasnimul Hasan, Yoichi Sato

Abstract: We study the problem of identifying object instances in a dynamic environment where people interact with the objects. In such an environment, objects' appearance changes dynamically by interaction with other entities, occlusion by hands, background change, etc. This leads to a larger intra-instance variation of appearance than in static environments. To discover the challenges in this setting, we… ▽ More We study the problem of identifying object instances in a dynamic environment where people interact with the objects. In such an environment, objects' appearance changes dynamically by interaction with other entities, occlusion by hands, background change, etc. This leads to a larger intra-instance variation of appearance than in static environments. To discover the challenges in this setting, we newly built a benchmark of more than 1,500 instances built on the EPIC-KITCHENS dataset which includes natural activities and conducted an extensive analysis of it. Experimental results suggest that (i) robustness against instance-specific appearance change (ii) integration of low-level (e.g., color, texture) and high-level (e.g., object category) features (iii) foreground feature selection on overlapping objects are required for further improvement. △ Less

Submitted 10 June, 2022; originally announced June 2022.

Comments: Joint 1st Ego4D and 10th EPIC Workshop (EPIC@CVPR2022) Extended Abstract

arXiv:2205.11038 [pdf, other]

Computational Approach of Designing Magnetfree Nonreciprocal Metamaterial

Authors: Swadesh Poddar, Md. Tanvir Hasan, Md. Ragib Shakil Rafi

Abstract: This article aims at discussing computational approach to design magnet-free nonreciprocal metamaterial. Detailed mathematical derivation on floquet mode analysis is presented for Faraday and Kerr rotation. Non-reciprocity in the designed metasurface is achieved in the presence of biased transistor loaded in the gap of circular ring resonator. Based on the synthesized mathematical model, We extrac… ▽ More This article aims at discussing computational approach to design magnet-free nonreciprocal metamaterial. Detailed mathematical derivation on floquet mode analysis is presented for Faraday and Kerr rotation. Non-reciprocity in the designed metasurface is achieved in the presence of biased transistor loaded in the gap of circular ring resonator. Based on the synthesized mathematical model, We extract co-cross polarized components as well as Faraday and Kerr rotation from the developed synthesized model and compare/contrast reciprocal and nonreciprocal system. △ Less

Submitted 23 May, 2022; originally announced May 2022.

Comments: 8 figures, 10 pages

arXiv:2110.10174 [pdf, other]

Hand-Object Contact Prediction via Motion-Based Pseudo-Labeling and Guided Progressive Label Correction

Authors: Takuma Yagi, Md Tasnimul Hasan, Yoichi Sato

Abstract: Every hand-object interaction begins with contact. Despite predicting the contact state between hands and objects is useful in understanding hand-object interactions, prior methods on hand-object analysis have assumed that the interacting hands and objects are known, and were not studied in detail. In this study, we introduce a video-based method for predicting contact between a hand and an object… ▽ More Every hand-object interaction begins with contact. Despite predicting the contact state between hands and objects is useful in understanding hand-object interactions, prior methods on hand-object analysis have assumed that the interacting hands and objects are known, and were not studied in detail. In this study, we introduce a video-based method for predicting contact between a hand and an object. Specifically, given a video and a pair of hand and object tracks, we predict a binary contact state (contact or no-contact) for each frame. However, annotating a large number of hand-object tracks and contact labels is costly. To overcome the difficulty, we propose a semi-supervised framework consisting of (i) automatic collection of training data with motion-based pseudo-labels and (ii) guided progressive label correction (gPLC), which corrects noisy pseudo-labels with a small amount of trusted data. We validated our framework's effectiveness on a newly built benchmark dataset for hand-object contact prediction and showed superior performance against existing baseline methods. Code and data are available at https://github.com/takumayagi/hand_object_contact_prediction. △ Less

Submitted 19 October, 2021; originally announced October 2021.

Comments: BMVC 2021

Showing 1–7 of 7 results for author: Hasan, M T