Zum Hauptinhalt springen

Showing 1–18 of 18 results for author: Pham, N T

Searching in archive cs. Search in all archives.
.
  1. arXiv:2408.13850  [pdf, other

    cs.LG cs.AI

    Condensed Sample-Guided Model Inversion for Knowledge Distillation

    Authors: Kuluhan Binici, Shivam Aggarwal, Cihan Acar, Nam Trung Pham, Karianto Leman, Gim Hee Lee, Tulika Mitra

    Abstract: Knowledge distillation (KD) is a key element in neural network compression that allows knowledge transfer from a pre-trained teacher model to a more compact student model. KD relies on access to the training dataset, which may not always be fully available due to privacy concerns or logistical issues related to the size of the data. To address this, "data-free" KD methods use synthetic data, gener… ▽ More

    Submitted 25 August, 2024; originally announced August 2024.

  2. arXiv:2312.09877  [pdf, other

    cs.LG cs.AI cs.DC stat.ML

    Distributed Learning of Mixtures of Experts

    Authors: Faïcel Chamroukhi, Nhat Thien Pham

    Abstract: In modern machine learning problems we deal with datasets that are either distributed by nature or potentially large for which distributing the computations is usually a standard way to proceed, since centralized algorithms are in general ineffective. We propose a distributed learning approach for mixtures of experts (MoE) models with an aggregation strategy to construct a reduction estimator from… ▽ More

    Submitted 15 December, 2023; originally announced December 2023.

  3. vieCap4H-VLSP 2021: Vietnamese Image Captioning for Healthcare Domain using Swin Transformer and Attention-based LSTM

    Authors: Thanh Tin Nguyen, Long H. Nguyen, Nhat Truong Pham, Liu Tai Nguyen, Van Huong Do, Hai Nguyen, Ngoc Duy Nguyen

    Abstract: This study presents our approach on the automatic Vietnamese image captioning for healthcare domain in text processing tasks of Vietnamese Language and Speech Processing (VLSP) Challenge 2021, as shown in Figure 1. In recent years, image captioning often employs a convolutional neural network-based architecture as an encoder and a long short-term memory (LSTM) as a decoder to generate sentences. T… ▽ More

    Submitted 2 September, 2022; originally announced September 2022.

    Comments: Accepted for publication in the VNU Journal of Science: Computer Science and Communication Engineering

    Journal ref: VNU Journal of Science: Computer Science and Communication Engineering, 38(2), 2022

  4. arXiv:2202.13934  [pdf, ps, other

    stat.ML cs.AI cs.LG

    Functional mixture-of-experts for classification

    Authors: Nhat Thien Pham, Faicel Chamroukhi

    Abstract: We develop a mixtures-of-experts (ME) approach to the multiclass classification where the predictors are univariate functions. It consists of a ME model in which both the gating network and the experts network are constructed upon multinomial logistic activation functions with functional inputs. We perform a regularized maximum likelihood estimation in which the coefficient functions enjoy interpr… ▽ More

    Submitted 28 February, 2022; originally announced February 2022.

    Comments: Submitted to the 53èmes Journées de la Société Française de Statistique

  5. arXiv:2202.02249  [pdf, other

    stat.ME cs.LG stat.CO stat.ML

    Functional Mixtures-of-Experts

    Authors: Faïcel Chamroukhi, Nhat Thien Pham, Van Hà Hoang, Geoffrey J. McLachlan

    Abstract: We consider the statistical analysis of heterogeneous data for prediction in situations where the observations include functions, typically time series. We extend the modeling with Mixtures-of-Experts (ME), as a framework of choice in modeling heterogeneity in data for prediction with vectorial observations, to this functional data analysis context. We first present a new family of ME models, name… ▽ More

    Submitted 20 December, 2023; v1 submitted 4 February, 2022; originally announced February 2022.

    MSC Class: 62-XX; 62R10 ACM Class: G.3

  6. arXiv:2201.03019  [pdf, other

    cs.LG cs.AI

    Robust and Resource-Efficient Data-Free Knowledge Distillation by Generative Pseudo Replay

    Authors: Kuluhan Binici, Shivam Aggarwal, Nam Trung Pham, Karianto Leman, Tulika Mitra

    Abstract: Data-Free Knowledge Distillation (KD) allows knowledge transfer from a trained neural network (teacher) to a more compact one (student) in the absence of original training data. Existing works use a validation set to monitor the accuracy of the student over real data and report the highest performance throughout the entire process. However, validation data may not be available at distillation time… ▽ More

    Submitted 29 July, 2024; v1 submitted 9 January, 2022; originally announced January 2022.

    Comments: AAAI Conference on Artificial Intelligence

  7. arXiv:2109.09026  [pdf, other

    cs.SD cs.HC cs.LG eess.AS

    Hybrid Data Augmentation and Deep Attention-based Dilated Convolutional-Recurrent Neural Networks for Speech Emotion Recognition

    Authors: Nhat Truong Pham, Duc Ngoc Minh Dang, Sy Dzung Nguyen

    Abstract: Speech emotion recognition (SER) has been one of the significant tasks in Human-Computer Interaction (HCI) applications. However, it is hard to choose the optimal features and deal with imbalance labeled data. In this article, we investigate hybrid data augmentation (HDA) methods to generate and balance data based on traditional and generative adversarial networks (GAN) methods. To evaluate the ef… ▽ More

    Submitted 18 September, 2021; originally announced September 2021.

    Comments: 12 pages, 16 figures, 6 tables

  8. arXiv:2109.03219  [pdf, other

    cs.SD cs.LG cs.NE eess.AS

    Fruit-CoV: An Efficient Vision-based Framework for Speedy Detection and Diagnosis of SARS-CoV-2 Infections Through Recorded Cough Sounds

    Authors: Long H. Nguyen, Nhat Truong Pham, Van Huong Do, Liu Tai Nguyen, Thanh Tin Nguyen, Van Dung Do, Hai Nguyen, Ngoc Duy Nguyen

    Abstract: SARS-CoV-2 is colloquially known as COVID-19 that had an initial outbreak in December 2019. The deadly virus has spread across the world, taking part in the global pandemic disease since March 2020. In addition, a recent variant of SARS-CoV-2 named Delta is intractably contagious and responsible for more than four million deaths over the world. Therefore, it is vital to possess a self-testing serv… ▽ More

    Submitted 6 September, 2021; originally announced September 2021.

    Comments: 4 pages

  9. arXiv:2108.11089  [pdf, other

    cs.SD eess.AS

    Detecting Drill Failure in the Small Short-sound Drill Dataset

    Authors: Thanh Tran, Nhat Truong Pham, Jan Lundgren

    Abstract: Monitoring the conditions of machines is vital in the manufacturing industry. Early detection of faulty components in machines for stopping and repairing the failed components can minimize the downtime of the machine. This article presents an approach to detect the failure occurring in drill machines based on drill sounds from Valmet AB. The drill dataset includes three classes: anomalous sounds,… ▽ More

    Submitted 9 November, 2021; v1 submitted 25 August, 2021; originally announced August 2021.

    Comments: 8 pages, 10 figures, journal

  10. arXiv:2108.05698  [pdf, other

    cs.LG cs.CV

    Preventing Catastrophic Forgetting and Distribution Mismatch in Knowledge Distillation via Synthetic Data

    Authors: Kuluhan Binici, Nam Trung Pham, Tulika Mitra, Karianto Leman

    Abstract: With the increasing popularity of deep learning on edge devices, compressing large neural networks to meet the hardware requirements of resource-constrained devices became a significant research direction. Numerous compression methodologies are currently being used to reduce the memory sizes and energy consumption of neural networks. Knowledge distillation (KD) is among such methodologies and it f… ▽ More

    Submitted 5 November, 2021; v1 submitted 11 August, 2021; originally announced August 2021.

    Comments: Accepted by the 2022 Winter Conference on Applications of Computer Vision (WACV 2022)

    Journal ref: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2022, pp. 663-671

  11. arXiv:2012.02950  [pdf, other

    cs.LG cs.AI cs.CY

    Deep Depression Prediction on Longitudinal Data via Joint Anomaly Ranking and Classification

    Authors: Guansong Pang, Ngoc Thien Anh Pham, Emma Baker, Rebecca Bentley, Anton van den Hengel

    Abstract: A wide variety of methods have been developed for identifying depression, but they focus primarily on measuring the degree to which individuals are suffering from depression currently. In this work we explore the possibility of predicting future depression using machine learning applied to longitudinal socio-demographic data. In doing so we show that data such as housing status, and the details of… ▽ More

    Submitted 20 March, 2022; v1 submitted 5 December, 2020; originally announced December 2020.

    Comments: Accepted to PAKDD 2022

  12. arXiv:2006.08748  [pdf, other

    cs.CL

    DynE: Dynamic Ensemble Decoding for Multi-Document Summarization

    Authors: Chris Hokamp, Demian Gholipour Ghalandari, Nghia The Pham, John Glover

    Abstract: Sequence-to-sequence (s2s) models are the basis for extensive work in natural language processing. However, some applications, such as multi-document summarization, multi-modal machine translation, and the automatic post-editing of machine translation, require mapping a set of multiple distinct inputs into a single output sequence. Recent work has introduced bespoke architectures for these multi-i… ▽ More

    Submitted 15 June, 2020; originally announced June 2020.

  13. arXiv:2005.10070  [pdf, other

    cs.CL

    A Large-Scale Multi-Document Summarization Dataset from the Wikipedia Current Events Portal

    Authors: Demian Gholipour Ghalandari, Chris Hokamp, Nghia The Pham, John Glover, Georgiana Ifrim

    Abstract: Multi-document summarization (MDS) aims to compress the content in large document collections into short summaries and has important applications in story clustering for newsfeeds, presentation of search results, and timeline generation. However, there is a lack of datasets that realistically address such use cases at a scale large enough for training supervised models for this task. This work pre… ▽ More

    Submitted 20 May, 2020; originally announced May 2020.

    Comments: Camera-ready version for ACL 2020

  14. arXiv:1807.11024  [pdf

    cs.IR cs.AI cs.CL

    Opinion Spam Recognition Method for Online Reviews using Ontological Features

    Authors: L. H. Nguyen, N. T. H. Pham, V. M. Ngo

    Abstract: Nowadays, there are a lot of people using social media opinions to make their decision on buying products or services. Opinion spam detection is a hard problem because fake reviews can be made by organizations as well as individuals for different purposes. They write fake reviews to mislead readers or automated detection system by promoting or demoting target products to promote them or to damage… ▽ More

    Submitted 29 July, 2018; originally announced July 2018.

    Comments: 15 pages, In Journal of Science, Special Issue: Natural Science and Technology, Ho Chi Minh City University of Education

  15. arXiv:1702.01815  [pdf, other

    cs.CL

    Living a discrete life in a continuous world: Reference with distributed representations

    Authors: Gemma Boleda, Sebastian Padó, Nghia The Pham, Marco Baroni

    Abstract: Reference is a crucial property of language that allows us to connect linguistic expressions to the world. Modeling it requires handling both continuous and discrete aspects of meaning. Data-driven models excel at the former, but struggle with the latter, and the reverse is true for symbolic models. This paper (a) introduces a concrete referential task to test both aspects, called cross-modal en… ▽ More

    Submitted 4 September, 2017; v1 submitted 6 February, 2017; originally announced February 2017.

    Comments: Accepted at IWCS 2017. Final version, 9 pages

  16. arXiv:1605.07133  [pdf, other

    cs.CL cs.CV cs.LG

    Towards Multi-Agent Communication-Based Language Learning

    Authors: Angeliki Lazaridou, Nghia The Pham, Marco Baroni

    Abstract: We propose an interactive multimodal framework for language learning. Instead of being passively exposed to large amounts of natural text, our learners (implemented as feed-forward neural networks) engage in cooperative referential games starting from a tabula rasa setup, and thus develop their own language from the need to communicate in order to succeed at the game. Preliminary experiments provi… ▽ More

    Submitted 23 May, 2016; originally announced May 2016.

    Comments: 9 pages, manuscript under submission

  17. arXiv:1603.02618  [pdf, other

    cs.CL cs.CV

    The red one!: On learning to refer to things based on their discriminative properties

    Authors: Angeliki Lazaridou, Nghia The Pham, Marco Baroni

    Abstract: As a first step towards agents learning to communicate about their visual environment, we propose a system that, given visual representations of a referent (cat) and a context (sofa), identifies their discriminative attributes, i.e., properties that distinguish them (has_tail). Moreover, despite the lack of direct supervision at the attribute level, the model learns to assign plausible attributes… ▽ More

    Submitted 23 May, 2016; v1 submitted 8 March, 2016; originally announced March 2016.

    Comments: Accepted as an ACL-short sumbmission

  18. arXiv:1501.02598  [pdf, other

    cs.CL cs.CV cs.LG

    Combining Language and Vision with a Multimodal Skip-gram Model

    Authors: Angeliki Lazaridou, Nghia The Pham, Marco Baroni

    Abstract: We extend the SKIP-GRAM model of Mikolov et al. (2013a) by taking visual information into account. Like SKIP-GRAM, our multimodal models (MMSKIP-GRAM) build vector-based word representations by learning to predict linguistic contexts in text corpora. However, for a restricted set of words, the models are also exposed to visual representations of the objects they denote (extracted from natural imag… ▽ More

    Submitted 12 March, 2015; v1 submitted 12 January, 2015; originally announced January 2015.

    Comments: accepted at NAACL 2015, camera ready version, 11 pages