Search | arXiv e-print repository

Do We Really Need Graph Convolution During Training? Light Post-Training Graph-ODE for Efficient Recommendation

Authors: Weizhi Zhang, Liangwei Yang, Zihe Song, Henry Peng Zou, Ke Xu, Liancheng Fang, Philip S. Yu

Abstract: The efficiency and scalability of graph convolution networks (GCNs) in training recommender systems (RecSys) have been persistent concerns, hindering their deployment in real-world applications. This paper presents a critical examination of the necessity of graph convolutions during the training phase and introduces an innovative alternative: the Light Post-Training Graph Ordinary-Differential-Equ… ▽ More The efficiency and scalability of graph convolution networks (GCNs) in training recommender systems (RecSys) have been persistent concerns, hindering their deployment in real-world applications. This paper presents a critical examination of the necessity of graph convolutions during the training phase and introduces an innovative alternative: the Light Post-Training Graph Ordinary-Differential-Equation (LightGODE). Our investigation reveals that the benefits of GCNs are more pronounced during testing rather than training. Motivated by this, LightGODE utilizes a novel post-training graph convolution method that bypasses the computation-intensive message passing of GCNs and employs a non-parametric continuous graph ordinary-differential-equation (ODE) to dynamically model node representations. This approach drastically reduces training time while achieving fine-grained post-training graph convolution to avoid the distortion of the original training embedding space, termed the embedding discrepancy issue. We validate our model across several real-world datasets of different scales, demonstrating that LightGODE not only outperforms GCN-based models in terms of efficiency and effectiveness but also significantly mitigates the embedding discrepancy commonly associated with deeper graph convolution layers. Our LightGODE challenges the prevailing paradigms in RecSys training and suggests re-evaluating the role of graph convolutions, potentially guiding future developments of efficient large-scale graph-based RecSys. △ Less

Submitted 28 July, 2024; v1 submitted 26 July, 2024; originally announced July 2024.

Comments: Accepted to CIKM 2024

arXiv:2407.18416 [pdf, other]

PersonaGym: Evaluating Persona Agents and LLMs

Authors: Vinay Samuel, Henry Peng Zou, Yue Zhou, Shreyas Chaudhari, Ashwin Kalyan, Tanmay Rajpurohit, Ameet Deshpande, Karthik Narasimhan, Vishvak Murahari

Abstract: Persona agents, which are LLM agents that act according to an assigned persona, have demonstrated impressive contextual response capabilities across various applications. These persona agents offer significant enhancements across diverse sectors, such as education, healthcare, and entertainment, where model developers can align agent responses to different user requirements thereby broadening the… ▽ More Persona agents, which are LLM agents that act according to an assigned persona, have demonstrated impressive contextual response capabilities across various applications. These persona agents offer significant enhancements across diverse sectors, such as education, healthcare, and entertainment, where model developers can align agent responses to different user requirements thereby broadening the scope of agent applications. However, evaluating persona agent performance is incredibly challenging due to the complexity of assessing persona adherence in free-form interactions across various environments that are relevant to each persona agent. We introduce PersonaGym, the first dynamic evaluation framework for assessing persona agents, and PersonaScore, the first automated human-aligned metric grounded in decision theory for comprehensive large-scale evaluation of persona agents. Our evaluation of 6 open and closed-source LLMs, using a benchmark encompassing 200 personas and 10,000 questions, reveals significant opportunities for advancement in persona agent capabilities across state-of-the-art models. For example, Claude 3.5 Sonnet only has a 2.97% relative improvement in PersonaScore than GPT 3.5 despite being a much more advanced model. Importantly, we find that increased model size and complexity do not necessarily imply enhanced persona agent capabilities thereby highlighting the pressing need for algorithmic and architectural invention towards faithful and performant persona agents. △ Less

Submitted 28 July, 2024; v1 submitted 25 July, 2024; originally announced July 2024.

Comments: 21 pages, 5 figures

arXiv:2407.12037 [pdf, other]

A Novel HDL Code Generator for Effectively Testing FPGA Logic Synthesis Compilers

Authors: Zhihao Xu, Shikai Guo, Guilin Zhao, Peiyu Zou, Xiaochen Li, He Jiang

Abstract: Field Programmable Gate Array (FPGA) logic synthesis compilers (e.g., Vivado, Iverilog, Yosys, and Quartus) are widely applied in Electronic Design Automation (EDA), such as the development of FPGA programs.However, defects (i.e., incorrect synthesis) in logic synthesis compilers may lead to unexpected behaviors in target applications, posing security risks. Therefore, it is crucial to thoroughly… ▽ More Field Programmable Gate Array (FPGA) logic synthesis compilers (e.g., Vivado, Iverilog, Yosys, and Quartus) are widely applied in Electronic Design Automation (EDA), such as the development of FPGA programs.However, defects (i.e., incorrect synthesis) in logic synthesis compilers may lead to unexpected behaviors in target applications, posing security risks. Therefore, it is crucial to thoroughly test logic synthesis compilers to eliminate such defects.Despite several Hardware Design Language (HDL) code generators (e.g., Verismith) have been proposed to find defects in logic synthesis compilers, the effectiveness of these generators is still limited by the simple code generation strategy and the monogeneity of the generated HDL code.This paper proposes LegoHDL, a novel method to generate syntax valid HDL code for comprehensively testing FPGA logic synthesis compilers.LegoHDL can generate more complex and diverse defect-trigger HDL code (e.g., Verilog, VHDL, and SystemVerilog) by leveraging the guidance of abstract syntax tree and the extensive function block libraries of cyber-physical systems. Extensive experiments show that the diversity and defect-trigger capability of HDL code generated by LegoHDL are significantly better than the state-of-the-art method (i.e., Verismith).In three months, LegoHDL has reported 20 new defects--many of which are deep and important; 16 of them have been confirmed. △ Less

Submitted 1 July, 2024; originally announced July 2024.

arXiv:2407.05721 [pdf, other]

PsycoLLM: Enhancing LLM for Psychological Understanding and Evaluation

Authors: Jinpeng Hu, Tengteng Dong, Luo Gang, Hui Ma, Peng Zou, Xiao Sun, Dan Guo, Meng Wang

Abstract: Mental health has attracted substantial attention in recent years and LLM can be an effective technology for alleviating this problem owing to its capability in text understanding and dialogue. However, existing research in this domain often suffers from limitations, such as training on datasets lacking crucial prior knowledge and evidence, and the absence of comprehensive evaluation methods. In t… ▽ More Mental health has attracted substantial attention in recent years and LLM can be an effective technology for alleviating this problem owing to its capability in text understanding and dialogue. However, existing research in this domain often suffers from limitations, such as training on datasets lacking crucial prior knowledge and evidence, and the absence of comprehensive evaluation methods. In this paper, we propose a specialized psychological large language model (LLM), named PsycoLLM, trained on a proposed high-quality psychological dataset, including single-turn QA, multi-turn dialogues and knowledge-based QA. Specifically, we construct multi-turn dialogues through a three-step pipeline comprising generation, evidence judgment, and refinement. We augment this process with real-world psychological case backgrounds extracted from online platforms, enhancing the relevance and applicability of the generated data. Additionally, to compare the performance of PsycoLLM with other LLMs, we develop a comprehensive psychological benchmark based on authoritative psychological counseling examinations in China, which includes assessments of professional ethics, theoretical proficiency, and case analysis. The experimental results on the benchmark illustrates the effectiveness of PsycoLLM, which demonstrates superior performance compared to other LLMs. △ Less

Submitted 7 August, 2024; v1 submitted 8 July, 2024; originally announced July 2024.

Comments: work in progress

arXiv:2407.00869 [pdf, other]

Large Language Models Are Involuntary Truth-Tellers: Exploiting Fallacy Failure for Jailbreak Attacks

Authors: Yue Zhou, Henry Peng Zou, Barbara Di Eugenio, Yang Zhang

Abstract: We find that language models have difficulties generating fallacious and deceptive reasoning. When asked to generate deceptive outputs, language models tend to leak honest counterparts but believe them to be false. Exploiting this deficiency, we propose a jailbreak attack method that elicits an aligned language model for malicious output. Specifically, we query the model to generate a fallacious y… ▽ More We find that language models have difficulties generating fallacious and deceptive reasoning. When asked to generate deceptive outputs, language models tend to leak honest counterparts but believe them to be false. Exploiting this deficiency, we propose a jailbreak attack method that elicits an aligned language model for malicious output. Specifically, we query the model to generate a fallacious yet deceptively real procedure for the harmful behavior. Since a fallacious procedure is generally considered fake and thus harmless by LLMs, it helps bypass the safeguard mechanism. Yet the output is factually harmful since the LLM cannot fabricate fallacious solutions but proposes truthful ones. We evaluate our approach over five safety-aligned large language models, comparing four previous jailbreak methods, and show that our approach achieves competitive performance with more harmful outputs. We believe the findings could be extended beyond model safety, such as self-verification and hallucination. △ Less

Submitted 30 June, 2024; originally announced July 2024.

arXiv:2406.16253 [pdf, other]

LLMs Assist NLP Researchers: Critique Paper (Meta-)Reviewing

Authors: Jiangshu Du, Yibo Wang, Wenting Zhao, Zhongfen Deng, Shuaiqi Liu, Renze Lou, Henry Peng Zou, Pranav Narayanan Venkit, Nan Zhang, Mukund Srinath, Haoran Ranran Zhang, Vipul Gupta, Yinghui Li, Tao Li, Fei Wang, Qin Liu, Tianlin Liu, Pengzhi Gao, Congying Xia, Chen Xing, Jiayang Cheng, Zhaowei Wang, Ying Su, Raj Sanjay Shah, Ruohao Guo , et al. (15 additional authors not shown)

Abstract: This work is motivated by two key trends. On one hand, large language models (LLMs) have shown remarkable versatility in various generative tasks such as writing, drawing, and question answering, significantly reducing the time required for many routine tasks. On the other hand, researchers, whose work is not only time-consuming but also highly expertise-demanding, face increasing challenges as th… ▽ More This work is motivated by two key trends. On one hand, large language models (LLMs) have shown remarkable versatility in various generative tasks such as writing, drawing, and question answering, significantly reducing the time required for many routine tasks. On the other hand, researchers, whose work is not only time-consuming but also highly expertise-demanding, face increasing challenges as they have to spend more time reading, writing, and reviewing papers. This raises the question: how can LLMs potentially assist researchers in alleviating their heavy workload? This study focuses on the topic of LLMs assist NLP Researchers, particularly examining the effectiveness of LLM in assisting paper (meta-)reviewing and its recognizability. To address this, we constructed the ReviewCritique dataset, which includes two types of information: (i) NLP papers (initial submissions rather than camera-ready) with both human-written and LLM-generated reviews, and (ii) each review comes with "deficiency" labels and corresponding explanations for individual segments, annotated by experts. Using ReviewCritique, this study explores two threads of research questions: (i) "LLMs as Reviewers", how do reviews generated by LLMs compare with those written by humans in terms of quality and distinguishability? (ii) "LLMs as Metareviewers", how effectively can LLMs identify potential issues, such as Deficient or unprofessional review segments, within individual paper reviews? To our knowledge, this is the first work to provide such a comprehensive analysis. △ Less

Submitted 25 June, 2024; v1 submitted 23 June, 2024; originally announced June 2024.

arXiv:2406.05392 [pdf, other]

Deconstructing The Ethics of Large Language Models from Long-standing Issues to New-emerging Dilemmas

Authors: Chengyuan Deng, Yiqun Duan, Xin Jin, Heng Chang, Yijun Tian, Han Liu, Henry Peng Zou, Yiqiao Jin, Yijia Xiao, Yichen Wang, Shenghao Wu, Zongxing Xie, Kuofeng Gao, Sihong He, Jun Zhuang, Lu Cheng, Haohan Wang

Abstract: Large Language Models (LLMs) have achieved unparalleled success across diverse language modeling tasks in recent years. However, this progress has also intensified ethical concerns, impacting the deployment of LLMs in everyday contexts. This paper provides a comprehensive survey of ethical challenges associated with LLMs, from longstanding issues such as copyright infringement, systematic bias, an… ▽ More Large Language Models (LLMs) have achieved unparalleled success across diverse language modeling tasks in recent years. However, this progress has also intensified ethical concerns, impacting the deployment of LLMs in everyday contexts. This paper provides a comprehensive survey of ethical challenges associated with LLMs, from longstanding issues such as copyright infringement, systematic bias, and data privacy, to emerging problems like truthfulness and social norms. We critically analyze existing research aimed at understanding, examining, and mitigating these ethical risks. Our survey underscores integrating ethical standards and societal values into the development of LLMs, thereby guiding the development of responsible and ethically aligned language models. △ Less

Submitted 8 June, 2024; originally announced June 2024.

arXiv:2404.15954 [pdf, other]

Mixed Supervised Graph Contrastive Learning for Recommendation

Authors: Weizhi Zhang, Liangwei Yang, Zihe Song, Henry Peng Zou, Ke Xu, Yuanjie Zhu, Philip S. Yu

Abstract: Recommender systems (RecSys) play a vital role in online platforms, offering users personalized suggestions amidst vast information. Graph contrastive learning aims to learn from high-order collaborative filtering signals with unsupervised augmentation on the user-item bipartite graph, which predominantly relies on the multi-task learning framework involving both the pair-wise recommendation loss… ▽ More Recommender systems (RecSys) play a vital role in online platforms, offering users personalized suggestions amidst vast information. Graph contrastive learning aims to learn from high-order collaborative filtering signals with unsupervised augmentation on the user-item bipartite graph, which predominantly relies on the multi-task learning framework involving both the pair-wise recommendation loss and the contrastive loss. This decoupled design can cause inconsistent optimization direction from different losses, which leads to longer convergence time and even sub-optimal performance. Besides, the self-supervised contrastive loss falls short in alleviating the data sparsity issue in RecSys as it learns to differentiate users/items from different views without providing extra supervised collaborative filtering signals during augmentations. In this paper, we propose Mixed Supervised Graph Contrastive Learning for Recommendation (MixSGCL) to address these concerns. MixSGCL originally integrates the training of recommendation and unsupervised contrastive losses into a supervised contrastive learning loss to align the two tasks within one optimization direction. To cope with the data sparsity issue, instead unsupervised augmentation, we further propose node-wise and edge-wise mixup to mine more direct supervised collaborative filtering signals based on existing user-item interactions. Extensive experiments on three real-world datasets demonstrate that MixSGCL surpasses state-of-the-art methods, achieving top performance on both accuracy and efficiency. It validates the effectiveness of MixSGCL with our coupled design on supervised graph contrastive learning. △ Less

Submitted 25 April, 2024; v1 submitted 24 April, 2024; originally announced April 2024.

arXiv:2404.15592 [pdf, other]

ImplicitAVE: An Open-Source Dataset and Multimodal LLMs Benchmark for Implicit Attribute Value Extraction

Authors: Henry Peng Zou, Vinay Samuel, Yue Zhou, Weizhi Zhang, Liancheng Fang, Zihe Song, Philip S. Yu, Cornelia Caragea

Abstract: Existing datasets for attribute value extraction (AVE) predominantly focus on explicit attribute values while neglecting the implicit ones, lack product images, are often not publicly available, and lack an in-depth human inspection across diverse domains. To address these limitations, we present ImplicitAVE, the first, publicly available multimodal dataset for implicit attribute value extraction.… ▽ More Existing datasets for attribute value extraction (AVE) predominantly focus on explicit attribute values while neglecting the implicit ones, lack product images, are often not publicly available, and lack an in-depth human inspection across diverse domains. To address these limitations, we present ImplicitAVE, the first, publicly available multimodal dataset for implicit attribute value extraction. ImplicitAVE, sourced from the MAVE dataset, is carefully curated and expanded to include implicit AVE and multimodality, resulting in a refined dataset of 68k training and 1.6k testing data across five domains. We also explore the application of multimodal large language models (MLLMs) to implicit AVE, establishing a comprehensive benchmark for MLLMs on the ImplicitAVE dataset. Six recent MLLMs with eleven variants are evaluated across diverse settings, revealing that implicit value extraction remains a challenging task for MLLMs. The contributions of this work include the development and release of ImplicitAVE, and the exploration and benchmarking of various MLLMs for implicit AVE, providing valuable insights and potential future research directions. Dataset and code are available at https://github.com/HenryPengZou/ImplicitAVE △ Less

Submitted 19 July, 2024; v1 submitted 23 April, 2024; originally announced April 2024.

Comments: Accepted by ACL 2024 (Findings) - Scores: Soundness - 4/4/4, Dataset - 4/4/4, Overall Assessment - 4/3.5/3.5, Meta - 4

arXiv:2404.08886 [pdf, other]

EIVEN: Efficient Implicit Attribute Value Extraction using Multimodal LLM

Authors: Henry Peng Zou, Gavin Heqing Yu, Ziwei Fan, Dan Bu, Han Liu, Peng Dai, Dongmei Jia, Cornelia Caragea

Abstract: In e-commerce, accurately extracting product attribute values from multimodal data is crucial for improving user experience and operational efficiency of retailers. However, previous approaches to multimodal attribute value extraction often struggle with implicit attribute values embedded in images or text, rely heavily on extensive labeled data, and can easily confuse similar attribute values. To… ▽ More In e-commerce, accurately extracting product attribute values from multimodal data is crucial for improving user experience and operational efficiency of retailers. However, previous approaches to multimodal attribute value extraction often struggle with implicit attribute values embedded in images or text, rely heavily on extensive labeled data, and can easily confuse similar attribute values. To address these issues, we introduce EIVEN, a data- and parameter-efficient generative framework that pioneers the use of multimodal LLM for implicit attribute value extraction. EIVEN leverages the rich inherent knowledge of a pre-trained LLM and vision encoder to reduce reliance on labeled data. We also introduce a novel Learning-by-Comparison technique to reduce model confusion by enforcing attribute value comparison and difference identification. Additionally, we construct initial open-source datasets for multimodal implicit attribute value extraction. Our extensive experiments reveal that EIVEN significantly outperforms existing methods in extracting implicit attribute values while requiring less labeled data. △ Less

Submitted 12 April, 2024; originally announced April 2024.

Comments: Accepted by NAACL 2024 Industry Track

arXiv:2404.08638 [pdf, other]

Age of Information Optimization and State Error Analysis for Correlated Multi-Process Multi-Sensor Systems

Authors: Egemen Erbayat, Ali Maatouk, Peng Zou, Suresh Subramaniam

Abstract: In this paper, we examine a multi-sensor system where each sensor may monitor more than one time-varying information process and send status updates to a remote monitor over a common channel. We consider that each sensor's status update may contain information about more than one information process in the system subject to the system's constraints. To investigate the impact of this correlation on… ▽ More In this paper, we examine a multi-sensor system where each sensor may monitor more than one time-varying information process and send status updates to a remote monitor over a common channel. We consider that each sensor's status update may contain information about more than one information process in the system subject to the system's constraints. To investigate the impact of this correlation on the overall system's performance, we conduct an analysis of both the average Age of Information (AoI) and source state estimation error at the monitor. Building upon this analysis, we subsequently explore the impact of the packet arrivals, correlation probabilities, and rate of processes' state change on the system's performance. Next, we consider the case where sensors have limited sensing abilities and distribute a portion of their sensing abilities across the different processes. We optimize this distribution to minimize the total AoI of the system. Interestingly, we show that monitoring multiple processes from a single source may not always be beneficial. Our results also reveal that the optimal sensing distribution for diverse arrival rates may exhibit a rapid regime switch, rather than smooth transitions, after crossing critical system values. This highlights the importance of identifying these critical thresholds to ensure effective system performance. △ Less

Submitted 20 August, 2024; v1 submitted 12 April, 2024; originally announced April 2024.

arXiv:2402.17785 [pdf, other]

ByteComposer: a Human-like Melody Composition Method based on Language Model Agent

Authors: Xia Liang, Xingjian Du, Jiaju Lin, Pei Zou, Yuan Wan, Bilei Zhu

Abstract: Large Language Models (LLM) have shown encouraging progress in multimodal understanding and generation tasks. However, how to design a human-aligned and interpretable melody composition system is still under-explored. To solve this problem, we propose ByteComposer, an agent framework emulating a human's creative pipeline in four separate steps : "Conception Analysis - Draft Composition - Self-Eval… ▽ More Large Language Models (LLM) have shown encouraging progress in multimodal understanding and generation tasks. However, how to design a human-aligned and interpretable melody composition system is still under-explored. To solve this problem, we propose ByteComposer, an agent framework emulating a human's creative pipeline in four separate steps : "Conception Analysis - Draft Composition - Self-Evaluation and Modification - Aesthetic Selection". This framework seamlessly blends the interactive and knowledge-understanding features of LLMs with existing symbolic music generation models, thereby achieving a melody composition agent comparable to human creators. We conduct extensive experiments on GPT4 and several open-source large language models, which substantiate our framework's effectiveness. Furthermore, professional music composers were engaged in multi-dimensional evaluations, the final results demonstrated that across various facets of music composition, ByteComposer agent attains the level of a novice melody composer. △ Less

Submitted 6 March, 2024; v1 submitted 23 February, 2024; originally announced February 2024.

arXiv:2310.14627 [pdf, other]

CrisisMatch: Semi-Supervised Few-Shot Learning for Fine-Grained Disaster Tweet Classification

Authors: Henry Peng Zou, Yue Zhou, Cornelia Caragea, Doina Caragea

Abstract: The shared real-time information about natural disasters on social media platforms like Twitter and Facebook plays a critical role in informing volunteers, emergency managers, and response organizations. However, supervised learning models for monitoring disaster events require large amounts of annotated data, making them unrealistic for real-time use in disaster events. To address this challenge,… ▽ More The shared real-time information about natural disasters on social media platforms like Twitter and Facebook plays a critical role in informing volunteers, emergency managers, and response organizations. However, supervised learning models for monitoring disaster events require large amounts of annotated data, making them unrealistic for real-time use in disaster events. To address this challenge, we present a fine-grained disaster tweet classification model under the semi-supervised, few-shot learning setting where only a small number of annotated data is required. Our model, CrisisMatch, effectively classifies tweets into fine-grained classes of interest using few labeled data and large amounts of unlabeled data, mimicking the early stage of a disaster. Through integrating effective semi-supervised learning ideas and incorporating TextMixUp, CrisisMatch achieves performance improvement on two disaster datasets of 11.2\% on average. Further analyses are also provided for the influence of the number of labeled data and out-of-domain results. △ Less

Submitted 23 October, 2023; originally announced October 2023.

Comments: Accepted by ISCRAM 2023

arXiv:2310.14583 [pdf, other]

JointMatch: A Unified Approach for Diverse and Collaborative Pseudo-Labeling to Semi-Supervised Text Classification

Authors: Henry Peng Zou, Cornelia Caragea

Abstract: Semi-supervised text classification (SSTC) has gained increasing attention due to its ability to leverage unlabeled data. However, existing approaches based on pseudo-labeling suffer from the issues of pseudo-label bias and error accumulation. In this paper, we propose JointMatch, a holistic approach for SSTC that addresses these challenges by unifying ideas from recent semi-supervised learning an… ▽ More Semi-supervised text classification (SSTC) has gained increasing attention due to its ability to leverage unlabeled data. However, existing approaches based on pseudo-labeling suffer from the issues of pseudo-label bias and error accumulation. In this paper, we propose JointMatch, a holistic approach for SSTC that addresses these challenges by unifying ideas from recent semi-supervised learning and the task of learning with noise. JointMatch adaptively adjusts classwise thresholds based on the learning status of different classes to mitigate model bias towards current easy classes. Additionally, JointMatch alleviates error accumulation by utilizing two differently initialized networks to teach each other in a cross-labeling manner. To maintain divergence between the two networks for mutual learning, we introduce a strategy that weighs more disagreement data while also allowing the utilization of high-quality agreement data for training. Experimental results on benchmark datasets demonstrate the superior performance of JointMatch, achieving a significant 5.13% improvement on average. Notably, JointMatch delivers impressive results even in the extremely-scarce-label setting, obtaining 86% accuracy on AG News with only 5 labels per class. We make our code available at https://github.com/HenryPengZou/JointMatch. △ Less

Submitted 23 October, 2023; originally announced October 2023.

Comments: Accepted by EMNLP 2023 (Main)

arXiv:2310.14577 [pdf, other]

DeCrisisMB: Debiased Semi-Supervised Learning for Crisis Tweet Classification via Memory Bank

Authors: Henry Peng Zou, Yue Zhou, Weizhi Zhang, Cornelia Caragea

Abstract: During crisis events, people often use social media platforms such as Twitter to disseminate information about the situation, warnings, advice, and support. Emergency relief organizations leverage such information to acquire timely crisis circumstances and expedite rescue operations. While existing works utilize such information to build models for crisis event analysis, fully-supervised approache… ▽ More During crisis events, people often use social media platforms such as Twitter to disseminate information about the situation, warnings, advice, and support. Emergency relief organizations leverage such information to acquire timely crisis circumstances and expedite rescue operations. While existing works utilize such information to build models for crisis event analysis, fully-supervised approaches require annotating vast amounts of data and are impractical due to limited response time. On the other hand, semi-supervised models can be biased, performing moderately well for certain classes while performing extremely poorly for others, resulting in substantially negative effects on disaster monitoring and rescue. In this paper, we first study two recent debiasing methods on semi-supervised crisis tweet classification. Then we propose a simple but effective debiasing method, DeCrisisMB, that utilizes a Memory Bank to store and perform equal sampling for generated pseudo-labels from each class at each training iteration. Extensive experiments are conducted to compare different debiasing methods' performance and generalization ability in both in-distribution and out-of-distribution settings. The results demonstrate the superior performance of our proposed method. Our code is available at https://github.com/HenryPengZou/DeCrisisMB. △ Less

Submitted 23 October, 2023; originally announced October 2023.

Comments: Accepted by EMNLP 2023 (Findings)

arXiv:2310.02593 [pdf]

A ModelOps-based Framework for Intelligent Medical Knowledge Extraction

Authors: Hongxin Ding, Peinie Zou, Zhiyuan Wang, Junfeng Zhao, Yasha Wang, Qiang Zhou

Abstract: Extracting medical knowledge from healthcare texts enhances downstream tasks like medical knowledge graph construction and clinical decision-making. However, the construction and application of knowledge extraction models lack automation, reusability and unified management, leading to inefficiencies for researchers and high barriers for non-AI experts such as doctors, to utilize knowledge extracti… ▽ More Extracting medical knowledge from healthcare texts enhances downstream tasks like medical knowledge graph construction and clinical decision-making. However, the construction and application of knowledge extraction models lack automation, reusability and unified management, leading to inefficiencies for researchers and high barriers for non-AI experts such as doctors, to utilize knowledge extraction. To address these issues, we propose a ModelOps-based intelligent medical knowledge extraction framework that offers a low-code system for model selection, training, evaluation and optimization. Specifically, the framework includes a dataset abstraction mechanism based on multi-layer callback functions, a reusable model training, monitoring and management mechanism. We also propose a model recommendation method based on dataset similarity, which helps users quickly find potentially suitable models for a given dataset. Our framework provides convenience for researchers to develop models and simplifies model access for non-AI experts such as doctors. △ Less

Submitted 4 October, 2023; originally announced October 2023.

arXiv:2309.16247 [pdf, other]

PP-MeT: a Real-world Personalized Prompt based Meeting Transcription System

Authors: Xiang Lyu, Yuhang Cao, Qing Wang, Jingjing Yin, Yuguang Yang, Pengpeng Zou, Yanni Hu, Heng Lu

Abstract: Speaker-attributed automatic speech recognition (SA-ASR) improves the accuracy and applicability of multi-speaker ASR systems in real-world scenarios by assigning speaker labels to transcribed texts. However, SA-ASR poses unique challenges due to factors such as speaker overlap, speaker variability, background noise, and reverberation. In this study, we propose PP-MeT system, a real-world personal… ▽ More Speaker-attributed automatic speech recognition (SA-ASR) improves the accuracy and applicability of multi-speaker ASR systems in real-world scenarios by assigning speaker labels to transcribed texts. However, SA-ASR poses unique challenges due to factors such as speaker overlap, speaker variability, background noise, and reverberation. In this study, we propose PP-MeT system, a real-world personalized prompt based meeting transcription system, which consists of a clustering system, target-speaker voice activity detection (TS-VAD), and TS-ASR. Specifically, we utilize target-speaker embedding as a prompt in TS-VAD and TS-ASR modules in our proposed system. In constrast with previous system, we fully leverage pre-trained models for system initialization, thereby bestowing our approach with heightened generalizability and precision. Experiments on M2MeT2.0 Challenge dataset show that our system achieves a cp-CER of 11.27% on the test set, ranking first in both fixed and open training conditions. △ Less

Submitted 28 September, 2023; originally announced September 2023.

arXiv:2304.12256 [pdf, other]

How Costly Was That (In)Decision?

Authors: Peng Zou, Ali Maatouk, Jin Zhang, Suresh Subramaniam

Abstract: In this paper, we introduce a new metric, named Penalty upon Decision (PuD), for measuring the impact of communication delays and state changes at the source on a remote decision maker. Specifically, the metric quantifies the performance degradation at the decision maker's side due to delayed, erroneous, and (possibly) missed decisions. We clarify the rationale for the metric and derive closed-for… ▽ More In this paper, we introduce a new metric, named Penalty upon Decision (PuD), for measuring the impact of communication delays and state changes at the source on a remote decision maker. Specifically, the metric quantifies the performance degradation at the decision maker's side due to delayed, erroneous, and (possibly) missed decisions. We clarify the rationale for the metric and derive closed-form expressions for its average in M/GI/1 and M/GI/1/1 with blocking settings. Numerical results are then presented to support our expressions and to compare the infinite and zero buffer regimes. Interestingly, comparing these two settings sheds light on a buffer length design challenge that is essential to minimize the average PuD. △ Less

Submitted 24 April, 2023; originally announced April 2023.

arXiv:2303.10561 [pdf, other]

Spatial-temporal Transformer for Affective Behavior Analysis

Authors: Peng Zou, Rui Wang, Kehua Wen, Yasi Peng, Xiao Sun

Abstract: The in-the-wild affective behavior analysis has been an important study. In this paper, we submit our solutions for the 5th Workshop and Competition on Affective Behavior Analysis in-the-wild (ABAW), which includes V-A Estimation, Facial Expression Classification and AU Detection Sub-challenges. We propose a Transformer Encoder with Multi-Head Attention framework to learn the distribution of both… ▽ More The in-the-wild affective behavior analysis has been an important study. In this paper, we submit our solutions for the 5th Workshop and Competition on Affective Behavior Analysis in-the-wild (ABAW), which includes V-A Estimation, Facial Expression Classification and AU Detection Sub-challenges. We propose a Transformer Encoder with Multi-Head Attention framework to learn the distribution of both the spatial and temporal features. Besides, there are virious effective data augmentation strategies employed to alleviate the problems of sample imbalance during model training. The results fully demonstrate the effectiveness of our proposed model based on the Aff-Wild2 dataset. △ Less

Submitted 19 March, 2023; originally announced March 2023.

arXiv:2208.03051 [pdf, other]

Hybrid Multimodal Feature Extraction, Mining and Fusion for Sentiment Analysis

Authors: Jia Li, Ziyang Zhang, Junjie Lang, Yueqi Jiang, Liuwei An, Peng Zou, Yangyang Xu, Sheng Gao, Jie Lin, Chunxiao Fan, Xiao Sun, Meng Wang

Abstract: In this paper, we present our solutions for the Multimodal Sentiment Analysis Challenge (MuSe) 2022, which includes MuSe-Humor, MuSe-Reaction and MuSe-Stress Sub-challenges. The MuSe 2022 focuses on humor detection, emotional reactions and multimodal emotional stress utilizing different modalities and data sets. In our work, different kinds of multimodal features are extracted, including acoustic,… ▽ More In this paper, we present our solutions for the Multimodal Sentiment Analysis Challenge (MuSe) 2022, which includes MuSe-Humor, MuSe-Reaction and MuSe-Stress Sub-challenges. The MuSe 2022 focuses on humor detection, emotional reactions and multimodal emotional stress utilizing different modalities and data sets. In our work, different kinds of multimodal features are extracted, including acoustic, visual, text and biological features. These features are fused by TEMMA and GRU with self-attention mechanism frameworks. In this paper, 1) several new audio features, facial expression features and paragraph-level text embeddings are extracted for accuracy improvement. 2) we substantially improve the accuracy and reliability of multimodal sentiment prediction by mining and blending the multimodal features. 3) effective data augmentation strategies are applied in model training to alleviate the problem of sample imbalance and prevent the model from learning biased subject characters. For the MuSe-Humor sub-challenge, our model obtains the AUC score of 0.8932. For the MuSe-Reaction sub-challenge, the Pearson's Correlations Coefficient of our approach on the test set is 0.3879, which outperforms all other participants. For the MuSe-Stress sub-challenge, our approach outperforms the baseline in both arousal and valence on the test dataset, reaching a final combined result of 0.5151. △ Less

Submitted 12 August, 2022; v1 submitted 5 August, 2022; originally announced August 2022.

Comments: 8 pages, 2 figures, to appear in MuSe 2022 (ACM MM2022 co-located workshop)

arXiv:2206.08224 [pdf, other]

Multi scale Feature Extraction and Fusion for Online Knowledge Distillation

Authors: Panpan Zou, Yinglei Teng, Tao Niu

Abstract: Online knowledge distillation conducts knowledge transfer among all student models to alleviate the reliance on pre-trained models. However, existing online methods rely heavily on the prediction distributions and neglect the further exploration of the representational knowledge. In this paper, we propose a novel Multi-scale Feature Extraction and Fusion method (MFEF) for online knowledge distilla… ▽ More Online knowledge distillation conducts knowledge transfer among all student models to alleviate the reliance on pre-trained models. However, existing online methods rely heavily on the prediction distributions and neglect the further exploration of the representational knowledge. In this paper, we propose a novel Multi-scale Feature Extraction and Fusion method (MFEF) for online knowledge distillation, which comprises three key components: Multi-scale Feature Extraction, Dual-attention and Feature Fusion, towards generating more informative feature maps for distillation. The multiscale feature extraction exploiting divide-and-concatenate in channel dimension is proposed to improve the multi-scale representation ability of feature maps. To obtain more accurate information, we design a dual-attention to strengthen the important channel and spatial regions adaptively. Moreover, we aggregate and fuse the former processed feature maps via feature fusion to assist the training of student models. Extensive experiments on CIF AR-10, CIF AR-100, and CINIC-10 show that MFEF transfers more beneficial representational knowledge for distillation and outperforms alternative methods among various network architectures △ Less

Submitted 16 June, 2022; originally announced June 2022.

Comments: 12 pages, 3 figures

arXiv:2206.08186 [pdf, other]

Asymptotic Soft Cluster Pruning for Deep Neural Networks

Authors: Tao Niu, Yinglei Teng, Panpan Zou

Abstract: Filter pruning method introduces structural sparsity by removing selected filters and is thus particularly effective for reducing complexity. Previous works empirically prune networks from the point of view that filter with smaller norm contributes less to the final results. However, such criteria has been proven sensitive to the distribution of filters, and the accuracy may hard to recover since… ▽ More Filter pruning method introduces structural sparsity by removing selected filters and is thus particularly effective for reducing complexity. Previous works empirically prune networks from the point of view that filter with smaller norm contributes less to the final results. However, such criteria has been proven sensitive to the distribution of filters, and the accuracy may hard to recover since the capacity gap is fixed once pruned. In this paper, we propose a novel filter pruning method called Asymptotic Soft Cluster Pruning (ASCP), to identify the redundancy of network based on the similarity of filters. Each filter from over-parameterized network is first distinguished by clustering, and then reconstructed to manually introduce redundancy into it. Several guidelines of clustering are proposed to better preserve feature extraction ability. After reconstruction, filters are allowed to be updated to eliminate the effect caused by mistakenly selected. Besides, various decaying strategies of the pruning rate are adopted to stabilize the pruning process and improve the final performance as well. By gradually generating more identical filters within each cluster, ASCP can remove them through channel addition operation with almost no accuracy drop. Extensive experiments on CIFAR-10 and ImageNet datasets show that our method can achieve competitive results compared with many state-of-the-art algorithms. △ Less

Submitted 16 June, 2022; originally announced June 2022.

arXiv:2202.12081 [pdf, other]

Community Trend Prediction on Heterogeneous Graph in E-commerce

Authors: Jiahao Yuan, Zhao Li, Pengcheng Zou, Xuan Gao, Jinwei Pan, Wendi Ji, Xiaoling Wang

Abstract: In online shopping, ever-changing fashion trends make merchants need to prepare more differentiated products to meet the diversified demands, and e-commerce platforms need to capture the market trend with a prophetic vision. For the trend prediction, the attribute tags, as the essential description of items, can genuinely reflect the decision basis of consumers. However, few existing works explore… ▽ More In online shopping, ever-changing fashion trends make merchants need to prepare more differentiated products to meet the diversified demands, and e-commerce platforms need to capture the market trend with a prophetic vision. For the trend prediction, the attribute tags, as the essential description of items, can genuinely reflect the decision basis of consumers. However, few existing works explore the attribute trend in the specific community for e-commerce. In this paper, we focus on the community trend prediction on the item attribute and propose a unified framework that combines the dynamic evolution of two graph patterns to predict the attribute trend in a specific community. Specifically, we first design a communityattribute bipartite graph at each time step to learn the collaboration of different communities. Next, we transform the bipartite graph into a hypergraph to exploit the associations of different attribute tags in one community. Lastly, we introduce a dynamic evolution component based on the recurrent neural networks to capture the fashion trend of attribute tags. Extensive experiments on three real-world datasets in a large e-commerce platform show the superiority of the proposed approach over several strong alternatives and demonstrate the ability to discover the community trend in advance. △ Less

Submitted 24 February, 2022; originally announced February 2022.

Comments: Published as a full paper at WSDM 2022

arXiv:2201.02968 [pdf, other]

An Adaptive Device-Edge Co-Inference Framework Based on Soft Actor-Critic

Authors: Tao Niu, Yinglei Teng, Zhu Han, Panpan Zou

Abstract: Recently, the applications of deep neural network (DNN) have been very prominent in many fields such as computer vision (CV) and natural language processing (NLP) due to its superior feature extraction performance. However, the high-dimension parameter model and large-scale mathematical calculation restrict the execution efficiency, especially for Internet of Things (IoT) devices. Different from t… ▽ More Recently, the applications of deep neural network (DNN) have been very prominent in many fields such as computer vision (CV) and natural language processing (NLP) due to its superior feature extraction performance. However, the high-dimension parameter model and large-scale mathematical calculation restrict the execution efficiency, especially for Internet of Things (IoT) devices. Different from the previous cloud/edge-only pattern that brings huge pressure for uplink communication and device-only fashion that undertakes unaffordable calculation strength, we highlight the collaborative computation between the device and edge for DNN models, which can achieve a good balance between the communication load and execution accuracy. Specifically, a systematic on-demand co-inference framework is proposed to exploit the multi-branch structure, in which the pre-trained Alexnet is right-sized through \emph{early-exit} and partitioned at an intermediate DNN layer. The integer quantization is enforced to further compress transmission bits. As a result, we establish a new Deep Reinforcement Learning (DRL) optimizer-Soft Actor Critic for discrete (SAC-d), which generates the \emph{exit point}, \emph{partition point}, and \emph{compressing bits} by soft policy iterations. Based on the latency and accuracy aware reward design, such an optimizer can well adapt to the complex environment like dynamic wireless channel and arbitrary CPU processing, and is capable of supporting the 5G URLLC. Real-world experiment on Raspberry Pi 4 and PC shows the outperformance of the proposed solution. △ Less

Submitted 9 January, 2022; originally announced January 2022.

arXiv:2112.05725 [pdf, ps, other]

Beyond the Longest Letter-duplicated Subsequence Problem

Authors: Wenfeng Lai, Adiesha Liyanage, Binhai Zhu, Peng Zou

Abstract: Given a sequence $S$ of length $n$, a letter-duplicated subsequence is a subsequence of $S$ in the form of $x_1^{d_1}x_2^{d_2}\cdots x_k^{d_k}$ with $x_i\inΣ$, $x_j\neq x_{j+1}$ and $d_i\geq 2$ for all $i$ in $[k]$ and $j$ in $[k-1]$. A linear time algorithm for computing the longest letter-duplicated subsequence (LLDS) of $S$ can be easily obtained. In this paper, we focus on two variants of this… ▽ More Given a sequence $S$ of length $n$, a letter-duplicated subsequence is a subsequence of $S$ in the form of $x_1^{d_1}x_2^{d_2}\cdots x_k^{d_k}$ with $x_i\inΣ$, $x_j\neq x_{j+1}$ and $d_i\geq 2$ for all $i$ in $[k]$ and $j$ in $[k-1]$. A linear time algorithm for computing the longest letter-duplicated subsequence (LLDS) of $S$ can be easily obtained. In this paper, we focus on two variants of this problem. We first consider the constrained version when $Σ$ is unbounded, each letter appears in $S$ at least 6 times and all the letters in $Σ$ must appear in the solution. We show that the problem is NP-hard (a further twist indicates that the problem does not admit any polynomial time approximation). The reduction is from possibly the simplest version of SAT that is NP-complete, $(\leq 2,1,\leq 3)$-SAT, where each variable appears at most twice positively and exact once negatively, and each clause contains at most three literals and some clauses must contain exactly two literals. (We hope that this technique will serve as a general tool to help us proving the NP-hardness for some more tricky sequence problems involving only one sequence -- much harder than with at least two input sequences, which we apply successfully at the end of the paper on some extra variations of the LLDS problem.) We then show that when each letter appears in $S$ at most 3 times, then the problem admits a factor $1.5-O(\frac{1}{n})$ approximation. Finally, we consider the weighted version, where the weight of a block $x_i^{d_i} (d_i\geq 2)$ could be any positive function which might not grow with $d_i$. We give a non-trivial $O(n^2)$ time dynamic programming algorithm for this version, i.e., computing an LD-subsequence of $S$ whose weight is maximized. △ Less

Submitted 4 January, 2022; v1 submitted 10 December, 2021; originally announced December 2021.

Comments: 18 pages

MSC Class: 68W01; 68W32

arXiv:2110.05020 [pdf, other]

MELONS: generating melody with long-term structure using transformers and structure graph

Authors: Yi Zou, Pei Zou, Yi Zhao, Kaixiang Zhang, Ran Zhang, Xiaorui Wang

Abstract: The creation of long melody sequences requires effective expression of coherent musical structure. However, there is no clear representation of musical structure. Recent works on music generation have suggested various approaches to deal with the structural information of music, but generating a full-song melody with clear long-term structure remains a challenge. In this paper, we propose MELONS,… ▽ More The creation of long melody sequences requires effective expression of coherent musical structure. However, there is no clear representation of musical structure. Recent works on music generation have suggested various approaches to deal with the structural information of music, but generating a full-song melody with clear long-term structure remains a challenge. In this paper, we propose MELONS, a melody generation framework based on a graph representation of music structure which consists of eight types of bar-level relations. MELONS adopts a multi-step generation method with transformer-based networks by factoring melody generation into two sub-problems: structure generation and structure conditional melody generation. Experimental results show that MELONS can produce structured melodies with high quality and rich contents. △ Less

Submitted 3 November, 2021; v1 submitted 11 October, 2021; originally announced October 2021.

arXiv:2109.14062 [pdf, ps, other]

Overage and Staleness Metrics for Status Update Systems

Authors: Peng Zou, Jin Zhang, Xianglin Wei, Suresh Subramaniam

Abstract: Status update systems consist of sensors that take measurements of a physical parameter and transmit them to a remote receiver. Age of Information (AoI) has been studied extensively as a metric for the freshness of information in such systems with and without an enforced hard or soft deadline. In this paper, we propose three metrics for status update systems to measure the ability of different que… ▽ More Status update systems consist of sensors that take measurements of a physical parameter and transmit them to a remote receiver. Age of Information (AoI) has been studied extensively as a metric for the freshness of information in such systems with and without an enforced hard or soft deadline. In this paper, we propose three metrics for status update systems to measure the ability of different queuing systems to meet a threshold requirement for the AoI. The {\em overage probability} is defined as the probability that the age of the most recent update packet held by the receiver is larger than the threshold. The {\em stale update probability} is the probability that an update is stale, i.e., its age has exceeded the deadline, when it is delivered to the receiver. Finally, the {\em average overage} is defined as the time average of the overage (i.e., age beyond the threshold), and is a measure of the average ``staleness'' of the update packets held by the receiver. We investigate these metrics in three typical status update queuing systems -- M/G/1/1, M/G/1/$2^*$, and M/M/1. Numerical results show the performances for these metrics under different parameter settings and different service distributions. The differences between the average overage and average AoI are also shown. Our results demonstrate that a lower bound exists for the stale update probability when the buffer size is limited. Further, we observe that the overage probability decreases and the stale update probability increases as the update arrival rate increases. △ Less

Submitted 9 October, 2021; v1 submitted 28 September, 2021; originally announced September 2021.

arXiv:2011.04166 [pdf, other]

Distant Supervision for E-commerce Query Segmentation via Attention Network

Authors: Zhao Li, Donghui Ding, Pengcheng Zou, Yu Gong, Xi Chen, Ji Zhang, Jianliang Gao, Youxi Wu, Yucong Duan

Abstract: The booming online e-commerce platforms demand highly accurate approaches to segment queries that carry the product requirements of consumers. Recent works have shown that the supervised methods, especially those based on deep learning, are attractive for achieving better performance on the problem of query segmentation. However, the lack of labeled data is still a big challenge for training a dee… ▽ More The booming online e-commerce platforms demand highly accurate approaches to segment queries that carry the product requirements of consumers. Recent works have shown that the supervised methods, especially those based on deep learning, are attractive for achieving better performance on the problem of query segmentation. However, the lack of labeled data is still a big challenge for training a deep segmentation network, and the problem of Out-of-Vocabulary (OOV) also adversely impacts the performance of query segmentation. Different from query segmentation task in an open domain, e-commerce scenario can provide external documents that are closely related to these queries. Thus, to deal with the two challenges, we employ the idea of distant supervision and design a novel method to find contexts in external documents and extract features from these contexts. In this work, we propose a BiLSTM-CRF based model with an attention module to encode external features, such that external contexts information, which can be utilized naturally and effectively to help query segmentation. Experiments on two datasets show the effectiveness of our approach compared with several kinds of baselines. △ Less

Submitted 8 November, 2020; originally announced November 2020.

arXiv:2003.14069 [pdf, ps, other]

On Age and Value of Information in Status Update Systems

Authors: Peng Zou, Omur Ozel, Suresh Subramaniam

Abstract: Motivated by the inherent value of packets arising in many cyber-physical applications (e.g., due to precision of the information content or an alarm message), we consider status update systems with update packets carrying values as well as their generation time stamps. Once generated, a status update packet has a random initial value and a deterministic deadline after which it is not useful (ulti… ▽ More Motivated by the inherent value of packets arising in many cyber-physical applications (e.g., due to precision of the information content or an alarm message), we consider status update systems with update packets carrying values as well as their generation time stamps. Once generated, a status update packet has a random initial value and a deterministic deadline after which it is not useful (ultimate staleness). In our model, value of a packet decreases in time (even after reception) starting from its generation to ultimate staleness when it vanishes. The value of information (VoI) at the receiver is additive in that the VoI is the sum of the current values of all packets held by the receiver. We investigate various queuing disciplines under potential dependence between value and service time and provide closed form expressions for average VoI at the receiver. Numerical results illustrate the average VoI for different scenarios and the contrast between average age of information (AoI) and average VoI. △ Less

Submitted 31 March, 2020; originally announced March 2020.

arXiv:2003.13577 [pdf, ps, other]

Maintaining Information Freshness in Power-Efficient Status Update Systems

Authors: Parisa Rafiee, Peng Zou, Omur Ozel, Suresh Subramaniam

Abstract: This paper is motivated by emerging edge computing systems which consist of sensor nodes that acquire and process information and then transmit status updates to an edge receiver for possible further processing. As power is a scarce resource at the sensor nodes, the system is modeled as a tandem computation-transmission queue with power-efficient computing. Jobs arrive at the computation server wi… ▽ More This paper is motivated by emerging edge computing systems which consist of sensor nodes that acquire and process information and then transmit status updates to an edge receiver for possible further processing. As power is a scarce resource at the sensor nodes, the system is modeled as a tandem computation-transmission queue with power-efficient computing. Jobs arrive at the computation server with rate $λ$ as a Poisson process with no available data buffer. The computation server can be in one of three states: (i) OFF: the server is turned off and no jobs are observed or processed, (ii) ON-Idle: the server is turned on but there is no job in the server, (iii) ON-Busy: the server is turned on and a job is processed in the server. These states cost zero, one and $p_c$ units of power, respectively. Under a long-term power constraint, the computation server switches from one state to another in sequence: first a deterministic $T_o$ time units in OFF state, then waiting for a job arrival in ON-Idle state and then in ON-Busy state for an independent identically distributed compute time duration. The transmission server has a single unit data buffer to save incoming packets and applies last come first serve with discarding as well as a packet deadline to discard a sitting packet for maintaining information freshness, which is measured by the Age of Information (AoI). Additionally, there is a monotonic functional relation between the mean time spent in ON-Busy state and the mean transmission time. We obtain closed-form expressions for average AoI and average peak AoI. Our numerical results illustrate various regimes of operation for best AoI performances optimized over packet deadlines with relation to power efficiency. △ Less

Submitted 30 March, 2020; originally announced March 2020.

arXiv:2002.04778 [pdf, other]

Genomic Problems Involving Copy Number Profiles: Complexity and Algorithms

Authors: Manuel Lafond, Binhai Zhu, Peng Zou

Abstract: Recently, due to the genomic sequence analysis in several types of cancer, the genomic data based on {\em copy number profiles} ({\em CNP} for short) are getting more and more popular. A CNP is a vector where each component is a non-negative integer representing the number of copies of a specific gene or segment of interest. In this paper, we present two streams of results. The first is the nega… ▽ More Recently, due to the genomic sequence analysis in several types of cancer, the genomic data based on {\em copy number profiles} ({\em CNP} for short) are getting more and more popular. A CNP is a vector where each component is a non-negative integer representing the number of copies of a specific gene or segment of interest. In this paper, we present two streams of results. The first is the negative results on two open problems regarding the computational complexity of the Minimum Copy Number Generation (MCNG) problem posed by Qingge et al. in 2018. It was shown by Qingge et al. that the problem is NP-hard if the duplications are tandem and they left the open question of whether the problem remains NP-hard if arbitrary duplications are used. We answer this question affirmatively in this paper; in fact, we prove that it is NP-hard to even obtain a constant factor approximation. We also prove that the parameterized version is W[1]-hard, answering another open question by Qingge et al. The other result is positive and is based on a new (and more general) problem regarding CNP's. The \emph{Copy Number Profile Conforming (CNPC)} problem is formally defined as follows: given two CNP's $C_1$ and $C_2$, compute two strings $S_1$ and $S_2$ with $cnp(S_1)=C_1$ and $cnp(S_2)=C_2$ such that the distance between $S_1$ and $S_2$, $d(S_1,S_2)$, is minimized. Here, $d(S_1,S_2)$ is a very general term, which means it could be any genome rearrangement distance (like reversal, transposition, and tandem duplication, etc). We make the first step by showing that if $d(S_1,S_2)$ is measured by the breakpoint distance then the problem is polynomially solvable. △ Less

Submitted 11 February, 2020; originally announced February 2020.

Comments: 16 pages, 3 figures

MSC Class: 68 ACM Class: F.2.2; J.3

arXiv:1912.02692 [pdf, ps, other]

Optimizing Information Freshness Through Computation-Transmission Tradeoff and Queue Management in Edge Computing

Authors: Peng Zou, Omur Ozel, Suresh Subramaniam

Abstract: Edge computing applications typically require generated data to be preprocessed at the source and then transmitted to an edge server. In such cases, transmission time and preprocessing time are coupled, yielding a tradeoff between them to achieve the targeted objective. This paper presents analysis of such a system with the objective of optimizing freshness of received data at the edge server. We… ▽ More Edge computing applications typically require generated data to be preprocessed at the source and then transmitted to an edge server. In such cases, transmission time and preprocessing time are coupled, yielding a tradeoff between them to achieve the targeted objective. This paper presents analysis of such a system with the objective of optimizing freshness of received data at the edge server. We model this system as two queues in tandem whose service times are independent over time but the transmission service time is monotonically dependent on the computation service time in mean value. This dependence captures the natural decrease in transmission time due to lower offloaded computation. We analyze various queue management schemes in this tandem queue where the first queue has a single server, Poisson packet arrivals, general independent service and no extra buffer to save incoming status update packets. The second queue has a single server receiving packets from the first queue and service is memoryless. We consider the second queue in two forms: (i) No data buffer and (ii) One unit data buffer and last come first serve with discarding. We analyze various non-preemptive as well as preemptive cases. We perform stationary distribution analysis and obtain closed form expressions for average age of information (AoI) and average peak AoI. Our numerical results illustrate analytical findings on how computation and transmission times could be traded off to optimize AoI and reveal a consequent tradeoff between average AoI and average peak AoI. △ Less

Submitted 3 December, 2019; originally announced December 2019.

Comments: arXiv admin note: substantial text overlap with arXiv:1907.00928

arXiv:1907.00928 [pdf, ps, other]

Trading Off Computation with Transmission in Status Update Systems

Authors: Peng Zou, Omur Ozel, Suresh Subramaniam

Abstract: This paper is motivated by emerging edge computing applications in which generated data are pre-processed at the source and then transmitted to an edge server. In such a scenario, there is typically a tradeoff between the amount of pre-processing and the amount of data to be transmitted. We model such a system by considering two non-preemptive queues in tandem whose service times are independent o… ▽ More This paper is motivated by emerging edge computing applications in which generated data are pre-processed at the source and then transmitted to an edge server. In such a scenario, there is typically a tradeoff between the amount of pre-processing and the amount of data to be transmitted. We model such a system by considering two non-preemptive queues in tandem whose service times are independent over time but the transmission service time is dependent on the computation service time in mean value. The first queue is in M/GI/1/1 form with a single server, memoryless exponential arrivals, general independent service and no extra buffer to save incoming status update packets. The second queue is in GI/M/1/2* form with a single server receiving packets from the first queue, memoryless service and a single data buffer to save incoming packets. Additionally, mean service times of the first and second queues are dependent through a deterministic monotonic function. We perform stationary distribution analysis in this system and obtain closed form expressions for average age of information (AoI) and average peak AoI. Our numerical results illustrate the analytical findings and highlight the tradeoff between average AoI and average peak AoI generated by the tandem nature of the queueing system with dependent service times. △ Less

Submitted 1 July, 2019; originally announced July 2019.

arXiv:1906.05266 [pdf, ps, other]

The Tandem Duplication Distance is NP-hard

Authors: Manuel Lafond, Binhai Zhu, Peng Zou

Abstract: In computational biology, tandem duplication is an important biological phenomenon which can occur either at the genome or at the DNA level. A tandem duplication takes a copy of a genome segment and inserts it right after the segment - this can be represented as the string operation $AXB \Rightarrow AXXB$. For example, Tandem exon duplications have been found in many species such as human, fly or… ▽ More In computational biology, tandem duplication is an important biological phenomenon which can occur either at the genome or at the DNA level. A tandem duplication takes a copy of a genome segment and inserts it right after the segment - this can be represented as the string operation $AXB \Rightarrow AXXB$. For example, Tandem exon duplications have been found in many species such as human, fly or worm, and have been largely studied in computational biology. The Tandem Duplication (TD) distance problem we investigate in this paper is defined as follows: given two strings $S$ and $T$ over the same alphabet, compute the smallest sequence of tandem duplications required to convert $S$ to $T$. The natural question of whether the TD distance can be computed in polynomial time was posed in 2004 by Leupold et al. and had remained open, despite the fact that tandem duplications have received much attention ever since. In this paper, we prove that this problem is NP-hard. We further show that this hardness holds even if all characters of $S$ are distinct. This is known as the exemplar TD distance, which is of special relevance in bioinformatics. One of the tools we develop for the reduction is a new problem called the Cost-Effective Subgraph, for which we obtain W[1]-hardness results that might be of independent interest. We finally show that computing the exemplar TD distance between $S$ and $T$ is fixed-parameter tractable. Our results open the door to many other questions, and we conclude with several open problems. △ Less

Submitted 12 June, 2019; originally announced June 2019.

arXiv:1904.01735 [pdf, other]

Multi-Modal Generative Adversarial Network for Short Product Title Generation in Mobile E-Commerce

Authors: Jian-Guo Zhang, Pengcheng Zou, Zhao Li, Yao Wan, Xiuming Pan, Yu Gong, Philip S. Yu

Abstract: Nowadays, more and more customers browse and purchase products in favor of using mobile E-Commerce Apps such as Taobao and Amazon. Since merchants are usually inclined to describe redundant and over-informative product titles to attract attentions from customers, it is important to concisely display short product titles on limited screen of mobile phones. To address this discrepancy, previous stud… ▽ More Nowadays, more and more customers browse and purchase products in favor of using mobile E-Commerce Apps such as Taobao and Amazon. Since merchants are usually inclined to describe redundant and over-informative product titles to attract attentions from customers, it is important to concisely display short product titles on limited screen of mobile phones. To address this discrepancy, previous studies mainly consider textual information of long product titles and lacks of human-like view during training and evaluation process. In this paper, we propose a Multi-Modal Generative Adversarial Network (MM-GAN) for short product title generation in E-Commerce, which innovatively incorporates image information and attribute tags from product, as well as textual information from original long titles. MM-GAN poses short title generation as a reinforcement learning process, where the generated titles are evaluated by the discriminator in a human-like view. Extensive experiments on a large-scale E-Commerce dataset demonstrate that our algorithm outperforms other state-of-the-art methods. Moreover, we deploy our model into a real-world online E-Commerce environment and effectively boost the performance of click through rate and click conversion rate by 1.66% and 1.87%, respectively. △ Less

Submitted 2 April, 2019; originally announced April 2019.

Comments: Accepted by NAACL-HLT 2019. arXiv admin note: substantial text overlap with arXiv:1811.04498

arXiv:1901.05428 [pdf, ps, other]

Relative Age of Information: A New Metric for Status Update Systems

Authors: Peng Zou, Omur Ozel, Suresh Subramaniam

Abstract: In this paper, we introduce a new data freshness metric, relative Age of Information (rAoI), and examine it in a single server system with various packet management schemes. The (classical) AoI metric was introduced to measure the staleness of status updates at the receiving end with respect to their generation at the source. This metric addresses systems where the timings of update generation at… ▽ More In this paper, we introduce a new data freshness metric, relative Age of Information (rAoI), and examine it in a single server system with various packet management schemes. The (classical) AoI metric was introduced to measure the staleness of status updates at the receiving end with respect to their generation at the source. This metric addresses systems where the timings of update generation at the source are absolute and can be designed separately or jointly with the transmission schedules. In many decentralized applications, transmission schedules are blind to update generation timing, and the transmitter can know the timing of an update packet only after it arrives. As such, an update becomes stale after a new one arrives. The rAoI metric measures how fresh the data is at the receiver with respect to the data at the transmitter. It introduces a particularly explicit dependence on the arrival process in the evaluation of age. We investigate several queuing disciplines and provide closed form expressions for rAoI and numerical comparisons. △ Less

Submitted 1 July, 2019; v1 submitted 16 January, 2019; originally announced January 2019.

arXiv:1901.02873 [pdf, ps, other]

Waiting before Serving: A Companion to Packet Management in Status Update Systems

Authors: Peng Zou, Omur Ozel, Suresh Subramaniam

Abstract: In this paper, we explore the potential of server waiting before packet transmission in improving the Age of Information (AoI) in status update systems. We consider a non-preemptive queue with Poisson arrivals and independent general service distribution and we incorporate waiting before serving in two packet management schemes: M/GI/1/1 and M/GI/1/$2^*$. In M/GI/1/1 scheme, the server waits for a… ▽ More In this paper, we explore the potential of server waiting before packet transmission in improving the Age of Information (AoI) in status update systems. We consider a non-preemptive queue with Poisson arrivals and independent general service distribution and we incorporate waiting before serving in two packet management schemes: M/GI/1/1 and M/GI/1/$2^*$. In M/GI/1/1 scheme, the server waits for a deterministic time immediately after a packet enters the server. In M/GI/1/$2^*$ scheme, depending on idle or busy system state, the server waits for a deterministic time before starting service of the packet. In both cases, if a potential newer arrival is captured existing packet is discarded. Different from most existing works, we analyze AoI evolution by indexing the incoming packets, which is enabled by an alternative method of partitioning the area under the evolution of instantaneous AoI to calculate its time average. We obtain expressions for average and average peak AoI for both queueing disciplines with waiting. Our numerical results demonstrate that waiting before service can bring significant improvement in average age, particularly, for heavy-tailed service distributions. This improvement comes at the expense of an increase in average peak AoI. We highlight the trade-off between average and average peak AoI generated by waiting before serving. △ Less

Submitted 22 April, 2019; v1 submitted 9 January, 2019; originally announced January 2019.

arXiv:1811.04498 [pdf, other]

Product Title Refinement via Multi-Modal Generative Adversarial Learning

Authors: Jianguo Zhang, Pengcheng Zou, Zhao Li, Yao Wan, Ye Liu, Xiuming Pan, Yu Gong, Philip S. Yu

Abstract: Nowadays, an increasing number of customers are in favor of using E-commerce Apps to browse and purchase products. Since merchants are usually inclined to employ redundant and over-informative product titles to attract customers' attention, it is of great importance to concisely display short product titles on limited screen of cell phones. Previous researchers mainly consider textual information… ▽ More Nowadays, an increasing number of customers are in favor of using E-commerce Apps to browse and purchase products. Since merchants are usually inclined to employ redundant and over-informative product titles to attract customers' attention, it is of great importance to concisely display short product titles on limited screen of cell phones. Previous researchers mainly consider textual information of long product titles and lack of human-like view during training and evaluation procedure. In this paper, we propose a Multi-Modal Generative Adversarial Network (MM-GAN) for short product title generation, which innovatively incorporates image information, attribute tags from the product and the textual information from original long titles. MM-GAN treats short titles generation as a reinforcement learning process, where the generated titles are evaluated by the discriminator in a human-like view. △ Less

Submitted 11 November, 2018; originally announced November 2018.

Comments: Workshop on Visually Grounded Interaction and Language, NIPS, 2018

arXiv:1708.09532 [pdf]

doi 10.1016/j.physa.2018.08.053

Leveraging local h-index to identify and rank influential spreaders in networks

Authors: Qiang Liu, Yuxiao Zhu, Yan Jia, Lu Deng, Bin Zhou, Junxing Zhu, Peng Zou

Abstract: Identifying influential nodes in complex networks has received increasing attention for its great theoretical and practical applications in many fields. Traditional methods, such as degree centrality, betweenness centrality, closeness centrality, and coreness centrality, have more or less disadvantages in detecting influential nodes, which have been illustrated in related literatures. Recently, th… ▽ More Identifying influential nodes in complex networks has received increasing attention for its great theoretical and practical applications in many fields. Traditional methods, such as degree centrality, betweenness centrality, closeness centrality, and coreness centrality, have more or less disadvantages in detecting influential nodes, which have been illustrated in related literatures. Recently, the h-index, which is utilized to measure both the productivity and citation impact of the publications of a scientist or scholar, has been introduced to the network world to evaluate a node's spreading ability. However, this method assigns too many nodes with the same value, which leads to a resolution limit problem in distinguishing the real influence of these nodes. In this paper, we propose a local h-index centrality (LH-index) method for identifying and ranking influential nodes in networks. The LH-index method simultaneously takes into account of h-index values of the node itself and its neighbors, which is based on the idea that a node connects to more influential nodes will also be influential. According to the simulation results with the stochastic Susceptible-Infected-Recovered (SIR) model in four real world networks and several simulated networks, we demonstrate the effectivity of the LH-index method in identifying influential nodes in networks. △ Less

Submitted 15 September, 2017; v1 submitted 30 August, 2017; originally announced August 2017.

Comments: 15 pages,6 figures

Journal ref: Q. Liu, Physica A (2018) 379-391

Showing 1–39 of 39 results for author: Zou, P