Search | arXiv e-print repository

Multi-Agent Deep Reinforcement Learning for Energy Efficient Multi-Hop STAR-RIS-Assisted Transmissions

Authors: Pei-Hsiang Liao, Li-Hsiang Shen, Po-Chen Wu, Kai-Ten Feng

Abstract: Simultaneously transmitting and reflecting reconfigurable intelligent surface (STAR-RIS) provides a promising way to expand coverage in wireless communications. However, limitation of single STAR-RIS inspire us to integrate the concept of multi-hop transmissions, as focused on RIS in existing research. Therefore, we propose the novel architecture of multi-hop STAR-RISs to achieve a wider range of… ▽ More Simultaneously transmitting and reflecting reconfigurable intelligent surface (STAR-RIS) provides a promising way to expand coverage in wireless communications. However, limitation of single STAR-RIS inspire us to integrate the concept of multi-hop transmissions, as focused on RIS in existing research. Therefore, we propose the novel architecture of multi-hop STAR-RISs to achieve a wider range of full-plane service coverage. In this paper, we intend to solve active beamforming of the base station and passive beamforming of STAR-RISs, aiming for maximizing the energy efficiency constrained by hardware limitation of STAR-RISs. Furthermore, we investigate the impact of the on-off state of STAR-RIS elements on energy efficiency. To tackle the complex problem, a Multi-Agent Global and locAl deep Reinforcement learning (MAGAR) algorithm is designed. The global agent elevates the collaboration among local agents, which focus on individual learning. In numerical results, we observe the significant improvement of MAGAR compared to the other benchmarks, including Q-learning, multi-agent deep Q network (DQN) with golbal reward, and multi-agent DQN with local rewards. Moreover, the proposed architecture of multi-hop STAR-RISs achieves the highest energy efficiency compared to mode switching based STAR-RISs, conventional RISs and deployment without RISs or STAR-RISs. △ Less

Submitted 26 July, 2024; originally announced July 2024.

Comments: Accepted by Proc. IEEE VTC-fall

arXiv:2407.05869 [pdf, other]

PORCA: Root Cause Analysis with Partially Observed Data

Authors: Chang Gong, Di Yao, Jin Wang, Wenbin Li, Lanting Fang, Yongtao Xie, Kaiyu Feng, Peng Han, Jingping Bi

Abstract: Root Cause Analysis (RCA) aims at identifying the underlying causes of system faults by uncovering and analyzing the causal structure from complex systems. It has been widely used in many application domains. Reliable diagnostic conclusions are of great importance in mitigating system failures and financial losses. However, previous studies implicitly assume a full observation of the system, which… ▽ More Root Cause Analysis (RCA) aims at identifying the underlying causes of system faults by uncovering and analyzing the causal structure from complex systems. It has been widely used in many application domains. Reliable diagnostic conclusions are of great importance in mitigating system failures and financial losses. However, previous studies implicitly assume a full observation of the system, which neglect the effect of partial observation (i.e., missing nodes and latent malfunction). As a result, they fail in deriving reliable RCA results. In this paper, we unveil the issues of unobserved confounders and heterogeneity in partial observation and come up with a new problem of root cause analysis with partially observed data. To achieve this, we propose PORCA, a novel RCA framework which can explore reliable root causes under both unobserved confounders and unobserved heterogeneity. PORCA leverages magnified score-based causal discovery to efficiently optimize acyclic directed mixed graph under unobserved confounders. In addition, we also develop a heterogeneity-aware scheduling strategy to provide adaptive sample weights. Extensive experimental results on one synthetic and two real-world datasets demonstrate the effectiveness and superiority of the proposed framework. △ Less

Submitted 11 July, 2024; v1 submitted 8 July, 2024; originally announced July 2024.

arXiv:2406.09098 [pdf, other]

SciKnowEval: Evaluating Multi-level Scientific Knowledge of Large Language Models

Authors: Kehua Feng, Keyan Ding, Weijie Wang, Xiang Zhuang, Zeyuan Wang, Ming Qin, Yu Zhao, Jianhua Yao, Qiang Zhang, Huajun Chen

Abstract: The burgeoning utilization of Large Language Models (LLMs) in scientific research necessitates advanced benchmarks capable of evaluating their understanding and application of scientific knowledge comprehensively. To address this need, we introduce the SciKnowEval benchmark, a novel framework that systematically evaluates LLMs across five progressive levels of scientific knowledge: studying extens… ▽ More The burgeoning utilization of Large Language Models (LLMs) in scientific research necessitates advanced benchmarks capable of evaluating their understanding and application of scientific knowledge comprehensively. To address this need, we introduce the SciKnowEval benchmark, a novel framework that systematically evaluates LLMs across five progressive levels of scientific knowledge: studying extensively, inquiring earnestly, thinking profoundly, discerning clearly, and practicing assiduously. These levels aim to assess the breadth and depth of scientific knowledge in LLMs, including knowledge coverage, inquiry and exploration capabilities, reflection and reasoning abilities, ethic and safety considerations, as well as practice proficiency. Specifically, we take biology and chemistry as the two instances of SciKnowEval and construct a dataset encompassing 50K multi-level scientific problems and solutions. By leveraging this dataset, we benchmark 20 leading open-source and proprietary LLMs using zero-shot and few-shot prompting strategies. The results reveal that despite achieving state-of-the-art performance, the proprietary LLMs still have considerable room for improvement, particularly in addressing scientific computations and applications. We anticipate that SciKnowEval will establish a comprehensive standard for benchmarking LLMs in science research and discovery, and promote the development of LLMs that integrate scientific knowledge with strong safety awareness. The dataset and code are publicly available at https://github.com/hicai-zju/sciknoweval . △ Less

Submitted 13 June, 2024; originally announced June 2024.

Comments: 48 pages, 2 figures

arXiv:2406.07786 [pdf, ps, other]

doi 10.1364/OE.496966

Field Test of Quantum Key Distribution with High Key Creation Efficiency

Authors: Yung-Cheng Kao, Sheng-Hsuan Huang, Chin-Hsuan Chang, Chih-Hsiang Wu, Shih-Hsien Chu, Jian Jiang, An-Chi Zhang, Sheng-Yao Huang, Jhih-Heng Yan, Kai-Ming Feng, Chih-Sung Chuu

Abstract: Quantumkey distribution (QKD) promises unconditional security for communication. However, the random choices of the measurement basis in QKD usually result in low key creation efficiency. This drawback is overcome in the differential-phase-shift QKD, provided that each photon can be prepared in a large number of time bins with a proper waveform. In this work we develop a miniature 1550-nm single-p… ▽ More Quantumkey distribution (QKD) promises unconditional security for communication. However, the random choices of the measurement basis in QKD usually result in low key creation efficiency. This drawback is overcome in the differential-phase-shift QKD, provided that each photon can be prepared in a large number of time bins with a proper waveform. In this work we develop a miniature 1550-nm single-photon source to generate narrowband single photon in 50 time bins with a nearly optimal waveform for achieving unity key creation efficiency. By utilizing these single photons in the field test, we demonstrate the differential-phase-shift QKD with a key creation efficiency of 97%. Our work shows that the practical QKD can benefit from the narrowband single photons with controllable waveforms. △ Less

Submitted 11 June, 2024; originally announced June 2024.

Comments: 9pages, 4figures

Journal ref: Opt. Express 31, 30239-30247 (2023)

arXiv:2405.19062 [pdf, other]

SIG: Efficient Self-Interpretable Graph Neural Network for Continuous-time Dynamic Graphs

Authors: Lanting Fang, Yulian Yang, Kai Wang, Shanshan Feng, Kaiyu Feng, Jie Gui, Shuliang Wang, Yew-Soon Ong

Abstract: While dynamic graph neural networks have shown promise in various applications, explaining their predictions on continuous-time dynamic graphs (CTDGs) is difficult. This paper investigates a new research task: self-interpretable GNNs for CTDGs. We aim to predict future links within the dynamic graph while simultaneously providing causal explanations for these predictions. There are two key challen… ▽ More While dynamic graph neural networks have shown promise in various applications, explaining their predictions on continuous-time dynamic graphs (CTDGs) is difficult. This paper investigates a new research task: self-interpretable GNNs for CTDGs. We aim to predict future links within the dynamic graph while simultaneously providing causal explanations for these predictions. There are two key challenges: (1) capturing the underlying structural and temporal information that remains consistent across both independent and identically distributed (IID) and out-of-distribution (OOD) data, and (2) efficiently generating high-quality link prediction results and explanations. To tackle these challenges, we propose a novel causal inference model, namely the Independent and Confounded Causal Model (ICCM). ICCM is then integrated into a deep learning architecture that considers both effectiveness and efficiency. Extensive experiments demonstrate that our proposed model significantly outperforms existing methods across link prediction accuracy, explanation quality, and robustness to shortcut features. Our code and datasets are anonymously released at https://github.com/2024SIG/SIG. △ Less

Submitted 29 May, 2024; originally announced May 2024.

Comments: 19 pages

arXiv:2405.16064 [pdf, other]

Keypoint-based Progressive Chain-of-Thought Distillation for LLMs

Authors: Kaituo Feng, Changsheng Li, Xiaolu Zhang, Jun Zhou, Ye Yuan, Guoren Wang

Abstract: Chain-of-thought distillation is a powerful technique for transferring reasoning abilities from large language models (LLMs) to smaller student models. Previous methods typically require the student to mimic the step-by-step rationale produced by LLMs, often facing the following challenges: (i) Tokens within a rationale vary in significance, and treating them equally may fail to accurately mimic k… ▽ More Chain-of-thought distillation is a powerful technique for transferring reasoning abilities from large language models (LLMs) to smaller student models. Previous methods typically require the student to mimic the step-by-step rationale produced by LLMs, often facing the following challenges: (i) Tokens within a rationale vary in significance, and treating them equally may fail to accurately mimic keypoint tokens, leading to reasoning errors. (ii) They usually distill knowledge by consistently predicting all the steps in a rationale, which falls short in distinguishing the learning order of step generation. This diverges from the human cognitive progression of starting with easy tasks and advancing to harder ones, resulting in sub-optimal outcomes. To this end, we propose a unified framework, called KPOD, to address these issues. Specifically, we propose a token weighting module utilizing mask learning to encourage accurate mimicry of keypoint tokens by the student during distillation. Besides, we develop an in-rationale progressive distillation strategy, starting with training the student to generate the final reasoning steps and gradually extending to cover the entire rationale. To accomplish this, a weighted token generation loss is proposed to assess step reasoning difficulty, and a value function is devised to schedule the progressive distillation by considering both step difficulty and question diversity. Extensive experiments on four reasoning benchmarks illustrate our KPOD outperforms previous methods by a large margin. △ Less

Submitted 25 May, 2024; originally announced May 2024.

Comments: Accepted by ICML 2024

arXiv:2405.13300

FAITH: Frequency-domain Attention In Two Horizons for Time Series Forecasting

Authors: Ruiqi Li, Maowei Jiang, Kai Wang, Kaiduo Feng, Quangao Liu, Yue Sun, Xiufang Zhou

Abstract: Time Series Forecasting plays a crucial role in various fields such as industrial equipment maintenance, meteorology, energy consumption, traffic flow and financial investment. However, despite their considerable advantages over traditional statistical approaches, current deep learning-based predictive models often exhibit a significant deviation between their forecasting outcomes and the ground t… ▽ More Time Series Forecasting plays a crucial role in various fields such as industrial equipment maintenance, meteorology, energy consumption, traffic flow and financial investment. However, despite their considerable advantages over traditional statistical approaches, current deep learning-based predictive models often exhibit a significant deviation between their forecasting outcomes and the ground truth. This discrepancy is largely due to an insufficient emphasis on extracting the sequence's latent information, particularly its global information within the frequency domain and the relationship between different variables. To address this issue, we propose a novel model Frequency-domain Attention In Two Horizons, which decomposes time series into trend and seasonal components using a multi-scale sequence adaptive decomposition and fusion architecture, and processes them separately. FAITH utilizes Frequency Channel feature Extraction Module and Frequency Temporal feature Extraction Module to capture inter-channel relationships and temporal global information in the sequence, significantly improving its ability to handle long-term dependencies and complex patterns. Furthermore, FAITH achieves theoretically linear complexity by modifying the time-frequency domain transformation method, effectively reducing computational costs. Extensive experiments on 6 benchmarks for long-term forecasting and 3 benchmarks for short-term forecasting demonstrate that FAITH outperforms existing models in many fields, such as electricity, weather and traffic, proving its effectiveness and superiority both in long-term and short-term time series forecasting tasks. Our codes and data are available at https://github.com/LRQ577/FAITH. △ Less

Submitted 1 July, 2024; v1 submitted 21 May, 2024; originally announced May 2024.

Comments: We think there are some errors in the experiment result, it may lead to a wrong conclusion. So we think it will be responsible to withdraw it

arXiv:2405.11132 [pdf, ps, other]

Quadratic twists of tiling number elliptic curves

Authors: Keqin Feng, Qiuyue Liu, Jinzhao Pan, Ye Tian

Abstract: A positive integer $n$ is called a tiling number if the equilateral triangle can be dissected into $nk^2$ congruent triangles for some integer $k$. An integer $n>3$ is tiling number if and only if at least one of the elliptic curves $E^{(\pm n)}:\pm ny^2=x(x-1)(x+3)$ has positive Mordell-Weil rank. Let $A$ denote one of the two curves. In this paper, using Waldspurger formula and an induction meth… ▽ More A positive integer $n$ is called a tiling number if the equilateral triangle can be dissected into $nk^2$ congruent triangles for some integer $k$. An integer $n>3$ is tiling number if and only if at least one of the elliptic curves $E^{(\pm n)}:\pm ny^2=x(x-1)(x+3)$ has positive Mordell-Weil rank. Let $A$ denote one of the two curves. In this paper, using Waldspurger formula and an induction method, for $n\equiv 3,7\mod 24$ positive square-free, as well as some other residue classes, we express the parity of analytic Sha of $A$ in terms of the genus number $g(m):=\#2\mathrm{Cl}(\mathbb{Q}(\sqrt{-m}))$ as $m$ runs over factors of $n$. Together with $2$-descent method which express $\mathrm{dim}_{\mathbb{F}_2}\mathrm{Sel}_2(A/\mathbb{Q})/A[2]$ in terms of the corank of a matrix of $\mathbb{F}_2$-coefficients, we show that for $n\equiv 3,7\mod 24$ positive square-free, the analytic Sha of $A$ being odd is equivalent to that $\mathrm{Sel}_2(A/\mathbb{Q})/A[2]$ being trivial, as predicted by the BSD conjecture. We also show that, among the residue classes $3$, resp. $7\mod 24$, the subset of $n$ such that both of $E^{(n)}$ and $E^{(-n)}$ have analytic Sha odd is of limit density $0.288\cdots$ and $0.144\cdots$, respectively, in particular, they are non-tiling numbers. This exhibits two new phenomena on tiling number elliptic curves: firstly, the limit density is different from the general phenomenon on elliptic curves predicted by Bhargava-Kane-Lenstra-Poonen-Rains; secondly, the joint distribution has different behavior among different residue classes. △ Less

Submitted 17 May, 2024; originally announced May 2024.

Comments: 25 pages

MSC Class: 11G05 (Primary) 11G40 (Secondary)

arXiv:2405.07626 [pdf, other]

AnomalyLLM: Few-shot Anomaly Edge Detection for Dynamic Graphs using Large Language Models

Authors: Shuo Liu, Di Yao, Lanting Fang, Zhetao Li, Wenbin Li, Kaiyu Feng, XiaoWen Ji, Jingping Bi

Abstract: Detecting anomaly edges for dynamic graphs aims to identify edges significantly deviating from the normal pattern and can be applied in various domains, such as cybersecurity, financial transactions and AIOps. With the evolving of time, the types of anomaly edges are emerging and the labeled anomaly samples are few for each type. Current methods are either designed to detect randomly inserted edge… ▽ More Detecting anomaly edges for dynamic graphs aims to identify edges significantly deviating from the normal pattern and can be applied in various domains, such as cybersecurity, financial transactions and AIOps. With the evolving of time, the types of anomaly edges are emerging and the labeled anomaly samples are few for each type. Current methods are either designed to detect randomly inserted edges or require sufficient labeled data for model training, which harms their applicability for real-world applications. In this paper, we study this problem by cooperating with the rich knowledge encoded in large language models(LLMs) and propose a method, namely AnomalyLLM. To align the dynamic graph with LLMs, AnomalyLLM pre-trains a dynamic-aware encoder to generate the representations of edges and reprograms the edges using the prototypes of word embeddings. Along with the encoder, we design an in-context learning framework that integrates the information of a few labeled samples to achieve few-shot anomaly detection. Experiments on four datasets reveal that AnomalyLLM can not only significantly improve the performance of few-shot anomaly detection, but also achieve superior results on new anomalies without any update of model parameters. △ Less

Submitted 28 August, 2024; v1 submitted 13 May, 2024; originally announced May 2024.

Comments: 13pages

arXiv:2405.04867 [pdf, other]

MIPI 2024 Challenge on Demosaic for HybridEVS Camera: Methods and Results

Authors: Yaqi Wu, Zhihao Fan, Xiaofeng Chu, Jimmy S. Ren, Xiaoming Li, Zongsheng Yue, Chongyi Li, Shangcheng Zhou, Ruicheng Feng, Yuekun Dai, Peiqing Yang, Chen Change Loy, Senyan Xu, Zhijing Sun, Jiaying Zhu, Yurui Zhu, Xueyang Fu, Zheng-Jun Zha, Jun Cao, Cheng Li, Shu Chen, Liang Ma, Shiyang Zhou, Haijin Zeng, Kai Feng , et al. (24 additional authors not shown)

Abstract: The increasing demand for computational photography and imaging on mobile platforms has led to the widespread development and integration of advanced image sensors with novel algorithms in camera systems. However, the scarcity of high-quality data for research and the rare opportunity for in-depth exchange of views from industry and academia constrain the development of mobile intelligent photogra… ▽ More The increasing demand for computational photography and imaging on mobile platforms has led to the widespread development and integration of advanced image sensors with novel algorithms in camera systems. However, the scarcity of high-quality data for research and the rare opportunity for in-depth exchange of views from industry and academia constrain the development of mobile intelligent photography and imaging (MIPI). Building on the achievements of the previous MIPI Workshops held at ECCV 2022 and CVPR 2023, we introduce our third MIPI challenge including three tracks focusing on novel image sensors and imaging algorithms. In this paper, we summarize and review the Nighttime Flare Removal track on MIPI 2024. In total, 170 participants were successfully registered, and 14 teams submitted results in the final testing phase. The developed solutions in this challenge achieved state-of-the-art performance on Nighttime Flare Removal. More details of this challenge and the link to the dataset can be found at https://mipi-challenge.org/MIPI2024/. △ Less

Submitted 8 May, 2024; originally announced May 2024.

Comments: MIPI@CVPR2024. Website: https://mipi-challenge.org/MIPI2024/

arXiv:2404.18440 [pdf, other]

Potential Paradigm Shift in Hazard Risk Management: AI-Based Weather Forecast for Tropical Cyclone Hazards

Authors: Kairui Feng, Dazhi Xi, Wei Ma, Cao Wang, Yuanlong Li, Xuanhong Chen

Abstract: The advents of Artificial Intelligence (AI)-driven models marks a paradigm shift in risk management strategies for meteorological hazards. This study specifically employs tropical cyclones (TCs) as a focal example. We engineer a perturbation-based method to produce ensemble forecasts using the advanced Pangu AI weather model. Unlike traditional approaches that often generate fewer than 20 scenario… ▽ More The advents of Artificial Intelligence (AI)-driven models marks a paradigm shift in risk management strategies for meteorological hazards. This study specifically employs tropical cyclones (TCs) as a focal example. We engineer a perturbation-based method to produce ensemble forecasts using the advanced Pangu AI weather model. Unlike traditional approaches that often generate fewer than 20 scenarios from Weather Research and Forecasting (WRF) simulations for one event, our method facilitates the rapid nature of AI-driven model to create thousands of scenarios. We offer open-source access to our model and evaluate its effectiveness through retrospective case studies of significant TC events: Hurricane Irma (2017), Typhoon Mangkhut (2018), and TC Debbie (2017), affecting regions across North America, East Asia, and Australia. Our findings indicate that the AI-generated ensemble forecasts align closely with the European Centre for Medium-Range Weather Forecasts (ECMWF) ensemble predictions up to seven days prior to landfall. This approach could substantially enhance the effectiveness of weather forecast-driven risk analysis and management, providing unprecedented operational speed, user-friendliness, and global applicability. △ Less

Submitted 29 April, 2024; originally announced April 2024.

arXiv:2404.08008 [pdf, other]

Sample-Efficient Human Evaluation of Large Language Models via Maximum Discrepancy Competition

Authors: Kehua Feng, Keyan Ding, Kede Ma, Zhihua Wang, Qiang Zhang, Huajun Chen

Abstract: The past years have witnessed a proliferation of large language models (LLMs). Yet, automated and unbiased evaluation of LLMs is challenging due to the inaccuracy of standard metrics in reflecting human preferences and the inefficiency in sampling informative and diverse test examples. While human evaluation remains the gold standard, it is expensive and time-consuming, especially when dealing wit… ▽ More The past years have witnessed a proliferation of large language models (LLMs). Yet, automated and unbiased evaluation of LLMs is challenging due to the inaccuracy of standard metrics in reflecting human preferences and the inefficiency in sampling informative and diverse test examples. While human evaluation remains the gold standard, it is expensive and time-consuming, especially when dealing with a large number of testing samples. To address this problem, we propose a sample-efficient human evaluation method based on MAximum Discrepancy (MAD) competition. MAD automatically selects a small set of informative and diverse instructions, each adapted to two LLMs, whose responses are subject to three-alternative forced choice by human subjects. The pairwise comparison results are then aggregated into a global ranking using the Elo rating system. We select eight representative LLMs and compare them in terms of four skills: knowledge understanding, mathematical reasoning, writing, and coding. Experimental results show that the proposed method achieves a reliable and sensible ranking of LLMs' capabilities, identifies their relative strengths and weaknesses, and offers valuable insights for further LLM advancement. △ Less

Submitted 9 April, 2024; originally announced April 2024.

Comments: 32 pages, 6 figures

arXiv:2404.05952 [pdf, other]

Robot Safe Planning In Dynamic Environments Based On Model Predictive Control Using Control Barrier Function

Authors: Zetao Lu, Kaijun Feng, Jun Xu, Haoyao Chen, Yunjiang Lou

Abstract: Implementing obstacle avoidance in dynamic environments is a challenging problem for robots. Model predictive control (MPC) is a popular strategy for dealing with this type of problem, and recent work mainly uses control barrier function (CBF) as hard constraints to ensure that the system state remains in the safe set. However, in crowded scenarios, effective solutions may not be obtained due to i… ▽ More Implementing obstacle avoidance in dynamic environments is a challenging problem for robots. Model predictive control (MPC) is a popular strategy for dealing with this type of problem, and recent work mainly uses control barrier function (CBF) as hard constraints to ensure that the system state remains in the safe set. However, in crowded scenarios, effective solutions may not be obtained due to infeasibility problems, resulting in degraded controller performance. We propose a new MPC framework that integrates CBF to tackle the issue of obstacle avoidance in dynamic environments, in which the infeasibility problem induced by hard constraints operating over the whole prediction horizon is solved by softening the constraints and introducing exact penalty, prompting the robot to actively seek out new paths. At the same time, generalized CBF is extended as a single-step safety constraint of the controller to enhance the safety of the robot during navigation. The efficacy of the proposed method is first shown through simulation experiments, in which a double-integrator system and a unicycle system are employed, and the proposed method outperforms other controllers in terms of safety, feasibility, and navigation efficiency. Furthermore, real-world experiment on an MR1000 robot is implemented to demonstrate the effectiveness of the proposed method. △ Less

Submitted 8 April, 2024; originally announced April 2024.

arXiv:2404.05268 [pdf, other]

MC$^2$: Multi-concept Guidance for Customized Multi-concept Generation

Authors: Jiaxiu Jiang, Yabo Zhang, Kailai Feng, Xiaohe Wu, Wangmeng Zuo

Abstract: Customized text-to-image generation aims to synthesize instantiations of user-specified concepts and has achieved unprecedented progress in handling individual concept. However, when extending to multiple customized concepts, existing methods exhibit limitations in terms of flexibility and fidelity, only accommodating the combination of limited types of models and potentially resulting in a mix of… ▽ More Customized text-to-image generation aims to synthesize instantiations of user-specified concepts and has achieved unprecedented progress in handling individual concept. However, when extending to multiple customized concepts, existing methods exhibit limitations in terms of flexibility and fidelity, only accommodating the combination of limited types of models and potentially resulting in a mix of characteristics from different concepts. In this paper, we introduce the Multi-concept guidance for Multi-concept customization, termed MC$^2$, for improved flexibility and fidelity. MC$^2$ decouples the requirements for model architecture via inference time optimization, allowing the integration of various heterogeneous single-concept customized models. It adaptively refines the attention weights between visual and textual tokens, directing image regions to focus on their associated words while diminishing the impact of irrelevant ones. Extensive experiments demonstrate that MC$^2$ even surpasses previous methods that require additional training in terms of consistency with input prompt and reference images. Moreover, MC$^2$ can be extended to elevate the compositional capabilities of text-to-image generation, yielding appealing results. Code will be publicly available at https://github.com/JIANGJiaXiu/MC-2. △ Less

Submitted 12 April, 2024; v1 submitted 8 April, 2024; originally announced April 2024.

arXiv:2403.18248 [pdf, other]

Statistical Inference of Optimal Allocations I: Regularities and their Implications

Authors: Kai Feng, Han Hong

Abstract: In this paper, we develop a functional differentiability approach for solving statistical optimal allocation problems. We first derive Hadamard differentiability of the value function through a detailed analysis of the general properties of the sorting operator. Central to our framework are the concept of Hausdorff measure and the area and coarea integration formulas from geometric measure theory.… ▽ More In this paper, we develop a functional differentiability approach for solving statistical optimal allocation problems. We first derive Hadamard differentiability of the value function through a detailed analysis of the general properties of the sorting operator. Central to our framework are the concept of Hausdorff measure and the area and coarea integration formulas from geometric measure theory. Building on our Hadamard differentiability results, we demonstrate how the functional delta method can be used to directly derive the asymptotic properties of the value function process for binary constrained optimal allocation problems, as well as the two-step ROC curve estimator. Moreover, leveraging profound insights from geometric functional analysis on convex and local Lipschitz functionals, we obtain additional generic Fréchet differentiability results for the value functions of optimal allocation problems. These compelling findings motivate us to study carefully the first order approximation of the optimal social welfare. In this paper, we then present a double / debiased estimator for the value functions. Importantly, the conditions outlined in the Hadamard differentiability section validate the margin assumption from the statistical classification literature employing plug-in methods that justifies a faster convergence rate. △ Less

Submitted 7 April, 2024; v1 submitted 27 March, 2024; originally announced March 2024.

arXiv:2403.16427 [pdf, other]

Re2LLM: Reflective Reinforcement Large Language Model for Session-based Recommendation

Authors: Ziyan Wang, Yingpeng Du, Zhu Sun, Haoyan Chua, Kaidong Feng, Wenya Wang, Jie Zhang

Abstract: Large Language Models (LLMs) are emerging as promising approaches to enhance session-based recommendation (SBR), where both prompt-based and fine-tuning-based methods have been widely investigated to align LLMs with SBR. However, the former methods struggle with optimal prompts to elicit the correct reasoning of LLMs due to the lack of task-specific feedback, leading to unsatisfactory recommendati… ▽ More Large Language Models (LLMs) are emerging as promising approaches to enhance session-based recommendation (SBR), where both prompt-based and fine-tuning-based methods have been widely investigated to align LLMs with SBR. However, the former methods struggle with optimal prompts to elicit the correct reasoning of LLMs due to the lack of task-specific feedback, leading to unsatisfactory recommendations. Although the latter methods attempt to fine-tune LLMs with domain-specific knowledge, they face limitations such as high computational costs and reliance on open-source backbones. To address such issues, we propose a Reflective Reinforcement Large Language Model (Re2LLM) for SBR, guiding LLMs to focus on specialized knowledge essential for more accurate recommendations effectively and efficiently. In particular, we first design the Reflective Exploration Module to effectively extract knowledge that is readily understandable and digestible by LLMs. To be specific, we direct LLMs to examine recommendation errors through self-reflection and construct a knowledge base (KB) comprising hints capable of rectifying these errors. To efficiently elicit the correct reasoning of LLMs, we further devise the Reinforcement Utilization Module to train a lightweight retrieval agent. It learns to select hints from the constructed KB based on the task-specific feedback, where the hints can serve as guidance to help correct LLMs reasoning for better recommendations. Extensive experiments on multiple real-world datasets demonstrate that our method consistently outperforms state-of-the-art methods. △ Less

Submitted 19 April, 2024; v1 submitted 25 March, 2024; originally announced March 2024.

Comments: 11 pages, 4 figures

arXiv:2403.01238 [pdf, other]

On the Road to Portability: Compressing End-to-End Motion Planner for Autonomous Driving

Authors: Kaituo Feng, Changsheng Li, Dongchun Ren, Ye Yuan, Guoren Wang

Abstract: End-to-end motion planning models equipped with deep neural networks have shown great potential for enabling full autonomous driving. However, the oversized neural networks render them impractical for deployment on resource-constrained systems, which unavoidably requires more computational time and resources during reference.To handle this, knowledge distillation offers a promising approach that c… ▽ More End-to-end motion planning models equipped with deep neural networks have shown great potential for enabling full autonomous driving. However, the oversized neural networks render them impractical for deployment on resource-constrained systems, which unavoidably requires more computational time and resources during reference.To handle this, knowledge distillation offers a promising approach that compresses models by enabling a smaller student model to learn from a larger teacher model. Nevertheless, how to apply knowledge distillation to compress motion planners has not been explored so far. In this paper, we propose PlanKD, the first knowledge distillation framework tailored for compressing end-to-end motion planners. First, considering that driving scenes are inherently complex, often containing planning-irrelevant or even noisy information, transferring such information is not beneficial for the student planner. Thus, we design an information bottleneck based strategy to only distill planning-relevant information, rather than transfer all information indiscriminately. Second, different waypoints in an output planned trajectory may hold varying degrees of importance for motion planning, where a slight deviation in certain crucial waypoints might lead to a collision. Therefore, we devise a safety-aware waypoint-attentive distillation module that assigns adaptive weights to different waypoints based on the importance, to encourage the student to accurately mimic more crucial waypoints, thereby improving overall safety. Experiments demonstrate that our PlanKD can boost the performance of smaller planners by a large margin, and significantly reduce their reference time. △ Less

Submitted 15 April, 2024; v1 submitted 2 March, 2024; originally announced March 2024.

Comments: Accepted by CVPR 2024

arXiv:2402.13011 [pdf, other]

doi 10.1016/j.amc.2024.128618

An evolutionary game with reputation-based imitation-mutation dynamics

Authors: Kehuan Feng, Songlin Han, Minyu Feng, Attila Szolnoki

Abstract: Reputation plays a crucial role in social interactions by affecting the fitness of individuals during an evolutionary process. Previous works have extensively studied the result of imitation dynamics without focusing on potential irrational choices in strategy updates. We now fill this gap and explore the consequence of such kind of randomness, or one may interpret it as an autonomous thinking. In… ▽ More Reputation plays a crucial role in social interactions by affecting the fitness of individuals during an evolutionary process. Previous works have extensively studied the result of imitation dynamics without focusing on potential irrational choices in strategy updates. We now fill this gap and explore the consequence of such kind of randomness, or one may interpret it as an autonomous thinking. In particular, we study how this extended dynamics alters the evolution of cooperation when individual reputation is directly linked to collected payoff, hence providing a general fitness function. For a broadly valid conclusion, our spatial populations cover different types of interaction topologies, including lattices, small-world and scale-free graphs. By means of intensive simulations we can detect substantial increase in cooperation level that shows a reasonable stability in the presence of a notable strategy mutation. △ Less

Submitted 20 February, 2024; originally announced February 2024.

Comments: 13 pages, 8 figures, to be published in Applied Mathematics and Computation

Journal ref: Appl. Math. Comput. 472 (2024) 128618

arXiv:2402.02963 [pdf, other]

One-class anomaly detection through color-to-thermal AI for building envelope inspection

Authors: Polina Kurtser, Kailun Feng, Thomas Olofsson, Aitor De Andres

Abstract: We present a label-free method for detecting anomalies during thermographic inspection of building envelopes. It is based on the AI-driven prediction of thermal distributions from color images. Effectively the method performs as a one-class classifier of the thermal image regions with high mismatch between the predicted and actual thermal distributions. The algorithm can learn to identify certain… ▽ More We present a label-free method for detecting anomalies during thermographic inspection of building envelopes. It is based on the AI-driven prediction of thermal distributions from color images. Effectively the method performs as a one-class classifier of the thermal image regions with high mismatch between the predicted and actual thermal distributions. The algorithm can learn to identify certain features as normal or anomalous by selecting the target sample used for training. We demonstrated this principle by training the algorithm with data collected at different outdoors temperature, which lead to the detection of thermal bridges. The method can be implemented to assist human professionals during routine building inspections or combined with mobile platforms for automating examination of large areas. △ Less

Submitted 5 February, 2024; originally announced February 2024.

arXiv:2402.01864 [pdf, other]

(A)I Am Not a Lawyer, But...: Engaging Legal Experts towards Responsible LLM Policies for Legal Advice

Authors: Inyoung Cheong, King Xia, K. J. Kevin Feng, Quan Ze Chen, Amy X. Zhang

Abstract: Large language models (LLMs) are increasingly capable of providing users with advice in a wide range of professional domains, including legal advice. However, relying on LLMs for legal queries raises concerns due to the significant expertise required and the potential real-world consequences of the advice. To explore \textit{when} and \textit{why} LLMs should or should not provide advice to users,… ▽ More Large language models (LLMs) are increasingly capable of providing users with advice in a wide range of professional domains, including legal advice. However, relying on LLMs for legal queries raises concerns due to the significant expertise required and the potential real-world consequences of the advice. To explore \textit{when} and \textit{why} LLMs should or should not provide advice to users, we conducted workshops with 20 legal experts using methods inspired by case-based reasoning. The provided realistic queries ("cases") allowed experts to examine granular, situation-specific concerns and overarching technical and legal constraints, producing a concrete set of contextual considerations for LLM developers. By synthesizing the factors that impacted LLM response appropriateness, we present a 4-dimension framework: (1) User attributes and behaviors, (2) Nature of queries, (3) AI capabilities, and (4) Social impacts. We share experts' recommendations for LLM response strategies, which center around helping users identify `right questions to ask' and relevant information rather than providing definitive legal judgments. Our findings reveal novel legal considerations, such as unauthorized practice of law, confidentiality, and liability for inaccurate advice, that have been overlooked in the literature. The case-based deliberation method enabled us to elicit fine-grained, practice-informed insights that surpass those from de-contextualized surveys or speculative principles. These findings underscore the applicability of our method for translating domain-specific professional knowledge and practices into policies that can guide LLM behavior in a more responsible direction. △ Less

Submitted 3 May, 2024; v1 submitted 2 February, 2024; originally announced February 2024.

Comments: 14 pages

arXiv:2401.15221 [pdf, other]

Designing and Testing a Mobile Application for Collecting WhatsApp Chat Data While Preserving Privacy

Authors: Brennan Schaffner, Archie Brohn, Jason Chee, K. J. Feng, Marshini Chetty

Abstract: It is common practice for researchers to join public WhatsApp chats and scrape their contents for analysis. However, research shows collecting data this way contradicts user expectations and preferences, even if the data is effectively public. To overcome these issues, we outline design considerations for collecting WhatsApp chat data with improved user privacy by heightening user control and over… ▽ More It is common practice for researchers to join public WhatsApp chats and scrape their contents for analysis. However, research shows collecting data this way contradicts user expectations and preferences, even if the data is effectively public. To overcome these issues, we outline design considerations for collecting WhatsApp chat data with improved user privacy by heightening user control and oversight of data collection and taking care to minimize the data researchers collect and process off a user's device. We refer to these design principles as User-Centered Data Sharing (UCDS). To evaluate our UCDS principles, we implemented a mobile application representing one possible instance of these improved data collection techniques and evaluated the viability of using the app to collect WhatsApp chat data. Second, we surveyed WhatsApp users to gather user perceptions on common existing WhatsApp data collection methods as well as UCDS methods. Our results show that we were able to glean similar informative insights into WhatsApp chats using UCDS principles in our prototype app to common, less privacy-preserving methods. Our survey showed that methods following the UCDS principles are preferred by users because they offered users more control over the data collection process. Future user studies could further expand upon UCDS principles to overcome complications of researcher-to-group communication in research on WhatsApp chats and evaluate these principles in other data sharing contexts. △ Less

Submitted 26 January, 2024; originally announced January 2024.

arXiv:2401.14656 [pdf, other]

Scientific Large Language Models: A Survey on Biological & Chemical Domains

Authors: Qiang Zhang, Keyang Ding, Tianwen Lyv, Xinda Wang, Qingyu Yin, Yiwen Zhang, Jing Yu, Yuhao Wang, Xiaotong Li, Zhuoyi Xiang, Kehua Feng, Xiang Zhuang, Zeyuan Wang, Ming Qin, Mengyao Zhang, Jinlu Zhang, Jiyu Cui, Tao Huang, Pengju Yan, Renjun Xu, Hongyang Chen, Xiaolin Li, Xiaohui Fan, Huabin Xing, Huajun Chen

Abstract: Large Language Models (LLMs) have emerged as a transformative power in enhancing natural language comprehension, representing a significant stride toward artificial general intelligence. The application of LLMs extends beyond conventional linguistic boundaries, encompassing specialized linguistic systems developed within various scientific disciplines. This growing interest has led to the advent o… ▽ More Large Language Models (LLMs) have emerged as a transformative power in enhancing natural language comprehension, representing a significant stride toward artificial general intelligence. The application of LLMs extends beyond conventional linguistic boundaries, encompassing specialized linguistic systems developed within various scientific disciplines. This growing interest has led to the advent of scientific LLMs, a novel subclass specifically engineered for facilitating scientific discovery. As a burgeoning area in the community of AI for Science, scientific LLMs warrant comprehensive exploration. However, a systematic and up-to-date survey introducing them is currently lacking. In this paper, we endeavor to methodically delineate the concept of "scientific language", whilst providing a thorough review of the latest advancements in scientific LLMs. Given the expansive realm of scientific disciplines, our analysis adopts a focused lens, concentrating on the biological and chemical domains. This includes an in-depth examination of LLMs for textual knowledge, small molecules, macromolecular proteins, genomic sequences, and their combinations, analyzing them in terms of model architectures, capabilities, datasets, and evaluation. Finally, we critically examine the prevailing challenges and point out promising research directions along with the advances of LLMs. By offering a comprehensive overview of technical developments in this field, this survey aspires to be an invaluable resource for researchers navigating the intricate landscape of scientific LLMs. △ Less

Submitted 23 July, 2024; v1 submitted 26 January, 2024; originally announced January 2024.

arXiv:2401.14000 [pdf, other]

doi 10.1145/3613904.3642120

Mapping the Design Space of Teachable Social Media Feed Experiences

Authors: K. J. Kevin Feng, Xander Koo, Lawrence Tan, Amy Bruckman, David W. McDonald, Amy X. Zhang

Abstract: Social media feeds are deeply personal spaces that reflect individual values and preferences. However, top-down, platform-wide content algorithms can reduce users' sense of agency and fail to account for nuanced experiences and values. Drawing on the paradigm of interactive machine teaching (IMT), an interaction framework for non-expert algorithmic adaptation, we map out a design space for teachab… ▽ More Social media feeds are deeply personal spaces that reflect individual values and preferences. However, top-down, platform-wide content algorithms can reduce users' sense of agency and fail to account for nuanced experiences and values. Drawing on the paradigm of interactive machine teaching (IMT), an interaction framework for non-expert algorithmic adaptation, we map out a design space for teachable social media feed experiences to empower agential, personalized feed curation. To do so, we conducted a think-aloud study (N=24) featuring four social media platforms -- Instagram, Mastodon, TikTok, and Twitter -- to understand key signals users leveraged to determine the value of a post in their feed. We synthesized users' signals into taxonomies that, when combined with user interviews, inform five design principles that extend IMT into the social media setting. We finally embodied our principles into three feed designs that we present as sensitizing concepts for teachable feed experiences moving forward. △ Less

Submitted 29 January, 2024; v1 submitted 25 January, 2024; originally announced January 2024.

Comments: CHI 2024

arXiv:2401.11338 [pdf, other]

doi 10.1063/5.0199112

ENN's Roadmap for Proton-Boron Fusion Based on Spherical Torus

Authors: Min-sheng Liu, Hua-sheng Xie, Yu-min Wang, Jia-qi Dong, Kai-ming Feng, Xiang Gu, Xian-li Huang, Xin-chen Jiang, Ying-ying Li, Zhi Li, Bing Liu, Wen-jun Liu, Di Luo, Yueng-Kay Martin Peng, Yue-jiang Shi, Shao-dong Song, Xian-ming Song, Tian-tian Sun, Mu-zhi Tan, Xue-yun Wang, Yuan-ming Yang, Gang Yin, Han-yue Zhao, ENN fusion team

Abstract: ENN Science and Technology Development Co., Ltd. (ENN) is committed to generating fusion energy in an environmentally friendly and cost-effective manner, which requires abundant aneutronic fuel. Proton-boron ( p-$^{11}$B or p-B) fusion is considered an ideal choice for this purpose. Recent studies have suggested that p-B fusion, although challenging, is feasible based on new cross-section data, pr… ▽ More ENN Science and Technology Development Co., Ltd. (ENN) is committed to generating fusion energy in an environmentally friendly and cost-effective manner, which requires abundant aneutronic fuel. Proton-boron ( p-$^{11}$B or p-B) fusion is considered an ideal choice for this purpose. Recent studies have suggested that p-B fusion, although challenging, is feasible based on new cross-section data, provided that a hot ion mode and high wall reflection can be achieved to reduce electron radiation loss. The high beta and good confinement of the spherical torus (ST) make it an ideal candidate for p-B fusion. By utilizing the new spherical torus energy confinement scaling law, a reactor with a major radius $R_0=4$ m, central magnetic field $B_0=6$ T, central temperature $T_{i0}=150$ keV, plasma current $I_p=30$ MA, and hot ion mode $T_i/T_e=4$ can yield p-B fusion with $Q>10$. A roadmap for p-B fusion has been developed, with the next-generation device named EHL-2. EHL stands for ENN He-Long, which literally means ``peaceful Chinese Loong". The main target parameters include $R_0\simeq1.05$ m, $A\simeq1.85$, $B_0\simeq3$ T, $T_{i0}\simeq30$ keV, $I_p\simeq3$ MA, and $T_i/T_e\geq2$. The existing ST device EXL-50 was simultaneously upgraded to provide experimental support for the new roadmap, involving the installation and upgrading of the central solenoid, vacuum chamber, and magnetic systems. The construction of the upgraded ST fusion device, EXL-50U, was completed at the end of 2023, and it achieved its first plasma in January 2024. The construction of EHL-2 is estimated to be completed by 2026. △ Less

Submitted 10 June, 2024; v1 submitted 20 January, 2024; originally announced January 2024.

Comments: 16 pages, 8 figures

Journal ref: Phys. Plasmas 31, 062507 (2024)

arXiv:2401.10418 [pdf, other]

Hazard resistance-based spatiotemporal risk analysis for distribution network outages during hurricanes

Authors: Luo Xu, Ning Lin, Dazhi Xi, Kairui Feng, H. Vincent Poor

Abstract: Blackouts in recent decades show an increasing prevalence of power outages due to extreme weather events such as hurricanes. Precisely assessing the spatiotemporal outages in distribution networks, the most vulnerable part of power systems, is critical to enhance power system resilience. The Sequential Monte Carlo (SMC) simulation method is widely used for spatiotemporal risk analysis of power sys… ▽ More Blackouts in recent decades show an increasing prevalence of power outages due to extreme weather events such as hurricanes. Precisely assessing the spatiotemporal outages in distribution networks, the most vulnerable part of power systems, is critical to enhance power system resilience. The Sequential Monte Carlo (SMC) simulation method is widely used for spatiotemporal risk analysis of power systems during extreme weather hazards. However, it is found here that the SMC method can lead to large errors by directly applying the fragility function or failure probability of system components in time-sequential analysis, particularly overestimating damages under evolving hazards with high-frequency sampling. To address this issue, a novel hazard resistance-based spatiotemporal risk analysis (HRSRA) method is proposed. This method converts the time-varying failure probability of a component into a hazard resistance as a time-invariant value during the simulation of evolving hazards. The proposed HRSRA provides an adaptive framework for incorporating high-spatiotemporal-resolution meteorology models into power outage simulations. By leveraging the geographic information system data of the power system and a physics-based hurricane wind field model, the superiority of the proposed method is validated using real-world time-series power outage data from Puerto Rico during Hurricane Fiona 2022. △ Less

Submitted 18 January, 2024; originally announced January 2024.

Comments: 10 pages, 10 figures

arXiv:2401.09051 [pdf, other]

Canvil: Designerly Adaptation for LLM-Powered User Experiences

Authors: K. J. Kevin Feng, Q. Vera Liao, Ziang Xiao, Jennifer Wortman Vaughan, Amy X. Zhang, David W. McDonald

Abstract: Advancements in large language models (LLMs) are poised to spark a proliferation of LLM-powered user experiences. In product teams, designers are often tasked with crafting user experiences that align with user needs. To involve designers and leverage their user-centered perspectives to create effective and responsible LLM-powered products, we introduce the practice of designerly adaptation for en… ▽ More Advancements in large language models (LLMs) are poised to spark a proliferation of LLM-powered user experiences. In product teams, designers are often tasked with crafting user experiences that align with user needs. To involve designers and leverage their user-centered perspectives to create effective and responsible LLM-powered products, we introduce the practice of designerly adaptation for engaging with LLMs as an adaptable design material. We first identify key characteristics of designerly adaptation through a formative study with designers experienced in designing for LLM-powered products (N=12). These characteristics are 1) have a low technical barrier to entry, 2) leverage designers' unique perspectives bridging users and technology, and 3) encourage model tinkering. Based on this characterization, we build Canvil, a Figma widget that operationalizes designerly adaptation. Canvil supports structured authoring of system prompts to adapt LLM behavior, testing of adapted models on diverse user inputs, and integration of model outputs into interface designs. We use Canvil as a technology probe in a group-based design study (6 groups, N=17) to investigate the implications of integrating designerly adaptation into design workflows. We find that designers are able to iteratively tinker with different adaptation approaches and reason about interface affordances to enhance end-user interaction with LLMs. Furthermore, designers identified promising collaborative workflows for designerly adaptation. Our work opens new avenues for collaborative processes and tools that foreground designers' user-centered expertise in the crafting and deployment of LLM-powered user experiences. △ Less

Submitted 17 January, 2024; originally announced January 2024.

arXiv:2401.03188 [pdf, other]

doi 10.1109/TAI.2024.3351798

A Survey on Verification and Validation, Testing and Evaluations of Neurosymbolic Artificial Intelligence

Authors: Justus Renkhoff, Ke Feng, Marc Meier-Doernberg, Alvaro Velasquez, Houbing Herbert Song

Abstract: Neurosymbolic artificial intelligence (AI) is an emerging branch of AI that combines the strengths of symbolic AI and sub-symbolic AI. A major drawback of sub-symbolic AI is that it acts as a "black box", meaning that predictions are difficult to explain, making the testing & evaluation (T&E) and validation & verification (V&V) processes of a system that uses sub-symbolic AI a challenge. Since neu… ▽ More Neurosymbolic artificial intelligence (AI) is an emerging branch of AI that combines the strengths of symbolic AI and sub-symbolic AI. A major drawback of sub-symbolic AI is that it acts as a "black box", meaning that predictions are difficult to explain, making the testing & evaluation (T&E) and validation & verification (V&V) processes of a system that uses sub-symbolic AI a challenge. Since neurosymbolic AI combines the advantages of both symbolic and sub-symbolic AI, this survey explores how neurosymbolic applications can ease the V&V process. This survey considers two taxonomies of neurosymbolic AI, evaluates them, and analyzes which algorithms are commonly used as the symbolic and sub-symbolic components in current applications. Additionally, an overview of current techniques for the T&E and V&V processes of these components is provided. Furthermore, it is investigated how the symbolic part is used for T&E and V&V purposes in current neurosymbolic applications. Our research shows that neurosymbolic AI as great potential to ease the T&E and V&V processes of sub-symbolic AI by leveraging the possibilities of symbolic AI. Additionally, the applicability of current T&E and V&V methods to neurosymbolic AI is assessed, and how different neurosymbolic architectures can impact these methods is explored. It is found that current T&E and V&V techniques are partly sufficient to test, evaluate, verify, or validate the symbolic and sub-symbolic part of neurosymbolic applications independently, while some of them use approaches where current T&E and V&V methods are not applicable by default, and adjustments or even new approaches are needed. Our research shows that there is great potential in using symbolic AI to test, evaluate, verify, or validate the predictions of a sub-symbolic model, making neurosymbolic AI an interesting research direction for safe, secure, and trustworthy AI. △ Less

Submitted 10 January, 2024; v1 submitted 6 January, 2024; originally announced January 2024.

Comments: 16 pages, 8 figures

arXiv:2312.16262 [pdf, other]

doi 10.1145/3626772.3657808

Dynamic In-Context Learning from Nearest Neighbors for Bundle Generation

Authors: Zhu Sun, Kaidong Feng, Jie Yang, Xinghua Qu, Hui Fang, Yew-Soon Ong, Wenyuan Liu

Abstract: Product bundling has evolved into a crucial marketing strategy in e-commerce. However, current studies are limited to generating (1) fixed-size or single bundles, and most importantly, (2) bundles that do not reflect consistent user intents, thus being less intelligible or useful to users. This paper explores two interrelated tasks, i.e., personalized bundle generation and the underlying intent in… ▽ More Product bundling has evolved into a crucial marketing strategy in e-commerce. However, current studies are limited to generating (1) fixed-size or single bundles, and most importantly, (2) bundles that do not reflect consistent user intents, thus being less intelligible or useful to users. This paper explores two interrelated tasks, i.e., personalized bundle generation and the underlying intent inference based on users' interactions in a session, leveraging the logical reasoning capability of large language models. We introduce a dynamic in-context learning paradigm, which enables ChatGPT to seek tailored and dynamic lessons from closely related sessions as demonstrations while performing tasks in the target session. Specifically, it first harnesses retrieval augmented generation to identify nearest neighbor sessions for each target session. Then, proper prompts are designed to guide ChatGPT to perform the two tasks on neighbor sessions. To enhance reliability and mitigate the hallucination issue, we develop (1) a self-correction strategy to foster mutual improvement in both tasks without supervision signals; and (2) an auto-feedback mechanism to recurrently offer dynamic supervision based on the distinct mistakes made by ChatGPT on various neighbor sessions. Thus, the target session can receive customized and dynamic lessons for improved performance by observing the demonstrations of its neighbor sessions. Finally, experimental results on three real-world datasets verify the effectiveness of our methods on both tasks. Additionally, the inferred intents can prove beneficial for other intriguing downstream tasks, such as crafting appealing bundle names. △ Less

Submitted 26 December, 2023; originally announced December 2023.

arXiv:2312.09937 [pdf]

doi 10.1039/D3CP04560A

Structural dimerization and charge-orbital ordering in a ferromagnetic semiconductor LiV2S4 monolayer

Authors: Rui Song, Bili Wang, Kai Feng, Jia Yao, Mengjie Lu, Jing Bai, Shuai Dong, Ming An

Abstract: With the rise of two-dimensional (2D) materials, unique properties that are completely distinct from bulk counterparts continue to emerge at low-dimensional scales, presenting numerous opportunities and challenges. It also provides a new perspective for the study of transition metal system. Here, based on density functional theory (DFT), the physical properties of 2D monolayer LiV2S4 have been stu… ▽ More With the rise of two-dimensional (2D) materials, unique properties that are completely distinct from bulk counterparts continue to emerge at low-dimensional scales, presenting numerous opportunities and challenges. It also provides a new perspective for the study of transition metal system. Here, based on density functional theory (DFT), the physical properties of 2D monolayer LiV2S4 have been studied. Remarkable changes have been observed, i.e., vanadium dimerization, ferromagnetism, charge distribution and metal-insulator transition (MIT). It is argued that the electronic instability leads to the V dimerization which further lifts the degeneracy of charge distribution and stabilizes the charge and spin ordering state. △ Less

Submitted 15 December, 2023; originally announced December 2023.

Comments: 6 pages, 5 Figures

arXiv:2312.07552 [pdf, other]

doi 10.1145/3626772.3657688

Large Language Models for Intent-Driven Session Recommendations

Authors: Zhu Sun, Hongyang Liu, Xinghua Qu, Kaidong Feng, Yan Wang, Yew-Soon Ong

Abstract: Intent-aware session recommendation (ISR) is pivotal in discerning user intents within sessions for precise predictions. Traditional approaches, however, face limitations due to their presumption of a uniform number of intents across all sessions. This assumption overlooks the dynamic nature of user sessions, where the number and type of intentions can significantly vary. In addition, these method… ▽ More Intent-aware session recommendation (ISR) is pivotal in discerning user intents within sessions for precise predictions. Traditional approaches, however, face limitations due to their presumption of a uniform number of intents across all sessions. This assumption overlooks the dynamic nature of user sessions, where the number and type of intentions can significantly vary. In addition, these methods typically operate in latent spaces, thus hinder the model's transparency.Addressing these challenges, we introduce a novel ISR approach, utilizing the advanced reasoning capabilities of large language models (LLMs). First, this approach begins by generating an initial prompt that guides LLMs to predict the next item in a session, based on the varied intents manifested in user sessions. Then, to refine this process, we introduce an innovative prompt optimization mechanism that iteratively self-reflects and adjusts prompts. Furthermore, our prompt selection module, built upon the LLMs' broad adaptability, swiftly selects the most optimized prompts across diverse domains. This new paradigm empowers LLMs to discern diverse user intents at a semantic level, leading to more accurate and interpretable session recommendations. Our extensive experiments on three real-world datasets demonstrate the effectiveness of our method, marking a significant advancement in ISR systems. △ Less

Submitted 6 December, 2023; originally announced December 2023.

arXiv:2311.10934 [pdf, other]

Case Repositories: Towards Case-Based Reasoning for AI Alignment

Authors: K. J. Kevin Feng, Quan Ze Chen, Inyoung Cheong, King Xia, Amy X. Zhang

Abstract: Case studies commonly form the pedagogical backbone in law, ethics, and many other domains that face complex and ambiguous societal questions informed by human values. Similar complexities and ambiguities arise when we consider how AI should be aligned in practice: when faced with vast quantities of diverse (and sometimes conflicting) values from different individuals and communities, with whose v… ▽ More Case studies commonly form the pedagogical backbone in law, ethics, and many other domains that face complex and ambiguous societal questions informed by human values. Similar complexities and ambiguities arise when we consider how AI should be aligned in practice: when faced with vast quantities of diverse (and sometimes conflicting) values from different individuals and communities, with whose values is AI to align, and how should AI do so? We propose a complementary approach to constitutional AI alignment, grounded in ideas from case-based reasoning (CBR), that focuses on the construction of policies through judgments on a set of cases. We present a process to assemble such a case repository by: 1) gathering a set of ``seed'' cases -- questions one may ask an AI system -- in a particular domain, 2) eliciting domain-specific key dimensions for cases through workshops with domain experts, 3) using LLMs to generate variations of cases not seen in the wild, and 4) engaging with the public to judge and improve cases. We then discuss how such a case repository could assist in AI alignment, both through directly acting as precedents to ground acceptable behaviors, and as a medium for individuals and communities to engage in moral reasoning around AI. △ Less

Submitted 26 November, 2023; v1 submitted 17 November, 2023; originally announced November 2023.

Comments: MP2 workshop @ NeurIPS 2023

arXiv:2311.04241 [pdf, ps, other]

AI-Enabled Unmanned Vehicle-Assisted Reconfigurable Intelligent Surfaces: Deployment, Prototyping, Experiments, and Opportunities

Authors: Li-Hsiang Shen, Kai-Ten Feng, Ta-Sung Lee, Yuan-Chun Lin, Shih-Cheng Lin, Chia-Chan Chang, Sheng-Fuh Chang

Abstract: The requirement of wireless data demands is increasingly high as the sixth-generation (6G) technology evolves. Reconfigurable intelligent surface (RIS) is promisingly deemed to be one of 6G techniques for extending service coverage, reducing power consumption, and enhancing spectral efficiency. In this article, we have provided some fundamentals of RIS deployment in theory and hardware perspective… ▽ More The requirement of wireless data demands is increasingly high as the sixth-generation (6G) technology evolves. Reconfigurable intelligent surface (RIS) is promisingly deemed to be one of 6G techniques for extending service coverage, reducing power consumption, and enhancing spectral efficiency. In this article, we have provided some fundamentals of RIS deployment in theory and hardware perspectives as well as utilization of artificial intelligence (AI) and machine learning. We conducted an intelligent deployment of RIS (i-Dris) prototype, including dual-band auto-guided vehicle (AGV) assisted RISs associated with an mmWave base station (BS) and a receiver. The RISs are deployed on the AGV with configured incident/reflection angles. While, both the mmWave BS and receiver are associated with an edge server monitoring downlink packets for obtaining system throughput. We have designed a federated multi-agent reinforcement learning scheme associated with several AGV-RIS agents and sub-agents per AGV-RIS consisting of the deployment of position, height, orientation and elevation angles. The experimental results presented the stationary measurement in different aspects and scenarios. The i-Dris can reach up to 980 Mbps transmission throughput under a bandwidth of 100 MHz with comparably low complexity as well as rapid deployment, which outperforms the other existing works. At last, we highlight some opportunities and future issues in leveraging RIS-empowered wireless communication networks. △ Less

Submitted 6 November, 2023; originally announced November 2023.

arXiv:2311.03579 [pdf, ps, other]

Downlink Rate Maximization with Reconfigurable Intelligent Surface Assisted Full-Duplex Transmissions

Authors: Li-Hsiang Shen, Chia-Jou Ku, Kai-Ten Feng

Abstract: Reconfigurable intelligent surfaces (RIS) as an effective technique for intelligently manipulating channel paths through reflection to serve desired users. Full-duplex (FD) systems, enabling simultaneous transmission and reception from a base station (BS), offer the theoretical advantage of doubled spectrum efficiency. However, the presence of strong self-interference (SI) in FD systems significan… ▽ More Reconfigurable intelligent surfaces (RIS) as an effective technique for intelligently manipulating channel paths through reflection to serve desired users. Full-duplex (FD) systems, enabling simultaneous transmission and reception from a base station (BS), offer the theoretical advantage of doubled spectrum efficiency. However, the presence of strong self-interference (SI) in FD systems significantly degrades performance, which can be mitigated by leveraging the capabilities of RIS. In this work, we consider joint BS and RIS beamforming for maximizing the downlink (DL) transmission rate while guaranteeing uplink (UL) rate requirement. We propose an FD-RIS beamforming (FRIS) scheme by adopting penalty convex-concave programming. Simulation results demonstrate the UL/DL rate improvements achieved by considering various levels of imperfect CSI. The proposed FRIS scheme validates their effectiveness across different RIS deployments and RIS/BS configurations. FRIS has achieved the highest rate compared to the other approximation method, conventional beamforming techniques, HD systems, and deployment without RIS. △ Less

Submitted 6 November, 2023; originally announced November 2023.

Comments: arXiv admin note: substantial text overlap with arXiv:2306.05693

arXiv:2310.11862 [pdf, other]

Learning to Generate Parameters of ConvNets for Unseen Image Data

Authors: Shiye Wang, Kaituo Feng, Changsheng Li, Ye Yuan, Guoren Wang

Abstract: Typical Convolutional Neural Networks (ConvNets) depend heavily on large amounts of image data and resort to an iterative optimization algorithm (e.g., SGD or Adam) to learn network parameters, which makes training very time- and resource-intensive. In this paper, we propose a new training paradigm and formulate the parameter learning of ConvNets into a prediction task: given a ConvNet architectur… ▽ More Typical Convolutional Neural Networks (ConvNets) depend heavily on large amounts of image data and resort to an iterative optimization algorithm (e.g., SGD or Adam) to learn network parameters, which makes training very time- and resource-intensive. In this paper, we propose a new training paradigm and formulate the parameter learning of ConvNets into a prediction task: given a ConvNet architecture, we observe there exist correlations between image datasets and their corresponding optimal network parameters, and explore if we can learn a hyper-mapping between them to capture the relations, such that we can directly predict the parameters of the network for an image dataset never seen during the training phase. To do this, we put forward a new hypernetwork based model, called PudNet, which intends to learn a mapping between datasets and their corresponding network parameters, and then predicts parameters for unseen data with only a single forward propagation. Moreover, our model benefits from a series of adaptive hyper recurrent units sharing weights to capture the dependencies of parameters among different network layers. Extensive experiments demonstrate that our proposed method achieves good efficacy for unseen image datasets on two kinds of settings: Intra-dataset prediction and Inter-dataset prediction. Our PudNet can also well scale up to large-scale datasets, e.g., ImageNet-1K. It takes 8967 GPU seconds to train ResNet-18 on the ImageNet-1K using GC from scratch and obtain a top-5 accuracy of 44.65%. However, our PudNet costs only 3.89 GPU seconds to predict the network parameters of ResNet-18 achieving comparable performance (44.92%), more than 2,300 times faster than the traditional training paradigm. △ Less

Submitted 9 August, 2024; v1 submitted 18 October, 2023; originally announced October 2023.

Comments: Accepted by IEEE TIP

arXiv:2310.10200 [pdf, ps, other]

Circular External Difference Families: Construction and Non-Existence

Authors: Huawei Wu, Jing Yang, Keqin Feng

Abstract: The circular external difference family and its strong version, which themselves are of independent combinatorial interest, were proposed as variants of the difference family to construct new unconditionally secure non-malleable threshold schemes. In this paper, we present new results regarding the construction and non-existence of (strong) circular external difference families, thereby solving se… ▽ More The circular external difference family and its strong version, which themselves are of independent combinatorial interest, were proposed as variants of the difference family to construct new unconditionally secure non-malleable threshold schemes. In this paper, we present new results regarding the construction and non-existence of (strong) circular external difference families, thereby solving several open problems on this topic. △ Less

Submitted 30 October, 2023; v1 submitted 16 October, 2023; originally announced October 2023.

arXiv:2310.06754 [pdf, other]

A Stochastic Geometry Framework for Performance Analysis of RIS-assisted OFDM Cellular Networks

Authors: Guodong Sun, Francois Baccelli, Ke Feng, Luis Uzeda Garcia, Stefano Paris

Abstract: The reconfigurable intelligent surface (RIS) technology allows one to engineer spatial diversity in complex cellular networks. This paper provides a framework for the system-level performance assessment of RIS-assisted networks and in particular downlink coverage probability and ergodic rate. To account for the inherent randomness in the spatial deployments of base stations (BSs) and RISs, we mode… ▽ More The reconfigurable intelligent surface (RIS) technology allows one to engineer spatial diversity in complex cellular networks. This paper provides a framework for the system-level performance assessment of RIS-assisted networks and in particular downlink coverage probability and ergodic rate. To account for the inherent randomness in the spatial deployments of base stations (BSs) and RISs, we model the placements of the RISs as point processes (PPs) conditioned on the associated BSs, which are modeled by a Poisson point process (PPP). These RIS PPs can be adapted based on the deployment strategy. We focus on modeling the RISs as a Matérn cluster process (MCP), where each RIS cluster is a finite PPP with support a disc centered on the association BS. We assume that the system uses the orthogonal frequency division multiplexing (OFDM) technique to exploit the multipath diversity provided by RISs. The coverage probability and the ergodic rate can be evaluated when RISs operate as batched powerless beamformers. The resulting analytical expressions provide a general methodology to evaluate the impact of key RIS-related parameters, such as the batch size and the density of RISs, on system-level performance. To demonstrate the framework's broad applicability, we also analyze a RIS placement variant where RISs are deployed around coverage holes. Numerical evaluations of the analytical expressions and Monte-Carlo simulations jointly validate the proposed analytical approach and provide valuable insights into the design of future RIS-assisted cellular networks. △ Less

Submitted 8 July, 2024; v1 submitted 10 October, 2023; originally announced October 2023.

arXiv:2309.11861 [pdf]

Data-driven quantitative analysis of an integrated open digital ecosystems platform for user-centric energy retrofits: A case study in Northern Sweden

Authors: Bokai Liu, Santhan Reddy Penaka, Weizhuo Lu, Kailun Feng, Anders Rebbling, Thomas Olofsson

Abstract: We present an open digital ecosystem based on web-framework with a functional back-end server in user-centric energy retrofits. This data-driven web framework is proposed for building energy renovation benchmarking as part of an energy advisory service development for the Västerbotten region, Sweden. A 4-tiers architecture is developed and programmed to achieve users' interactive design and visual… ▽ More We present an open digital ecosystem based on web-framework with a functional back-end server in user-centric energy retrofits. This data-driven web framework is proposed for building energy renovation benchmarking as part of an energy advisory service development for the Västerbotten region, Sweden. A 4-tiers architecture is developed and programmed to achieve users' interactive design and visualization via a web browser. Six data-driven methods are integrated into this framework as backend server functions. Based on those functions the users can be supported by this decision-making system when they want to know if it needs to be renovated or not. Meanwhile, influential factors (input values) from databases that affect energy usage in buildings are to be analyzed via quantitative analysis, i.e., sensitive analysis. The contributions to this open ecosystem platform in energy renovation are: 1) A systematic framework that can be applied to energy efficiency with data-driven approaches, 2) A user-friendly web-based platform that is easy and flexible to use, and 3) integrated quantitative analysis into the framework to obtain the importance among all the relevant factors. This computational framework is designed for stakeholders who would like to get preliminary information in energy advisory. The improved energy advisor service enabled by the developed platform can significantly reduce the cost of decision-making, enabling decision-makers to participate in such professional knowledge-required decisions in a deliberate and efficient manner. This work is funded by the AURORAL project, which integrates an open and interoperable digital platform, demonstrated through regional large-scale pilots in different countries of Europe by interdisciplinary applications. △ Less

Submitted 21 September, 2023; originally announced September 2023.

Comments: 28 pages

arXiv:2309.10990 [pdf, ps, other]

Arithmetic crosscorrelation of binary $m$-sequences with coprime periods

Authors: Xiaoyan Jing, Keqin Feng

Abstract: The arithmetic crosscorrelation of binary $m$-sequences with coprime periods $2^{n_1}-1$ and $2^{n_2}-1$\ ($\gcd(n_1,n_2)=1$) is determined. The result shows that the absolute value of arithmetic crosscorrelation of such binary $m$-sequences is not greater than $2^{\min(n_1,n_2)}-1$. The arithmetic crosscorrelation of binary $m$-sequences with coprime periods $2^{n_1}-1$ and $2^{n_2}-1$\ ($\gcd(n_1,n_2)=1$) is determined. The result shows that the absolute value of arithmetic crosscorrelation of such binary $m$-sequences is not greater than $2^{\min(n_1,n_2)}-1$. △ Less

Submitted 19 September, 2023; originally announced September 2023.

arXiv:2308.16777 [pdf, other]

Ref-Diff: Zero-shot Referring Image Segmentation with Generative Models

Authors: Minheng Ni, Yabo Zhang, Kailai Feng, Xiaoming Li, Yiwen Guo, Wangmeng Zuo

Abstract: Zero-shot referring image segmentation is a challenging task because it aims to find an instance segmentation mask based on the given referring descriptions, without training on this type of paired data. Current zero-shot methods mainly focus on using pre-trained discriminative models (e.g., CLIP). However, we have observed that generative models (e.g., Stable Diffusion) have potentially understoo… ▽ More Zero-shot referring image segmentation is a challenging task because it aims to find an instance segmentation mask based on the given referring descriptions, without training on this type of paired data. Current zero-shot methods mainly focus on using pre-trained discriminative models (e.g., CLIP). However, we have observed that generative models (e.g., Stable Diffusion) have potentially understood the relationships between various visual elements and text descriptions, which are rarely investigated in this task. In this work, we introduce a novel Referring Diffusional segmentor (Ref-Diff) for this task, which leverages the fine-grained multi-modal information from generative models. We demonstrate that without a proposal generator, a generative model alone can achieve comparable performance to existing SOTA weakly-supervised models. When we combine both generative and discriminative models, our Ref-Diff outperforms these competing methods by a significant margin. This indicates that generative models are also beneficial for this task and can complement discriminative models for better referring segmentation. Our code is publicly available at https://github.com/kodenii/Ref-Diff. △ Less

Submitted 1 September, 2023; v1 submitted 31 August, 2023; originally announced August 2023.

arXiv:2308.15516 [pdf, other]

doi 10.1103/PhysRevA.109.023314

Coexistence of ergodic and weakly ergodic states in finite-height Wannier-Stark ladders

Authors: Xingbo Wei, Liangqing Wu, Kewei Feng, Tong Liu, Yunbo Zhang

Abstract: We investigate a single-particle in one-dimensional Wannier-Stark ladders with either a linear potential or a mosaic potential with spacing $κ=2$. In both cases, we exactly determine the critical energies separating the weakly ergodic states from ergodic states for a finite potential height. Especially in the latter case, we demonstrate a rich phase diagram with ergodic states, weakly ergodic stat… ▽ More We investigate a single-particle in one-dimensional Wannier-Stark ladders with either a linear potential or a mosaic potential with spacing $κ=2$. In both cases, we exactly determine the critical energies separating the weakly ergodic states from ergodic states for a finite potential height. Especially in the latter case, we demonstrate a rich phase diagram with ergodic states, weakly ergodic states, and strongly Wannier-Stark localized states. Our results also exhibit that critical energies are highly dependent on the height of the ladder and ergodic states only survive at $E\approx0$ for the high ladder. Importantly, we find that the number of ergodic states can be adjusted by changing the interval of the non-zero potential. These interesting features will shed light on the study of disorder-free systems. △ Less

Submitted 19 February, 2024; v1 submitted 29 August, 2023; originally announced August 2023.

Comments: 10 pages, 12 figures, accepted by physical review A

Journal ref: Phys. Rev. A 109, 023314 (2024)

arXiv:2308.13945 [pdf, other]

Complex Antiferromagnetic Order in the Metallic Triangular Lattice Compound SmAuAl$_4$Ge$_2$

Authors: Keke Feng, Caleb Bush, Olatunde Oladehin, Minhyea Lee, Ryan Baumbach

Abstract: The compounds $Ln$AuAl$_4$Ge$_2$ ($Ln$ $=$ lanthanide) form in a structure that features two-dimensional triangular lattices of $Ln$ ions that are stacked along the crystalline $c$ axis. Together with crystal electric field effects, magnetic anisotropy, and electron-mediated spin exchange interactions, this sets the stage for the emergence of strongly correlated spin and electron phenomena. Here w… ▽ More The compounds $Ln$AuAl$_4$Ge$_2$ ($Ln$ $=$ lanthanide) form in a structure that features two-dimensional triangular lattices of $Ln$ ions that are stacked along the crystalline $c$ axis. Together with crystal electric field effects, magnetic anisotropy, and electron-mediated spin exchange interactions, this sets the stage for the emergence of strongly correlated spin and electron phenomena. Here we investigate SmAuAl$_4$Ge$_2$, which exhibits weak paramagnetism that strongly deviates from conventional Curie-Weiss behavior. Complex antiferromagnetic ordering emerges at $T_{\rm{N1}}$ $=$ 13.2 K and $T_{\rm{N2}}$ $=$ 7.4 K, where heat capacity measurements show that these transitions are first and second order, respectively. These measurements also reveal that the Sommerfeld coefficient is not enhanced compared to the nonmagnetic analog YAuAl$_4$Ge$_2$, consistent with the charge carrier quasiparticles exhibiting typical Fermi liquid behavior. The temperature-dependent electrical resistivity follows standard metallic behavior, but linear magnetoresistance unexpectedly appears within the ordered state. We compare these results to other $Ln$AuAl$_4$Ge$_2$ materials, which have already been established as localized $f$-electron magnets that are hosts for interesting magnetic and electronic phases. From this, SmAuAl$_4$Ge$_2$ emerges as a complex quantum spin metal, inviting further investigations into its properties and the broader family of related materials. △ Less

Submitted 1 January, 2024; v1 submitted 26 August, 2023; originally announced August 2023.

Comments: 9 pages, 6 figures

arXiv:2307.16096 [pdf, ps, other]

D-STAR: Dual Simultaneously Transmitting and Reflecting Reconfigurable Intelligent Surfaces for Joint Uplink/Downlink Transmission

Authors: Li-Hsiang Shen, Po-Chen Wu, Chia-Jou Ku, Yu-Ting Li, Kai-Ten Feng, Yuanwei Liu, Lajos Hanzo

Abstract: The joint uplink/downlink (JUD) design of simultaneously transmitting and reflecting reconfigurable intelligent surfaces (STAR-RIS) is conceived in support of both uplink (UL) and downlink (DL) users. Furthermore, the dual STAR-RISs (D-STAR) concept is conceived as a promising architecture for 360-degree full-plane service coverage, including UL/DL users located between the base station (BS) and t… ▽ More The joint uplink/downlink (JUD) design of simultaneously transmitting and reflecting reconfigurable intelligent surfaces (STAR-RIS) is conceived in support of both uplink (UL) and downlink (DL) users. Furthermore, the dual STAR-RISs (D-STAR) concept is conceived as a promising architecture for 360-degree full-plane service coverage, including UL/DL users located between the base station (BS) and the D-STAR as well as beyond. The corresponding regions are termed as primary (P) and secondary (S) regions. Both BS/users exist in the P-region, but only users are located in the S-region. The primary STAR-RIS (STAR-P) plays an important role in terms of tackling the P-region inter-user interference, the self-interference (SI) from the BS and from the reflective as well as refractive UL users imposed on the DL receiver. By contrast, the secondary STAR-RIS (STAR-S) aims for mitigating the S-region interferences. The non-linear and non-convex rate-maximization problem formulated is solved by alternating optimization amongst the decomposed convex sub-problems of the BS beamformer, and the D-STAR amplitude as well as phase shift configurations. We also propose a D-STAR based active beamforming and passive STAR-RIS amplitude/phase (DBAP) optimization scheme to solve the respective sub-problems by Lagrange dual with Dinkelbach's transformation, alternating direction method of multipliers (ADMM) with successive convex approximation (SCA), and penalty convex-concave procedure (PCCP). Our simulation results reveal that the proposed D-STAR architecture outperforms the conventional single RIS, single STAR-RIS, and half-duplex networks. The proposed DBAP of D-STAR outperforms the state-of-the-art solutions found in the open literature for different numbers of quantization levels, geographic deployment, transmit power and for diverse numbers of transmit antennas, patch partitions as well as D-STAR elements. △ Less

Submitted 8 February, 2024; v1 submitted 29 July, 2023; originally announced July 2023.

Comments: Accepted by IEEE TCOM

arXiv:2307.15876 [pdf, other]

GraphDAC: A Graph-Analytic Approach to Dynamic Airspace Configuration

Authors: Ke Feng, Dahai Liu, Yongxin Liu, Hong Liu, Houbing Song

Abstract: The current National Airspace System (NAS) is reaching capacity due to increased air traffic, and is based on outdated pre-tactical planning. This study proposes a more dynamic airspace configuration (DAC) approach that could increase throughput and accommodate fluctuating traffic, ideal for emergencies. The proposed approach constructs the airspace as a constraints-embedded graph, compresses its… ▽ More The current National Airspace System (NAS) is reaching capacity due to increased air traffic, and is based on outdated pre-tactical planning. This study proposes a more dynamic airspace configuration (DAC) approach that could increase throughput and accommodate fluctuating traffic, ideal for emergencies. The proposed approach constructs the airspace as a constraints-embedded graph, compresses its dimensions, and applies a spectral clustering-enabled adaptive algorithm to generate collaborative airport groups and evenly distribute workloads among them. Under various traffic conditions, our experiments demonstrate a 50\% reduction in workload imbalances. This research could ultimately form the basis for a recommendation system for optimized airspace configuration. Code available at https://github.com/KeFenge2022/GraphDAC.git △ Less

Submitted 28 July, 2023; originally announced July 2023.

Comments: Acceptted for publication by IEEE IRI'23

arXiv:2307.13528 [pdf, other]

FacTool: Factuality Detection in Generative AI -- A Tool Augmented Framework for Multi-Task and Multi-Domain Scenarios

Authors: I-Chun Chern, Steffi Chern, Shiqi Chen, Weizhe Yuan, Kehua Feng, Chunting Zhou, Junxian He, Graham Neubig, Pengfei Liu

Abstract: The emergence of generative pre-trained models has facilitated the synthesis of high-quality text, but it has also posed challenges in identifying factual errors in the generated text. In particular: (1) A wider range of tasks now face an increasing risk of containing factual errors when handled by generative models. (2) Generated texts tend to be lengthy and lack a clearly defined granularity for… ▽ More The emergence of generative pre-trained models has facilitated the synthesis of high-quality text, but it has also posed challenges in identifying factual errors in the generated text. In particular: (1) A wider range of tasks now face an increasing risk of containing factual errors when handled by generative models. (2) Generated texts tend to be lengthy and lack a clearly defined granularity for individual facts. (3) There is a scarcity of explicit evidence available during the process of fact checking. With the above challenges in mind, in this paper, we propose FacTool, a task and domain agnostic framework for detecting factual errors of texts generated by large language models (e.g., ChatGPT). Experiments on four different tasks (knowledge-based QA, code generation, mathematical reasoning, and scientific literature review) show the efficacy of the proposed method. We release the code of FacTool associated with ChatGPT plugin interface at https://github.com/GAIR-NLP/factool . △ Less

Submitted 26 July, 2023; v1 submitted 25 July, 2023; originally announced July 2023.

arXiv:2307.12452 [pdf, other]

Characterizing non-Markovian Quantum Process by Fast Bayesian Tomography

Authors: R. Y. Su, J. Y. Huang, N. Dumoulin. Stuyck, M. K. Feng, W. Gilbert, T. J. Evans, W. H. Lim, F. E. Hudson, K. W. Chan, W. Huang, Kohei M. Itoh, R. Harper, S. D. Bartlett, C. H. Yang, A. Laucht, A. Saraiva, T. Tanttu, A. S. Dzurak

Abstract: To push gate performance to levels beyond the thresholds for quantum error correction, it is important to characterize the error sources occurring on quantum gates. However, the characterization of non-Markovian error poses a challenge to current quantum process tomography techniques. Fast Bayesian Tomography (FBT) is a self-consistent gate set tomography protocol that can be bootstrapped from ear… ▽ More To push gate performance to levels beyond the thresholds for quantum error correction, it is important to characterize the error sources occurring on quantum gates. However, the characterization of non-Markovian error poses a challenge to current quantum process tomography techniques. Fast Bayesian Tomography (FBT) is a self-consistent gate set tomography protocol that can be bootstrapped from earlier characterization knowledge and be updated in real-time with arbitrary gate sequences. Here we demonstrate how FBT allows for the characterization of key non-Markovian error processes. We introduce two experimental protocols for FBT to diagnose the non-Markovian behavior of two-qubit systems on silicon quantum dots. To increase the efficiency and scalability of the experiment-analysis loop, we develop an online FBT software stack. To reduce experiment cost and analysis time, we also introduce a native readout method and warm boot strategy. Our results demonstrate that FBT is a useful tool for probing non-Markovian errors that can be detrimental to the ultimate realization of fault-tolerant operation on quantum computing. △ Less

Submitted 4 October, 2023; v1 submitted 23 July, 2023; originally announced July 2023.

arXiv:2307.07650 [pdf, ps, other]

SALC: Skeleton-Assisted Learning-Based Clustering for Time-Varying Indoor Localization

Authors: An-Hung Hsiao, Li-Hsiang Shen, Chen-Yi Chang, Chun-Jie Chiu, Kai-Ten Feng

Abstract: Wireless indoor localization has attracted significant amount of attention in recent years. Using received signal strength (RSS) obtained from WiFi access points (APs) for establishing fingerprinting database is a widely utilized method in indoor localization. However, the time-variant problem for indoor positioning systems is not well-investigated in existing literature. Compared to conventional… ▽ More Wireless indoor localization has attracted significant amount of attention in recent years. Using received signal strength (RSS) obtained from WiFi access points (APs) for establishing fingerprinting database is a widely utilized method in indoor localization. However, the time-variant problem for indoor positioning systems is not well-investigated in existing literature. Compared to conventional static fingerprinting, the dynamicallyreconstructed database can adapt to a highly-changing environment, which achieves sustainability of localization accuracy. To deal with the time-varying issue, we propose a skeleton-assisted learning-based clustering localization (SALC) system, including RSS-oriented map-assisted clustering (ROMAC), cluster-based online database establishment (CODE), and cluster-scaled location estimation (CsLE). The SALC scheme jointly considers similarities from the skeleton-based shortest path (SSP) and the time-varying RSS measurements across the reference points (RPs). ROMAC clusters RPs into different feature sets and therefore selects suitable monitor points (MPs) for enhancing location estimation. Moreover, the CODE algorithm aims for establishing adaptive fingerprint database to alleviate the timevarying problem. Finally, CsLE is adopted to acquire the target position by leveraging the benefits of clustering information and estimated signal variations in order to rescale the weights fromweighted k-nearest neighbors (WkNN) method. Both simulation and experimental results demonstrate that the proposed SALC system can effectively reconstruct the fingerprint database with an enhanced location estimation accuracy, which outperforms the other existing schemes in the open literature. △ Less

Submitted 14 July, 2023; originally announced July 2023.

arXiv:2307.01990 [pdf]

Unsupervised Spectral Demosaicing with Lightweight Spectral Attention Networks

Authors: Kai Feng, Yongqiang Zhao, Seong G. Kong, Haijin Zeng

Abstract: This paper presents a deep learning-based spectral demosaicing technique trained in an unsupervised manner. Many existing deep learning-based techniques relying on supervised learning with synthetic images, often underperform on real-world images especially when the number of spectral bands increases. According to the characteristics of the spectral mosaic image, this paper proposes a mosaic loss… ▽ More This paper presents a deep learning-based spectral demosaicing technique trained in an unsupervised manner. Many existing deep learning-based techniques relying on supervised learning with synthetic images, often underperform on real-world images especially when the number of spectral bands increases. According to the characteristics of the spectral mosaic image, this paper proposes a mosaic loss function, the corresponding model structure, a transformation strategy, and an early stopping strategy, which form a complete unsupervised spectral demosaicing framework. A challenge in real-world spectral demosaicing is inconsistency between the model parameters and the computational resources of the imager. We reduce the complexity and parameters of the spectral attention module by dividing the spectral attention tensor into spectral attention matrices in the spatial dimension and spectral attention vector in the channel dimension, which is more suitable for unsupervised framework. This paper also presents Mosaic25, a real 25-band hyperspectral mosaic image dataset of various objects, illuminations, and materials for benchmarking. Extensive experiments on synthetic and real-world datasets demonstrate that the proposed method outperforms conventional unsupervised methods in terms of spatial distortion suppression, spectral fidelity, robustness, and computational cost. △ Less

Submitted 4 July, 2023; originally announced July 2023.

arXiv:2307.00534 [pdf, other]

Shared Growth of Graph Neural Networks via Prompted Free-direction Knowledge Distillation

Authors: Kaituo Feng, Yikun Miao, Changsheng Li, Ye Yuan, Guoren Wang

Abstract: Knowledge distillation (KD) has shown to be effective to boost the performance of graph neural networks (GNNs), where the typical objective is to distill knowledge from a deeper teacher GNN into a shallower student GNN. However, it is often quite challenging to train a satisfactory deeper GNN due to the well-known over-parametrized and over-smoothing issues, leading to invalid knowledge transfer i… ▽ More Knowledge distillation (KD) has shown to be effective to boost the performance of graph neural networks (GNNs), where the typical objective is to distill knowledge from a deeper teacher GNN into a shallower student GNN. However, it is often quite challenging to train a satisfactory deeper GNN due to the well-known over-parametrized and over-smoothing issues, leading to invalid knowledge transfer in practical applications. In this paper, we propose the first Free-direction Knowledge Distillation framework via reinforcement learning for GNNs, called FreeKD, which is no longer required to provide a deeper well-optimized teacher GNN. Our core idea is to collaboratively learn two shallower GNNs to exchange knowledge between them. As we observe that one typical GNN model often exhibits better and worse performances at different nodes during training, we devise a dynamic and free-direction knowledge transfer strategy that involves two levels of actions: 1) node-level action determines the directions of knowledge transfer between the corresponding nodes of two networks; and then 2) structure-level action determines which of the local structures generated by the node-level actions to be propagated. Additionally, considering that different augmented graphs can potentially capture distinct perspectives of the graph data, we propose FreeKD-Prompt that learns undistorted and diverse augmentations based on prompt learning for exchanging varied knowledge. Furthermore, instead of confining knowledge exchange within two GNNs, we develop FreeKD++ to enable free-direction knowledge transfer among multiple GNNs. Extensive experiments on five benchmark datasets demonstrate our approaches outperform the base GNNs in a large margin. More surprisingly, our FreeKD has comparable or even better performance than traditional KD algorithms that distill knowledge from a deeper and stronger teacher GNN. △ Less

Submitted 16 November, 2023; v1 submitted 2 July, 2023; originally announced July 2023.

Comments: arXiv admin note: substantial text overlap with arXiv:2206.06561

arXiv:2306.17589 [pdf, other]

Effect of thermal fluctuations on spectra and predictability in compressible decaying isotropic turbulence

Authors: Qihan Ma, Chunxin Yang, Song Chen, Kaikai Feng, Ziqi Cui, Jun Zhang

Abstract: This study investigates the impact of molecular thermal fluctuations on compressible decaying isotropic turbulence using the unified stochastic particle (USP) method, encompassing both two-dimensional (2D) and three-dimensional (3D) scenarios. The findings reveal that the turbulent spectra of velocity and thermodynamic variables follow the wavenumber scaling law of ${k}^{(d-1)}$ for different spat… ▽ More This study investigates the impact of molecular thermal fluctuations on compressible decaying isotropic turbulence using the unified stochastic particle (USP) method, encompassing both two-dimensional (2D) and three-dimensional (3D) scenarios. The findings reveal that the turbulent spectra of velocity and thermodynamic variables follow the wavenumber scaling law of ${k}^{(d-1)}$ for different spatial dimensions $d$ within the high wavenumber range, indicating the impact of thermal fluctuations on small-scale turbulent statistics. With the application of Helmholtz decomposition, it is found that the thermal fluctuation spectra of solenoidal and compressible velocity components (${\vec{u}}_{s}$ and ${\vec{u}}_{c}$) follow an energy ratio of 1:1 for 2D cases, while the ratio changes to 2:1 for 3D cases. Comparisons between 3D turbulent spectra obtained through USP simulations and direct numerical simulations of the Navier-Stokes equations demonstrate that thermal fluctuations dominate the spectra at length scales comparable to the Kolmogorov length scale. Additionally, the effect of thermal fluctuations on the spectrum of ${\vec{u}}_{c}$ is significantly influenced by variations in the turbulent Mach number. We further study the impact of thermal fluctuations on the predictability of turbulence. With initial differences caused by thermal fluctuations, different flow realizations display significant disparities in velocity and thermodynamic fields at larger scales after a certain period of time, which can be characterized by "inverse error cascades". Moreover, the results suggest a strong correlation between the predictabilities of thermodynamic fields and the predictability of ${\vec{u}}_{c}$. △ Less

Submitted 30 June, 2023; originally announced June 2023.

arXiv:2306.14108 [pdf, other]

SpikeCodec: An End-to-end Learned Compression Framework for Spiking Camera

Authors: Kexiang Feng, Chuanmin Jia, Siwei Ma, Wen Gao

Abstract: Recently, the bio-inspired spike camera with continuous motion recording capability has attracted tremendous attention due to its ultra high temporal resolution imaging characteristic. Such imaging feature results in huge data storage and transmission burden compared to that of traditional camera, raising severe challenge and imminent necessity in compression for spike camera captured content. Exi… ▽ More Recently, the bio-inspired spike camera with continuous motion recording capability has attracted tremendous attention due to its ultra high temporal resolution imaging characteristic. Such imaging feature results in huge data storage and transmission burden compared to that of traditional camera, raising severe challenge and imminent necessity in compression for spike camera captured content. Existing lossy data compression methods could not be applied for compressing spike streams efficiently due to integrate-and-fire characteristic and binarized data structure. Considering the imaging principle and information fidelity of spike cameras, we introduce an effective and robust representation of spike streams. Based on this representation, we propose a novel learned spike compression framework using scene recovery, variational auto-encoder plus spike simulator. To our knowledge, it is the first data-trained model for efficient and robust spike stream compression. Extensive experimental results show that our method outperforms the conventional and learning-based codecs, contributing a strong baseline for learned spike data compression. △ Less

Submitted 24 June, 2023; originally announced June 2023.

Comments: 13 pages, 11 figures and 5 tables

Showing 1–50 of 205 results for author: Feng, K