Search | arXiv e-print repository

TrajFM: A Vehicle Trajectory Foundation Model for Region and Task Transferability

Authors: Yan Lin, Tonglong Wei, Zeyu Zhou, Haomin Wen, Jilin Hu, Shengnan Guo, Youfang Lin, Huaiyu Wan

Abstract: Vehicle trajectories provide valuable movement information that supports various downstream tasks and powers real-world applications. A desirable trajectory learning model should transfer between different regions and tasks without retraining, thus improving computational efficiency and effectiveness with limited training data. However, a model's ability to transfer across regions is limited by th… ▽ More Vehicle trajectories provide valuable movement information that supports various downstream tasks and powers real-world applications. A desirable trajectory learning model should transfer between different regions and tasks without retraining, thus improving computational efficiency and effectiveness with limited training data. However, a model's ability to transfer across regions is limited by the unique spatial features and POI arrangements of each region, which are closely linked to vehicle movement patterns and difficult to generalize. Additionally, achieving task transferability is challenging due to the differing generation schemes required for various tasks. Existing efforts towards transferability primarily involve learning embedding vectors for trajectories, which perform poorly in region transfer and still require retraining of prediction modules for task transfer. To address these challenges, we propose TrajFM, a vehicle trajectory foundation model that excels in both region and task transferability. For region transferability, we introduce STRFormer as the main learnable model within TrajFM. It integrates spatial, temporal, and POI modalities of trajectories to effectively manage variations in POI arrangements across regions and includes a learnable spatio-temporal Rotary position embedding module for handling spatial features. For task transferability, we propose a trajectory masking and recovery scheme. This scheme unifies the generation processes of various tasks into the masking and recovery of modalities and sub-trajectories, allowing TrajFM to be pre-trained once and transferred to different tasks without retraining. Experiments on two real-world vehicle trajectory datasets under various settings demonstrate the effectiveness of TrajFM. Code is available at https://anonymous.4open.science/r/TrajFM-30E4. △ Less

Submitted 9 August, 2024; originally announced August 2024.

arXiv:2408.12809 [pdf, other]

DutyTTE: Deciphering Uncertainty in Origin-Destination Travel Time Estimation

Authors: Xiaowei Mao, Yan Lin, Shengnan Guo, Yubin Chen, Xingyu Xian, Haomin Wen, Qisen Xu, Youfang Lin, Huaiyu Wan

Abstract: Uncertainty quantification in travel time estimation (TTE) aims to estimate the confidence interval for travel time, given the origin (O), destination (D), and departure time (T). Accurately quantifying this uncertainty requires generating the most likely path and assessing travel time uncertainty along the path. This involves two main challenges: 1) Predicting a path that aligns with the ground t… ▽ More Uncertainty quantification in travel time estimation (TTE) aims to estimate the confidence interval for travel time, given the origin (O), destination (D), and departure time (T). Accurately quantifying this uncertainty requires generating the most likely path and assessing travel time uncertainty along the path. This involves two main challenges: 1) Predicting a path that aligns with the ground truth, and 2) modeling the impact of travel time in each segment on overall uncertainty under varying conditions. We propose DutyTTE to address these challenges. For the first challenge, we introduce a deep reinforcement learning method to improve alignment between the predicted path and the ground truth, providing more accurate travel time information from road segments to improve TTE. For the second challenge, we propose a mixture of experts guided uncertainty quantification mechanism to better capture travel time uncertainty for each segment under varying contexts. Additionally, we calibrate our results using Hoeffding's upper-confidence bound to provide statistical guarantees for the estimated confidence intervals. Extensive experiments on two real-world datasets demonstrate the superiority of our proposed method. △ Less

Submitted 22 August, 2024; originally announced August 2024.

Comments: 7 pages

arXiv:2408.12374 [pdf]

Doping-free Janus homojunction solar cell with efficiency exceeding 23%

Authors: Lei Li, Zi-Xuan Yang, Tao Huang, Hui Wan, Wu-Yu Chen, Tao Zhang, Gui-Fang Huang, Wangyu Hu, Wei-Qing Huang

Abstract: Photovoltaic solar cell is one of the main renewable energy sources, and its power conversion efficiency (PCE) is improved by employing doping or heterojunction to reduce the photogenerated carrier recombination. Here, we propose a doping-free homojunction solar cell utilizing two-dimensional Janus semiconductors to achieve high PCE. Thanks to the intrinsic dipole of Janus structure, doping-free J… ▽ More Photovoltaic solar cell is one of the main renewable energy sources, and its power conversion efficiency (PCE) is improved by employing doping or heterojunction to reduce the photogenerated carrier recombination. Here, we propose a doping-free homojunction solar cell utilizing two-dimensional Janus semiconductors to achieve high PCE. Thanks to the intrinsic dipole of Janus structure, doping-free Janus homojunction has naturally not only a type-II band alignment to promote the photoexciton dissociation, but also a smaller effective bandgap to enhance light absorption. More importantly, the intrinsic electric field across the Janus structure will drive photoinduced electron and hole transfer from the interface to the opposite transport layers respectively, significantly enhancing the efficiency of carrier separation and transport. We illustrate the concept in titanium-based Janus monolayer homojunction, where the theoretically observed PCE reaches 23.22% of TiSSe homojunction. Our work opens a novel avenue to design low-cost, high-efficiency solar cells. △ Less

Submitted 22 August, 2024; originally announced August 2024.

Comments: 16 pages, 5 figures,

arXiv:2408.09613 [pdf, other]

How Do Social Bots Participate in Misinformation Spread? A Comprehensive Dataset and Analysis

Authors: Herun Wan, Minnan Luo, Zihan Ma, Guang Dai, Xiang Zhao

Abstract: Information spreads faster through social media platforms than traditional media, thus becoming an ideal medium to spread misinformation. Meanwhile, automated accounts, known as social bots, contribute more to the misinformation dissemination. In this paper, we explore the interplay between social bots and misinformation on the Sina Weibo platform. We propose a comprehensive and large-scale misinf… ▽ More Information spreads faster through social media platforms than traditional media, thus becoming an ideal medium to spread misinformation. Meanwhile, automated accounts, known as social bots, contribute more to the misinformation dissemination. In this paper, we explore the interplay between social bots and misinformation on the Sina Weibo platform. We propose a comprehensive and large-scale misinformation dataset, containing 11,393 misinformation and 16,416 unbiased real information with multiple modality information, with 952,955 related users. We propose a scalable weak-surprised method to annotate social bots, obtaining 68,040 social bots and 411,635 genuine accounts. To the best of our knowledge, this dataset is the largest dataset containing misinformation and social bots. We conduct comprehensive experiments and analysis on this dataset. Results show that social bots play a central role in misinformation dissemination, participating in news discussions to amplify echo chambers, manipulate public sentiment, and reverse public stances. △ Less

Submitted 18 August, 2024; originally announced August 2024.

arXiv:2408.08378 [pdf, other]

CNUCTRAN: A program for computing final nuclide concentrations using a direct simulation approach

Authors: K. A. Bala, M. R Omar, John Y. H. Soon, W. M. H. Wan

Abstract: It is essential to precisely determine the evolving concentrations of radioactive nuclides within transmutation problems. It is also a crucial aspect of nuclear physics with widespread applications in nuclear waste management and energy production. This paper introduces CNUCTRAN, a novel computer program that employs a probabilistic approach to estimate nuclide concentrations in transmutation prob… ▽ More It is essential to precisely determine the evolving concentrations of radioactive nuclides within transmutation problems. It is also a crucial aspect of nuclear physics with widespread applications in nuclear waste management and energy production. This paper introduces CNUCTRAN, a novel computer program that employs a probabilistic approach to estimate nuclide concentrations in transmutation problems. CNUCTRAN directly simulates nuclei transformations arising from various nuclear reactions, diverging from the traditional deterministic methods that solve the Bateman equation using matrix exponential approximation. This approach effectively addresses numerical challenges associated with solving the Bateman equations, therefore, circumventing the need for matrix exponential approximations that risk producing nonphysical concentrations. Our sample calculations using CNUCTRAN shows that the concentration predictions of CNUCTRAN have a relative error of less than 0.001% compared to the state-of-the-art method, CRAM, in different test cases. This makes CNUCTRAN a valuable alternative tool for transmutation analysis △ Less

Submitted 15 August, 2024; originally announced August 2024.

arXiv:2408.05503 [pdf, other]

Disentangled Noisy Correspondence Learning

Authors: Zhuohang Dang, Minnan Luo, Jihong Wang, Chengyou Jia, Haochen Han, Herun Wan, Guang Dai, Xiaojun Chang, Jingdong Wang

Abstract: Cross-modal retrieval is crucial in understanding latent correspondences across modalities. However, existing methods implicitly assume well-matched training data, which is impractical as real-world data inevitably involves imperfect alignments, i.e., noisy correspondences. Although some works explore similarity-based strategies to address such noise, they suffer from sub-optimal similarity predic… ▽ More Cross-modal retrieval is crucial in understanding latent correspondences across modalities. However, existing methods implicitly assume well-matched training data, which is impractical as real-world data inevitably involves imperfect alignments, i.e., noisy correspondences. Although some works explore similarity-based strategies to address such noise, they suffer from sub-optimal similarity predictions influenced by modality-exclusive information (MEI), e.g., background noise in images and abstract definitions in texts. This issue arises as MEI is not shared across modalities, thus aligning it in training can markedly mislead similarity predictions. Moreover, although intuitive, directly applying previous cross-modal disentanglement methods suffers from limited noise tolerance and disentanglement efficacy. Inspired by the robustness of information bottlenecks against noise, we introduce DisNCL, a novel information-theoretic framework for feature Disentanglement in Noisy Correspondence Learning, to adaptively balance the extraction of MII and MEI with certifiable optimal cross-modal disentanglement efficacy. DisNCL then enhances similarity predictions in modality-invariant subspace, thereby greatly boosting similarity-based alleviation strategy for noisy correspondences. Furthermore, DisNCL introduces soft matching targets to model noisy many-to-many relationships inherent in multi-modal input for noise-robust and accurate cross-modal alignment. Extensive experiments confirm DisNCL's efficacy by 2% average recall improvement. Mutual information estimation and visualization results show that DisNCL learns meaningful MII/MEI subspaces, validating our theoretical analyses. △ Less

Submitted 10 August, 2024; originally announced August 2024.

arXiv:2408.04916 [pdf, other]

PTrajM: Efficient and Semantic-rich Trajectory Learning with Pretrained Trajectory-Mamba

Authors: Yan Lin, Yichen Liu, Zeyu Zhou, Haomin Wen, Erwen Zheng, Shengnan Guo, Youfang Lin, Huaiyu Wan

Abstract: Vehicle trajectories provide crucial movement information for various real-world applications. To better utilize vehicle trajectories, it is essential to develop a trajectory learning approach that can effectively and efficiently extract rich semantic information, including movement behavior and travel purposes, to support accurate downstream applications. However, creating such an approach presen… ▽ More Vehicle trajectories provide crucial movement information for various real-world applications. To better utilize vehicle trajectories, it is essential to develop a trajectory learning approach that can effectively and efficiently extract rich semantic information, including movement behavior and travel purposes, to support accurate downstream applications. However, creating such an approach presents two significant challenges. First, movement behavior are inherently spatio-temporally continuous, making them difficult to extract efficiently from irregular and discrete trajectory points. Second, travel purposes are related to the functionalities of areas and road segments traversed by vehicles. These functionalities are not available from the raw spatio-temporal trajectory features and are hard to extract directly from complex textual features associated with these areas and road segments. To address these challenges, we propose PTrajM, a novel method capable of efficient and semantic-rich vehicle trajectory learning. To support efficient modeling of movement behavior, we introduce Trajectory-Mamba as the learnable model of PTrajM, which effectively extracts continuous movement behavior while being more computationally efficient than existing structures. To facilitate efficient extraction of travel purposes, we propose a travel purpose-aware pre-training procedure, which enables PTrajM to discern the travel purposes of trajectories without additional computational resources during its embedding process. Extensive experiments on two real-world datasets and comparisons with several state-of-the-art trajectory learning methods demonstrate the effectiveness of PTrajM. Code is available at https://anonymous.4open.science/r/PTrajM-C973. △ Less

Submitted 9 August, 2024; originally announced August 2024.

arXiv:2408.04499 [pdf, other]

Knowledge-Aided Semantic Communication Leveraging Probabilistic Graphical Modeling

Authors: Haowen Wan, Qianqian Yang, Jiancheng Tang, Zhiguo shi

Abstract: In this paper, we propose a semantic communication approach based on probabilistic graphical model (PGM). The proposed approach involves constructing a PGM from a training dataset, which is then shared as common knowledge between the transmitter and receiver. We evaluate the importance of various semantic features and present a PGM-based compression algorithm designed to eliminate predictable port… ▽ More In this paper, we propose a semantic communication approach based on probabilistic graphical model (PGM). The proposed approach involves constructing a PGM from a training dataset, which is then shared as common knowledge between the transmitter and receiver. We evaluate the importance of various semantic features and present a PGM-based compression algorithm designed to eliminate predictable portions of semantic information. Furthermore, we introduce a technique to reconstruct the discarded semantic information at the receiver end, generating approximate results based on the PGM. Simulation results indicate a significant improvement in transmission efficiency over existing methods, while maintaining the quality of the transmitted images. △ Less

Submitted 8 August, 2024; originally announced August 2024.

arXiv:2408.03717 [pdf, other]

Pick of the Bunch: Detecting Infrared Small Targets Beyond Hit-Miss Trade-Offs via Selective Rank-Aware Attention

Authors: Yimian Dai, Peiwen Pan, Yulei Qian, Yuxuan Li, Xiang Li, Jian Yang, Huan Wan

Abstract: Infrared small target detection faces the inherent challenge of precisely localizing dim targets amidst complex background clutter. Traditional approaches struggle to balance detection precision and false alarm rates. To break this dilemma, we propose SeRankDet, a deep network that achieves high accuracy beyond the conventional hit-miss trade-off, by following the ``Pick of the Bunch'' principle.… ▽ More Infrared small target detection faces the inherent challenge of precisely localizing dim targets amidst complex background clutter. Traditional approaches struggle to balance detection precision and false alarm rates. To break this dilemma, we propose SeRankDet, a deep network that achieves high accuracy beyond the conventional hit-miss trade-off, by following the ``Pick of the Bunch'' principle. At its core lies our Selective Rank-Aware Attention (SeRank) module, employing a non-linear Top-K selection process that preserves the most salient responses, preventing target signal dilution while maintaining constant complexity. Furthermore, we replace the static concatenation typical in U-Net structures with our Large Selective Feature Fusion (LSFF) module, a dynamic fusion strategy that empowers SeRankDet with adaptive feature integration, enhancing its ability to discriminate true targets from false alarms. The network's discernment is further refined by our Dilated Difference Convolution (DDC) module, which merges differential convolution aimed at amplifying subtle target characteristics with dilated convolution to expand the receptive field, thereby substantially improving target-background separation. Despite its lightweight architecture, the proposed SeRankDet sets new benchmarks in state-of-the-art performance across multiple public datasets. The code is available at https://github.com/GrokCV/SeRankDet. △ Less

Submitted 7 August, 2024; originally announced August 2024.

arXiv:2407.20976 [pdf, other]

An iterative transversal CNOT decoder

Authors: Kwok Ho Wan, Mark Webber, Austin G. Fowler, Winfried K. Hensinger

Abstract: Modern platforms for potential qubit candidates, such as trapped ions or neutral atoms, allow long range connectivity between distant physical qubits through shuttling. This opens up an avenue for transversal logical CNOT gates between distant logical qubits, whereby physical CNOT gates are performed between each corresponding physical qubit on the control and target logical qubits. However, the t… ▽ More Modern platforms for potential qubit candidates, such as trapped ions or neutral atoms, allow long range connectivity between distant physical qubits through shuttling. This opens up an avenue for transversal logical CNOT gates between distant logical qubits, whereby physical CNOT gates are performed between each corresponding physical qubit on the control and target logical qubits. However, the transversal CNOT can propagate errors from one logical qubit to another, leading to correlated errors between logical qubits. We have developed a multi-pass iterative decoder that decodes each logical qubit separately to deal with this correlated error. We show that under circuit-level noise and only $\mathcal{O}(1)$ code cycles, a threshold can still persist, and the logical error rate will not be significantly degraded, matching the sub-threshold logical error rate scaling of $p^{\lfloor\frac{d}{2}\rfloor}$ for a distance $d$ rotated surface code. △ Less

Submitted 30 August, 2024; v1 submitted 30 July, 2024; originally announced July 2024.

arXiv:2407.18108 [pdf, other]

Graph Neural Ordinary Differential Equations for Coarse-Grained Socioeconomic Dynamics

Authors: James Koch, Pranab Roy Chowdhury, Heng Wan, Parin Bhaduri, Jim Yoon, Vivek Srikrishnan, W. Brent Daniel

Abstract: We present a data-driven machine-learning approach for modeling space-time socioeconomic dynamics. Through coarse-graining fine-scale observations, our modeling framework simplifies these complex systems to a set of tractable mechanistic relationships -- in the form of ordinary differential equations -- while preserving critical system behaviors. This approach allows for expedited 'what if' studie… ▽ More We present a data-driven machine-learning approach for modeling space-time socioeconomic dynamics. Through coarse-graining fine-scale observations, our modeling framework simplifies these complex systems to a set of tractable mechanistic relationships -- in the form of ordinary differential equations -- while preserving critical system behaviors. This approach allows for expedited 'what if' studies and sensitivity analyses, essential for informed policy-making. Our findings, from a case study of Baltimore, MD, indicate that this machine learning-augmented coarse-grained model serves as a powerful instrument for deciphering the complex interactions between social factors, geography, and exogenous stressors, offering a valuable asset for system forecasting and resilience planning. △ Less

Submitted 25 July, 2024; originally announced July 2024.

arXiv:2407.15899 [pdf, other]

doi 10.1109/TKDE.2024.3434565

Spatial-Temporal Cross-View Contrastive Pre-training for Check-in Sequence Representation Learning

Authors: Letian Gong, Huaiyu Wan, Shengnan Guo, Xiucheng Li, Yan Lin, Erwen Zheng, Tianyi Wang, Zeyu Zhou, Youfang Lin

Abstract: The rapid growth of location-based services (LBS) has yielded massive amounts of data on human mobility. Effectively extracting meaningful representations for user-generated check-in sequences is pivotal for facilitating various downstream services. However, the user-generated check-in data are simultaneously influenced by the surrounding objective circumstances and the user's subjective intention… ▽ More The rapid growth of location-based services (LBS) has yielded massive amounts of data on human mobility. Effectively extracting meaningful representations for user-generated check-in sequences is pivotal for facilitating various downstream services. However, the user-generated check-in data are simultaneously influenced by the surrounding objective circumstances and the user's subjective intention. Specifically, the temporal uncertainty and spatial diversity exhibited in check-in data make it difficult to capture the macroscopic spatial-temporal patterns of users and to understand the semantics of user mobility activities. Furthermore, the distinct characteristics of the temporal and spatial information in check-in sequences call for an effective fusion method to incorporate these two types of information. In this paper, we propose a novel Spatial-Temporal Cross-view Contrastive Representation (STCCR) framework for check-in sequence representation learning. Specifically, STCCR addresses the above challenges by employing self-supervision from "spatial topic" and "temporal intention" views, facilitating effective fusion of spatial and temporal information at the semantic level. Besides, STCCR leverages contrastive clustering to uncover users' shared spatial topics from diverse mobility activities, while employing angular momentum contrast to mitigate the impact of temporal uncertainty and noise. We extensively evaluate STCCR on three real-world datasets and demonstrate its superior performance across three downstream tasks. △ Less

Submitted 25 July, 2024; v1 submitted 22 July, 2024; originally announced July 2024.

Comments: This paper has been accepted as a regular paper at IEEE TKDE

arXiv:2407.12550 [pdf, other]

UniTE: A Survey and Unified Pipeline for Pre-training ST Trajectory Embeddings

Authors: Yan Lin, Zeyu Zhou, Yicheng Liu, Haochen Lv, Haomin Wen, Tianyi Li, Yushuai Li, Christian S. Jensen, Shengnan Guo, Youfang Lin, Huaiyu Wan

Abstract: Spatio-temporal (ST) trajectories are sequences of timestamped locations, which enable a variety of analyses that in turn enable important real-world applications. It is common to map trajectories to vectors, called embeddings, before subsequent analyses. Thus, the qualities of embeddings are very important. Methods for pre-training embeddings, which leverage unlabeled trajectories for training un… ▽ More Spatio-temporal (ST) trajectories are sequences of timestamped locations, which enable a variety of analyses that in turn enable important real-world applications. It is common to map trajectories to vectors, called embeddings, before subsequent analyses. Thus, the qualities of embeddings are very important. Methods for pre-training embeddings, which leverage unlabeled trajectories for training universal embeddings, have shown promising applicability across different tasks, thus attracting considerable interest. However, research progress on this topic faces two key challenges: a lack of a comprehensive overview of existing methods, resulting in several related methods not being well-recognized, and the absence of a unified pipeline, complicating the development new methods and the analysis of methods. To overcome these obstacles and advance the field of pre-training of trajectory embeddings, we present UniTE, a survey and a unified pipeline for this domain. In doing so, we present a comprehensive list of existing methods for pre-training trajectory embeddings, which includes methods that either explicitly or implicitly employ pre-training techniques. Further, we present a unified and modular pipeline with publicly available underlying code, simplifying the process of constructing and evaluating methods for pre-training trajectory embeddings. Additionally, we contribute a selection of experimental results using the proposed pipeline on real-world datasets. △ Less

Submitted 17 July, 2024; originally announced July 2024.

arXiv:2407.09096 [pdf, other]

STD-PLM: Understanding Both Spatial and Temporal Properties of Spatial-Temporal Data with PLM

Authors: YiHeng Huang, Xiaowei Mao, Shengnan Guo, Yubin Chen, Junfeng Shen, Tiankuo Li, Youfang Lin, Huaiyu Wan

Abstract: Spatial-temporal forecasting and imputation are important for real-world intelligent systems. Most existing methods are tailored for individual forecasting or imputation tasks but are not designed for both. Additionally, they are less effective for zero-shot and few-shot learning. While pre-trained language model (PLM) have exhibited strong pattern recognition and reasoning abilities across variou… ▽ More Spatial-temporal forecasting and imputation are important for real-world intelligent systems. Most existing methods are tailored for individual forecasting or imputation tasks but are not designed for both. Additionally, they are less effective for zero-shot and few-shot learning. While pre-trained language model (PLM) have exhibited strong pattern recognition and reasoning abilities across various tasks, including few-shot and zero-shot learning, their applications in spatial-temporal data understanding has been constrained by insufficient modeling of complex correlations such as the temporal correlations, spatial connectivity, non-pairwise and high-order spatial-temporal correlations within data. In this paper, we propose STD-PLM for understanding both spatial and temporal properties of \underline{S}patial-\underline{T}emporal \underline{D}ata with \underline{PLM}, which is capable of implementing both spatial-temporal forecasting and imputation tasks. STD-PLM understands spatial-temporal correlations via explicitly designed spatial and temporal tokenizers. Topology-aware node embeddings are designed for PLM to comprehend and exploit the topology structure of data in inductive manner. Furthermore, to mitigate the efficiency issues introduced by the PLM, we design a sandglass attention module (SGA) combined with a specific constrained loss function, which significantly improves the model's efficiency while ensuring performance. Extensive experiments demonstrate that STD-PLM exhibits competitive performance and generalization capabilities across the forecasting and imputation tasks on various datasets. Moreover, STD-PLM achieves promising results on both few-shot and zero-shot tasks. △ Less

Submitted 27 August, 2024; v1 submitted 12 July, 2024; originally announced July 2024.

arXiv:2407.05228 [pdf]

Thermal Preconditioning of Membrane Stress to Control the Shapes of Ultrathin Crystals

Authors: Hao Wan, Geunwoong Jeon, Gregory M. Grason, Maria M. Santore

Abstract: We employ the phospholipid bilayer membranes of giant unilamellar vesicles as a free-standing environment for the growth of membrane-integrated ultrathin phospholipid crystals possessing a variety of shapes with 6-fold symmetry. Crystal growth within vesicle membranes, where more elaborate shapes grow on larger vesicles is dominated by the bending energy of the membrane itself, creating a means to… ▽ More We employ the phospholipid bilayer membranes of giant unilamellar vesicles as a free-standing environment for the growth of membrane-integrated ultrathin phospholipid crystals possessing a variety of shapes with 6-fold symmetry. Crystal growth within vesicle membranes, where more elaborate shapes grow on larger vesicles is dominated by the bending energy of the membrane itself, creating a means to manipulate crystal morphology. Here we demonstrate how cooling rate preconditions the membrane tension before nucleation, in turn regulating nucleation and growth, and directing the morphology of crystals by the time they are large enough to be visualized. The crystals retain their shapes during further growth through the two phase region. Experiments demonstrate this behavior for single crystals growing within the membrane of each vesicle, ultimately comprising up to 13% of the vesicle area and length scales of up to 50 microns. A model for stress evolution, employing only physical property data, reveals how the competition between thermal membrane contraction and water diffusion from tensed vesicles produces a size- and time-dependence of the membrane tension as a result of cooling history. The tension, critical in the contribution of bending energy in the fluid membrane regions, in turn selects for crystal shape for vesicles of a given size. The model reveals unanticipated behaviors including a low steady state tension on small vesicles that allows compact domains to develop, rapid tension development on large vesicles producing flower-shaped domains, and a stress relaxation through water diffusion across the membrane with a time constant scaling as the square of the vesicle radius, consistent with measurable tensions only in the largest vesicles. △ Less

Submitted 6 July, 2024; originally announced July 2024.

Comments: Main text 32 pages and 7 figures; SI 4 pages and 4 figures

arXiv:2406.20015 [pdf, other]

ToolBeHonest: A Multi-level Hallucination Diagnostic Benchmark for Tool-Augmented Large Language Models

Authors: Yuxiang Zhang, Jing Chen, Junjie Wang, Yaxin Liu, Cheng Yang, Chufan Shi, Xinyu Zhu, Zihao Lin, Hanwen Wan, Yujiu Yang, Tetsuya Sakai, Tian Feng, Hayato Yamana

Abstract: Tool-augmented large language models (LLMs) are rapidly being integrated into real-world applications. Due to the lack of benchmarks, the community still needs to fully understand the hallucination issues within these models. To address this challenge, we introduce a comprehensive diagnostic benchmark, ToolBH. Specifically, we assess the LLM's hallucinations through two perspectives: depth and bre… ▽ More Tool-augmented large language models (LLMs) are rapidly being integrated into real-world applications. Due to the lack of benchmarks, the community still needs to fully understand the hallucination issues within these models. To address this challenge, we introduce a comprehensive diagnostic benchmark, ToolBH. Specifically, we assess the LLM's hallucinations through two perspectives: depth and breadth. In terms of depth, we propose a multi-level diagnostic process, including (1) solvability detection, (2) solution planning, and (3) missing-tool analysis. For breadth, we consider three scenarios based on the characteristics of the toolset: missing necessary tools, potential tools, and limited functionality tools. Furthermore, we developed seven tasks and collected 700 evaluation samples through multiple rounds of manual annotation. The results show the significant challenges presented by the ToolBH benchmark. The current advanced models Gemini-1.5-Pro and GPT-4o only achieve a total score of 45.3 and 37.0, respectively, on a scale of 100. In this benchmark, larger model parameters do not guarantee better performance; the training data and response strategies also play a crucial role in tool-enhanced LLM scenarios. Our diagnostic analysis indicates that the primary reason for model errors lies in assessing task solvability. Additionally, open-weight models suffer from performance drops with verbose replies, whereas proprietary models excel with longer reasoning. △ Less

Submitted 28 June, 2024; originally announced June 2024.

arXiv:2406.00734 [pdf, other]

GLADformer: A Mixed Perspective for Graph-level Anomaly Detection

Authors: Fan Xu, Nan Wang, Hao Wu, Xuezhi Wen, Dalin Zhang, Siyang Lu, Binyong Li, Wei Gong, Hai Wan, Xibin Zhao

Abstract: Graph-Level Anomaly Detection (GLAD) aims to distinguish anomalous graphs within a graph dataset. However, current methods are constrained by their receptive fields, struggling to learn global features within the graphs. Moreover, most contemporary methods are based on spatial domain and lack exploration of spectral characteristics. In this paper, we propose a multi-perspective hybrid graph-level… ▽ More Graph-Level Anomaly Detection (GLAD) aims to distinguish anomalous graphs within a graph dataset. However, current methods are constrained by their receptive fields, struggling to learn global features within the graphs. Moreover, most contemporary methods are based on spatial domain and lack exploration of spectral characteristics. In this paper, we propose a multi-perspective hybrid graph-level anomaly detector namely GLADformer, consisting of two key modules. Specifically, we first design a Graph Transformer module with global spectrum enhancement, which ensures balanced and resilient parameter distributions by fusing global features and spectral distribution characteristics. Furthermore, to uncover local anomalous attributes, we customize a band-pass spectral GNN message passing module that further enhances the model's generalization capability. Through comprehensive experiments on ten real-world datasets from multiple domains, we validate the effectiveness and robustness of GLADformer. This demonstrates that GLADformer outperforms current state-of-the-art models in graph-level anomaly detection, particularly in effectively capturing global anomaly representations and spectral characteristics. △ Less

Submitted 3 July, 2024; v1 submitted 2 June, 2024; originally announced June 2024.

arXiv:2405.16835 [pdf]

Superionic surface Li-ion transport in carbonaceous materials

Authors: Jianbin Zhou, Shen Wang, Chaoshan Wu, Ji Qi, Hongli Wan, Shen Lai, Shijie Feng, Tsz Wai Ko, Zhaohui Liang, Ke Zhou, Nimrod Harpak, Nick Solan, Mengchen Liu, Zeyu Hui, Paulina J. Ai, Kent Griffith, Chunsheng Wang, Shyue Ping Ong, Yan Yao, Ping Liu

Abstract: Unlike Li-ion transport in the bulk of carbonaceous materials, little is known about Li-ion diffusion on their surface. In this study, we have discovered an ultra-fast Li-ion transport phenomenon on the surface of carbonaceous materials, particularly when they have limited Li insertion capacity along with a high surface area. This is exemplified by a carbon black, Ketjen Black (KB). An ionic condu… ▽ More Unlike Li-ion transport in the bulk of carbonaceous materials, little is known about Li-ion diffusion on their surface. In this study, we have discovered an ultra-fast Li-ion transport phenomenon on the surface of carbonaceous materials, particularly when they have limited Li insertion capacity along with a high surface area. This is exemplified by a carbon black, Ketjen Black (KB). An ionic conductivity of 18.1 mS cm-1 at room temperature is observed, far exceeding most solid-state ion conductors. Theoretical calculations reveal a low diffusion barrier for the surface Li species. The species is also identified as Li*, which features a partial positive charge. As a result, lithiated KB functions effectively as an interlayer between Li and solid-state electrolytes (SSE) to mitigate dendrite growth and cell shorting. This function is found to be electrolyte agnostic, effective for both sulfide and halide SSEs. Further, lithiated KB can act as a high-performance mixed ion/electron conductor that is thermodynamically stable at potentials near Li metal. A graphite anode mixed with KB instead of a solid electrolyte demonstrates full utilization with a capacity retention of ~85% over 300 cycles. The discovery of this surface-mediated ultra-fast Li-ion transport mechanism provides new directions for the design of solid-state ion conductors and solid-state batteries. △ Less

Submitted 27 May, 2024; originally announced May 2024.

Comments: 21 pages, 6 figures

arXiv:2405.12459 [pdf, other]

TrajCogn: Leveraging LLMs for Cognizing Movement Patterns and Travel Purposes from Trajectories

Authors: Zeyu Zhou, Yan Lin, Haomin Wen, Qisen Xu, Shengnan Guo, Jilin Hu, Youfang Lin, Huaiyu Wan

Abstract: Spatio-temporal trajectories are crucial in various data mining tasks. It is important to develop a versatile trajectory learning method that performs different tasks with high accuracy. This involves effectively extracting two core aspects of information--movement patterns and travel purposes--from trajectories. However, this is challenging due to limitations in model capacity and the quality and… ▽ More Spatio-temporal trajectories are crucial in various data mining tasks. It is important to develop a versatile trajectory learning method that performs different tasks with high accuracy. This involves effectively extracting two core aspects of information--movement patterns and travel purposes--from trajectories. However, this is challenging due to limitations in model capacity and the quality and scale of trajectory datasets. Meanwhile, large language models (LLMs) have shown great success in versatility by training on large-scale, high-quality datasets. Given the similarities between trajectories and sentences, there's potential to leverage LLMs to develop an effective trajectory learning method. However, standard LLMs are not designed to handle the unique spatio-temporal features of trajectories and cannot extract movement patterns and travel purposes. To address these challenges, we propose a model called TrajCogn that effectively utilizes LLMs to model trajectories. TrajCogn leverages the strengths of LLMs to create a versatile trajectory learning approach while addressing the limitations of standard LLMs. First, TrajCogn incorporates a novel trajectory semantic embedder that enables LLMs to process spatio-temporal features and extract movement patterns and travel purposes. Second, TrajCogn introduces a new trajectory prompt that integrates these patterns and purposes into LLMs, allowing the model to adapt to various tasks. Extensive experiments on two real-world datasets and two representative tasks demonstrate that TrajCogn successfully achieves its design goals. Codes are available at https://anonymous.4open.science/r/TrajCogn-5021. △ Less

Submitted 9 August, 2024; v1 submitted 20 May, 2024; originally announced May 2024.

arXiv:2404.19141 [pdf, other]

Micro-Macro Spatial-Temporal Graph-based Encoder-Decoder for Map-Constrained Trajectory Recovery

Authors: Tonglong Wei, Youfang Lin, Yan Lin, Shengnan Guo, Lan Zhang, Huaiyu Wan

Abstract: Recovering intermediate missing GPS points in a sparse trajectory, while adhering to the constraints of the road network, could offer deep insights into users' moving behaviors in intelligent transportation systems. Although recent studies have demonstrated the advantages of achieving map-constrained trajectory recovery via an end-to-end manner, they still face two significant challenges. Firstly,… ▽ More Recovering intermediate missing GPS points in a sparse trajectory, while adhering to the constraints of the road network, could offer deep insights into users' moving behaviors in intelligent transportation systems. Although recent studies have demonstrated the advantages of achieving map-constrained trajectory recovery via an end-to-end manner, they still face two significant challenges. Firstly, existing methods are mostly sequence-based models. It is extremely hard for them to comprehensively capture the micro-semantics of individual trajectory, including the information of each GPS point and the movement between two GPS points. Secondly, existing approaches ignore the impact of the macro-semantics, i.e., the road conditions and the people's shared travel preferences reflected by a group of trajectories. To address the above challenges, we propose a Micro-Macro Spatial-Temporal Graph-based Encoder-Decoder (MM-STGED). Specifically, we model each trajectory as a graph to efficiently describe the micro-semantics of trajectory and design a novel message-passing mechanism to learn trajectory representations. Additionally, we extract the macro-semantics of trajectories and further incorporate them into a well-designed graph-based decoder to guide trajectory recovery. Extensive experiments conducted on sparse trajectories with three different sampling intervals that are respectively constructed from two real-world trajectory datasets demonstrate the superiority of our proposed model. △ Less

Submitted 29 April, 2024; originally announced April 2024.

Comments: This paper has been accepted as a regular paper at IEEE TKDE

arXiv:2404.09355 [pdf, other]

Shape equilibria of vesicles with rigid planar inclusions

Authors: Geunwoong Jeon, Justin Fagnoni, Hao Wan, Maria M. Santore, Gregory M. Grason

Abstract: Motivated by recent studies of two-phase lipid vesicles possessing 2D solid domains integrated within a fluid bilayer phase, we study the shape equilibria of closed vesicles possessing a single planar, circular inclusion. While 2D solid elasticity tends to expel Gaussian curvature, topology requires closed vesicles to maintain an average, non-zero Gaussian curvature leading to an elementary mechan… ▽ More Motivated by recent studies of two-phase lipid vesicles possessing 2D solid domains integrated within a fluid bilayer phase, we study the shape equilibria of closed vesicles possessing a single planar, circular inclusion. While 2D solid elasticity tends to expel Gaussian curvature, topology requires closed vesicles to maintain an average, non-zero Gaussian curvature leading to an elementary mechanism of shape frustration that increases with inclusion size. We study elastic ground states of the Helfrich model of the planar-fluid composite vesicles, analytically and computationally, as a function of planar fraction and reduced volume. Notably, we show that incorporation of a planar inclusion of only a few percent dramatically shifts the ground state shapes of vesicles from predominantly {\it prolate} to {\it oblate}, and moreover, shifts the optimal surface to volume ratio far from spherical shapes. We show that for sufficiently small planar inclusions, the elastic ground states break symmetry via a complex variety of asymmetric oblate, prolate, and triaxial shapes, while inclusion sizes above about $8\%$ drive composite vesicles to adopt axisymmetric oblate shapes. These predictions cast useful light on the emergent shape and mechanical responses of fluid-solid composite vesicles. △ Less

Submitted 14 April, 2024; originally announced April 2024.

Comments: 15 pages, 13 figures, 3 appendices

arXiv:2403.13432 [pdf]

Influence of chemical composition on the room temperature plas-ticity of C15 Ca-Al-Mg Laves phases

Authors: Martina Freund, Zhuocheng Xie, Pei-Ling Sun, Lukas Berners, Joshua Spille, Hexin Wan, Carsten Thomas, Michael Feuerbacher, Marta Lipinska-Chwalek, Joachim Mayer, Sandra Korte-Kerzel

Abstract: The influence of chemical composition changes on the room temperature mechanical proper-ties in the C15 CaAl2 Laves phase were investigated in two off-stoichiometric compositions with 5.7 at.-% Mg addition (Ca33Al61Mg6) and 10.8 at.-% Mg and 3.0 at.-% Ca addition (Ca36Al53Mg11) and compared to the stoichiometric (Ca33Al67) composition. Cubic Ca-Al-Mg Laves phases with multiple crystallographic ori… ▽ More The influence of chemical composition changes on the room temperature mechanical proper-ties in the C15 CaAl2 Laves phase were investigated in two off-stoichiometric compositions with 5.7 at.-% Mg addition (Ca33Al61Mg6) and 10.8 at.-% Mg and 3.0 at.-% Ca addition (Ca36Al53Mg11) and compared to the stoichiometric (Ca33Al67) composition. Cubic Ca-Al-Mg Laves phases with multiple crystallographic orientations were characterised and deformed using nanoindentation. The hardness and indentation modulus were measured to be 4.1 +- 0.3 GPa and 71.3 +- 1.5 GPa for Ca36Al53Mg11, 4.6 +- 0.2 GPa and 80.4 +- 3.8 GPa for Ca33Al61Mg6 and 4.9 +- 0.3 GPa and 85.5 +- 4.0 GPa for Ca33Al67, respectively. The resulting surface traces as well as slip and crack planes, were distinguished on the indentation surfac-es, revealing the activation of several different {11n} slip systems, as further confirmed by conventional transmission electron microscopic observations. Additionally, the deformation mechanisms and corresponding energy barriers of activated slip systems were evaluated by atomistic simulations. △ Less

Submitted 20 March, 2024; originally announced March 2024.

arXiv:2403.11432 [pdf, other]

Demystifying the Physics of Deep Reinforcement Learning-Based Autonomous Vehicle Decision-Making

Authors: Hanxi Wan, Pei Li, Arpan Kusari

Abstract: With the advent of universal function approximators in the domain of reinforcement learning, the number of practical applications leveraging deep reinforcement learning (DRL) has exploded. Decision-making in autonomous vehicles (AVs) has emerged as a chief application among them, taking the sensor data or the higher-order kinematic variables as the input and providing a discrete choice or continuo… ▽ More With the advent of universal function approximators in the domain of reinforcement learning, the number of practical applications leveraging deep reinforcement learning (DRL) has exploded. Decision-making in autonomous vehicles (AVs) has emerged as a chief application among them, taking the sensor data or the higher-order kinematic variables as the input and providing a discrete choice or continuous control output. There has been a continuous effort to understand the black-box nature of the DRL models, but so far, there hasn't been any discussion (to the best of authors' knowledge) about how the models learn the physical process. This presents an overwhelming limitation that restricts the real-world deployment of DRL in AVs. Therefore, in this research work, we try to decode the knowledge learnt by the attention-based DRL framework about the physical process. We use a continuous proximal policy optimization-based DRL algorithm as the baseline model and add a multi-head attention framework in an open-source AV simulation environment. We provide some analytical techniques for discussing the interpretability of the trained models in terms of explainability and causality for spatial and temporal correlations. We show that the weights in the first head encode the positions of the neighboring vehicles while the second head focuses on the leader vehicle exclusively. Also, the ego vehicle's action is causally dependent on the vehicles in the target lane spatially and temporally. Through these findings, we reliably show that these techniques can help practitioners decipher the results of the DRL algorithms. △ Less

Submitted 13 June, 2024; v1 submitted 17 March, 2024; originally announced March 2024.

Comments: Submitted for peer-review

arXiv:2403.09733 [pdf, other]

OverleafCopilot: Empowering Academic Writing in Overleaf with Large Language Models

Authors: Haomin Wen, Zhenjie Wei, Yan Lin, Jiyuan Wang, Yuxuan Liang, Huaiyu Wan

Abstract: The rapid development of Large Language Models (LLMs) has facilitated a variety of applications from different domains. In this technical report, we explore the integration of LLMs and the popular academic writing tool, Overleaf, to enhance the efficiency and quality of academic writing. To achieve the above goal, there are three challenges: i) including seamless interaction between Overleaf and L… ▽ More The rapid development of Large Language Models (LLMs) has facilitated a variety of applications from different domains. In this technical report, we explore the integration of LLMs and the popular academic writing tool, Overleaf, to enhance the efficiency and quality of academic writing. To achieve the above goal, there are three challenges: i) including seamless interaction between Overleaf and LLMs, ii) establishing reliable communication with the LLM provider, and iii) ensuring user privacy. To address these challenges, we present OverleafCopilot, the first-ever tool (i.e., a browser extension) that seamlessly integrates LLMs and Overleaf, enabling researchers to leverage the power of LLMs while writing papers. Specifically, we first propose an effective framework to bridge LLMs and Overleaf. Then, we developed PromptGenius, a website for researchers to easily find and share high-quality up-to-date prompts. Thirdly, we propose an agent command system to help researchers quickly build their customizable agents. OverleafCopilot (https://chromewebstore.google.com/detail/overleaf-copilot/eoadabdpninlhkkbhngoddfjianhlghb ) has been on the Chrome Extension Store, which now serves thousands of researchers. Additionally, the code of PromptGenius is released at https://github.com/wenhaomin/ChatGPT-PromptGenius. We believe our work has the potential to revolutionize academic writing practices, empowering researchers to produce higher-quality papers in less time. △ Less

Submitted 13 March, 2024; originally announced March 2024.

arXiv:2403.05268 [pdf, ps, other]

Deep Prompt Multi-task Network for Abuse Language Detection

Authors: Jian Zhu, Yuping Ruan, Jingfei Chang, Wenhui Sun, Hui Wan, Jian Long, Cheng Luo

Abstract: The detection of abusive language remains a long-standing challenge with the extensive use of social networks. The detection task of abusive language suffers from limited accuracy. We argue that the existing detection methods utilize the fine-tuning technique of the pre-trained language models (PLMs) to handle downstream tasks. Hence, these methods fail to stimulate the general knowledge of the PL… ▽ More The detection of abusive language remains a long-standing challenge with the extensive use of social networks. The detection task of abusive language suffers from limited accuracy. We argue that the existing detection methods utilize the fine-tuning technique of the pre-trained language models (PLMs) to handle downstream tasks. Hence, these methods fail to stimulate the general knowledge of the PLMs. To address the problem, we propose a novel Deep Prompt Multi-task Network (DPMN) for abuse language detection. Specifically, DPMN first attempts to design two forms of deep prompt tuning and light prompt tuning for the PLMs. The effects of different prompt lengths, tuning strategies, and prompt initialization methods on detecting abusive language are studied. In addition, we propose a Task Head based on Bi-LSTM and FFN, which can be used as a short text classifier. Eventually, DPMN utilizes multi-task learning to improve detection metrics further. The multi-task network has the function of transferring effective knowledge. The proposed DPMN is evaluated against eight typical methods on three public datasets: OLID, SOLID, and AbuseAnalyzer. The experimental results show that our DPMN outperforms the state-of-the-art methods. △ Less

Submitted 24 June, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

Comments: Accepted by the International Conference on Pattern Recognition (ICPR) 2024

arXiv:2402.12795 [pdf]

Symmetry-breaking-induced giant Stark effect in 2D Janus materials

Authors: Jiang-Yu Lu, Wu-Yu Chen, Lei Li, Tao Huang, Hui Wan, Zi-Xuan Yang, Gui-Fang Huang, Wangyu Hu, Wei-Qing Huang

Abstract: Symmetry breaking generally induce exotic physical properties, particularly for low-dimensional materials. Herein we demonstrate that symmetry breaking induces a giant Stark effect in 2D Janus materials using group IV-V monolayers with a four-atom-layer structure as a model system, which are constructed by Ge and As element substitution of symmetrical SnSb monolayer. A linear giant Stark effect is… ▽ More Symmetry breaking generally induce exotic physical properties, particularly for low-dimensional materials. Herein we demonstrate that symmetry breaking induces a giant Stark effect in 2D Janus materials using group IV-V monolayers with a four-atom-layer structure as a model system, which are constructed by Ge and As element substitution of symmetrical SnSb monolayer. A linear giant Stark effect is found in Janus semiconductor monolayers, as verified by the band gap variation up to 134 meV of Sn2SbAs monolayer, which is 30 times larger than that of SnSb monolayer (4 meV) when the applied electric field is increased from -0.30 to 0.30 V/Å. By considering the induced electronic field, we propose a generalized and effective formula that efficiently determines the band gap variation owing to Stark effect. The calculated results from proposed formula are well agreement with those from DFT-HSE06 functional. The giant Stark effect is originated from the large spatial separation of centers of the conduction band minimum and valence band maximum states of Janus structure due to its intrinsic potential gradient. The wide-range tuning of band gap under electronic field shows potential applications of 2D Janus materials in optoelectronic devices. △ Less

Submitted 20 February, 2024; originally announced February 2024.

Comments: 10 pages, 5 figures

arXiv:2402.10426 [pdf, other]

DELL: Generating Reactions and Explanations for LLM-Based Misinformation Detection

Authors: Herun Wan, Shangbin Feng, Zhaoxuan Tan, Heng Wang, Yulia Tsvetkov, Minnan Luo

Abstract: Large language models are limited by challenges in factuality and hallucinations to be directly employed off-the-shelf for judging the veracity of news articles, where factual accuracy is paramount. In this work, we propose DELL that identifies three key stages in misinformation detection where LLMs could be incorporated as part of the pipeline: 1) LLMs could \emph{generate news reactions} to repr… ▽ More Large language models are limited by challenges in factuality and hallucinations to be directly employed off-the-shelf for judging the veracity of news articles, where factual accuracy is paramount. In this work, we propose DELL that identifies three key stages in misinformation detection where LLMs could be incorporated as part of the pipeline: 1) LLMs could \emph{generate news reactions} to represent diverse perspectives and simulate user-news interaction networks; 2) LLMs could \emph{generate explanations} for proxy tasks (e.g., sentiment, stance) to enrich the contexts of news articles and produce experts specializing in various aspects of news understanding; 3) LLMs could \emph{merge task-specific experts} and provide an overall prediction by incorporating the predictions and confidence scores of varying experts. Extensive experiments on seven datasets with three LLMs demonstrate that DELL outperforms state-of-the-art baselines by up to 16.8\% in macro f1-score. Further analysis reveals that the generated reactions and explanations are greatly helpful in misinformation detection, while our proposed LLM-guided expert merging helps produce better-calibrated predictions. △ Less

Submitted 4 July, 2024; v1 submitted 15 February, 2024; originally announced February 2024.

arXiv:2402.07369 [pdf, other]

Diff-RNTraj: A Structure-aware Diffusion Model for Road Network-constrained Trajectory Generation

Authors: Tonglong Wei, Youfang Lin, Shengnan Guo, Yan Lin, Yiheng Huang, Chenyang Xiang, Yuqing Bai, Menglu Ya, Huaiyu Wan

Abstract: Trajectory data is essential for various applications as it records the movement of vehicles. However, publicly available trajectory datasets remain limited in scale due to privacy concerns, which hinders the development of trajectory data mining and trajectory-based applications. To address this issue, some methods for generating synthetic trajectories have been proposed to expand the scale of th… ▽ More Trajectory data is essential for various applications as it records the movement of vehicles. However, publicly available trajectory datasets remain limited in scale due to privacy concerns, which hinders the development of trajectory data mining and trajectory-based applications. To address this issue, some methods for generating synthetic trajectories have been proposed to expand the scale of the dataset. However, all existing methods generate trajectories in the geographical coordinate system, which poses two limitations for their utilization in practical applications: 1) the inability to ensure that the generated trajectories are constrained on the road. 2) the lack of road-related information. In this paper, we propose a new problem to meet the practical application need, \emph{i.e.}, road network-constrained trajectory (RNTraj) generation, which can directly generate trajectories on the road network with road-related information. RNTraj is a hybrid type of data, in which each point is represented by a discrete road segment and a continuous moving rate. To generate RNTraj, we design a diffusion model called Diff-RNTraj. This model can effectively handle the hybrid RNTraj using a continuous diffusion framework by incorporating a pre-training strategy to embed hybrid RNTraj into continuous representations. During the sampling stage, a RNTraj decoder is designed to map the continuous representation generated by the diffusion model back to the hybrid RNTraj format. Furthermore, Diff-RNTraj introduces a novel loss function to enhance the spatial validity of the generated trajectories. Extensive experiments conducted on two real-world trajectory datasets demonstrate the effectiveness of the proposed model. △ Less

Submitted 11 February, 2024; originally announced February 2024.

arXiv:2402.07232 [pdf, other]

UVTM: Universal Vehicle Trajectory Modeling with ST Feature Domain Generation

Authors: Yan Lin, Jilin Hu, Shengnan Guo, Bin Yang, Christian S. Jensen, Youfang Lin, Huaiyu Wan

Abstract: Vehicle movement is frequently captured in the form of trajectories, i.e., sequences of timestamped locations. Numerous methods exist that target different tasks involving trajectories such as travel-time estimation, trajectory recovery, and trajectory prediction. However, most methods target only one specific task and cannot be applied universally. Existing efforts to create a universal trajector… ▽ More Vehicle movement is frequently captured in the form of trajectories, i.e., sequences of timestamped locations. Numerous methods exist that target different tasks involving trajectories such as travel-time estimation, trajectory recovery, and trajectory prediction. However, most methods target only one specific task and cannot be applied universally. Existing efforts to create a universal trajectory model often involve adding prediction modules for adapting to different tasks, while also struggle with incomplete or sparse trajectories. To address these shortcomings, we propose the Universal Vehicle Trajectory Model (UVTM) designed to support different tasks based on incomplete or sparse trajectories without the need for retraining or extra prediction modules. To addresses task adaptability on incomplete trajectories, UVTM divide the spatio-temporal features of trajectories into three distinct domains. Each domain can be masked and generated independently to suit the input and output needs of specific tasks. To handle sparse trajectories effectively, UVTM is pre-trained by reconstructing densely sampled trajectories from sparsely sampled ones, allowing it to extract detailed spatio-temporal information from sparse trajectories. Experiments involving three representative trajectory-related tasks on two real-world vehicle trajectory datasets provide insight into the intended properties performance of UVTM and offer evidence that UVTM is capable of meeting its objectives. △ Less

Submitted 23 April, 2024; v1 submitted 11 February, 2024; originally announced February 2024.

arXiv:2402.04454 [pdf, other]

Evolving Mobile Cloud Gaming with 5G Standalone Network Telemetry

Authors: Haoran Wan, Kyle Jamieson

Abstract: Mobile cloud gaming places the simultaneous demands of high capacity and low latency on the wireless network, demands that Private and Metropolitan-Area Standalone 5G networks are poised to meet. However, lacking introspection into the 5G Radio Access Network (RAN), cloud gaming servers are ill-poised to cope with the vagaries of the wireless last hop to a mobile client, while 5G network operators… ▽ More Mobile cloud gaming places the simultaneous demands of high capacity and low latency on the wireless network, demands that Private and Metropolitan-Area Standalone 5G networks are poised to meet. However, lacking introspection into the 5G Radio Access Network (RAN), cloud gaming servers are ill-poised to cope with the vagaries of the wireless last hop to a mobile client, while 5G network operators run mostly closed networks, limiting their potential for co-design with the wider internet and user applications. This paper presents Telesa, a passive, incrementally-deployable, and independently-deployable Standalone 5G network telemetry system that streams fine-grained RAN capacity, latency, and retransmission information to application servers to enable better millisecond scale, application-level decisions on offered load and bit rate adaptation than end-to-end latency measurements or end-to-end packet losses currently permit. We design, implement, and evaluate a Telesa telemetry-enhanced game streaming platform, demonstrating exact congestion-control that can better adapt game video bitrate while simultaneously controlling end-to-end latency, thus maximizing game quality of experience. Our experimental evaluation on a production 5G Standalone network demonstrates a 178-249% Quality of Experience improvement versus two state-of-the-art cloud gaming applications. △ Less

Submitted 6 February, 2024; originally announced February 2024.

arXiv:2402.00371 [pdf, other]

What Does the Bot Say? Opportunities and Risks of Large Language Models in Social Media Bot Detection

Authors: Shangbin Feng, Herun Wan, Ningnan Wang, Zhaoxuan Tan, Minnan Luo, Yulia Tsvetkov

Abstract: Social media bot detection has always been an arms race between advancements in machine learning bot detectors and adversarial bot strategies to evade detection. In this work, we bring the arms race to the next level by investigating the opportunities and risks of state-of-the-art large language models (LLMs) in social bot detection. To investigate the opportunities, we design novel LLM-based bot… ▽ More Social media bot detection has always been an arms race between advancements in machine learning bot detectors and adversarial bot strategies to evade detection. In this work, we bring the arms race to the next level by investigating the opportunities and risks of state-of-the-art large language models (LLMs) in social bot detection. To investigate the opportunities, we design novel LLM-based bot detectors by proposing a mixture-of-heterogeneous-experts framework to divide and conquer diverse user information modalities. To illuminate the risks, we explore the possibility of LLM-guided manipulation of user textual and structured information to evade detection. Extensive experiments with three LLMs on two datasets demonstrate that instruction tuning on merely 1,000 annotated examples produces specialized LLMs that outperform state-of-the-art baselines by up to 9.1% on both datasets, while LLM-guided manipulation strategies could significantly bring down the performance of existing bot detectors by up to 29.6% and harm the calibration and reliability of bot detection systems. △ Less

Submitted 4 July, 2024; v1 submitted 1 February, 2024; originally announced February 2024.

Comments: ACL 2024

arXiv:2401.17188 [pdf, other]

Nested Construction of Polar Codes via Transformers

Authors: Sravan Kumar Ankireddy, S Ashwin Hebbar, Heping Wan, Joonyoung Cho, Charlie Zhang

Abstract: Tailoring polar code construction for decoding algorithms beyond successive cancellation has remained a topic of significant interest in the field. However, despite the inherent nested structure of polar codes, the use of sequence models in polar code construction is understudied. In this work, we propose using a sequence modeling framework to iteratively construct a polar code for any given lengt… ▽ More Tailoring polar code construction for decoding algorithms beyond successive cancellation has remained a topic of significant interest in the field. However, despite the inherent nested structure of polar codes, the use of sequence models in polar code construction is understudied. In this work, we propose using a sequence modeling framework to iteratively construct a polar code for any given length and rate under various channel conditions. Simulations show that polar codes designed via sequential modeling using transformers outperform both 5G-NR sequence and Density Evolution based approaches for both AWGN and Rayleigh fading channels. △ Less

Submitted 30 January, 2024; originally announced January 2024.

Comments: 7 pages; 8 figures

arXiv:2401.13880 [pdf, other]

Principal Component Regression to Study the Impact of Economic Factors on Disadvantaged Communities

Authors: Narmadha M. Mohankumar, Milan Jain, Heng Wan, Sumitrra Ganguli, Kyle D. Wilson, David M. Anderson

Abstract: The Council on Environmental Quality's Climate and Economic Justice Screening Tool defines "disadvantaged communities" (DAC) in the USA, highlighting census tracts where benefits of climate and energy investments are not accruing. We use a principal component generalized linear model, which addresses the intertwined nature of economic factors, income and employment and model their relationship to… ▽ More The Council on Environmental Quality's Climate and Economic Justice Screening Tool defines "disadvantaged communities" (DAC) in the USA, highlighting census tracts where benefits of climate and energy investments are not accruing. We use a principal component generalized linear model, which addresses the intertwined nature of economic factors, income and employment and model their relationship to DAC status. Our study 1) identifies the most significant income groups and employment industries that impact DAC status, 2) provides the probability of DAC status across census tracts and compares the predictive accuracy with widely used machine learning approaches, 3) obtains historical predictions of the probability of DAC status, 4) obtains spatial downscaling of DAC status across block groups. Our study provides valuable insights for policymakers and stakeholders to develop strategies that promote sustainable development and address inequities in climate and energy investments in the USA. △ Less

Submitted 24 January, 2024; originally announced January 2024.

Comments: 13 pages, 9 figures, 2 tables

arXiv:2312.06441 [pdf, other]

Revisiting Graph-Based Fraud Detection in Sight of Heterophily and Spectrum

Authors: Fan Xu, Nan Wang, Hao Wu, Xuezhi Wen, Xibin Zhao, Hai Wan

Abstract: Graph-based fraud detection (GFD) can be regarded as a challenging semi-supervised node binary classification task. In recent years, Graph Neural Networks (GNN) have been widely applied to GFD, characterizing the anomalous possibility of a node by aggregating neighbor information. However, fraud graphs are inherently heterophilic, thus most of GNNs perform poorly due to their assumption of homophi… ▽ More Graph-based fraud detection (GFD) can be regarded as a challenging semi-supervised node binary classification task. In recent years, Graph Neural Networks (GNN) have been widely applied to GFD, characterizing the anomalous possibility of a node by aggregating neighbor information. However, fraud graphs are inherently heterophilic, thus most of GNNs perform poorly due to their assumption of homophily. In addition, due to the existence of heterophily and class imbalance problem, the existing models do not fully utilize the precious node label information. To address the above issues, this paper proposes a semi-supervised GNN-based fraud detector SEC-GFD. This detector includes a hybrid filtering module and a local environmental constraint module, the two modules are utilized to solve heterophily and label utilization problem respectively. The first module starts from the perspective of the spectral domain, and solves the heterophily problem to a certain extent. Specifically, it divides the spectrum into various mixed-frequency bands based on the correlation between spectrum energy distribution and heterophily. Then in order to make full use of the node label information, a local environmental constraint module is adaptively designed. The comprehensive experimental results on four real-world fraud detection datasets denote that SEC-GFD outperforms other competitive graph-based fraud detectors. We release our code at https://github.com/Sunxkissed/SEC-GFD. △ Less

Submitted 8 July, 2024; v1 submitted 11 December, 2023; originally announced December 2023.

arXiv:2312.01601 [pdf, other]

Local-Global History-aware Contrastive Learning for Temporal Knowledge Graph Reasoning

Authors: Wei Chen, Huaiyu Wan, Yuting Wu, Shuyuan Zhao, Jiayaqi Cheng, Yuxin Li, Youfang Lin

Abstract: Temporal knowledge graphs (TKGs) have been identified as a promising approach to represent the dynamics of facts along the timeline. The extrapolation of TKG is to predict unknowable facts happening in the future, holding significant practical value across diverse fields. Most extrapolation studies in TKGs focus on modeling global historical fact repeating and cyclic patterns, as well as local his… ▽ More Temporal knowledge graphs (TKGs) have been identified as a promising approach to represent the dynamics of facts along the timeline. The extrapolation of TKG is to predict unknowable facts happening in the future, holding significant practical value across diverse fields. Most extrapolation studies in TKGs focus on modeling global historical fact repeating and cyclic patterns, as well as local historical adjacent fact evolution patterns, showing promising performance in predicting future unknown facts. Yet, existing methods still face two major challenges: (1) They usually neglect the importance of historical information in KG snapshots related to the queries when encoding the local and global historical information; (2) They exhibit weak anti-noise capabilities, which hinders their performance when the inputs are contaminated with noise.To this end, we propose a novel \blue{Lo}cal-\blue{g}lobal history-aware \blue{C}ontrastive \blue{L}earning model (\blue{LogCL}) for TKG reasoning, which adopts contrastive learning to better guide the fusion of local and global historical information and enhance the ability to resist interference. Specifically, for the first challenge, LogCL proposes an entity-aware attention mechanism applied to the local and global historical facts encoder, which captures the key historical information related to queries. For the latter issue, LogCL designs four historical query contrast patterns, effectively improving the robustness of the model. The experimental results on four benchmark datasets demonstrate that LogCL delivers better and more robust performance than the state-of-the-art baselines. △ Less

Submitted 3 December, 2023; originally announced December 2023.

Comments: 14 pages, Accept ICDE2024

arXiv:2311.08705 [pdf, other]

Evaluating Robustness of Dialogue Summarization Models in the Presence of Naturally Occurring Variations

Authors: Ankita Gupta, Chulaka Gunasekara, Hui Wan, Jatin Ganhotra, Sachindra Joshi, Marina Danilevsky

Abstract: Dialogue summarization task involves summarizing long conversations while preserving the most salient information. Real-life dialogues often involve naturally occurring variations (e.g., repetitions, hesitations) and existing dialogue summarization models suffer from performance drop on such conversations. In this study, we systematically investigate the impact of such variations on state-of-the-a… ▽ More Dialogue summarization task involves summarizing long conversations while preserving the most salient information. Real-life dialogues often involve naturally occurring variations (e.g., repetitions, hesitations) and existing dialogue summarization models suffer from performance drop on such conversations. In this study, we systematically investigate the impact of such variations on state-of-the-art dialogue summarization models using publicly available datasets. To simulate real-life variations, we introduce two types of perturbations: utterance-level perturbations that modify individual utterances with errors and language variations, and dialogue-level perturbations that add non-informative exchanges (e.g., repetitions, greetings). We conduct our analysis along three dimensions of robustness: consistency, saliency, and faithfulness, which capture different aspects of the summarization model's performance. We find that both fine-tuned and instruction-tuned models are affected by input variations, with the latter being more susceptible, particularly to dialogue-level perturbations. We also validate our findings via human evaluation. Finally, we investigate if the robustness of fine-tuned models can be improved by training them with a fraction of perturbed data and observe that this approach is insufficient to address robustness challenges with current models and thus warrants a more thorough investigation to identify better solutions. Overall, our work highlights robustness challenges in dialogue summarization and provides insights for future research. △ Less

Submitted 15 November, 2023; originally announced November 2023.

arXiv:2311.01759 [pdf, other]

TinyFormer: Efficient Transformer Design and Deployment on Tiny Devices

Authors: Jianlei Yang, Jiacheng Liao, Fanding Lei, Meichen Liu, Junyi Chen, Lingkun Long, Han Wan, Bei Yu, Weisheng Zhao

Abstract: Developing deep learning models on tiny devices (e.g. Microcontroller units, MCUs) has attracted much attention in various embedded IoT applications. However, it is challenging to efficiently design and deploy recent advanced models (e.g. transformers) on tiny devices due to their severe hardware resource constraints. In this work, we propose TinyFormer, a framework specifically designed to develo… ▽ More Developing deep learning models on tiny devices (e.g. Microcontroller units, MCUs) has attracted much attention in various embedded IoT applications. However, it is challenging to efficiently design and deploy recent advanced models (e.g. transformers) on tiny devices due to their severe hardware resource constraints. In this work, we propose TinyFormer, a framework specifically designed to develop and deploy resource-efficient transformers on MCUs. TinyFormer mainly consists of SuperNAS, SparseNAS and SparseEngine. Separately, SuperNAS aims to search for an appropriate supernet from a vast search space. SparseNAS evaluates the best sparse single-path model including transformer architecture from the identified supernet. Finally, SparseEngine efficiently deploys the searched sparse models onto MCUs. To the best of our knowledge, SparseEngine is the first deployment framework capable of performing inference of sparse models with transformer on MCUs. Evaluation results on the CIFAR-10 dataset demonstrate that TinyFormer can develop efficient transformers with an accuracy of $96.1\%$ while adhering to hardware constraints of $1$MB storage and $320$KB memory. Additionally, TinyFormer achieves significant speedups in sparse inference, up to $12.2\times$, when compared to the CMSIS-NN library. TinyFormer is believed to bring powerful transformers into TinyML scenarios and greatly expand the scope of deep learning applications. △ Less

Submitted 3 November, 2023; originally announced November 2023.

Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

arXiv:2311.01654 [pdf]

Correlation between spin state and activity for hydrogen evolution

Authors: Tao Zhang, Lei Li, Tao Huang, Hui Wan, Wu-Yu Chen, Zi-Xuan Yang, Gui-Fang Huang, Wangyu Hu, Wei-Qing Hang

Abstract: Spin plays a key role in physical and chemical reactions, such as oxygen evolution and hydrogen evolution reactions (OER/HER); but the spin-activity correlation has remained unclear. Based on a transition metal (TM)-doped PtN2 monolayer model with a well-defined spin center as adsorption site, we here reveal that only active spin state can enhance the strength of hydrogen adsorption, while inert s… ▽ More Spin plays a key role in physical and chemical reactions, such as oxygen evolution and hydrogen evolution reactions (OER/HER); but the spin-activity correlation has remained unclear. Based on a transition metal (TM)-doped PtN2 monolayer model with a well-defined spin center as adsorption site, we here reveal that only active spin state can enhance the strength of hydrogen adsorption, while inert spin state offers very little influence. Specifically, the unpaired electron along the out-of-plane direction such as in dZ2 orbital, acting as an active spin state, will strongly hybridize with hydrogen, resulting in enhanced hydrogen binding energy because dZ2 orbital is just enough to accommodate two electrons to form a bonding orbital. While the in-plane unpaired electron such as in dX2-Y2 orbital, plays a negligible role in adsorbing hydrogen atom. This is verified by a series of single atom catalysts comprising of PtN2 monolayer by replacing Pt atom with a TM (Fe, Co, Ni, Ru, Rh, Pd, Os, or Ir) atom, or subsequent adsorbing a Cl atom. One of the most promising materials is Pd@PtN2-Cl that offers superior HER activity, even better than pure Pt. This work uncovers the nature of spin-activity correlation, thus paving the way for the design of high-performance catalysts through spin-engineering. △ Less

Submitted 2 November, 2023; originally announced November 2023.

Comments: 5 figures

arXiv:2310.13411 [pdf, other]

Towards Enhancing Relational Rules for Knowledge Graph Link Prediction

Authors: Shuhan Wu, Huaiyu Wan, Wei Chen, Yuting Wu, Junfeng Shen, Youfang Lin

Abstract: Graph neural networks (GNNs) have shown promising performance for knowledge graph reasoning. A recent variant of GNN called progressive relational graph neural network (PRGNN), utilizes relational rules to infer missing knowledge in relational digraphs and achieves notable results. However, during reasoning with PRGNN, two important properties are often overlooked: (1) the sequentiality of relatio… ▽ More Graph neural networks (GNNs) have shown promising performance for knowledge graph reasoning. A recent variant of GNN called progressive relational graph neural network (PRGNN), utilizes relational rules to infer missing knowledge in relational digraphs and achieves notable results. However, during reasoning with PRGNN, two important properties are often overlooked: (1) the sequentiality of relation composition, where the order of combining different relations affects the semantics of the relational rules, and (2) the lagged entity information propagation, where the transmission speed of required information lags behind the appearance speed of new entities. Ignoring these properties leads to incorrect relational rule learning and decreased reasoning accuracy. To address these issues, we propose a novel knowledge graph reasoning approach, the Relational rUle eNhanced Graph Neural Network (RUN-GNN). Specifically, RUN-GNN employs a query related fusion gate unit to model the sequentiality of relation composition and utilizes a buffering update mechanism to alleviate the negative effect of lagged entity information propagation, resulting in higher-quality relational rule learning. Experimental results on multiple datasets demonstrate the superiority of RUN-GNN is superior on both transductive and inductive link prediction tasks. △ Less

Submitted 20 October, 2023; originally announced October 2023.

Comments: Accepted at Findings of EMNLP2023

arXiv:2310.06832 [pdf, other]

Flexible entangled state generation in linear optics

Authors: Brendan Pankovich, Alex Neville, Angus Kan, Srikrishna Omkar, Kwok Ho Wan, Kamil Brádler

Abstract: Fault-tolerant quantum computation can be achieved by creating constant-sized, entangled resource states and performing entangling measurements on subsets of their qubits. Linear optical quantum computers can be designed based on this approach, even though entangling operations at the qubit level are non-deterministic in this platform. Probabilistic generation and measurement of entangled states m… ▽ More Fault-tolerant quantum computation can be achieved by creating constant-sized, entangled resource states and performing entangling measurements on subsets of their qubits. Linear optical quantum computers can be designed based on this approach, even though entangling operations at the qubit level are non-deterministic in this platform. Probabilistic generation and measurement of entangled states must be pushed beyond the required threshold by some combination of scheme optimisation, introduction of redundancy and auxiliary state assistance. We report progress in each of these areas. We explore multi-qubit fusion measurements on dual-rail photonic qubits and their role in measurement-based resource state generation, showing that it is possible to boost the success probability of photonic GHZ state analysers with single photon auxiliary states. By incorporating generators of basic entangled "seed" states, we provide a method that simplifies the process of designing and optimising generators of complex, encoded resource states by establishing links to ZX diagrams. △ Less

Submitted 10 October, 2023; originally announced October 2023.

Comments: Comments welcome

arXiv:2309.11902 [pdf, other]

A Switch Architecture for Time-Triggered Transmission with Best-Effort Delivery

Authors: Zonghui Li, Wenlin Zhu, Kang G. Shin, Hai Wan, Xiaoyu Song, Dong Yang, Bo Ai

Abstract: In Time-Triggered (TT) or time-sensitive networks, the transmission of a TT frame is required to be scheduled at a precise time instant for industrial distributed real-time control systems. Other (or {\em best-effort} (BE)) frames are forwarded in a BE manner. Under this scheduling strategy, the transmission of a TT frame must wait until its scheduled instant even if it could have been transmitted… ▽ More In Time-Triggered (TT) or time-sensitive networks, the transmission of a TT frame is required to be scheduled at a precise time instant for industrial distributed real-time control systems. Other (or {\em best-effort} (BE)) frames are forwarded in a BE manner. Under this scheduling strategy, the transmission of a TT frame must wait until its scheduled instant even if it could have been transmitted sooner. On the other hand, BE frames are transmitted whenever possible but may miss deadlines or may even be dropped due to congestion. As a result, TT transmission and BE delivery are incompatible with each other. To remedy this incompatibility, we propose a synergistic switch architecture (SWA) for TT transmission with BE delivery to dynamically improve the end-to-end (e2e) latency of TT frames by opportunistically exploiting BE delivery. Given a TT frame, the SWA generates and transmits a cloned copy with BE delivery. The first frame arriving at the receiver device is delivered with a configured jitter and the other copy ignored. So, the SWA achieves shorter latency and controllable jitter, the best of both worlds. We have implemented SWA using FPGAs in an industry-strength TT switches and used four test scenarios to demonstrate SWA's improvements of e2e latency and controllable jitter over the state-of-the-art TT transmission scheme. △ Less

Submitted 21 September, 2023; originally announced September 2023.

Comments: 14 pages

arXiv:2309.04891 [pdf, other]

How to Evaluate Semantic Communications for Images with ViTScore Metric?

Authors: Tingting Zhu, Bo Peng, Jifan Liang, Tingchen Han, Hai Wan, Jingqiao Fu, Junjie Chen

Abstract: Semantic communications (SC) have been expected to be a new paradigm shifting to catalyze the next generation communication, whose main concerns shift from accurate bit transmission to effective semantic information exchange in communications. However, the previous and widely-used metrics for images are not applicable to evaluate the image semantic similarity in SC. Classical metrics to measure th… ▽ More Semantic communications (SC) have been expected to be a new paradigm shifting to catalyze the next generation communication, whose main concerns shift from accurate bit transmission to effective semantic information exchange in communications. However, the previous and widely-used metrics for images are not applicable to evaluate the image semantic similarity in SC. Classical metrics to measure the similarity between two images usually rely on the pixel level or the structural level, such as the PSNR and the MS-SSIM. Straightforwardly using some tailored metrics based on deep-learning methods in CV community, such as the LPIPS, is infeasible for SC. To tackle this, inspired by BERTScore in NLP community, we propose a novel metric for evaluating image semantic similarity, named Vision Transformer Score (ViTScore). We prove theoretically that ViTScore has 3 important properties, including symmetry, boundedness, and normalization, which make ViTScore convenient and intuitive for image measurement. To evaluate the performance of ViTScore, we compare ViTScore with 3 typical metrics (PSNR, MS-SSIM, and LPIPS) through 4 classes of experiments: (i) correlation with BERTScore through evaluation of image caption downstream CV task, (ii) evaluation in classical image communications, (iii) evaluation in image semantic communication systems, and (iv) evaluation in image semantic communication systems with semantic attack. Experimental results demonstrate that ViTScore is robust and efficient in evaluating the semantic similarity of images. Particularly, ViTScore outperforms the other 3 typical metrics in evaluating the image semantic changes by semantic attack, such as image inverse with Generative Adversarial Networks (GANs). This indicates that ViTScore is an effective performance metric when deployed in SC scenarios. △ Less

Submitted 20 April, 2024; v1 submitted 9 September, 2023; originally announced September 2023.

arXiv:2309.01194 [pdf, other]

A Survey on Service Route and Time Prediction in Instant Delivery: Taxonomy, Progress, and Prospects

Authors: Haomin Wen, Youfang Lin, Lixia Wu, Xiaowei Mao, Tianyue Cai, Yunfeng Hou, Shengnan Guo, Yuxuan Liang, Guangyin Jin, Yiji Zhao, Roger Zimmermann, Jieping Ye, Huaiyu Wan

Abstract: Instant delivery services, such as food delivery and package delivery, have achieved explosive growth in recent years by providing customers with daily-life convenience. An emerging research area within these services is service Route\&Time Prediction (RTP), which aims to estimate the future service route as well as the arrival time of a given worker. As one of the most crucial tasks in those serv… ▽ More Instant delivery services, such as food delivery and package delivery, have achieved explosive growth in recent years by providing customers with daily-life convenience. An emerging research area within these services is service Route\&Time Prediction (RTP), which aims to estimate the future service route as well as the arrival time of a given worker. As one of the most crucial tasks in those service platforms, RTP stands central to enhancing user satisfaction and trimming operational expenditures on these platforms. Despite a plethora of algorithms developed to date, there is no systematic, comprehensive survey to guide researchers in this domain. To fill this gap, our work presents the first comprehensive survey that methodically categorizes recent advances in service route and time prediction. We start by defining the RTP challenge and then delve into the metrics that are often employed. Following that, we scrutinize the existing RTP methodologies, presenting a novel taxonomy of them. We categorize these methods based on three criteria: (i) type of task, subdivided into only-route prediction, only-time prediction, and joint route\&time prediction; (ii) model architecture, which encompasses sequence-based and graph-based models; and (iii) learning paradigm, including Supervised Learning (SL) and Deep Reinforcement Learning (DRL). Conclusively, we highlight the limitations of current research and suggest prospective avenues. We believe that the taxonomy, progress, and prospects introduced in this paper can significantly promote the development of this field. △ Less

Submitted 3 September, 2023; originally announced September 2023.

arXiv:2308.13760 [pdf, other]

How Can Context Help? Exploring Joint Retrieval of Passage and Personalized Context

Authors: Hui Wan, Hongkang Li, Songtao Lu, Xiaodong Cui, Marina Danilevsky

Abstract: The integration of external personalized context information into document-grounded conversational systems has significant potential business value, but has not been well-studied. Motivated by the concept of personalized context-aware document-grounded conversational systems, we introduce the task of context-aware passage retrieval. We also construct a dataset specifically curated for this purpose… ▽ More The integration of external personalized context information into document-grounded conversational systems has significant potential business value, but has not been well-studied. Motivated by the concept of personalized context-aware document-grounded conversational systems, we introduce the task of context-aware passage retrieval. We also construct a dataset specifically curated for this purpose. We describe multiple baseline systems to address this task, and propose a novel approach, Personalized Context-Aware Search (PCAS), that effectively harnesses contextual information during passage retrieval. Experimental evaluations conducted on multiple popular dense retrieval systems demonstrate that our proposed approach not only outperforms the baselines in retrieving the most relevant passage but also excels at identifying the pertinent context among all the available contexts. We envision that our contributions will serve as a catalyst for inspiring future research endeavors in this promising direction. △ Less

Submitted 26 August, 2023; originally announced August 2023.

arXiv:2308.04192 [pdf, other]

doi 10.1103/PhysRevLett.133.050604

High photon-loss threshold quantum computing using GHZ-state measurements

Authors: Brendan Pankovich, Angus Kan, Kwok Ho Wan, Maike Ostmann, Alex Neville, Srikrishna Omkar, Adel Sohbi, Kamil Brádler

Abstract: We propose fault-tolerant architectures based on performing projective measurements in the Greenberger-Horne-Zeilinger (GHZ) basis on constant-sized, entangled resource states. We present linear-optical constructions of the architectures, where the GHZ-state measurements are encoded to suppress the errors induced by photon loss and the probabilistic nature of linear optics. Simulations of our cons… ▽ More We propose fault-tolerant architectures based on performing projective measurements in the Greenberger-Horne-Zeilinger (GHZ) basis on constant-sized, entangled resource states. We present linear-optical constructions of the architectures, where the GHZ-state measurements are encoded to suppress the errors induced by photon loss and the probabilistic nature of linear optics. Simulations of our constructions demonstrate high single-photon loss thresholds compared to the state-of-the-art linear-optical architecture realized with encoded two-qubit fusion measurements performed on constant-sized resource states. We believe this result shows a resource-efficient path to achieving photonic fault-tolerant quantum computing. △ Less

Submitted 8 August, 2023; originally announced August 2023.

Journal ref: Phys. Rev. Lett. 133, 050604 (2024)

arXiv:2307.16246 [pdf, other]

DRL4Route: A Deep Reinforcement Learning Framework for Pick-up and Delivery Route Prediction

Authors: Xiaowei Mao, Haomin Wen, Hengrui Zhang, Huaiyu Wan, Lixia Wu, Jianbin Zheng, Haoyuan Hu, Youfang Lin

Abstract: Pick-up and Delivery Route Prediction (PDRP), which aims to estimate the future service route of a worker given his current task pool, has received rising attention in recent years. Deep neural networks based on supervised learning have emerged as the dominant model for the task because of their powerful ability to capture workers' behavior patterns from massive historical data. Though promising,… ▽ More Pick-up and Delivery Route Prediction (PDRP), which aims to estimate the future service route of a worker given his current task pool, has received rising attention in recent years. Deep neural networks based on supervised learning have emerged as the dominant model for the task because of their powerful ability to capture workers' behavior patterns from massive historical data. Though promising, they fail to introduce the non-differentiable test criteria into the training process, leading to a mismatch in training and test criteria. Which considerably trims down their performance when applied in practical systems. To tackle the above issue, we present the first attempt to generalize Reinforcement Learning (RL) to the route prediction task, leading to a novel RL-based framework called DRL4Route. It combines the behavior-learning abilities of previous deep learning models with the non-differentiable objective optimization ability of reinforcement learning. DRL4Route can serve as a plug-and-play component to boost the existing deep learning models. Based on the framework, we further implement a model named DRL4Route-GAE for PDRP in logistic service. It follows the actor-critic architecture which is equipped with a Generalized Advantage Estimator that can balance the bias and variance of the policy gradient estimates, thus achieving a more optimal policy. Extensive offline experiments and the online deployment show that DRL4Route-GAE improves Location Square Deviation (LSD) by 0.9%-2.7%, and Accuracy@3 (ACC@3) by 2.4%-3.2% over existing methods on the real-world dataset. △ Less

Submitted 30 July, 2023; originally announced July 2023.

Comments: Accepted by KDD23

arXiv:2307.03048 [pdf, other]

Origin-Destination Travel Time Oracle for Map-based Services

Authors: Yan Lin, Huaiyu Wan, Jilin Hu, Shengnan Guo, Bin Yang, Youfang Lin, Christian S. Jensen

Abstract: Given an origin (O), a destination (D), and a departure time (T), an Origin-Destination (OD) travel time oracle~(ODT-Oracle) returns an estimate of the time it takes to travel from O to D when departing at T. ODT-Oracles serve important purposes in map-based services. To enable the construction of such oracles, we provide a travel-time estimation (TTE) solution that leverages historical trajectori… ▽ More Given an origin (O), a destination (D), and a departure time (T), an Origin-Destination (OD) travel time oracle~(ODT-Oracle) returns an estimate of the time it takes to travel from O to D when departing at T. ODT-Oracles serve important purposes in map-based services. To enable the construction of such oracles, we provide a travel-time estimation (TTE) solution that leverages historical trajectories to estimate time-varying travel times for OD pairs. The problem is complicated by the fact that multiple historical trajectories with different travel times may connect an OD pair, while trajectories may vary from one another. To solve the problem, it is crucial to remove outlier trajectories when doing travel time estimation for future queries. We propose a novel, two-stage framework called Diffusion-based Origin-destination Travel Time Estimation (DOT), that solves the problem. First, DOT employs a conditioned Pixelated Trajectories (PiT) denoiser that enables building a diffusion-based PiT inference process by learning correlations between OD pairs and historical trajectories. Specifically, given an OD pair and a departure time, we aim to infer a PiT. Next, DOT encompasses a Masked Vision Transformer~(MViT) that effectively and efficiently estimates a travel time based on the inferred PiT. We report on extensive experiments on two real-world datasets that offer evidence that DOT is capable of outperforming baseline methods in terms of accuracy, scalability, and explainability. △ Less

Submitted 6 July, 2023; originally announced July 2023.

Comments: 15 pages, 12 figures, accepted by SIGMOD International Conference on Management of Data 2024

arXiv:2307.02606 [pdf]

Flowering of Developable 2D Crystal Shapes in Closed, Fluid Membranes

Authors: Hao Wan, Geunwoong Jeon, Weiyue Xin, Gregory M. Grason, Maria M. Santore

Abstract: The morphologies of two-dimensional (2D) crystals, nucleated, grown, and integrated within 2D elastic fluids, for instance in giant vesicle membranes, are dictated by an interplay of mechanics, permeability, and thermal contraction. Mitigation of solid strain drives formation of crystals with developable shapes (e.g. planar or cylindrical) that expel Gaussian curvature into the 2D fluid. However,… ▽ More The morphologies of two-dimensional (2D) crystals, nucleated, grown, and integrated within 2D elastic fluids, for instance in giant vesicle membranes, are dictated by an interplay of mechanics, permeability, and thermal contraction. Mitigation of solid strain drives formation of crystals with developable shapes (e.g. planar or cylindrical) that expel Gaussian curvature into the 2D fluid. However, upon cooling to grow the crystals, large vesicles sustain greater inflation and tension because their small area to volume ratio slows water permeation. As a result, more elaborate shapes, for instance flowers with bendable but inextensible petals form on large vesicles despite their more gradual curvature, while small vesicles harbor compact planar crystals. This size dependence runs counter to the known cumulative growth of strain energy of 2D colloidal crystals on rigid spherical templates. This interplay of intra-membrane mechanics and processing points to the scalable production of flexible molecular crystals of controllable complex shape. △ Less

Submitted 5 July, 2023; originally announced July 2023.

Comments: Main text 27 pages and 6 figures; SI 25 pages and 14 figures

arXiv:2306.10675 [pdf, other]

LaDe: The First Comprehensive Last-mile Delivery Dataset from Industry

Authors: Lixia Wu, Haomin Wen, Haoyuan Hu, Xiaowei Mao, Yutong Xia, Ergang Shan, Jianbin Zhen, Junhong Lou, Yuxuan Liang, Liuqing Yang, Roger Zimmermann, Youfang Lin, Huaiyu Wan

Abstract: Real-world last-mile delivery datasets are crucial for research in logistics, supply chain management, and spatio-temporal data mining. Despite a plethora of algorithms developed to date, no widely accepted, publicly available last-mile delivery dataset exists to support research in this field. In this paper, we introduce \texttt{LaDe}, the first publicly available last-mile delivery dataset with… ▽ More Real-world last-mile delivery datasets are crucial for research in logistics, supply chain management, and spatio-temporal data mining. Despite a plethora of algorithms developed to date, no widely accepted, publicly available last-mile delivery dataset exists to support research in this field. In this paper, we introduce \texttt{LaDe}, the first publicly available last-mile delivery dataset with millions of packages from the industry. LaDe has three unique characteristics: (1) Large-scale. It involves 10,677k packages of 21k couriers over 6 months of real-world operation. (2) Comprehensive information. It offers original package information, such as its location and time requirements, as well as task-event information, which records when and where the courier is while events such as task-accept and task-finish events happen. (3) Diversity. The dataset includes data from various scenarios, including package pick-up and delivery, and from multiple cities, each with its unique spatio-temporal patterns due to their distinct characteristics such as populations. We verify LaDe on three tasks by running several classical baseline models per task. We believe that the large-scale, comprehensive, diverse feature of LaDe can offer unparalleled opportunities to researchers in the supply chain community, data mining community, and beyond. The dataset homepage is publicly available at https://huggingface.co/datasets/Cainiao-AI/LaDe. △ Less

Submitted 2 January, 2024; v1 submitted 18 June, 2023; originally announced June 2023.

arXiv:2306.05377 [pdf, other]

Numerical coupling of aerosol emissions, dry removal, and turbulent mixing in the E3SM Atmosphere Model version 1 (EAMv1), part I: dust budget analyses and the impacts of a revised coupling scheme

Authors: Hui Wan, Kai Zhang, Christopher J. Vogl, Carol S. Woodward, Richard C. Easter, Philip J. Rasch, Yan Feng, Hailong Wang

Abstract: An earlier study evaluating the dust life cycle in the Energy Exascale Earth System Model (E3SM) Atmosphere Model version 1 (EAMv1) has revealed that the simulated global mean dust lifetime is substantially shorter when higher vertical resolution is used, primarily due to significant strengthening of dust dry removal in source regions. This paper demonstrates that the sequential splitting of aeros… ▽ More An earlier study evaluating the dust life cycle in the Energy Exascale Earth System Model (E3SM) Atmosphere Model version 1 (EAMv1) has revealed that the simulated global mean dust lifetime is substantially shorter when higher vertical resolution is used, primarily due to significant strengthening of dust dry removal in source regions. This paper demonstrates that the sequential splitting of aerosol emissions, dry removal, and turbulent mixing in the model's time integration loop, especially the calculation of dry removal after surface emissions and before turbulent mixing, is the primary reason for the vertical resolution sensitivity reported in that earlier study. Based on this reasoning, we propose a simple revision to the numerical process coupling scheme, which moves the application of the surface emissions to after dry removal and before turbulent mixing. The revised scheme allows newly emitted particles to be transported aloft by turbulence before being removed from the atmosphere, and hence better resembles the dust life cycle in the real world. Sensitivity experiments are conducted and analyzed to evaluate the impact of the revised coupling on the simulated aerosol climatology in EAMv1. △ Less

Submitted 17 June, 2023; v1 submitted 8 June, 2023; originally announced June 2023.

Showing 1–50 of 172 results for author: Wan, H