Search | arXiv e-print repository

doi 10.5194/gmd-17-1409-2024

Numerical coupling of aerosol emissions, dry removal, and turbulent mixing in the E3SM Atmosphere Model version 1 (EAMv1), part II: a semi-discrete error analysis framework for assessing coupling schemes

Authors: Christopher J. Vogl, Hui Wan, Carol S. Woodward, Quan M. Bui

Abstract: This paper complements the empirical justification of the revised scheme in Part I of this work with a mathematical justification leveraging a semi-discrete analysis framework for assessing the splitting error of process coupling methods. The novelty of the framework is that splitting error is distinguished from the process time integration errors, i.e., the errors caused by discrete time integrat… ▽ More This paper complements the empirical justification of the revised scheme in Part I of this work with a mathematical justification leveraging a semi-discrete analysis framework for assessing the splitting error of process coupling methods. The novelty of the framework is that splitting error is distinguished from the process time integration errors, i.e., the errors caused by discrete time integration of individual processes, leading to expressions that are more easily interpreted utilizing existing physical understanding of the processes that the terms represent. This application of this framework to dust life cycle in EAMv1 showcases such an interpretation, using the leading-order splitting error that results from the framework to confirm (i) that the original EAMv1 scheme artificially strengthens the effect of dry removal processes, and (ii) that the revised splitting reduces that artificial strengthening. While the error analysis framework is presented in the context of the dust life cycle in EAMv1, the framework can be broadly leveraged to evaluate process coupling schemes, both in other physical problems and for any number of processes. This framework will be particularly powerful when the various process implementations support a variety of time integration approaches. Whereas traditional local truncation error approaches require separate consideration of each combination of time integration methods, this framework enables evaluation of coupling schemes independent of particular time integration approaches for each process while still allowing for the incorporation of these specific time integration errors if so desired. The framework also explains how the splitting error terms result from (i) the integration of individual processes in isolation from other processes, and (ii) the choices of input state and timestep size for the isolated integration of processes. △ Less

Submitted 20 February, 2024; v1 submitted 8 June, 2023; originally announced June 2023.

Journal ref: Geosci. Model Dev. 17, 3, 1409-1428 (2024)

arXiv:2306.02780 [pdf, other]

Stochastic p-Bits Based on Spin-Orbit Torque Magnetic Tunnel Junctions

Authors: X. H. Li, M. K. Zhao, R. Zhang, C. H. Wan, Y. Z. Wang, X. M. Luo, S. Q. Liu, J. H. Xia, G. Q. Yu, X. F. Han

Abstract: Stochastic p-Bit devices play a pivotal role in solving NP-hard problems, neural network computing, and hardware accelerators for algorithms such as the simulated annealing. In this work, we focus on Stochastic p-Bits based on high-barrier magnetic tunnel junctions (HB-MTJs) with identical stack structure and cell geometry, but employing different spin-orbit torque (SOT) switching schemes. We cond… ▽ More Stochastic p-Bit devices play a pivotal role in solving NP-hard problems, neural network computing, and hardware accelerators for algorithms such as the simulated annealing. In this work, we focus on Stochastic p-Bits based on high-barrier magnetic tunnel junctions (HB-MTJs) with identical stack structure and cell geometry, but employing different spin-orbit torque (SOT) switching schemes. We conducted a comparative study of their switching probability as a function of pulse amplitude and width of the applied voltage. Through experimental and theoretical investigations, we have observed that the Y-type SOT-MTJs exhibit the gentlest dependence of the switching probability on the external voltage. This characteristic indicates superior tunability in randomness and enhanced robustness against external disturbances when Y-type SOT-MTJs are employed as stochastic p-Bits. Furthermore, the random numbers generated by these Y-type SOT-MTJs, following XOR pretreatment, have successfully passed the National Institute of Standards and Technology (NIST) SP800-22 test. This comprehensive study demonstrates the high performance and immense potential of Y-type SOT-MTJs for the implementation of stochastic p-Bits. △ Less

Submitted 5 June, 2023; originally announced June 2023.

arXiv:2305.04750 [pdf, other]

Sense, Imagine, Act: Multimodal Perception Improves Model-Based Reinforcement Learning for Head-to-Head Autonomous Racing

Authors: Elena Shrestha, Chetan Reddy, Hanxi Wan, Yulun Zhuang, Ram Vasudevan

Abstract: Model-based reinforcement learning (MBRL) techniques have recently yielded promising results for real-world autonomous racing using high-dimensional observations. MBRL agents, such as Dreamer, solve long-horizon tasks by building a world model and planning actions by latent imagination. This approach involves explicitly learning a model of the system dynamics and using it to learn the optimal poli… ▽ More Model-based reinforcement learning (MBRL) techniques have recently yielded promising results for real-world autonomous racing using high-dimensional observations. MBRL agents, such as Dreamer, solve long-horizon tasks by building a world model and planning actions by latent imagination. This approach involves explicitly learning a model of the system dynamics and using it to learn the optimal policy for continuous control over multiple timesteps. As a result, MBRL agents may converge to sub-optimal policies if the world model is inaccurate. To improve state estimation for autonomous racing, this paper proposes a self-supervised sensor fusion technique that combines egocentric LiDAR and RGB camera observations collected from the F1TENTH Gym. The zero-shot performance of MBRL agents is empirically evaluated on unseen tracks and against a dynamic obstacle. This paper illustrates that multimodal perception improves robustness of the world model without requiring additional training data. The resulting multimodal Dreamer agent safely avoided collisions and won the most races compared to other tested baselines in zero-shot head-to-head autonomous racing. △ Less

Submitted 8 May, 2023; originally announced May 2023.

arXiv:2304.12090 [pdf, other]

Reinforcement Learning with Knowledge Representation and Reasoning: A Brief Survey

Authors: Chao Yu, Xuejing Zheng, Hankz Hankui Zhuo, Hai Wan, Weilin Luo

Abstract: Reinforcement Learning(RL) has achieved tremendous development in recent years, but still faces significant obstacles in addressing complex real-life problems due to the issues of poor system generalization, low sample efficiency as well as safety and interpretability concerns. The core reason underlying such dilemmas can be attributed to the fact that most of the work has focused on the computati… ▽ More Reinforcement Learning(RL) has achieved tremendous development in recent years, but still faces significant obstacles in addressing complex real-life problems due to the issues of poor system generalization, low sample efficiency as well as safety and interpretability concerns. The core reason underlying such dilemmas can be attributed to the fact that most of the work has focused on the computational aspect of value functions or policies using a representational model to describe atomic components of rewards, states and actions etc, thus neglecting the rich high-level declarative domain knowledge of facts, relations and rules that can be either provided a priori or acquired through reasoning over time. Recently, there has been a rapidly growing interest in the use of Knowledge Representation and Reasoning(KRR) methods, usually using logical languages, to enable more abstract representation and efficient learning in RL. In this survey, we provide a preliminary overview on these endeavors that leverage the strengths of KRR to help solving various problems in RL, and discuss the challenging open problems and possible directions for future work in this area. △ Less

Submitted 24 April, 2023; originally announced April 2023.

arXiv:2303.03677 [pdf, other]

Training Machine Learning Models to Characterize Temporal Evolution of Disadvantaged Communities

Authors: Milan Jain, Narmadha Meenu Mohankumar, Heng Wan, Sumitrra Ganguly, Kyle D Wilson, David M Anderson

Abstract: Disadvantaged communities (DAC), as defined by the Justice40 initiative of the Department of Energy (DOE), USA, identifies census tracts across the USA to determine where benefits of climate and energy investments are or are not currently accruing. The DAC status not only helps in determining the eligibility for future Justice40-related investments but is also critical for exploring ways to achiev… ▽ More Disadvantaged communities (DAC), as defined by the Justice40 initiative of the Department of Energy (DOE), USA, identifies census tracts across the USA to determine where benefits of climate and energy investments are or are not currently accruing. The DAC status not only helps in determining the eligibility for future Justice40-related investments but is also critical for exploring ways to achieve equitable distribution of resources. However, designing inclusive and equitable strategies not just requires a good understanding of current demographics, but also a deeper analysis of the transformations that happened in those demographics over the years. In this paper, machine learning (ML) models are trained on publicly available census data from recent years to classify the DAC status at the census tracts level and then the trained model is used to classify DAC status for historical years. A detailed analysis of the feature and model selection along with the evolution of disadvantaged communities between 2013 and 2018 is presented in this study. △ Less

Submitted 7 March, 2023; originally announced March 2023.

arXiv:2302.00381 [pdf, other]

BotPercent: Estimating Bot Populations in Twitter Communities

Authors: Zhaoxuan Tan, Shangbin Feng, Melanie Sclar, Herun Wan, Minnan Luo, Yejin Choi, Yulia Tsvetkov

Abstract: Twitter bot detection is vital in combating misinformation and safeguarding the integrity of social media discourse. While malicious bots are becoming more and more sophisticated and personalized, standard bot detection approaches are still agnostic to social environments (henceforth, communities) the bots operate at. In this work, we introduce community-specific bot detection, estimating the perc… ▽ More Twitter bot detection is vital in combating misinformation and safeguarding the integrity of social media discourse. While malicious bots are becoming more and more sophisticated and personalized, standard bot detection approaches are still agnostic to social environments (henceforth, communities) the bots operate at. In this work, we introduce community-specific bot detection, estimating the percentage of bots given the context of a community. Our method -- BotPercent -- is an amalgamation of Twitter bot detection datasets and feature-, text-, and graph-based models, adjusted to a particular community on Twitter. We introduce an approach that performs confidence calibration across bot detection models, which addresses generalization issues in existing community-agnostic models targeting individual bots and leads to more accurate community-level bot estimations. Experiments demonstrate that BotPercent achieves state-of-the-art performance in community-level Twitter bot detection across both balanced and imbalanced class distribution settings, %outperforming existing approaches and presenting a less biased estimator of Twitter bot populations within the communities we analyze. We then analyze bot rates in several Twitter groups, including users who engage with partisan news media, political communities in different countries, and more. Our results reveal that the presence of Twitter bots is not homogeneous, but exhibiting a spatial-temporal distribution with considerable heterogeneity that should be taken into account for content moderation and social media policy making. The implementation of BotPercent is available at https://github.com/TamSiuhin/BotPercent. △ Less

Submitted 18 October, 2023; v1 submitted 1 February, 2023; originally announced February 2023.

Comments: Accepted to findings of EMNLP 2023

arXiv:2301.13629 [pdf, other]

DiffSTG: Probabilistic Spatio-Temporal Graph Forecasting with Denoising Diffusion Models

Authors: Haomin Wen, Youfang Lin, Yutong Xia, Huaiyu Wan, Qingsong Wen, Roger Zimmermann, Yuxuan Liang

Abstract: Spatio-temporal graph neural networks (STGNN) have emerged as the dominant model for spatio-temporal graph (STG) forecasting. Despite their success, they fail to model intrinsic uncertainties within STG data, which cripples their practicality in downstream tasks for decision-making. To this end, this paper focuses on probabilistic STG forecasting, which is challenging due to the difficulty in mode… ▽ More Spatio-temporal graph neural networks (STGNN) have emerged as the dominant model for spatio-temporal graph (STG) forecasting. Despite their success, they fail to model intrinsic uncertainties within STG data, which cripples their practicality in downstream tasks for decision-making. To this end, this paper focuses on probabilistic STG forecasting, which is challenging due to the difficulty in modeling uncertainties and complex ST dependencies. In this study, we present the first attempt to generalize the popular denoising diffusion probabilistic models to STGs, leading to a novel non-autoregressive framework called DiffSTG, along with the first denoising network UGnet for STG in the framework. Our approach combines the spatio-temporal learning capabilities of STGNNs with the uncertainty measurements of diffusion models. Extensive experiments validate that DiffSTG reduces the Continuous Ranked Probability Score (CRPS) by 4%-14%, and Root Mean Squared Error (RMSE) by 2%-7% over existing methods on three real-world datasets. △ Less

Submitted 9 March, 2024; v1 submitted 31 January, 2023; originally announced January 2023.

Comments: Accepted to the 31st ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems

arXiv:2301.03128 [pdf, ps, other]

Compress-and-Forward via Multilevel Coding and Trellis Coded Quantization

Authors: Heping Wan, Anders Host-Madsen, Aria Nosratinia

Abstract: Compress-forward (CF) relays can improve communication rates even when the relay cannot decode the source signal. Efficient implementation of CF is a topic of contemporary interest, in part because of its potential impact on wireless technologies such as cloud-RAN. There exists a gap between the performance of CF implementations in the high spectral efficiency regime and the corresponding informat… ▽ More Compress-forward (CF) relays can improve communication rates even when the relay cannot decode the source signal. Efficient implementation of CF is a topic of contemporary interest, in part because of its potential impact on wireless technologies such as cloud-RAN. There exists a gap between the performance of CF implementations in the high spectral efficiency regime and the corresponding information theoretic achievable rates. We begin by re-framing a dilemma causing this gap, and propose an approach for its mitigation. We utilize trellis coded quantization (TCQ) at the relay together with multi-level coding at the source and relay, in a manner that facilitates the calculation of bit LLRs at the destination for joint decoding. The contributions of this work include designing TCQ for end-to-end relay performance, since a distortion-minimizing TCQ is suboptimum. The reported improvements include a 1dB gain over prior results for PSK modulation. △ Less

Submitted 8 January, 2023; originally announced January 2023.

arXiv:2301.01015 [pdf, other]

Semi-Structured Object Sequence Encoders

Authors: Rudra Murthy V, Riyaz Bhat, Chulaka Gunasekara, Siva Sankalp Patel, Hui Wan, Tejas Indulal Dhamecha, Danish Contractor, Marina Danilevsky

Abstract: In this paper we explore the task of modeling semi-structured object sequences; in particular, we focus our attention on the problem of developing a structure-aware input representation for such sequences. Examples of such data include user activity on websites, machine logs, and many others. This type of data is often represented as a sequence of sets of key-value pairs over time and can present… ▽ More In this paper we explore the task of modeling semi-structured object sequences; in particular, we focus our attention on the problem of developing a structure-aware input representation for such sequences. Examples of such data include user activity on websites, machine logs, and many others. This type of data is often represented as a sequence of sets of key-value pairs over time and can present modeling challenges due to an ever-increasing sequence length. We propose a two-part approach, which first considers each key independently and encodes a representation of its values over time; we then self-attend over these value-aware key representations to accomplish a downstream task. This allows us to operate on longer object sequences than existing methods. We introduce a novel shared-attention-head architecture between the two modules and present an innovative training schedule that interleaves the training of both modules with shared weights for some attention heads. Our experiments on multiple prediction tasks using real-world data demonstrate that our approach outperforms a unified network with hierarchical encoding, as well as other methods including a record-centric representation and a flattened representation of the sequence. △ Less

Submitted 22 May, 2023; v1 submitted 3 January, 2023; originally announced January 2023.

arXiv:2212.00373 [pdf, other]

A Noise-tolerant Differentiable Learning Approach for Single Occurrence Regular Expression with Interleaving

Authors: Rongzhen Ye, Tianqu Zhuang, Hai Wan, Jianfeng Du, Weilin Luo, Pingjia Liang

Abstract: We study the problem of learning a single occurrence regular expression with interleaving (SOIRE) from a set of text strings possibly with noise. SOIRE fully supports interleaving and covers a large portion of regular expressions used in practice. Learning SOIREs is challenging because it requires heavy computation and text strings usually contain noise in practice. Most of the previous studies on… ▽ More We study the problem of learning a single occurrence regular expression with interleaving (SOIRE) from a set of text strings possibly with noise. SOIRE fully supports interleaving and covers a large portion of regular expressions used in practice. Learning SOIREs is challenging because it requires heavy computation and text strings usually contain noise in practice. Most of the previous studies only learn restricted SOIREs and are not robust on noisy data. To tackle these issues, we propose a noise-tolerant differentiable learning approach SOIREDL for SOIRE. We design a neural network to simulate SOIRE matching and theoretically prove that certain assignments of the set of parameters learnt by the neural network, called faithful encodings, are one-to-one corresponding to SOIREs for a bounded size. Based on this correspondence, we interpret the target SOIRE from an assignment of the set of parameters of the neural network by exploring the nearest faithful encodings. Experimental results show that SOIREDL outperforms the state-of-the-art approaches, especially on noisy data. △ Less

Submitted 11 January, 2023; v1 submitted 1 December, 2022; originally announced December 2022.

arXiv:2211.15666 [pdf, other]

Learning Visual Planning Models from Partially Observed Images

Authors: Kebing Jin, Zhanhao Xiao, Hankui Hankz Zhuo, Hai Wan, Jiaran Cai

Abstract: There has been increasing attention on planning model learning in classical planning. Most existing approaches, however, focus on learning planning models from structured data in symbolic representations. It is often difficult to obtain such structured data in real-world scenarios. Although a number of approaches have been developed for learning planning models from fully observed unstructured dat… ▽ More There has been increasing attention on planning model learning in classical planning. Most existing approaches, however, focus on learning planning models from structured data in symbolic representations. It is often difficult to obtain such structured data in real-world scenarios. Although a number of approaches have been developed for learning planning models from fully observed unstructured data (e.g., images), in many scenarios raw observations are often incomplete. In this paper, we provide a novel framework, \aType{Recplan}, for learning a transition model from partially observed raw image traces. More specifically, by considering the preceding and subsequent images in a trace, we learn the latent state representations of raw observations and then build a transition model based on such representations. Additionally, we propose a neural-network-based approach to learn a heuristic model that estimates the distance toward a given goal observation. Based on the learned transition model and heuristic model, we implement a classical planner for images. We exhibit empirically that our approach is more effective than a state-of-the-art approach of learning visual planning models in the environment with incomplete observations. △ Less

Submitted 25 November, 2022; originally announced November 2022.

Comments: 25 pages, 5 figures

arXiv:2210.12415 [pdf, other]

ALT: Boosting Deep Learning Performance by Breaking the Wall between Graph and Operator Level Optimizations

Authors: Zhiying Xu, Jiafan Xu, Hongding Peng, Wei Wang, Xiaoliang Wang, Haoran Wan, Haipeng Dai, Yixu Xu, Hao Cheng, Kun Wang, Guihai Chen

Abstract: Deep learning models rely on highly optimized tensor libraries for efficient inference on heterogeneous hardware. Current deep compilers typically predetermine layouts of tensors and then optimize loops of operators. However, such unidirectional and one-off workflow strictly separates graph-level optimization and operator-level optimization into different system layers, missing opportunities for u… ▽ More Deep learning models rely on highly optimized tensor libraries for efficient inference on heterogeneous hardware. Current deep compilers typically predetermine layouts of tensors and then optimize loops of operators. However, such unidirectional and one-off workflow strictly separates graph-level optimization and operator-level optimization into different system layers, missing opportunities for unified tuning. This paper proposes ALT, a compiler that performs joint graph- and operator-level optimizations for deep models. ALT provides a generic transformation module to manipulate layouts and loops with easy-to-use primitive functions. ALT further integrates an auto-tuning module that jointly optimizes graph-level data layouts and operator-level loops while guaranteeing efficiency. Experimental results show that ALT significantly outperforms state-of-the-art compilers (e.g., Ansor) in terms of both single operator performance (e.g., 1.5x speedup on average) and end-to-end inference performance (e.g., 1.4x speedup on average). △ Less

Submitted 29 October, 2022; v1 submitted 22 October, 2022; originally announced October 2022.

arXiv:2210.04174 [pdf, other]

Grow and Merge: A Unified Framework for Continuous Categories Discovery

Authors: Xinwei Zhang, Jianwen Jiang, Yutong Feng, Zhi-Fan Wu, Xibin Zhao, Hai Wan, Mingqian Tang, Rong Jin, Yue Gao

Abstract: Although a number of studies are devoted to novel category discovery, most of them assume a static setting where both labeled and unlabeled data are given at once for finding new categories. In this work, we focus on the application scenarios where unlabeled data are continuously fed into the category discovery system. We refer to it as the {\bf Continuous Category Discovery} ({\bf CCD}) problem,… ▽ More Although a number of studies are devoted to novel category discovery, most of them assume a static setting where both labeled and unlabeled data are given at once for finding new categories. In this work, we focus on the application scenarios where unlabeled data are continuously fed into the category discovery system. We refer to it as the {\bf Continuous Category Discovery} ({\bf CCD}) problem, which is significantly more challenging than the static setting. A common challenge faced by novel category discovery is that different sets of features are needed for classification and category discovery: class discriminative features are preferred for classification, while rich and diverse features are more suitable for new category mining. This challenge becomes more severe for dynamic setting as the system is asked to deliver good performance for known classes over time, and at the same time continuously discover new classes from unlabeled data. To address this challenge, we develop a framework of {\bf Grow and Merge} ({\bf GM}) that works by alternating between a growing phase and a merging phase: in the growing phase, it increases the diversity of features through a continuous self-supervised learning for effective category mining, and in the merging phase, it merges the grown model with a static one to ensure satisfying performance for known classes. Our extensive studies verify that the proposed GM framework is significantly more effective than the state-of-the-art approaches for continuous category discovery. △ Less

Submitted 9 October, 2022; originally announced October 2022.

Comments: This paper has already been accepted by 36th Conference on Neural Information Processing Systems (NeurIPS 2022)

arXiv:2209.15166 [pdf, other]

Reward Shaping for User Satisfaction in a REINFORCE Recommender

Authors: Konstantina Christakopoulou, Can Xu, Sai Zhang, Sriraj Badam, Trevor Potter, Daniel Li, Hao Wan, Xinyang Yi, Ya Le, Chris Berg, Eric Bencomo Dixon, Ed H. Chi, Minmin Chen

Abstract: How might we design Reinforcement Learning (RL)-based recommenders that encourage aligning user trajectories with the underlying user satisfaction? Three research questions are key: (1) measuring user satisfaction, (2) combatting sparsity of satisfaction signals, and (3) adapting the training of the recommender agent to maximize satisfaction. For measurement, it has been found that surveys explici… ▽ More How might we design Reinforcement Learning (RL)-based recommenders that encourage aligning user trajectories with the underlying user satisfaction? Three research questions are key: (1) measuring user satisfaction, (2) combatting sparsity of satisfaction signals, and (3) adapting the training of the recommender agent to maximize satisfaction. For measurement, it has been found that surveys explicitly asking users to rate their experience with consumed items can provide valuable orthogonal information to the engagement/interaction data, acting as a proxy to the underlying user satisfaction. For sparsity, i.e, only being able to observe how satisfied users are with a tiny fraction of user-item interactions, imputation models can be useful in predicting satisfaction level for all items users have consumed. For learning satisfying recommender policies, we postulate that reward shaping in RL recommender agents is powerful for driving satisfying user experiences. Putting everything together, we propose to jointly learn a policy network and a satisfaction imputation network: The role of the imputation network is to learn which actions are satisfying to the user; while the policy network, built on top of REINFORCE, decides which items to recommend, with the reward utilizing the imputed satisfaction. We use both offline analysis and live experiments in an industrial large-scale recommendation platform to demonstrate the promise of our approach for satisfying user experiences. △ Less

Submitted 29 September, 2022; originally announced September 2022.

Comments: Accepted in Reinforcement Learning for Real Life (RL4RealLife) Workshop in the 38th International Conference on Machine Learning, 2021

arXiv:2209.11853 [pdf]

Multiplexed control of spin quantum memories in a photonic circuit

Authors: D. Andrew Golter, Genevieve Clark, Tareq El Dandachi, Stefan Krastanov, Andrew J. Leenheer, Noel H. Wan, Hamza Raniwala, Matthew Zimmermann, Mark Dong, Kevin C. Chen, Linsen Li, Matt Eichenfield, Gerald Gilbert, Dirk Englund

Abstract: A central goal in many quantum information processing applications is a network of quantum memories that can be entangled with each other while being individually controlled and measured with high fidelity. This goal has motivated the development of programmable photonic integrated circuits (PICs) with integrated spin quantum memories using diamond color center spin-photon interfaces. However, thi… ▽ More A central goal in many quantum information processing applications is a network of quantum memories that can be entangled with each other while being individually controlled and measured with high fidelity. This goal has motivated the development of programmable photonic integrated circuits (PICs) with integrated spin quantum memories using diamond color center spin-photon interfaces. However, this approach introduces a challenge in the microwave control of individual spins within closely packed registers. Here, we present a quantum-memory-integrated photonics platform capable of (i) the integration of multiple diamond color center spins into a cryogenically compatible, high-speed programmable PIC platform; (ii) selective manipulation of individual spin qubits addressed via tunable magnetic field gradients; and (iii) simultaneous control of multiple qubits using numerically optimized microwave pulse shaping. The combination of localized optical control, enabled by the PIC platform, together with selective spin manipulation opens the path to scalable quantum networks on intra-chip and inter-chip platforms. △ Less

Submitted 21 April, 2023; v1 submitted 23 September, 2022; originally announced September 2022.

Comments: 10 pages, 4 figures

arXiv:2208.09027 [pdf, other]

doi 10.1145/3511808.3557337

GraTO: Graph Neural Network Framework Tackling Over-smoothing with Neural Architecture Search

Authors: Xinshun Feng, Herun Wan, Shangbin Feng, Hongrui Wang, Jun Zhou, Qinghua Zheng, Minnan Luo

Abstract: Current Graph Neural Networks (GNNs) suffer from the over-smoothing problem, which results in indistinguishable node representations and low model performance with more GNN layers. Many methods have been put forward to tackle this problem in recent years. However, existing tackling over-smoothing methods emphasize model performance and neglect the over-smoothness of node representations. Additiona… ▽ More Current Graph Neural Networks (GNNs) suffer from the over-smoothing problem, which results in indistinguishable node representations and low model performance with more GNN layers. Many methods have been put forward to tackle this problem in recent years. However, existing tackling over-smoothing methods emphasize model performance and neglect the over-smoothness of node representations. Additional, different approaches are applied one at a time, while there lacks an overall framework to jointly leverage multiple solutions to the over-smoothing challenge. To solve these problems, we propose GraTO, a framework based on neural architecture search to automatically search for GNNs architecture. GraTO adopts a novel loss function to facilitate striking a balance between model performance and representation smoothness. In addition to existing methods, our search space also includes DropAttribute, a novel scheme for alleviating the over-smoothing challenge, to fully leverage diverse solutions. We conduct extensive experiments on six real-world datasets to evaluate GraTo, which demonstrates that GraTo outperforms baselines in the over-smoothing metrics and achieves competitive performance in accuracy. GraTO is especially effective and robust with increasing numbers of GNN layers. Further experiments bear out the quality of node representations learned with GraTO and the effectiveness of model architecture. We make cide of GraTo available at Github (\url{https://github.com/fxsxjtu/GraTO}). △ Less

Submitted 22 October, 2022; v1 submitted 18 August, 2022; originally announced August 2022.

Comments: accepted at CIKM2022

arXiv:2208.08320 [pdf, other]

BIC: Twitter Bot Detection with Text-Graph Interaction and Semantic Consistency

Authors: Zhenyu Lei, Herun Wan, Wenqian Zhang, Shangbin Feng, Zilong Chen, Jundong Li, Qinghua Zheng, Minnan Luo

Abstract: Twitter bots are automatic programs operated by malicious actors to manipulate public opinion and spread misinformation. Research efforts have been made to automatically identify bots based on texts and networks on social media. Existing methods only leverage texts or networks alone, and while few works explored the shallow combination of the two modalities, we hypothesize that the interaction and… ▽ More Twitter bots are automatic programs operated by malicious actors to manipulate public opinion and spread misinformation. Research efforts have been made to automatically identify bots based on texts and networks on social media. Existing methods only leverage texts or networks alone, and while few works explored the shallow combination of the two modalities, we hypothesize that the interaction and information exchange between texts and graphs could be crucial for holistically evaluating bot activities on social media. In addition, according to a recent survey (Cresci, 2020), Twitter bots are constantly evolving while advanced bots steal genuine users' tweets and dilute their malicious content to evade detection. This results in greater inconsistency across the timeline of novel Twitter bots, which warrants more attention. In light of these challenges, we propose BIC, a Twitter Bot detection framework with text-graph Interaction and semantic Consistency. Specifically, in addition to separately modeling the two modalities on social media, BIC employs a text-graph interaction module to enable information exchange across modalities in the learning process. In addition, given the stealing behavior of novel Twitter bots, BIC proposes to model semantic consistency in tweets based on attention weights while using it to augment the decision process. Extensive experiments demonstrate that BIC consistently outperforms state-of-the-art baselines on two widely adopted datasets. Further analyses reveal that text-graph interactions and modeling semantic consistency are essential improvements and help combat bot evolution. △ Less

Submitted 17 February, 2023; v1 submitted 17 August, 2022; originally announced August 2022.

arXiv:2207.14539 [pdf, other]

Pre-training General Trajectory Embeddings with Maximum Multi-view Entropy Coding

Authors: Yan Lin, Huaiyu Wan, Shengnan Guo, Jilin Hu, Christian S. Jensen, Youfang Lin

Abstract: Spatio-temporal trajectories provide valuable information about movement and travel behavior, enabling various downstream tasks that in turn power real-world applications. Learning trajectory embeddings can improve task performance but may incur high computational costs and face limited training data availability. Pre-training learns generic embeddings by means of specially constructed pretext tas… ▽ More Spatio-temporal trajectories provide valuable information about movement and travel behavior, enabling various downstream tasks that in turn power real-world applications. Learning trajectory embeddings can improve task performance but may incur high computational costs and face limited training data availability. Pre-training learns generic embeddings by means of specially constructed pretext tasks that enable learning from unlabeled data. Existing pre-training methods face (i) difficulties in learning general embeddings due to biases towards certain downstream tasks incurred by the pretext tasks, (ii) limitations in capturing both travel semantics and spatio-temporal correlations, and (iii) the complexity of long, irregularly sampled trajectories. To tackle these challenges, we propose Maximum Multi-view Trajectory Entropy Coding (MMTEC) for learning general and comprehensive trajectory embeddings. We introduce a pretext task that reduces biases in pre-trained trajectory embeddings, yielding embeddings that are useful for a wide variety of downstream tasks. We also propose an attention-based discrete encoder and a NeuralCDE-based continuous encoder that extract and represent travel behavior and continuous spatio-temporal correlations from trajectories in embeddings, respectively. Extensive experiments on two real-world datasets and three downstream tasks offer insight into the design properties of our proposal and indicate that it is capable of outperforming existing trajectory embedding methods. △ Less

Submitted 25 December, 2023; v1 submitted 29 July, 2022; originally announced July 2022.

Comments: 15 pages, 7 figures, accepted by IEEE Trans. on Knowledge and Data Engineering

arXiv:2207.07722 [pdf, other]

The Distribution of Error Terms of Smoothed Summatory Totient Functions

Authors: Sanjana Das, Hannah Lang, Hamilton Wan, Nancy Xu

Abstract: We consider the summatory function of the totient function after applications of a suitable smoothing operator and study the limiting behavior of the associated error term. Under several conditional assumptions, we show that the smoothed error term possesses a limiting logarithmic distribution through a framework consolidated by Akbary--Ng--Shahabi. To obtain this result, we prove a truncated vers… ▽ More We consider the summatory function of the totient function after applications of a suitable smoothing operator and study the limiting behavior of the associated error term. Under several conditional assumptions, we show that the smoothed error term possesses a limiting logarithmic distribution through a framework consolidated by Akbary--Ng--Shahabi. To obtain this result, we prove a truncated version of Perron's inversion formula for arbitrary Riesz typical means. We conclude with a conditional proof that at least two applications of the smoothing operator are necessary and sufficient to bound the growth of the error term by $\sqrt{x}$. △ Less

Submitted 15 July, 2022; originally announced July 2022.

Comments: 15 pages, 2 figures

MSC Class: 11N64; 11M26; 11N56

arXiv:2207.06486 [pdf, other]

Distributions of Hook Lengths Divisible by Two or Three

Authors: Hannah Lang, Hamilton Wan, Nancy Xu

Abstract: For fixed $t = 2$ or $3$, we investigate the statistical properties of $\{Y_t(n)\}$, the sequence of random variables corresponding to the number of hook lengths divisible by $t$ among the partitions of $n$. We characterize the support of $Y_t(n)$ and show, in accordance with empirical observations, that the support is vanishingly small for large $n$. Moreover, we demonstrate that the nonzero valu… ▽ More For fixed $t = 2$ or $3$, we investigate the statistical properties of $\{Y_t(n)\}$, the sequence of random variables corresponding to the number of hook lengths divisible by $t$ among the partitions of $n$. We characterize the support of $Y_t(n)$ and show, in accordance with empirical observations, that the support is vanishingly small for large $n$. Moreover, we demonstrate that the nonzero values of the mass functions of $Y_2(n)$ and $Y_3(n)$ approximate continuous functions. Finally, we prove that although the mass functions fail to converge, the cumulative distribution functions of $\{Y_2(n)\}$ and $\{Y_3(n)\}$ converge pointwise to shifted Gamma distributions, completing a characterization initiated by Griffin--Ono--Tsai for $t \geq 4$. △ Less

Submitted 11 November, 2022; v1 submitted 13 July, 2022; originally announced July 2022.

Comments: 20 pages, 11 figures. v2: Incorporates referee comments

MSC Class: 11P82; 05A17

arXiv:2207.00047 [pdf, ps, other]

The Distribution of $k$-Free Effective Divisors and the Summatory Totient Function in Function Fields

Authors: Sanjana Das, Hannah Lang, Hamilton Wan, Nancy Xu

Abstract: Motivated by the study of the summatory $k$-free indicator and totient functions in the classical setting, we investigate their function field analogues. First, we derive an expression for the error terms of the summatory functions in terms of the zeros of the associated zeta function. Under the Linear Independence hypothesis, we explicitly construct the limiting distributions of these error terms… ▽ More Motivated by the study of the summatory $k$-free indicator and totient functions in the classical setting, we investigate their function field analogues. First, we derive an expression for the error terms of the summatory functions in terms of the zeros of the associated zeta function. Under the Linear Independence hypothesis, we explicitly construct the limiting distributions of these error terms and compute the frequency with which they occur in an interval $[-β, β]$ for a real $β> 0$. We also show that these error terms are unbiased, that is, they are positive and negative equally often. Finally, we examine the average behavior of these error terms across families of hyperelliptic curves of fixed genus. We obtain these results by following a general framework initiated by Cha and Humphries. △ Less

Submitted 25 July, 2023; v1 submitted 30 June, 2022; originally announced July 2022.

Comments: 37 pages. v3: incorporates referee comments

arXiv:2206.04564 [pdf, other]

TwiBot-22: Towards Graph-Based Twitter Bot Detection

Authors: Shangbin Feng, Zhaoxuan Tan, Herun Wan, Ningnan Wang, Zilong Chen, Binchi Zhang, Qinghua Zheng, Wenqian Zhang, Zhenyu Lei, Shujie Yang, Xinshun Feng, Qingyue Zhang, Hongrui Wang, Yuhan Liu, Yuyang Bai, Heng Wang, Zijian Cai, Yanbo Wang, Lijing Zheng, Zihan Ma, Jundong Li, Minnan Luo

Abstract: Twitter bot detection has become an increasingly important task to combat misinformation, facilitate social media moderation, and preserve the integrity of the online discourse. State-of-the-art bot detection methods generally leverage the graph structure of the Twitter network, and they exhibit promising performance when confronting novel Twitter bots that traditional methods fail to detect. Howe… ▽ More Twitter bot detection has become an increasingly important task to combat misinformation, facilitate social media moderation, and preserve the integrity of the online discourse. State-of-the-art bot detection methods generally leverage the graph structure of the Twitter network, and they exhibit promising performance when confronting novel Twitter bots that traditional methods fail to detect. However, very few of the existing Twitter bot detection datasets are graph-based, and even these few graph-based datasets suffer from limited dataset scale, incomplete graph structure, as well as low annotation quality. In fact, the lack of a large-scale graph-based Twitter bot detection benchmark that addresses these issues has seriously hindered the development and evaluation of novel graph-based bot detection approaches. In this paper, we propose TwiBot-22, a comprehensive graph-based Twitter bot detection benchmark that presents the largest dataset to date, provides diversified entities and relations on the Twitter network, and has considerably better annotation quality than existing datasets. In addition, we re-implement 35 representative Twitter bot detection baselines and evaluate them on 9 datasets, including TwiBot-22, to promote a fair comparison of model performance and a holistic understanding of research progress. To facilitate further research, we consolidate all implemented codes and datasets into the TwiBot-22 evaluation framework, where researchers could consistently evaluate new models and datasets. The TwiBot-22 Twitter bot detection benchmark and evaluation framework are publicly available at https://twibot22.github.io/ △ Less

Submitted 12 February, 2023; v1 submitted 9 June, 2022; originally announced June 2022.

Comments: NeurIPS 2022, Datasets and Benchmarks Track

arXiv:2205.14748 [pdf, other]

Learning as Conversation: Dialogue Systems Reinforced for Information Acquisition

Authors: Pengshan Cai, Hui Wan, Fei Liu, Mo Yu, Hong Yu, Sachindra Joshi

Abstract: We propose novel AI-empowered chat bots for learning as conversation where a user does not read a passage but gains information and knowledge through conversation with a teacher bot. Our information-acquisition-oriented dialogue system employs a novel adaptation of reinforced self-play so that the system can be transferred to various domains without in-domain dialogue data, and can carry out conve… ▽ More We propose novel AI-empowered chat bots for learning as conversation where a user does not read a passage but gains information and knowledge through conversation with a teacher bot. Our information-acquisition-oriented dialogue system employs a novel adaptation of reinforced self-play so that the system can be transferred to various domains without in-domain dialogue data, and can carry out conversations both informative and attentive to users. Our extensive subjective and objective evaluations on three large public data corpora demonstrate the effectiveness of our system to deliver knowledge-intensive and attentive conversations and help end users substantially gain knowledge without reading passages. Our code and datasets are publicly available for follow-up research. △ Less

Submitted 29 May, 2022; originally announced May 2022.

Comments: 10 pages, accepted by NAACL 2022

arXiv:2205.14226 [pdf, other]

Fast and Light-Weight Answer Text Retrieval in Dialogue Systems

Authors: Hui Wan, Siva Sankalp Patel, J. William Murdock, Saloni Potdar, Sachindra Joshi

Abstract: Dialogue systems can benefit from being able to search through a corpus of text to find information relevant to user requests, especially when encountering a request for which no manually curated response is available. The state-of-the-art technology for neural dense retrieval or re-ranking involves deep learning models with hundreds of millions of parameters. However, it is difficult and expensiv… ▽ More Dialogue systems can benefit from being able to search through a corpus of text to find information relevant to user requests, especially when encountering a request for which no manually curated response is available. The state-of-the-art technology for neural dense retrieval or re-ranking involves deep learning models with hundreds of millions of parameters. However, it is difficult and expensive to get such models to operate at an industrial scale, especially for cloud services that often need to support a big number of individually customized dialogue systems, each with its own text corpus. We report our work on enabling advanced neural dense retrieval systems to operate effectively at scale on relatively inexpensive hardware. We compare with leading alternative industrial solutions and show that we can provide a solution that is effective, fast, and cost-efficient. △ Less

Submitted 31 May, 2022; v1 submitted 27 May, 2022; originally announced May 2022.

Comments: Accepted in NAACL-HLT 2022 Industry Track

arXiv:2203.06802 [pdf]

In-depth characterization and analysis of simple shear flows over regularly arranged micro pillars, II. Effect of pillar arrangement

Authors: Yanxing Wang, Hui Wan, Tie Wei, Fangjun Shu

Abstract: Through high-fidelity numerical simulation, the effect of the arrangement of micropillars on the flow characteristics and momentum transport has been extensively investigated. The surface friction due to the complex flow characteristics and momentum transport mechanism has also been studied in depth. The micropillars are arranged in a quadrilateral, and different arrangements are acquired by chang… ▽ More Through high-fidelity numerical simulation, the effect of the arrangement of micropillars on the flow characteristics and momentum transport has been extensively investigated. The surface friction due to the complex flow characteristics and momentum transport mechanism has also been studied in depth. The micropillars are arranged in a quadrilateral, and different arrangements are acquired by changing the streamwise and spanwise distances between pillar rows. The results show that the streamwise and spanwise pillar distances have their own different influences. When the streamwise pillar distance is small, the micro eddies in the gaps between the streamwise neighboring pillars are significantly suppressed. The increase in the spanwise pillar distance enhances the momentum transport from the flow above pillar array to the flow in the spaces among micro pillars. When the spanwise pillar distance is small, the micro eddies in the gaps between the streamwise neighboring pillars connect with each other and form a tubular eddy between each pair of spanwise pillar rows. The tubular eddies significantly reduce the momentum transport from the upper flow to the lower flow. The increase in the streamwise pillar distance increase the momentum flux slightly. The surface friction can be decomposed into three components which are associated with two factors, the dilution effect of the number density of micro pillars and the multi-faceted effects of micro eddies. These two factors are determined by the streamwise and spanwise pillar distances. The dependence of the total friction and its components on the pillar distances has been thoroughly examined. △ Less

Submitted 13 March, 2022; originally announced March 2022.

arXiv:2203.05607 [pdf]

In-depth characterization and analysis of simple shear flows over regularly arranged micro pillars, I. Effect of fluid inertia

Authors: Yanxing Wang, Hui Wan, Tie Wei, Fangjun Shu

Abstract: Through high-fidelity numerical simulation, the simple shear flow over regularly arranged micro pillars has been investigated. The essential issues to be addressed include the characteristics of a simple shear flow over quadrilateral array of micro pillars, the effect of fluid inertia on the basic flow pattern, and the decomposition of the complex surface friction. The results show that the flow i… ▽ More Through high-fidelity numerical simulation, the simple shear flow over regularly arranged micro pillars has been investigated. The essential issues to be addressed include the characteristics of a simple shear flow over quadrilateral array of micro pillars, the effect of fluid inertia on the basic flow pattern, and the decomposition of the complex surface friction. The results show that the flow is characterized by a series of microscale recirculating eddies in the gaps between the streamwise neighboring pillars. The recirculation of the micro eddies and the oscillation of the overhead flow climbing over the pillar tips create a local flow advection. At smaller Reynolds number, the fluid inertia is weak and the flow patterns are symmetrical about the pillar center. When the Reynolds number is sufficiently large, the fluid inertia takes effect and breaks the symmetrical patterns. The overhead flow tilts downward, forming a spiral long-range advection between the fluid flow above pillar array and the flow in the spaces among micro pillars. The local advection and long-range advection constitute the transport mechanism in wall-normal direction. On micro-structured walls, the total friction includes the reaction forces of micro pillars due to flow shear and flow pressure at pillar surfaces and the reaction force of bottom plane due to flow shear on bottom surface. For larger Reynolds numbers, fluid inertia prevents the fluid from flowing along the curved surface of micro pillars and reduces the equivalent shear stress of the pillar reaction force due to flow shear. At the same time, the fluid inertia makes the overhead flow impact the windward side of micro pillars more strongly and therefore increases the equivalent shear stress of the pillar reaction force due to flow pressure. △ Less

Submitted 10 March, 2022; originally announced March 2022.

arXiv:2203.05113 [pdf]

doi 10.1063/5.0094725

Enhancement of heat and mass transfer by herringbone microstructures in a simple shear flow

Authors: Yanxing Wang, Hui Wan, Tie Wei, John Abraham

Abstract: The heat and mass transfer characteristics in a simple shear flow over staggered herringbone structures are numerically investigated with the lattice Boltzmann method. Two flow motions are identified. The first is a spiral flow oscillation above the herringbone structures that advects heat and mass from the top plane to herringbone structures. The second is a flow recirculation in the grooves betw… ▽ More The heat and mass transfer characteristics in a simple shear flow over staggered herringbone structures are numerically investigated with the lattice Boltzmann method. Two flow motions are identified. The first is a spiral flow oscillation above the herringbone structures that advects heat and mass from the top plane to herringbone structures. The second is a flow recirculation in the grooves between herringbone ridges that advects heat and mass from the area around herringbone tips to the side walls of herringbone ridges and the bottom surfaces. These two basic flow motions couple together to form complex transport mechanisms. The results show that when advective heat and mass transfer takes effect at relatively larger Reynolds and Schmidt numbers, the dependence of the total transfer rate on the Schmidt number follows a power law, with the power being the same as that in the Dittus-Boelter equation for turbulent heat transfer. As Reynolds number increases, the dependence of the total transfer rate on Reynolds number also approaches a power law, and the power is close to that in the Dittus-Boelter equation. △ Less

Submitted 9 March, 2022; originally announced March 2022.

arXiv:2203.01414 [pdf, other]

ICARUS: A Specialized Architecture for Neural Radiance Fields Rendering

Authors: Chaolin Rao, Huangjie Yu, Haochuan Wan, Jindong Zhou, Yueyang Zheng, Yu Ma, Anpei Chen, Minye Wu, Binzhe Yuan, Pingqiang Zhou, Xin Lou, Jingyi Yu

Abstract: The practical deployment of Neural Radiance Fields (NeRF) in rendering applications faces several challenges, with the most critical one being low rendering speed on even high-end graphic processing units (GPUs). In this paper, we present ICARUS, a specialized accelerator architecture tailored for NeRF rendering. Unlike GPUs using general purpose computing and memory architectures for NeRF, ICARUS… ▽ More The practical deployment of Neural Radiance Fields (NeRF) in rendering applications faces several challenges, with the most critical one being low rendering speed on even high-end graphic processing units (GPUs). In this paper, we present ICARUS, a specialized accelerator architecture tailored for NeRF rendering. Unlike GPUs using general purpose computing and memory architectures for NeRF, ICARUS executes the complete NeRF pipeline using dedicated plenoptic cores (PLCore) consisting of a positional encoding unit (PEU), a multi-layer perceptron (MLP) engine, and a volume rendering unit (VRU). A PLCore takes in positions \& directions and renders the corresponding pixel colors without any intermediate data going off-chip for temporary storage and exchange, which can be time and power consuming. To implement the most expensive component of NeRF, i.e., the MLP, we transform the fully connected operations to approximated reconfigurable multiple constant multiplications (MCMs), where common subexpressions are shared across different multiplications to improve the computation efficiency. We build a prototype ICARUS using Synopsys HAPS-80 S104, a field programmable gate array (FPGA)-based prototyping system for large-scale integrated circuits and systems design. We evaluate the power-performance-area (PPA) of a PLCore using 40nm LP CMOS technology. Working at 400 MHz, a single PLCore occupies 16.5 $mm^2$ and consumes 282.8 mW, translating to 0.105 uJ/sample. The results are compared with those of GPU and tensor processing unit (TPU) implementations. △ Less

Submitted 26 September, 2022; v1 submitted 28 February, 2022; originally announced March 2022.

arXiv:2202.10850

How to Verify Identity in the Continuous Variable Quantum System?

Authors: Xing-Qiang Zhao, Hai Wan, Lv-Zhou Li

Abstract: Continuous variable quantum cryptography has developed rapidly in recent decades, but how to verify identity in the continuous variable quantum system is still an urgent issue to be solved. To solve this problem, we propose a continuous variable quantum identification (CV-QI) protocol based on the correlation of two-mode squeezed vacuum state and the continuous variable teleportation. The bidirect… ▽ More Continuous variable quantum cryptography has developed rapidly in recent decades, but how to verify identity in the continuous variable quantum system is still an urgent issue to be solved. To solve this problem, we propose a continuous variable quantum identification (CV-QI) protocol based on the correlation of two-mode squeezed vacuum state and the continuous variable teleportation. The bidirectional identity verification between two participants of the communication can be achieved by the proposed CV-QI protocol. In order to guarantee the security, we make full use of the decoy state sequences during the whole process of the proposed CV-QI protocol. Besides, we provide the security analyses of the proposed CV-QI protocol, and analyses indicate that the security of the proposed CV-QI protocol is guaranteed. △ Less

Submitted 16 March, 2022; v1 submitted 22 February, 2022; originally announced February 2022.

Comments: Further correction

arXiv:2111.00207 [pdf, other]

PatchFormer: An Efficient Point Transformer with Patch Attention

Authors: Zhang Cheng, Haocheng Wan, Xinyi Shen, Zizhao Wu

Abstract: The point cloud learning community witnesses a modeling shift from CNNs to Transformers, where pure Transformer architectures have achieved top accuracy on the major learning benchmarks. However, existing point Transformers are computationally expensive since they need to generate a large attention map, which has quadratic complexity (both in space and time) with respect to input size. To solve th… ▽ More The point cloud learning community witnesses a modeling shift from CNNs to Transformers, where pure Transformer architectures have achieved top accuracy on the major learning benchmarks. However, existing point Transformers are computationally expensive since they need to generate a large attention map, which has quadratic complexity (both in space and time) with respect to input size. To solve this shortcoming, we introduce Patch ATtention (PAT) to adaptively learn a much smaller set of bases upon which the attention maps are computed. By a weighted summation upon these bases, PAT not only captures the global shape context but also achieves linear complexity to input size. In addition, we propose a lightweight Multi-Scale aTtention (MST) block to build attentions among features of different scales, providing the model with multi-scale features. Equipped with the PAT and MST, we construct our neural architecture called PatchFormer that integrates both modules into a joint framework for point cloud learning. Extensive experiments demonstrate that our network achieves comparable accuracy on general point cloud learning tasks with 9.2x speed-up than previous point Transformers. △ Less

Submitted 24 March, 2022; v1 submitted 30 October, 2021; originally announced November 2021.

Comments: 10 pages

arXiv:2110.10007 [pdf, other]

doi 10.1016/j.artint.2022.103789

Gradient-Based Mixed Planning with Symbolic and Numeric Action Parameters

Authors: Kebing Jin, Hankz Hankui Zhuo, Zhanhao Xiao, Hai Wan, Subbarao Kambhampati

Abstract: Dealing with planning problems with both logical relations and numeric changes in real-world dynamic environments is challenging. Existing numeric planning systems for the problem often discretize numeric variables or impose convex constraints on numeric variables, which harms the performance when solving problems. In this paper, we propose a novel algorithm framework to solve numeric planning pro… ▽ More Dealing with planning problems with both logical relations and numeric changes in real-world dynamic environments is challenging. Existing numeric planning systems for the problem often discretize numeric variables or impose convex constraints on numeric variables, which harms the performance when solving problems. In this paper, we propose a novel algorithm framework to solve numeric planning problems mixed with logical relations and numeric changes based on gradient descent. We cast the numeric planning with logical relations and numeric changes as an optimization problem. Specifically, we extend syntax to allow parameters of action models to be either objects or real-valued numbers, which enhances the ability to model real-world numeric effects. Based on the extended modeling language, we propose a gradient-based framework to simultaneously optimize numeric parameters and compute appropriate actions to form candidate plans. The gradient-based framework is composed of an algorithmic heuristic module based on propositional operations to select actions and generate constraints for gradient descent, an algorithmic transition module to update states to next ones, and a loss module to compute loss. We repeatedly minimize loss by updating numeric parameters and compute candidate plans until it converges into a valid plan for the planning problem. In the empirical study, we exhibit that our algorithm framework is both effective and efficient in solving planning problems mixed with logical relations and numeric changes, especially when the problems contain obstacles and non-linear numeric effects. △ Less

Submitted 9 October, 2022; v1 submitted 19 October, 2021; originally announced October 2021.

Comments: 41 pages, 22 figures. Accepted by Artificial Intelligence

arXiv:2110.03772 [pdf, other]

CondiDiag1.0: A flexible online diagnostic tool for conditional sampling and budget analysis in the E3SM atmosphere model (EAM)

Authors: Hui Wan, Kai Zhang, Philip J. Rasch, Vincent E. Larson, Xubin Zeng, Shixuan Zhang, Ross Dixon

Abstract: Numerical models used in weather and climate prediction take into account a comprehensive set of atmospheric processes such as the resolved and unresolved fluid dynamics, radiative transfer, cloud and aerosol life cycles, and mass or energy exchanges with the Earth's surface. In order to identify model deficiencies and improve predictive skills, it is important to obtain process-level understandin… ▽ More Numerical models used in weather and climate prediction take into account a comprehensive set of atmospheric processes such as the resolved and unresolved fluid dynamics, radiative transfer, cloud and aerosol life cycles, and mass or energy exchanges with the Earth's surface. In order to identify model deficiencies and improve predictive skills, it is important to obtain process-level understanding of the interactions between different processes. Conditional sampling and budget analysis are powerful tools for process-oriented model evaluation, but they often require tedious ad hoc coding and large amounts of instantaneous model output, resulting in inefficient use of human and computing resources. This paper presents an online diagnostic tool that addresses this challenge by monitoring model variables in a generic manner as they evolve within the time integration cycle. The tool is convenient to use. It allows users to select sampling conditions and specify monitored variables at run time. Both the evolving values of the model variables and their increments caused by different atmospheric processes can be monitored and archived. Online calculation of vertical integrals is also supported. Multiple sampling conditions can be monitored in a single simulation in combination with unconditional sampling. The paper explains in detail the design and implementation of the tool in the Energy Exascale Earth System Model (E3SM) version 1. The usage is demonstrated through three examples: a global budget analysis of dust aerosol mass concentration, a composite analysis of sea salt emission and its dependency on surface wind speed, and a conditionally sampled relative humidity budget. The tool is expected to be easily portable to closely related atmospheric models that use the same or similar data structures and time integration methods. △ Less

Submitted 7 October, 2021; originally announced October 2021.

arXiv:2109.12595 [pdf, other]

doi 10.18653/v1/2021.emnlp-main.498

MultiDoc2Dial: Modeling Dialogues Grounded in Multiple Documents

Authors: Song Feng, Siva Sankalp Patel, Hui Wan, Sachindra Joshi

Abstract: We propose MultiDoc2Dial, a new task and dataset on modeling goal-oriented dialogues grounded in multiple documents. Most previous works treat document-grounded dialogue modeling as a machine reading comprehension task based on a single given document or passage. In this work, we aim to address more realistic scenarios where a goal-oriented information-seeking conversation involves multiple topics… ▽ More We propose MultiDoc2Dial, a new task and dataset on modeling goal-oriented dialogues grounded in multiple documents. Most previous works treat document-grounded dialogue modeling as a machine reading comprehension task based on a single given document or passage. In this work, we aim to address more realistic scenarios where a goal-oriented information-seeking conversation involves multiple topics, and hence is grounded on different documents. To facilitate such a task, we introduce a new dataset that contains dialogues grounded in multiple documents from four different domains. We also explore modeling the dialogue-based and document-based context in the dataset. We present strong baseline approaches and various experimental results, aiming to support further research efforts on such a task. △ Less

Submitted 26 September, 2021; originally announced September 2021.

Journal ref: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

arXiv:2109.04495 [pdf, other]

doi 10.3934/dcds.2022141

Slope Gap Distribution of Saddle Connections on the 2n-gon

Authors: Jonah Berman, Taylor McAdam, Ananth Miller-Murthy, Caglar Uyanik, Hamilton Wan

Abstract: We explicitly compute the limiting slope gap distribution for saddle connections on any 2n-gon. Our calculations show that the slope gap distribution for a translation surface is not always unimodal, answering a question of Athreya. We also give linear lower and upper bounds for number of non-differentiability points as n grows. The latter result exhibits the first example of a non-trivial bound o… ▽ More We explicitly compute the limiting slope gap distribution for saddle connections on any 2n-gon. Our calculations show that the slope gap distribution for a translation surface is not always unimodal, answering a question of Athreya. We also give linear lower and upper bounds for number of non-differentiability points as n grows. The latter result exhibits the first example of a non-trivial bound on an infinite family of translation surfaces and answers a question by Kumanduri-Sanchez-Wang. △ Less

Submitted 18 June, 2022; v1 submitted 9 September, 2021; originally announced September 2021.

Comments: 51 pages, 34 figures. v2: Incorporates referee comments, improves upper bound and provides high level summaries at various points

Journal ref: Discrete and Continuous Dynamical Systems, 2023, 43(1): 1-56

arXiv:2109.02776 [pdf, other]

Net Buying Pressure and the Information in Bitcoin Option Trades

Authors: Carol Alexander, Jun Deng, Jianfen Feng, Huning Wan

Abstract: How do supply and demand from informed traders drive market prices of bitcoin options? Deribit options tick-level data supports the limits-to-arbitrage hypothesis about the market maker's supply. The main demand-side effects are that at-the-money option prices are largely driven by volatility traders and out-of-the-money options are simultaneously driven by volatility traders and those with propri… ▽ More How do supply and demand from informed traders drive market prices of bitcoin options? Deribit options tick-level data supports the limits-to-arbitrage hypothesis about the market maker's supply. The main demand-side effects are that at-the-money option prices are largely driven by volatility traders and out-of-the-money options are simultaneously driven by volatility traders and those with proprietary information about the direction of future bitcoin price movements. The demand-side trading results contrast with prior studies on established options markets in the US and Asia, but we also show that Deribit is rapidly evolving into a more efficient channel for aggregating information from informed traders. △ Less

Submitted 25 March, 2022; v1 submitted 6 September, 2021; originally announced September 2021.

Comments: 35 pages

arXiv:2109.01993 [pdf, ps, other]

Statistical computation methods for microbiome compositional data network inference

Authors: Liang Chen, Qiuyan He, Hui Wan, Shun He, Minghua Deng

Abstract: Microbes can affect processes from food production to human health. Such microbes are not isolated, but rather interact with each other and establish connections with their living environments. Understanding these interactions is essential to an understanding of the organization and complex interplay of microbial communities, as well as the structure and dynamics of various ecosystems. A common an… ▽ More Microbes can affect processes from food production to human health. Such microbes are not isolated, but rather interact with each other and establish connections with their living environments. Understanding these interactions is essential to an understanding of the organization and complex interplay of microbial communities, as well as the structure and dynamics of various ecosystems. A common and essential approach toward this objective involves the inference of microbiome interaction networks. Although network inference methods in other fields have been studied before, applying these methods to estimate microbiome associations based on compositional data will not yield valid results. On the one hand, features of microbiome data such as compositionality, sparsity and high-dimensionality challenge the data normalization and the design of computational methods. On the other hand, several issues like microbial community heterogeneity, external environmental interference and biological concerns also make it more difficult to deal with the network inference. In this paper, we provide a comprehensive review of emerging microbiome interaction network inference methods. According to various assumptions and research targets, estimated networks are divided into four main categories: correlation networks, conditional correlation networks, mixture networks and differential networks. Their scope of applications, advantages and limitations are presented in this review. Since real microbial interactions can be complex and dynamic, no unifying method has captured all the aspects of interest to date. In addition, we discuss the challenges now confronting current microbial associations study and future prospects. Finally, we highlight that the research in microbial network inference requires the joint promotion of statistical computation methods and experimental techniques. △ Less

Submitted 5 September, 2021; originally announced September 2021.

arXiv:2108.06076 [pdf, other]

PVT: Point-Voxel Transformer for Point Cloud Learning

Authors: Cheng Zhang, Haocheng Wan, Xinyi Shen, Zizhao Wu

Abstract: The recently developed pure Transformer architectures have attained promising accuracy on point cloud learning benchmarks compared to convolutional neural networks. However, existing point cloud Transformers are computationally expensive since they waste a significant amount of time on structuring the irregular data. To solve this shortcoming, we present Sparse Window Attention (SWA) module to gat… ▽ More The recently developed pure Transformer architectures have attained promising accuracy on point cloud learning benchmarks compared to convolutional neural networks. However, existing point cloud Transformers are computationally expensive since they waste a significant amount of time on structuring the irregular data. To solve this shortcoming, we present Sparse Window Attention (SWA) module to gather coarse-grained local features from non-empty voxels, which not only bypasses the expensive irregular data structuring and invalid empty voxel computation, but also obtains linear computational complexity with respect to voxel resolution. Meanwhile, to gather fine-grained features about the global shape, we introduce relative attention (RA) module, a more robust self-attention variant for rigid transformations of objects. Equipped with the SWA and RA, we construct our neural architecture called PVT that integrates both modules into a joint framework for point cloud learning. Compared with previous Transformer-based and attention-based models, our method attains top accuracy of 94.0% on classification benchmark and 10x inference speedup on average. Extensive experiments also valid the effectiveness of PVT on part and semantic segmentation benchmarks (86.6% and 69.2% mIoU, respectively). △ Less

Submitted 25 May, 2022; v1 submitted 13 August, 2021; originally announced August 2021.

Comments: 29 pages

arXiv:2106.13092 [pdf, other]

doi 10.1145/3487351.3488336

BotRGCN: Twitter Bot Detection with Relational Graph Convolutional Networks

Authors: Shangbin Feng, Herun Wan, Ningnan Wang, Minnan Luo

Abstract: Twitter bot detection is an important and challenging task. Existing bot detection measures fail to address the challenge of community and disguise, falling short of detecting bots that disguise as genuine users and attack collectively. To address these two challenges of Twitter bot detection, we propose BotRGCN, which is short for Bot detection with Relational Graph Convolutional Networks. BotRGC… ▽ More Twitter bot detection is an important and challenging task. Existing bot detection measures fail to address the challenge of community and disguise, falling short of detecting bots that disguise as genuine users and attack collectively. To address these two challenges of Twitter bot detection, we propose BotRGCN, which is short for Bot detection with Relational Graph Convolutional Networks. BotRGCN addresses the challenge of community by constructing a heterogeneous graph from follow relationships and applies relational graph convolutional networks. Apart from that, BotRGCN makes use of multi-modal user semantic and property information to avoid feature engineering and augment its ability to capture bots with diversified disguise. Extensive experiments demonstrate that BotRGCN outperforms competitive baselines on a comprehensive benchmark TwiBot-20 which provides follow relationships. △ Less

Submitted 27 September, 2021; v1 submitted 24 June, 2021; originally announced June 2021.

Comments: accepted at ASONAM 2021

arXiv:2106.13089 [pdf, other]

doi 10.1145/3459637.3481949

SATAR: A Self-supervised Approach to Twitter Account Representation Learning and its Application in Bot Detection

Authors: Shangbin Feng, Herun Wan, Ningnan Wang, Jundong Li, Minnan Luo

Abstract: Twitter has become a major social media platform since its launching in 2006, while complaints about bot accounts have increased recently. Although extensive research efforts have been made, the state-of-the-art bot detection methods fall short of generalizability and adaptability. Specifically, previous bot detectors leverage only a small fraction of user information and are often trained on data… ▽ More Twitter has become a major social media platform since its launching in 2006, while complaints about bot accounts have increased recently. Although extensive research efforts have been made, the state-of-the-art bot detection methods fall short of generalizability and adaptability. Specifically, previous bot detectors leverage only a small fraction of user information and are often trained on datasets that only cover few types of bots. As a result, they fail to generalize to real-world scenarios on the Twittersphere where different types of bots co-exist. Additionally, bots in Twitter are constantly evolving to evade detection. Previous efforts, although effective once in their context, fail to adapt to new generations of Twitter bots. To address the two challenges of Twitter bot detection, we propose SATAR, a self-supervised representation learning framework of Twitter users, and apply it to the task of bot detection. In particular, SATAR generalizes by jointly leveraging the semantics, property and neighborhood information of a specific user. Meanwhile, SATAR adapts by pre-training on a massive number of self-supervised users and fine-tuning on detailed bot detection scenarios. Extensive experiments demonstrate that SATAR outperforms competitive baselines on different bot detection datasets of varying information completeness and collection time. SATAR is also proved to generalize in real-world scenarios and adapt to evolving generations of social media bots. △ Less

Submitted 27 August, 2021; v1 submitted 24 June, 2021; originally announced June 2021.

Comments: accepted at CIKM 2021

arXiv:2106.13088 [pdf, other]

doi 10.1145/3459637.3482019

TwiBot-20: A Comprehensive Twitter Bot Detection Benchmark

Authors: Shangbin Feng, Herun Wan, Ningnan Wang, Jundong Li, Minnan Luo

Abstract: Twitter has become a vital social media platform while an ample amount of malicious Twitter bots exist and induce undesirable social effects. Successful Twitter bot detection proposals are generally supervised, which rely heavily on large-scale datasets. However, existing benchmarks generally suffer from low levels of user diversity, limited user information and data scarcity. Therefore, these dat… ▽ More Twitter has become a vital social media platform while an ample amount of malicious Twitter bots exist and induce undesirable social effects. Successful Twitter bot detection proposals are generally supervised, which rely heavily on large-scale datasets. However, existing benchmarks generally suffer from low levels of user diversity, limited user information and data scarcity. Therefore, these datasets are not sufficient to train and stably benchmark bot detection measures. To alleviate these problems, we present TwiBot-20, a massive Twitter bot detection benchmark, which contains 229,573 users, 33,488,192 tweets, 8,723,736 user property items and 455,958 follow relationships. TwiBot-20 covers diversified bots and genuine users to better represent the real-world Twittersphere. TwiBot-20 also includes three modals of user information to support both binary classification of single users and community-aware approaches. To the best of our knowledge, TwiBot-20 is the largest Twitter bot detection benchmark to date. We reproduce competitive bot detection methods and conduct a thorough evaluation on TwiBot-20 and two other public datasets. Experiment results demonstrate that existing bot detection measures fail to match their previously claimed performance on TwiBot-20, which suggests that Twitter bot detection remains a challenging task and requires further research efforts. △ Less

Submitted 27 August, 2021; v1 submitted 24 June, 2021; originally announced June 2021.

Comments: accepted at CIKM 2021

arXiv:2106.10837 [pdf]

doi 10.1021/acs.nanolett.3c04104

Electrochemical control of ferroelectricity in hafnia-based ferroelectric devices using reversible oxygen migration

Authors: M. H. Shao, H. F. Liu, R. He, X. M. Li, L. Wu, J. Ma, X. C. Hu, R. T. Zhao, Z. C. Zhong, Y. Yu, C. H. Wan, Y. Yang, C. -W. Nan, X. D. Bai, T. -L. Ren, X. Renshaw Wang

Abstract: Ferroelectricity, especially in hafnia-based thin films at nanosizes, has been rejuvenated in the fields of low-power, nonvolatile and Si-compatible modern memory and logic applications. Despite tremendous efforts to explore the formation of the metastable ferroelectric phase and the polarization degradation during field cycling, the ability of oxygen vacancy to exactly engineer and switch polariz… ▽ More Ferroelectricity, especially in hafnia-based thin films at nanosizes, has been rejuvenated in the fields of low-power, nonvolatile and Si-compatible modern memory and logic applications. Despite tremendous efforts to explore the formation of the metastable ferroelectric phase and the polarization degradation during field cycling, the ability of oxygen vacancy to exactly engineer and switch polarization remains to be elucidated. Here we report reversibly electrochemical control of ferroelectricity in Hf$_{0.5}$Zr$_{0.5}$O$_2$ (HZO) heterostructures with a mixed ionic-electronic LaSrMnO$_3$ electrode, achieving a hard breakdown field more than 18 MV/cm, over fourfold as high as that of typical HZO. The electrical extraction and insertion of oxygen into HZO is macroscopically characterized and atomically imaged in situ. Utilizing this reversible process, we achieved multiple polarization states and even repeatedly repaired the damaged ferroelectricity by reversed negative electric fields. Our study demonstrates the robust and switchable ferroelectricity in hafnia oxide distinctly associated with oxygen vacancy and opens up opportunities to recover, manipulate, and utilize rich ferroelectric functionalities for advanced ferroelectric functionality to empower the existing Si-based electronics such as multi-bit storage. △ Less

Submitted 20 June, 2021; originally announced June 2021.

arXiv:2105.04341 [pdf, other]

doi 10.1063/5.0056534

Quantum networks based on color centers in diamond

Authors: Maximilian Ruf, Noel H. Wan, Hyeongrak Choi, Dirk Englund, Ronald Hanson

Abstract: With the ability to transfer and process quantum information, large-scale quantum networks will enable a suite of fundamentally new applications, from quantum communications to distributed sensing, metrology, and computing. This perspective reviews requirements for quantum network nodes and color centers in diamond as suitable node candidates. We give a brief overview of state-of-the-art quantum n… ▽ More With the ability to transfer and process quantum information, large-scale quantum networks will enable a suite of fundamentally new applications, from quantum communications to distributed sensing, metrology, and computing. This perspective reviews requirements for quantum network nodes and color centers in diamond as suitable node candidates. We give a brief overview of state-of-the-art quantum network experiments employing color centers in diamond, and discuss future research directions, focusing in particular on the control and coherence of qubits that distribute and store entangled states, and on efficient spin-photon interfaces. We discuss a route towards large-scale integrated devices combining color centers in diamond with other photonic materials and give an outlook towards realistic future quantum network protocol implementations and applications. △ Less

Submitted 10 May, 2021; originally announced May 2021.

Comments: This is the draft text of an invited perspective article. We appreciate comments and suggestions to improve the work

arXiv:2104.05666 [pdf, other]

View-Guided Point Cloud Completion

Authors: Xuancheng Zhang, Yutong Feng, Siqi Li, Changqing Zou, Hai Wan, Xibin Zhao, Yandong Guo, Yue Gao

Abstract: This paper presents a view-guided solution for the task of point cloud completion. Unlike most existing methods directly inferring the missing points using shape priors, we address this task by introducing ViPC (view-guided point cloud completion) that takes the missing crucial global structure information from an extra single-view image. By leveraging a framework that sequentially performs effect… ▽ More This paper presents a view-guided solution for the task of point cloud completion. Unlike most existing methods directly inferring the missing points using shape priors, we address this task by introducing ViPC (view-guided point cloud completion) that takes the missing crucial global structure information from an extra single-view image. By leveraging a framework that sequentially performs effective cross-modality and cross-level fusions, our method achieves significantly superior results over typical existing solutions on a new large-scale dataset we collect for the view-guided point cloud completion task. △ Less

Submitted 13 April, 2021; v1 submitted 12 April, 2021; originally announced April 2021.

Comments: 10 pages, 8 figures, CVPR2021

arXiv:2104.04488 [pdf, other]

Explaining Neural Network Predictions on Sentence Pairs via Learning Word-Group Masks

Authors: Hanjie Chen, Song Feng, Jatin Ganhotra, Hui Wan, Chulaka Gunasekara, Sachindra Joshi, Yangfeng Ji

Abstract: Explaining neural network models is important for increasing their trustworthiness in real-world applications. Most existing methods generate post-hoc explanations for neural network models by identifying individual feature attributions or detecting interactions between adjacent features. However, for models with text pairs as inputs (e.g., paraphrase identification), existing methods are not suff… ▽ More Explaining neural network models is important for increasing their trustworthiness in real-world applications. Most existing methods generate post-hoc explanations for neural network models by identifying individual feature attributions or detecting interactions between adjacent features. However, for models with text pairs as inputs (e.g., paraphrase identification), existing methods are not sufficient to capture feature interactions between two texts and their simple extension of computing all word-pair interactions between two texts is computationally inefficient. In this work, we propose the Group Mask (GMASK) method to implicitly detect word correlations by grouping correlated words from the input text pair together and measure their contribution to the corresponding NLP tasks as a whole. The proposed method is evaluated with two different model architectures (decomposable attention model and BERT) across four datasets, including natural language inference and paraphrase identification tasks. Experiments show the effectiveness of GMASK in providing faithful explanations to these models. △ Less

Submitted 13 April, 2021; v1 submitted 9 April, 2021; originally announced April 2021.

Comments: NAACL-HLT 2021

arXiv:2103.02384 [pdf, ps, other]

How to Identify Boundary Conditions with Contrasty Metric?

Authors: Weilin Luo, Hai Wan, Xiaotong Song, Binhao Yang, Hongzhen Zhong, Yin Chen

Abstract: The boundary conditions (BCs) have shown great potential in requirements engineering because a BC captures the particular combination of circumstances, i.e., divergence, in which the goals of the requirement cannot be satisfied as a whole. Existing researches have attempted to automatically identify lots of BCs. Unfortunately, a large number of identified BCs make assessing and resolving divergenc… ▽ More The boundary conditions (BCs) have shown great potential in requirements engineering because a BC captures the particular combination of circumstances, i.e., divergence, in which the goals of the requirement cannot be satisfied as a whole. Existing researches have attempted to automatically identify lots of BCs. Unfortunately, a large number of identified BCs make assessing and resolving divergences expensive. Existing methods adopt a coarse-grained metric, generality, to filter out less general BCs. However, the results still retain a large number of redundant BCs since a general BC potentially captures redundant circumstances that do not lead to a divergence. Furthermore, the likelihood of BC can be misled by redundant BCs resulting in costly repeatedly assessing and resolving divergences. In this paper, we present a fine-grained metric to filter out the redundant BCs. We first introduce the concept of contrasty of BC. Intuitively, if two BCs are contrastive, they capture different divergences. We argue that a set of contrastive BCs should be recommended to engineers, rather than a set of general BCs that potentially only indicates the same divergence. Then we design a post-processing framework (PPAc) to produce a set of contrastive BCs after identifying BCs. Experimental results show that the contrasty metric dramatically reduces the number of BCs recommended to engineers. Results also demonstrate that lots of BCs identified by the state-of-the-art method are redundant in most cases. Besides, to improve efficiency, we propose a joint framework (JAc) to interleave assessing based on the contrasty metric with identifying BCs. The primary intuition behind JAc is that it considers the search bias toward contrastive BCs during identifying BCs, thereby pruning the BCs capturing the same divergence. Experiments confirm the improvements of JAc in identifying contrastive BCs. △ Less

Submitted 3 March, 2021; originally announced March 2021.

Comments: to be published in ICSE21

arXiv:2102.11482 [pdf, other]

Structural Similarity of Boundary Conditions and an Efficient Local Search Algorithm for Goal Conflict Identification

Authors: Hongzhen Zhong, Hai Wan, Weilin Luo, Zhanhao Xiao, Jia Li, Biqing Fang

Abstract: In goal-oriented requirements engineering, goal conflict identification is of fundamental importance for requirements analysis. The task aims to find the feasible situations which make the goals diverge within the domain, called boundary conditions (BCs). However, the existing approaches for goal conflict identification fail to find sufficient BCs and general BCs which cover more combinations of c… ▽ More In goal-oriented requirements engineering, goal conflict identification is of fundamental importance for requirements analysis. The task aims to find the feasible situations which make the goals diverge within the domain, called boundary conditions (BCs). However, the existing approaches for goal conflict identification fail to find sufficient BCs and general BCs which cover more combinations of circumstances. From the BCs found by these existing approaches, we have observed an interesting phenomenon that there are some pairs of BCs are similar in formula structure, which occurs frequently in the experimental cases. In other words, once a BC is found, a new BC may be discovered quickly by slightly changing the former. It inspires us to develop a local search algorithm named LOGION to find BCs, in which the structural similarity is captured by the neighborhood relation of formulae. Based on structural similarity, LOGION can find a lot of BCs in a short time. Moreover, due to the large number of BCs identified, it potentially selects more general BCs from them. By taking experiments on a set of cases, we show that LOGION effectively exploits the structural similarity of BCs. We also compare our algorithm against the two state-of-the-art approaches. The experimental results show that LOGION produces one order of magnitude more BCs than the state-of-the-art approaches and confirm that LOGION finds out more general BCs thanks to a large number of BCs. △ Less

Submitted 22 February, 2021; originally announced February 2021.

arXiv:2011.06623 [pdf, other]

doc2dial: A Goal-Oriented Document-Grounded Dialogue Dataset

Authors: Song Feng, Hui Wan, Chulaka Gunasekara, Siva Sankalp Patel, Sachindra Joshi, Luis A. Lastras

Abstract: We introduce doc2dial, a new dataset of goal-oriented dialogues that are grounded in the associated documents. Inspired by how the authors compose documents for guiding end users, we first construct dialogue flows based on the content elements that corresponds to higher-level relations across text sections as well as lower-level relations between discourse units within a section. Then we present t… ▽ More We introduce doc2dial, a new dataset of goal-oriented dialogues that are grounded in the associated documents. Inspired by how the authors compose documents for guiding end users, we first construct dialogue flows based on the content elements that corresponds to higher-level relations across text sections as well as lower-level relations between discourse units within a section. Then we present these dialogue flows to crowd contributors to create conversational utterances. The dataset includes about 4800 annotated conversations with an average of 14 turns that are grounded in over 480 documents from four domains. Compared to the prior document-grounded dialogue datasets, this dataset covers a variety of dialogue scenes in information-seeking conversations. For evaluating the versatility of the dataset, we introduce multiple dialogue modeling tasks and present baseline approaches. △ Less

Submitted 18 November, 2020; v1 submitted 12 November, 2020; originally announced November 2020.

Comments: EMNLP 2020

arXiv:2011.03956 [pdf]

Continuous nucleation switching driven by spin-orbit torques

Authors: C. H. Wan, M. E. Stebliy, X. Wang, G. Q. Yu, X. F. Han, A. G. Kolesnikov, M. A. Bazrov, M. E. Letushev, A. V. Ognev, A. S. Samardak

Abstract: Continuous switching driven by spin-orbit torque (SOT) is preferred to realize neuromorphic computing in a spintronic manner. Here we have applied focused ion beam (FIB) to selectively illuminate patterned regions in a Pt/Co/MgO strip with perpendicular magnetic anisotropy (PMA), soften the illuminated areas and realize the continuous switching by a SOT-driven nucleation process. It is found that… ▽ More Continuous switching driven by spin-orbit torque (SOT) is preferred to realize neuromorphic computing in a spintronic manner. Here we have applied focused ion beam (FIB) to selectively illuminate patterned regions in a Pt/Co/MgO strip with perpendicular magnetic anisotropy (PMA), soften the illuminated areas and realize the continuous switching by a SOT-driven nucleation process. It is found that a large in-plane field is a benefit to reduce the nucleation barrier, increase the number of nucleated domains and intermediate states during the switching progress, and finally flatten the switching curve. We proposed a phenomenological model for descripting the current dependence of magnetization and the dependence of the number of nucleation domains on the applied current and magnetic field. This study can thus promote the birth of SOT devices, which are promising in neuromorphic computing architectures. △ Less

Submitted 8 November, 2020; originally announced November 2020.

Comments: 12 pages with 3 figures

arXiv:2010.07479 [pdf, other]

Quantifying and attributing time step sensitivities in present-day climate simulations conducted with EAMv1

Authors: Hui Wan, Shixuan Zhang, Philip J. Rasch, Vincent E. Larson, Xubin Zeng, Huiping Yan

Abstract: This study assesses the relative importance of time integration error in present-day climate simulations conducted with the atmosphere component of the Energy Exascale Earth System Model version 1 (EAMv1) at 1-degree horizontal resolution. We show that a factor-of-6 reduction of time step size in all major parts of the model leads to significant changes in the long-term mean climate. These changes… ▽ More This study assesses the relative importance of time integration error in present-day climate simulations conducted with the atmosphere component of the Energy Exascale Earth System Model version 1 (EAMv1) at 1-degree horizontal resolution. We show that a factor-of-6 reduction of time step size in all major parts of the model leads to significant changes in the long-term mean climate. These changes imply that the reduction of temporal truncation errors leads to a notable although unsurprising degradation of agreement between the simulated and observed present-day climate; the model would require retuning to regain optimal climate fidelity in the absence of those truncation errors. A coarse-grained attribution of the time step sensitivities is carried out by separately shortening time steps used in various components of EAM or by revising the numerical coupling between some processes. The results provide useful clues to help better understand the root causes of time step sensitivities in EAM. The experimentation strategy used here can also provide a pathway for other models to identify and reduce time integration errors. △ Less

Submitted 28 February, 2021; v1 submitted 14 October, 2020; originally announced October 2020.

arXiv:2007.12497 [pdf, other]

Advanced Mapping Robot and High-Resolution Dataset

Authors: Hongyu Chen, Zhijie Yang, Xiting Zhao, Guangyuan Weng, Haochuan Wan, Jianwen Luo, Xiaoya Ye, Zehao Zhao, Zhenpeng He, Yongxia Shen, Sören Schwertfeger

Abstract: This paper presents a fully hardware synchronized mapping robot with support for a hardware synchronized external tracking system, for super-precise timing and localization. Nine high-resolution cameras and two 32-beam 3D Lidars were used along with a professional, static 3D scanner for ground truth map collection. With all the sensors calibrated on the mapping robot, three datasets are collected… ▽ More This paper presents a fully hardware synchronized mapping robot with support for a hardware synchronized external tracking system, for super-precise timing and localization. Nine high-resolution cameras and two 32-beam 3D Lidars were used along with a professional, static 3D scanner for ground truth map collection. With all the sensors calibrated on the mapping robot, three datasets are collected to evaluate the performance of mapping algorithms within a room and between rooms. Based on these datasets we generate maps and trajectory data, which is then fed into evaluation algorithms. We provide the datasets for download and the mapping and evaluation procedures are made in a very easily reproducible manner for maximum comparability. We have also conducted a survey on available robotics-related datasets and compiled a big table with those datasets and a number of properties of them. △ Less

Submitted 23 July, 2020; originally announced July 2020.

Comments: arXiv admin note: substantial text overlap with arXiv:1905.09483

Showing 51–100 of 172 results for author: Wan, H