Zum Hauptinhalt springen

Showing 1–14 of 14 results for author: Lou, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.11880  [pdf, other

    cs.LG cs.AI cs.CL cs.CV

    Quantifying In-Context Reasoning Effects and Memorization Effects in LLMs

    Authors: Siyu Lou, Yuntian Chen, Xiaodan Liang, Liang Lin, Quanshi Zhang

    Abstract: In this study, we propose an axiomatic system to define and quantify the precise memorization and in-context reasoning effects used by the large language model (LLM) for language generation. These effects are formulated as non-linear interactions between tokens/words encoded by the LLM. Specifically, the axiomatic system enables us to categorize the memorization effects into foundational memorizat… ▽ More

    Submitted 20 May, 2024; originally announced May 2024.

  2. arXiv:2401.16318  [pdf, other

    cs.LG cs.AI cs.CV

    Defining and Extracting generalizable interaction primitives from DNNs

    Authors: Lu Chen, Siyu Lou, Benhao Huang, Quanshi Zhang

    Abstract: Faithfully summarizing the knowledge encoded by a deep neural network (DNN) into a few symbolic primitive patterns without losing much information represents a core challenge in explainable AI. To this end, Ren et al. (2023c) have derived a series of theorems to prove that the inference score of a DNN can be explained as a small set of interactions between input variables. However, the lack of gen… ▽ More

    Submitted 29 January, 2024; originally announced January 2024.

  3. arXiv:2401.13904  [pdf, other

    cs.LG cs.AI cs.DB stat.AP

    Empowering Machines to Think Like Chemists: Unveiling Molecular Structure-Polarity Relationships with Hierarchical Symbolic Regression

    Authors: Siyu Lou, Chengchun Liu, Yuntian Chen, Fanyang Mo

    Abstract: Thin-layer chromatography (TLC) is a crucial technique in molecular polarity analysis. Despite its importance, the interpretability of predictive models for TLC, especially those driven by artificial intelligence, remains a challenge. Current approaches, utilizing either high-dimensional molecular fingerprints or domain-knowledge-driven feature engineering, often face a dilemma between expressiven… ▽ More

    Submitted 24 January, 2024; originally announced January 2024.

    Comments: 33 pages, 6 figures

  4. arXiv:2310.16376  [pdf, other

    cs.LG cs.AI

    GADY: Unsupervised Anomaly Detection on Dynamic Graphs

    Authors: Shiqi Lou, Qingyue Zhang, Shujie Yang, Yuyang Tian, Zhaoxuan Tan, Minnan Luo

    Abstract: Anomaly detection on dynamic graphs refers to detecting entities whose behaviors obviously deviate from the norms observed within graphs and their temporal information. This field has drawn increasing attention due to its application in finance, network security, social networks, and more. However, existing methods face two challenges: dynamic structure constructing challenge - difficulties in cap… ▽ More

    Submitted 25 October, 2023; originally announced October 2023.

  5. arXiv:2310.16251  [pdf, other

    cs.CL cs.AI

    Speakerly: A Voice-based Writing Assistant for Text Composition

    Authors: Dhruv Kumar, Vipul Raheja, Alice Kaiser-Schatzlein, Robyn Perry, Apurva Joshi, Justin Hugues-Nuger, Samuel Lou, Navid Chowdhury

    Abstract: We present Speakerly, a new real-time voice-based writing assistance system that helps users with text composition across various use cases such as emails, instant messages, and notes. The user can interact with the system through instructions or dictation, and the system generates a well-formatted and coherent document. We describe the system architecture and detail how we address the various cha… ▽ More

    Submitted 24 October, 2023; originally announced October 2023.

    Comments: Accepted at EMNLP 2023 Industry Track

  6. arXiv:2310.09725  [pdf, other

    cs.CL

    KGQuiz: Evaluating the Generalization of Encoded Knowledge in Large Language Models

    Authors: Yuyang Bai, Shangbin Feng, Vidhisha Balachandran, Zhaoxuan Tan, Shiqi Lou, Tianxing He, Yulia Tsvetkov

    Abstract: Large language models (LLMs) demonstrate remarkable performance on knowledge-intensive tasks, suggesting that real-world knowledge is encoded in their model parameters. However, besides explorations on a few probing tasks in limited knowledge domains, it is not well understood how to evaluate LLMs' knowledge systematically and how well their knowledge abilities generalize, across a spectrum of kno… ▽ More

    Submitted 23 March, 2024; v1 submitted 15 October, 2023; originally announced October 2023.

    Comments: TheWebConf 2024

  7. arXiv:2309.07672  [pdf

    cs.LG math.NA stat.AP

    Physics-constrained robust learning of open-form partial differential equations from limited and noisy data

    Authors: Mengge Du, Yuntian Chen, Longfeng Nie, Siyu Lou, Dongxiao Zhang

    Abstract: Unveiling the underlying governing equations of nonlinear dynamic systems remains a significant challenge. Insufficient prior knowledge hinders the determination of an accurate candidate library, while noisy observations lead to imprecise evaluations, which in turn result in redundant function terms or erroneous equations. This study proposes a framework to robustly uncover open-form partial diffe… ▽ More

    Submitted 29 April, 2024; v1 submitted 14 September, 2023; originally announced September 2023.

  8. arXiv:2306.03516  [pdf, other

    cs.IR cs.LG

    COPR: Consistency-Oriented Pre-Ranking for Online Advertising

    Authors: Zhishan Zhao, Jingyue Gao, Yu Zhang, Shuguang Han, Siyuan Lou, Xiang-Rong Sheng, Zhe Wang, Han Zhu, Yuning Jiang, Jian Xu, Bo Zheng

    Abstract: Cascading architecture has been widely adopted in large-scale advertising systems to balance efficiency and effectiveness. In this architecture, the pre-ranking model is expected to be a lightweight approximation of the ranking model, which handles more candidates with strict latency requirements. Due to the gap in model capacity, the pre-ranking and ranking models usually generate inconsistent ra… ▽ More

    Submitted 9 October, 2023; v1 submitted 6 June, 2023; originally announced June 2023.

  9. arXiv:2305.12837  [pdf, other

    cs.IR cs.AI cs.LG

    Capturing Conversion Rate Fluctuation during Sales Promotions: A Novel Historical Data Reuse Approach

    Authors: Zhangming Chan, Yu Zhang, Shuguang Han, Yong Bai, Xiang-Rong Sheng, Siyuan Lou, Jiacen Hu, Baolin Liu, Yuning Jiang, Jian Xu, Bo Zheng

    Abstract: Conversion rate (CVR) prediction is one of the core components in online recommender systems, and various approaches have been proposed to obtain accurate and well-calibrated CVR estimation. However, we observe that a well-trained CVR prediction model often performs sub-optimally during sales promotions. This can be largely ascribed to the problem of the data distribution shift, in which the conve… ▽ More

    Submitted 26 June, 2023; v1 submitted 22 May, 2023; originally announced May 2023.

    Comments: Accepted at KDD 2023. This work has already been deployed on the display advertising system in Alibaba, bringing substantial economic gains

  10. arXiv:2304.01811  [pdf, other

    cs.LG cs.AI cs.CV

    HarsanyiNet: Computing Accurate Shapley Values in a Single Forward Propagation

    Authors: Lu Chen, Siyu Lou, Keyan Zhang, Jin Huang, Quanshi Zhang

    Abstract: The Shapley value is widely regarded as a trustworthy attribution metric. However, when people use Shapley values to explain the attribution of input variables of a deep neural network (DNN), it usually requires a very high computational cost to approximate relatively accurate Shapley values in real-world applications. Therefore, we propose a novel network architecture, the HarsanyiNet, which make… ▽ More

    Submitted 1 December, 2023; v1 submitted 4 April, 2023; originally announced April 2023.

  11. arXiv:2302.13095  [pdf, other

    cs.LG cs.AI cs.CV

    Bayesian Neural Networks Avoid Encoding Complex and Perturbation-Sensitive Concepts

    Authors: Qihan Ren, Huiqi Deng, Yunuo Chen, Siyu Lou, Quanshi Zhang

    Abstract: In this paper, we focus on mean-field variational Bayesian Neural Networks (BNNs) and explore the representation capacity of such BNNs by investigating which types of concepts are less likely to be encoded by the BNN. It has been observed and studied that a relatively small set of interactive concepts usually emerge in the knowledge representation of a sufficiently-trained neural network, and such… ▽ More

    Submitted 1 December, 2023; v1 submitted 25 February, 2023; originally announced February 2023.

  12. HiPAL: A Deep Framework for Physician Burnout Prediction Using Activity Logs in Electronic Health Records

    Authors: Hanyang Liu, Sunny S. Lou, Benjamin C. Warner, Derek R. Harford, Thomas Kannampallil, Chenyang Lu

    Abstract: Burnout is a significant public health concern affecting nearly half of the healthcare workforce. This paper presents the first end-to-end deep learning framework for predicting physician burnout based on electronic health record (EHR) activity logs, digital traces of physician work activities that are available in any EHR system. In contrast to prior approaches that exclusively relied on surveys… ▽ More

    Submitted 8 July, 2022; v1 submitted 23 May, 2022; originally announced May 2022.

    Comments: 11 pages including appendices. Accepted by KDD'22

    Journal ref: KDD 2022

  13. arXiv:2203.13645  [pdf, other

    cs.SD cs.CL eess.AS

    Audio-text Retrieval in Context

    Authors: Siyu Lou, Xuenan Xu, Mengyue Wu, Kai Yu

    Abstract: Audio-text retrieval based on natural language descriptions is a challenging task. It involves learning cross-modality alignments between long sequences under inadequate data conditions. In this work, we investigate several audio features as well as sequence aggregation methods for better audio-text alignment. Moreover, through a qualitative analysis we observe that semantic mapping is more import… ▽ More

    Submitted 29 March, 2022; v1 submitted 25 March, 2022; originally announced March 2022.

  14. arXiv:1811.09618  [pdf, other

    cs.CV

    NeuroTreeNet: A New Method to Explore Horizontal Expansion Network

    Authors: Shenlong Lou, Yan Luo, Qiancong Fan, Feng Chen, Yiping Chen, Cheng Wang, Jonathan Li

    Abstract: It is widely recognized that the deeper networks or networks with more feature maps have better performance. Existing studies mainly focus on extending the network depth and increasing the feature maps of networks. At the same time, horizontal expansion network (e.g. Inception Model) as an alternative way to improve network performance has not been fully investigated. Accordingly, we proposed Neur… ▽ More

    Submitted 22 November, 2018; originally announced November 2018.