Zum Hauptinhalt springen

Showing 1–13 of 13 results for author: Hu, J Y

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.01079  [pdf, other

    stat.ML cs.AI cs.LG

    On Statistical Rates and Provably Efficient Criteria of Latent Diffusion Transformers (DiTs)

    Authors: Jerry Yao-Chieh Hu, Weimin Wu, Zhao Song, Han Liu

    Abstract: We investigate the statistical and computational limits of latent \textbf{Di}ffusion \textbf{T}ransformers (\textbf{DiT}s) under the low-dimensional linear latent space assumption. Statistically, we study the universal approximation and sample complexity of the DiTs score function, as well as the distribution recovery property of the initial data. Specifically, under mild data assumptions, we deri… ▽ More

    Submitted 22 August, 2024; v1 submitted 1 July, 2024; originally announced July 2024.

    Comments: v2 fixed typos, added Fig. 1 and added clarifications

  2. arXiv:2406.03136  [pdf, ps, other

    cs.LG cs.AI cs.CC stat.ML

    Computational Limits of Low-Rank Adaptation (LoRA) for Transformer-Based Models

    Authors: Jerry Yao-Chieh Hu, Maojiang Su, En-Jui Kuo, Zhao Song, Han Liu

    Abstract: We study the computational limits of Low-Rank Adaptation (LoRA) update for finetuning transformer-based models using fine-grained complexity theory. Our key observation is that the existence of low-rank decompositions within the gradient computation of LoRA adaptation leads to possible algorithmic speedup. This allows us to (i) identify a phase transition behavior and (ii) prove the existence of n… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

  3. arXiv:2406.01514  [pdf, other

    cs.CL cs.AI cs.CR

    Decoupled Alignment for Robust Plug-and-Play Adaptation

    Authors: Haozheng Luo, Jiahao Yu, Wenxin Zhang, Jialong Li, Jerry Yao-Chieh Hu, Xinyu Xing, Han Liu

    Abstract: We introduce a low-resource safety enhancement method for aligning large language models (LLMs) without the need for supervised fine-tuning (SFT) or reinforcement learning from human feedback (RLHF). Our main idea is to exploit knowledge distillation to extract the alignment information from existing well-aligned LLMs and integrate it into unaligned LLMs in a plug-and-play fashion. Methodology, we… ▽ More

    Submitted 6 June, 2024; v1 submitted 3 June, 2024; originally announced June 2024.

  4. arXiv:2405.20653  [pdf, other

    cs.AI

    Enhancing Jailbreak Attack Against Large Language Models through Silent Tokens

    Authors: Jiahao Yu, Haozheng Luo, Jerry Yao-Chieh Hu, Wenbo Guo, Han Liu, Xinyu Xing

    Abstract: Along with the remarkable successes of Language language models, recent research also started to explore the security threats of LLMs, including jailbreaking attacks. Attackers carefully craft jailbreaking prompts such that a target LLM will respond to the harmful question. Existing jailbreaking attacks require either human experts or leveraging complicated algorithms to craft jailbreaking prompts… ▽ More

    Submitted 4 June, 2024; v1 submitted 31 May, 2024; originally announced May 2024.

  5. arXiv:2404.03900  [pdf, other

    stat.ML cs.AI cs.LG cs.NE

    Nonparametric Modern Hopfield Models

    Authors: Jerry Yao-Chieh Hu, Bo-Yu Chen, Dennis Wu, Feng Ruan, Han Liu

    Abstract: We present a nonparametric construction for deep learning compatible modern Hopfield models and utilize this framework to debut an efficient variant. Our key contribution stems from interpreting the memory storage and retrieval processes in modern Hopfield models as a nonparametric regression problem subject to a set of query-memory pairs. Crucially, our framework not only recovers the known resul… ▽ More

    Submitted 5 April, 2024; originally announced April 2024.

    Comments: 59 pages; Code available at https://github.com/MAGICS-LAB/NonparametricHopfield

  6. arXiv:2404.03830  [pdf, other

    cs.LG cs.AI stat.ML

    BiSHop: Bi-Directional Cellular Learning for Tabular Data with Generalized Sparse Modern Hopfield Model

    Authors: Chenwei Xu, Yu-Chao Huang, Jerry Yao-Chieh Hu, Weijian Li, Ammar Gilani, Hsi-Sheng Goan, Han Liu

    Abstract: We introduce the \textbf{B}i-Directional \textbf{S}parse \textbf{Hop}field Network (\textbf{BiSHop}), a novel end-to-end framework for deep tabular learning. BiSHop handles the two major challenges of deep tabular learning: non-rotationally invariant data structure and feature sparsity in tabular data. Our key motivation comes from the recent established connection between associative memory and a… ▽ More

    Submitted 12 July, 2024; v1 submitted 4 April, 2024; originally announced April 2024.

    Comments: 31 pages; Code available at https://github.com/MAGICS-LAB/BiSHop

  7. arXiv:2404.03828  [pdf, other

    cs.LG cs.AI stat.ML

    Outlier-Efficient Hopfield Layers for Large Transformer-Based Models

    Authors: Jerry Yao-Chieh Hu, Pei-Hsuan Chang, Robin Luo, Hong-Yu Chen, Weijian Li, Wei-Po Wang, Han Liu

    Abstract: We introduce an Outlier-Efficient Modern Hopfield Model (termed $\mathrm{OutEffHop}$) and use it to address the outlier inefficiency problem of {training} gigantic transformer-based models. Our main contribution is a novel associative memory model facilitating \textit{outlier-efficient} associative memory retrievals. Interestingly, this memory model manifests a model-based interpretation of an out… ▽ More

    Submitted 26 June, 2024; v1 submitted 4 April, 2024; originally announced April 2024.

    Comments: Accepted at ICML 2024; v2 updated to camera-ready version; Code available at https://github.com/MAGICS-LAB/OutEffHop; Models are on Hugging Face: https://huggingface.co/collections/magicslabnu/outeffhop-6610fcede8d2cda23009a98f

  8. arXiv:2404.03827  [pdf, other

    cs.LG cs.AI stat.ML

    Uniform Memory Retrieval with Larger Capacity for Modern Hopfield Models

    Authors: Dennis Wu, Jerry Yao-Chieh Hu, Teng-Yun Hsiao, Han Liu

    Abstract: We propose a two-stage memory retrieval dynamics for modern Hopfield models, termed $\mathtt{U\text{-}Hop}$, with enhanced memory capacity. Our key contribution is a learnable feature map $Φ$ which transforms the Hopfield energy function into kernel space. This transformation ensures convergence between the local minima of energy and the fixed points of retrieval dynamics within the kernel space.… ▽ More

    Submitted 12 June, 2024; v1 submitted 4 April, 2024; originally announced April 2024.

    Comments: Accepted at ICML 2024; v2 updated to camera-ready version; Code available at https://github.com/MAGICS-LAB/UHop

  9. arXiv:2402.04520  [pdf, ps, other

    cs.LG cs.AI stat.ML

    On Computational Limits of Modern Hopfield Models: A Fine-Grained Complexity Analysis

    Authors: Jerry Yao-Chieh Hu, Thomas Lin, Zhao Song, Han Liu

    Abstract: We investigate the computational limits of the memory retrieval dynamics of modern Hopfield models from the fine-grained complexity analysis. Our key contribution is the characterization of a phase transition behavior in the efficiency of all possible modern Hopfield models based on the norm of patterns. Specifically, we establish an upper bound criterion for the norm of input query patterns and m… ▽ More

    Submitted 31 May, 2024; v1 submitted 6 February, 2024; originally announced February 2024.

    Comments: Accepted at ICML 2024; v2 corrected typos; v3 added clarifications and references; v4,5 updated to camera-ready version

  10. arXiv:2312.17372  [pdf, other

    cs.LG cs.AI physics.acc-ph

    Beyond PID Controllers: PPO with Neuralized PID Policy for Proton Beam Intensity Control in Mu2e

    Authors: Chenwei Xu, Jerry Yao-Chieh Hu, Aakaash Narayanan, Mattson Thieme, Vladimir Nagaslaev, Mark Austin, Jeremy Arnold, Jose Berlioz, Pierrick Hanlet, Aisha Ibrahim, Dennis Nicklaus, Jovan Mitrevski, Jason Michael St. John, Gauri Pradhan, Andrea Saewert, Kiyomi Seiya, Brian Schupbach, Randy Thurman-Keup, Nhan Tran, Rui Shi, Seda Ogrenci, Alexis Maya-Isabelle Shuping, Kyle Hazelwood, Han Liu

    Abstract: We introduce a novel Proximal Policy Optimization (PPO) algorithm aimed at addressing the challenge of maintaining a uniform proton beam intensity delivery in the Muon to Electron Conversion Experiment (Mu2e) at Fermi National Accelerator Laboratory (Fermilab). Our primary objective is to regulate the spill process to ensure a consistent intensity profile, with the ultimate goal of creating an aut… ▽ More

    Submitted 28 December, 2023; originally announced December 2023.

    Comments: 10 pages, accepted at NeurIPS 2023 ML4Phy Workshop

  11. arXiv:2312.17346  [pdf, other

    cs.LG cs.AI cs.CV cs.NE stat.ML

    STanHop: Sparse Tandem Hopfield Model for Memory-Enhanced Time Series Prediction

    Authors: Dennis Wu, Jerry Yao-Chieh Hu, Weijian Li, Bo-Yu Chen, Han Liu

    Abstract: We present STanHop-Net (Sparse Tandem Hopfield Network) for multivariate time series prediction with memory-enhanced capabilities. At the heart of our approach is STanHop, a novel Hopfield-based neural network block, which sparsely learns and stores both temporal and cross-series representations in a data-dependent fashion. In essence, STanHop sequentially learn temporal representation and cross-s… ▽ More

    Submitted 28 December, 2023; originally announced December 2023.

  12. arXiv:2309.12673  [pdf, other

    cs.LG cs.AI cs.CV stat.ML

    On Sparse Modern Hopfield Model

    Authors: Jerry Yao-Chieh Hu, Donglin Yang, Dennis Wu, Chenwei Xu, Bo-Yu Chen, Han Liu

    Abstract: We introduce the sparse modern Hopfield model as a sparse extension of the modern Hopfield model. Like its dense counterpart, the sparse modern Hopfield model equips a memory-retrieval dynamics whose one-step approximation corresponds to the sparse attention mechanism. Theoretically, our key contribution is a principled derivation of a closed-form sparse Hopfield energy using the convex conjugate… ▽ More

    Submitted 29 November, 2023; v1 submitted 22 September, 2023; originally announced September 2023.

    Comments: 37 pages, accepted at NeurIPS 2023. [v2] updated to match with camera-ready version. Code is available at https://github.com/MAGICS-LAB/SparseModernHopfield

  13. arXiv:2306.06252  [pdf, other

    cs.LG stat.ML

    Feature Programming for Multivariate Time Series Prediction

    Authors: Alex Reneau, Jerry Yao-Chieh Hu, Chenwei Xu, Weijian Li, Ammar Gilani, Han Liu

    Abstract: We introduce the concept of programmable feature engineering for time series modeling and propose a feature programming framework. This framework generates large amounts of predictive features for noisy multivariate time series while allowing users to incorporate their inductive bias with minimal effort. The key motivation of our framework is to view any multivariate time series as a cumulative su… ▽ More

    Submitted 9 June, 2023; originally announced June 2023.

    Comments: 21 pages, accepted to ICML2023. Code is available at https://github.com/SirAlex900/FeatureProgramming