Zum Hauptinhalt springen

Showing 1–5 of 5 results for author: Somayajula, S A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2403.12918  [pdf, other

    cs.CL cs.AI cs.LG

    Generalizable and Stable Finetuning of Pretrained Language Models on Low-Resource Texts

    Authors: Sai Ashish Somayajula, Youwei Liang, Abhishek Singh, Li Zhang, Pengtao Xie

    Abstract: Pretrained Language Models (PLMs) have advanced Natural Language Processing (NLP) tasks significantly, but finetuning PLMs on low-resource datasets poses significant challenges such as instability and overfitting. Previous methods tackle these issues by finetuning a strategically chosen subnetwork on a downstream task, while keeping the remaining weights fixed to the pretrained weights. However, t… ▽ More

    Submitted 19 March, 2024; originally announced March 2024.

    Comments: Accepted as a long paper to NAACL 2024 Main Conference; 18 pages, 11 tables, 3 figures

  2. arXiv:2403.09113  [pdf, other

    cs.CL cs.AI cs.LG

    AutoLoRA: Automatically Tuning Matrix Ranks in Low-Rank Adaptation Based on Meta Learning

    Authors: Ruiyi Zhang, Rushi Qiang, Sai Ashish Somayajula, Pengtao Xie

    Abstract: Large-scale pretraining followed by task-specific finetuning has achieved great success in various NLP tasks. Since finetuning all parameters of large pretrained models poses substantial computational and memory challenges, several efficient finetuning methods have been developed. Among them, low-rank adaptation (LoRA), which finetunes low-rank incremental update matrices on top of frozen pretrain… ▽ More

    Submitted 17 March, 2024; v1 submitted 14 March, 2024; originally announced March 2024.

  3. arXiv:2402.18128  [pdf, other

    cs.CV cs.LG

    Downstream Task Guided Masking Learning in Masked Autoencoders Using Multi-Level Optimization

    Authors: Han Guo, Ramtin Hosseini, Ruiyi Zhang, Sai Ashish Somayajula, Ranak Roy Chowdhury, Rajesh K. Gupta, Pengtao Xie

    Abstract: Masked Autoencoder (MAE) is a notable method for self-supervised pretraining in visual representation learning. It operates by randomly masking image patches and reconstructing these masked patches using the unmasked ones. A key limitation of MAE lies in its disregard for the varying informativeness of different patches, as it uniformly selects patches to mask. To overcome this, some approaches pr… ▽ More

    Submitted 28 February, 2024; originally announced February 2024.

  4. arXiv:2402.18059  [pdf, other

    cs.LG cs.CL cs.CR

    Token-Specific Watermarking with Enhanced Detectability and Semantic Coherence for Large Language Models

    Authors: Mingjia Huo, Sai Ashish Somayajula, Youwei Liang, Ruisi Zhang, Farinaz Koushanfar, Pengtao Xie

    Abstract: Large language models generate high-quality responses with potential misinformation, underscoring the need for regulation by distinguishing AI-generated and human-written texts. Watermarking is pivotal in this context, which involves embedding hidden markers in texts during the LLM inference phase, which is imperceptible to humans. Achieving both the detectability of inserted watermarks and the se… ▽ More

    Submitted 6 June, 2024; v1 submitted 28 February, 2024; originally announced February 2024.

    Comments: 22 pages, 13 figures, 5 tables

  5. arXiv:2112.00171  [pdf, other

    cs.LG cs.CV

    Improving Differentiable Architecture Search with a Generative Model

    Authors: Ruisi Zhang, Youwei Liang, Sai Ashish Somayajula, Pengtao Xie

    Abstract: In differentiable neural architecture search (NAS) algorithms like DARTS, the training set used to update model weight and the validation set used to update model architectures are sampled from the same data distribution. Thus, the uncommon features in the dataset fail to receive enough attention during training. In this paper, instead of introducing more complex NAS algorithms, we explore the ide… ▽ More

    Submitted 30 November, 2021; originally announced December 2021.