Skip to main content

Showing 1–5 of 5 results for author: Azeemi, A H

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.04459  [pdf, other

    cs.CL

    Generalists vs. Specialists: Evaluating Large Language Models for Urdu

    Authors: Samee Arif, Abdul Hameed Azeemi, Agha Ali Raza, Awais Athar

    Abstract: In this paper, we compare general-purpose pretrained models, GPT-4-Turbo and Llama-3-8b-Instruct with special-purpose models fine-tuned on specific tasks, XLM-Roberta-large, mT5-large, and Llama-3-8b-Instruct. We focus on seven classification and six generation tasks to evaluate the performance of these models on Urdu language. Urdu has 70 million native speakers, yet it remains underrepresented i… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

  2. arXiv:2403.09259  [pdf, other

    cs.CL cs.LG

    To Label or Not to Label: Hybrid Active Learning for Neural Machine Translation

    Authors: Abdul Hameed Azeemi, Ihsan Ayyub Qazi, Agha Ali Raza

    Abstract: Active learning (AL) techniques reduce labeling costs for training neural machine translation (NMT) models by selecting smaller representative subsets from unlabeled data for annotation. Diversity sampling techniques select heterogeneous instances, while uncertainty sampling methods select instances with the highest model uncertainty. Both approaches have limitations - diversity methods may extrac… ▽ More

    Submitted 14 March, 2024; originally announced March 2024.

    Comments: 11 pages, 3 figures

  3. arXiv:2203.09829  [pdf, other

    cs.LG cs.SD eess.AS

    Representative Subset Selection for Efficient Fine-Tuning in Self-Supervised Speech Recognition

    Authors: Abdul Hameed Azeemi, Ihsan Ayyub Qazi, Agha Ali Raza

    Abstract: Self-supervised speech recognition models require considerable labeled training data for learning high-fidelity representations for Automatic Speech Recognition (ASR) which is computationally demanding and time-consuming. We consider the task of identifying an optimal subset of data for efficient fine-tuning in self-supervised speech models for ASR. We discover that the dataset pruning strategies… ▽ More

    Submitted 11 April, 2023; v1 submitted 18 March, 2022; originally announced March 2022.

    Comments: 16 pages, 8 figures

  4. arXiv:2103.04390  [pdf, other

    cs.IR

    RevDet: Robust and Memory Efficient Event Detection and Tracking in Large News Feeds

    Authors: Abdul Hameed Azeemi, Muhammad Hamza Sohail, Talha Zubair, Muaz Maqbool, Irfan Younas, Omair Shafiq

    Abstract: With the ever-growing volume of online news feeds, event-based organization of news articles has many practical applications including better information navigation and the ability to view and analyze events as they develop. Automatically tracking the evolution of events in large news corpora still remains a challenging task, and the existing techniques for Event Detection and Tracking do not plac… ▽ More

    Submitted 7 March, 2021; originally announced March 2021.

    Comments: 9 pages, 9 figures

  5. arXiv:2103.00199  [pdf, other

    cs.CL cs.IR cs.LG

    COVID-19 Tweets Analysis through Transformer Language Models

    Authors: Abdul Hameed Azeemi, Adeel Waheed

    Abstract: Understanding the public sentiment and perception in a healthcare crisis is essential for developing appropriate crisis management techniques. While some studies have used Twitter data for predictive modelling during COVID-19, fine-grained sentiment analysis of the opinion of people on social media during this pandemic has not yet been done. In this study, we perform an in-depth, fine-grained sent… ▽ More

    Submitted 27 February, 2021; originally announced March 2021.

    Comments: 5 pages, 5 figures