Zum Hauptinhalt springen

Showing 1–12 of 12 results for author: Masry, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.04172  [pdf, other

    cs.AI cs.CL cs.CV

    ChartGemma: Visual Instruction-tuning for Chart Reasoning in the Wild

    Authors: Ahmed Masry, Megh Thakkar, Aayush Bajaj, Aaryaman Kartha, Enamul Hoque, Shafiq Joty

    Abstract: Given the ubiquity of charts as a data analysis, visualization, and decision-making tool across industries and sciences, there has been a growing interest in developing pre-trained foundation models as well as general purpose instruction-tuned models for chart understanding and reasoning. However, existing methods suffer crucial drawbacks across two critical axes affecting the performance of chart… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

  2. arXiv:2406.00257  [pdf, other

    cs.CL

    Are Large Vision Language Models up to the Challenge of Chart Comprehension and Reasoning? An Extensive Investigation into the Capabilities and Limitations of LVLMs

    Authors: Mohammed Saidul Islam, Raian Rahman, Ahmed Masry, Md Tahmid Rahman Laskar, Mir Tafseer Nayeem, Enamul Hoque

    Abstract: Natural language is a powerful complementary modality of communication for data visualizations, such as bar and line charts. To facilitate chart-based reasoning using natural language, various downstream tasks have been introduced recently such as chart question answering, chart summarization, and fact-checking with charts. These tasks pose a unique challenge, demanding both vision-language reason… ▽ More

    Submitted 31 May, 2024; originally announced June 2024.

  3. arXiv:2403.09028  [pdf, other

    cs.CL

    ChartInstruct: Instruction Tuning for Chart Comprehension and Reasoning

    Authors: Ahmed Masry, Mehrad Shahmohammadi, Md Rizwan Parvez, Enamul Hoque, Shafiq Joty

    Abstract: Charts provide visual representations of data and are widely used for analyzing information, addressing queries, and conveying insights to others. Various chart-related downstream tasks have emerged recently, such as question-answering and summarization. A common strategy to solve these tasks is to fine-tune various models originally trained on vision tasks language. However, such task-specific mo… ▽ More

    Submitted 13 March, 2024; originally announced March 2024.

  4. arXiv:2401.15050  [pdf, other

    cs.CL

    LongFin: A Multimodal Document Understanding Model for Long Financial Domain Documents

    Authors: Ahmed Masry, Amir Hajian

    Abstract: Document AI is a growing research field that focuses on the comprehension and extraction of information from scanned and digital documents to make everyday business operations more efficient. Numerous downstream tasks and datasets have been introduced to facilitate the training of AI models capable of parsing and extracting information from various document types such as receipts and scanned forms… ▽ More

    Submitted 26 January, 2024; originally announced January 2024.

    Comments: Accepted at AAAI 2024 Workshop on AI in Finance for Social Impact

  5. arXiv:2312.10610  [pdf, other

    cs.CL

    Do LLMs Work on Charts? Designing Few-Shot Prompts for Chart Question Answering and Summarization

    Authors: Xuan Long Do, Mohammad Hassanpour, Ahmed Masry, Parsa Kavehzadeh, Enamul Hoque, Shafiq Joty

    Abstract: A number of tasks have been proposed recently to facilitate easy access to charts such as chart QA and summarization. The dominant paradigm to solve these tasks has been to fine-tune a pretrained model on the task data. However, this approach is not only expensive but also not generalizable to unseen tasks. On the other hand, large language models (LLMs) have shown impressive generalization capabi… ▽ More

    Submitted 17 December, 2023; originally announced December 2023.

    Comments: 23 pages

  6. arXiv:2305.14761  [pdf, other

    cs.CL

    UniChart: A Universal Vision-language Pretrained Model for Chart Comprehension and Reasoning

    Authors: Ahmed Masry, Parsa Kavehzadeh, Xuan Long Do, Enamul Hoque, Shafiq Joty

    Abstract: Charts are very popular for analyzing data, visualizing key insights and answering complex reasoning questions about data. To facilitate chart-based data analysis using natural language, several downstream tasks have been introduced recently such as chart question answering and chart summarization. However, most of the methods that solve these tasks use pretraining on language or vision-language t… ▽ More

    Submitted 10 October, 2023; v1 submitted 24 May, 2023; originally announced May 2023.

  7. arXiv:2205.03966  [pdf, other

    cs.CL

    Chart Question Answering: State of the Art and Future Directions

    Authors: Enamul Hoque, Parsa Kavehzadeh, Ahmed Masry

    Abstract: Information visualizations such as bar charts and line charts are very common for analyzing data and discovering critical insights. Often people analyze charts to answer questions that they have in mind. Answering such questions can be challenging as they often require a significant amount of perceptual and cognitive effort. Chart Question Answering (CQA) systems typically take a chart and a natur… ▽ More

    Submitted 21 May, 2022; v1 submitted 8 May, 2022; originally announced May 2022.

  8. arXiv:2203.10244  [pdf, other

    cs.CL

    ChartQA: A Benchmark for Question Answering about Charts with Visual and Logical Reasoning

    Authors: Ahmed Masry, Do Xuan Long, Jia Qing Tan, Shafiq Joty, Enamul Hoque

    Abstract: Charts are very popular for analyzing data. When exploring charts, people often ask a variety of complex reasoning questions that involve several logical and arithmetic operations. They also commonly refer to visual features of a chart in their questions. However, most existing datasets do not focus on such complex reasoning questions as their questions are template-based and answers come from a f… ▽ More

    Submitted 19 March, 2022; originally announced March 2022.

    Comments: Accepted by ACL 2022 Findings

  9. arXiv:2203.07452  [pdf, other

    eess.IV cs.CV

    A deep learning pipeline for breast cancer ki-67 proliferation index scoring

    Authors: Khaled Benaggoune, Zeina Al Masry, Jian Ma, Christine Devalland, L. H Mouss, Noureddine Zerhouni

    Abstract: The Ki-67 proliferation index is an essential biomarker that helps pathologists to diagnose and select appropriate treatments. However, automatic evaluation of Ki-67 is difficult due to nuclei overlapping and complex variations in their properties. This paper proposes an integrated pipeline for accurate automatic counting of Ki-67, where the impact of nuclei separation techniques is highlighted. F… ▽ More

    Submitted 14 March, 2022; originally announced March 2022.

  10. arXiv:2203.06486  [pdf, other

    cs.CL

    Chart-to-Text: A Large-Scale Benchmark for Chart Summarization

    Authors: Shankar Kantharaj, Rixie Tiffany Ko Leong, Xiang Lin, Ahmed Masry, Megh Thakkar, Enamul Hoque, Shafiq Joty

    Abstract: Charts are commonly used for exploring data and communicating insights. Generating natural language summaries from charts can be very helpful for people in inferring key insights that would otherwise require a lot of cognitive and perceptual efforts. We present Chart-to-text, a large-scale benchmark with two datasets and a total of 44,096 charts covering a wide range of topics and chart types. We… ▽ More

    Submitted 14 April, 2022; v1 submitted 12 March, 2022; originally announced March 2022.

    Comments: Accepted by ACL 2022 Main Conference

  11. A Survey of Breast Cancer Screening Techniques: Thermography and Electrical Impedance Tomography

    Authors: Juan Zuluaga-Gomez, N. Zerhouni, Z. Al Masry, C. Devalland, C. Varnier

    Abstract: Breast cancer is a disease that threatens many women's life, thus, early and accurate detection plays a key role in reducing the mortality rate. Mammography stands as the reference technique for breast cancer screening; nevertheless, many countries still lack access to mammograms due to economic, social, and cultural issues. Last advances in computational tools, infrared cameras, and devices for b… ▽ More

    Submitted 8 February, 2022; originally announced February 2022.

    Comments: Article published at: Journal of Medical Engineering & Technology (Volume 43, 2019 - Issue 5)

  12. arXiv:1910.13757  [pdf, other

    cs.CV eess.IV

    A CNN-based methodology for breast cancer diagnosis using thermal images

    Authors: Juan Zuluaga-Gomez, Zeina Al Masry, Khaled Benaggoune, Safa Meraghni, Noureddine Zerhouni

    Abstract: Micro Abstract: A recent study from GLOBOCAN disclosed that during 2018 two million women worldwide had been diagnosed from breast cancer. This study presents a computer-aided diagnosis system based on convolutional neural networks as an alternative diagnosis methodology for breast cancer diagnosis with thermal images. Experimental results showed that lower false-positives and false-negatives clas… ▽ More

    Submitted 30 October, 2019; originally announced October 2019.

    Comments: 19 pages, 7 figures, 5 tables. Clinical Breast Cancer