Showing 1–2 of 2 results for author: Abdelhalim, I

Search v0.5.6 released 2020-02-24

arXiv:2406.04493 [pdf, other]

cs.CV cs.CL

CORU: Comprehensive Post-OCR Parsing and Receipt Understanding Dataset

Authors: Abdelrahman Abdallah, Mahmoud Abdalla, Mahmoud SalahEldin Kasem, Mohamed Mahmoud, Ibrahim Abdelhalim, Mohamed Elkasaby, Yasser ElBendary, Adam Jatowt

Abstract: In the fields of Optical Character Recognition (OCR) and Natural Language Processing (NLP), integrating multilingual capabilities remains a critical challenge, especially when considering languages with complex scripts such as Arabic. This paper introduces the Comprehensive Post-OCR Parsing and Receipt Understanding Dataset (CORU), a novel dataset specifically designed to enhance OCR and informati… ▽ More In the fields of Optical Character Recognition (OCR) and Natural Language Processing (NLP), integrating multilingual capabilities remains a critical challenge, especially when considering languages with complex scripts such as Arabic. This paper introduces the Comprehensive Post-OCR Parsing and Receipt Understanding Dataset (CORU), a novel dataset specifically designed to enhance OCR and information extraction from receipts in multilingual contexts involving Arabic and English. CORU consists of over 20,000 annotated receipts from diverse retail settings, including supermarkets and clothing stores, alongside 30,000 annotated images for OCR that were utilized to recognize each detected line, and 10,000 items annotated for detailed information extraction. These annotations capture essential details such as merchant names, item descriptions, total prices, receipt numbers, and dates. They are structured to support three primary computational tasks: object detection, OCR, and information extraction. We establish the baseline performance for a range of models on CORU to evaluate the effectiveness of traditional methods, like Tesseract OCR, and more advanced neural network-based approaches. These baselines are crucial for processing the complex and noisy document layouts typical of real-world receipts and for advancing the state of automated multilingual document processing. Our datasets are publicly accessible (https://github.com/Update-For-Integrated-Business-AI/CORU). △ Less

Submitted 6 June, 2024; originally announced June 2024.
arXiv:1910.11960 [pdf, other]

eess.IV cs.CV

Data Augmentation for Skin Lesion using Self-Attention based Progressive Generative Adversarial Network

Authors: Ibrahim Saad Ali, Mamdouh Farouk Mohamed, Yousef Bassyouni Mahdy

Abstract: Deep Neural Networks (DNNs) show a significant impact on medical imaging. One significant problem with adopting DNNs for skin cancer classification is that the class frequencies in the existing datasets are imbalanced. This problem hinders the training of robust and well-generalizing models. Data Augmentation addresses this by using existing data more effectively. However, standard data augmentati… ▽ More Deep Neural Networks (DNNs) show a significant impact on medical imaging. One significant problem with adopting DNNs for skin cancer classification is that the class frequencies in the existing datasets are imbalanced. This problem hinders the training of robust and well-generalizing models. Data Augmentation addresses this by using existing data more effectively. However, standard data augmentation implementations are manually designed and produce only limited reasonably alternative data. Instead, Generative Adversarial Networks (GANs) is utilized to generate a much broader set of augmentations. This paper proposes a novel enhancement for the progressive generative adversarial networks (PGAN) using self-attention mechanism. Self-attention mechanism is used to directly model the long-range dependencies in the feature maps. Accordingly, self-attention complements PGAN to generate fine-grained samples that comprise clinically-meaningful information. Moreover, the stabilization technique was applied to the enhanced generative model. To train the generative models, ISIC 2018 skin lesion challenge dataset was used to synthesize highly realistic skin lesion samples for boosting further the classification result. We achieve an accuracy of 70.1% which is 2.8% better than the non-augmented one of 67.3%. △ Less

Submitted 25 October, 2019; originally announced October 2019.

Search v0.5.6 released 2020-02-24