-
Code Vulnerability Detection: A Comparative Analysis of Emerging Large Language Models
Authors:
Shaznin Sultana,
Sadia Afreen,
Nasir U. Eisty
Abstract:
The growing trend of vulnerability issues in software development as a result of a large dependence on open-source projects has received considerable attention recently. This paper investigates the effectiveness of Large Language Models (LLMs) in identifying vulnerabilities within codebases, with a focus on the latest advancements in LLM technology. Through a comparative analysis, we assess the pe…
▽ More
The growing trend of vulnerability issues in software development as a result of a large dependence on open-source projects has received considerable attention recently. This paper investigates the effectiveness of Large Language Models (LLMs) in identifying vulnerabilities within codebases, with a focus on the latest advancements in LLM technology. Through a comparative analysis, we assess the performance of emerging LLMs, specifically Llama, CodeLlama, Gemma, and CodeGemma, alongside established state-of-the-art models such as BERT, RoBERTa, and GPT-3. Our study aims to shed light on the capabilities of LLMs in vulnerability detection, contributing to the enhancement of software security practices across diverse open-source repositories. We observe that CodeGemma achieves the highest F1-score of 58\ and a Recall of 87\, amongst the recent additions of large language models to detect software security vulnerabilities.
△ Less
Submitted 16 September, 2024;
originally announced September 2024.
-
Survey and Analysis of IoT Operating Systems: A Comparative Study on the Effectiveness and Acquisition Time of Open Source Digital Forensics Tools
Authors:
Jeffrey Fairbanks,
Md Mashrur Arifin,
Sadia Afreen,
Alex Curtis
Abstract:
The main goal of this research project is to evaluate the effectiveness and speed of open-source forensic tools for digital evidence collecting from various Internet-of-Things (IoT) devices. The project will create and configure many IoT environments, across popular IoT operating systems, and run common forensics tasks in order to accomplish this goal. To validate these forensic analysis operation…
▽ More
The main goal of this research project is to evaluate the effectiveness and speed of open-source forensic tools for digital evidence collecting from various Internet-of-Things (IoT) devices. The project will create and configure many IoT environments, across popular IoT operating systems, and run common forensics tasks in order to accomplish this goal. To validate these forensic analysis operations, a variety of open-source forensic tools covering four standard digital forensics tasks. These tasks will be utilized across each sample IoT operating system and will have its time spent on record carefully tracked down and examined, allowing for a thorough evaluation of the effectiveness and speed for performing forensics on each type of IoT device. The research also aims to offer recommendations to IoT security experts and digital forensic practitioners about the most efficient open-source tools for forensic investigations with IoT devices while maintaining the integrity of gathered evidence and identifying challenges that exist with these new device types. The results will be shared widely and well-documented in order to provide significant contributions to the field of internet-of-things device makers and digital forensics.
△ Less
Submitted 1 July, 2024;
originally announced July 2024.
-
Alzheimer's Magnetic Resonance Imaging Classification Using Deep and Meta-Learning Models
Authors:
Nida Nasir,
Muneeb Ahmed,
Neda Afreen,
Mustafa Sameer
Abstract:
Deep learning, a cutting-edge machine learning approach, outperforms traditional machine learning in identifying intricate structures in complex high-dimensional data, particularly in the domain of healthcare. This study focuses on classifying Magnetic Resonance Imaging (MRI) data for Alzheimer's disease (AD) by leveraging deep learning techniques characterized by state-of-the-art CNNs. Brain imag…
▽ More
Deep learning, a cutting-edge machine learning approach, outperforms traditional machine learning in identifying intricate structures in complex high-dimensional data, particularly in the domain of healthcare. This study focuses on classifying Magnetic Resonance Imaging (MRI) data for Alzheimer's disease (AD) by leveraging deep learning techniques characterized by state-of-the-art CNNs. Brain imaging techniques such as MRI have enabled the measurement of pathophysiological brain changes related to Alzheimer's disease. Alzheimer's disease is the leading cause of dementia in the elderly, and it is an irreversible brain illness that causes gradual cognitive function disorder. In this paper, we train some benchmark deep models individually for the approach of the solution and later use an ensembling approach to combine the effect of multiple CNNs towards the observation of higher recall and accuracy. Here, the model's effectiveness is evaluated using various methods, including stacking, majority voting, and the combination of models with high recall values. The majority voting performs better than the alternative modelling approach as the majority voting approach typically reduces the variance in the predictions. We report a test accuracy of 90% with a precision score of 0.90 and a recall score of 0.89 in our proposed approach. In future, this study can be extended to incorporate other types of medical data, including signals, images, and other data. The same or alternative datasets can be used with additional classifiers, neural networks, and AI techniques to enhance Alzheimer's detection.
△ Less
Submitted 20 May, 2024;
originally announced May 2024.
-
A Genetic Algorithm-Based Support Vector Machine Approach for Intelligent Usability Assessment of m-Learning Applications
Authors:
Muhammad Asghar,
Imran Sarwar Bajwa,
Shabana Ramzan,
Hina Afreen,
Saima Abdullah
Abstract:
In the field of human-computer interaction (HCI), the usability assessment of m-learning (mobile-learning) applications is a real challenge. Such assessment typically involves extraction of the best features of an application like efficiency, effectiveness, learnability, cognition, memorability, etc., and further ranking of those features for an overall assessment of the quality of the mobile appl…
▽ More
In the field of human-computer interaction (HCI), the usability assessment of m-learning (mobile-learning) applications is a real challenge. Such assessment typically involves extraction of the best features of an application like efficiency, effectiveness, learnability, cognition, memorability, etc., and further ranking of those features for an overall assessment of the quality of the mobile application. In the previous literature, it is found that there is neither any theory nor any tool available to measure or assess a user perception and assessment of usability features of a m-learning application for the sake of ranking the graphical user interface of a mobile application in terms of a user acceptance and satisfaction. In this paper, a novel approach is presented by performing a mobile applications quantitative and qualitative analysis. Based on user requirements and perception, a criterion is defined based on a set of important features. Afterward, for the qualitative analysis, a genetic algorithm (GA) is used to score prescribed features for the usability assessment of a mobile application. The used approach assigns a score to each usability feature according to the user requirement and weight of each feature. GA performs the rank assessment process initially by performing feature selection and scoring the best features of the application. A comparison of assessment analysis of GA and various machine learning models, K-nearest neighbours, Naive Bayes, and Random Forests is performed. It was found that a GA-based support vector machine (SVM) provides more accuracy in the extraction of the best features of a mobile application and further ranking of those features.
△ Less
Submitted 4 April, 2024;
originally announced April 2024.
-
Solvability of the Inverse Optimal Control problem based on the minimum principle
Authors:
Afreen Islam,
Guido Herrmann,
Joaquin Carrasco
Abstract:
In this paper, the solvability of the Inverse Optimal Control (IOC) problem based on two existing minimum principal methods, is analysed. The aim of this work is to answer the question regarding what kinds of trajectories, that is depending on the initial conditions of the closed-loop system and system dynamics, of the original optimal control problem, will result in the recovery of the true weigh…
▽ More
In this paper, the solvability of the Inverse Optimal Control (IOC) problem based on two existing minimum principal methods, is analysed. The aim of this work is to answer the question regarding what kinds of trajectories, that is depending on the initial conditions of the closed-loop system and system dynamics, of the original optimal control problem, will result in the recovery of the true weights of the reward function for both the soft and the hard-constrained methods [1], [2]. Analytical conditions are provided which allow to verify if a trajectory is sufficiently conditioned, that is, holds sufficient information to recover the true weights of an optimal control problem. It was found that the open-loop system of the original optimal problem has a stronger influence on the solvability of the Inverse Optimal Control problem for the hard-constrained method as compared to the soft-constrained method. These analytical results were validated via simulation.
△ Less
Submitted 14 March, 2024;
originally announced March 2024.
-
RAGGED: Towards Informed Design of Retrieval Augmented Generation Systems
Authors:
Jennifer Hsia,
Afreen Shaikh,
Zhiruo Wang,
Graham Neubig
Abstract:
Retrieval-augmented generation (RAG) can significantly improve the performance of language models (LMs) by providing additional context for tasks such as document-based question answering (DBQA). However, the effectiveness of RAG is highly dependent on its configuration. To systematically find the optimal configuration, we introduce RAGGED, a framework for analyzing RAG configurations across vario…
▽ More
Retrieval-augmented generation (RAG) can significantly improve the performance of language models (LMs) by providing additional context for tasks such as document-based question answering (DBQA). However, the effectiveness of RAG is highly dependent on its configuration. To systematically find the optimal configuration, we introduce RAGGED, a framework for analyzing RAG configurations across various DBQA tasks. Using the framework, we discover distinct LM behaviors in response to varying context quantities, context qualities, and retrievers. For instance, while some models are robust to noisy contexts, monotonically performing better with more contexts, others are more noise-sensitive and can effectively use only a few contexts before declining in performance. This framework also provides a deeper analysis of these differences by evaluating the LMs' sensitivity to signal and noise under specific context quality conditions. Using RAGGED, researchers and practitioners can derive actionable insights about how to optimally configure their RAG systems for their specific question-answering tasks.
△ Less
Submitted 12 August, 2024; v1 submitted 13 March, 2024;
originally announced March 2024.
-
Stochastic controllability for a non-autonomous fractional neutral differential equation with infinite delay in abstract space
Authors:
Areefa Khatoon,
Abdur Raheem,
Asma Afreen
Abstract:
This paper deals with the controllability for a class of non-autonomous neutral differential equations of fractional order with infinite delay in an abstract space. The semi-group theory of bounded linear operators, fractional calculus, and stochastic analysis techniques have been implemented to achieve the main result. We prove the existence of mild solution and controllability of the system by u…
▽ More
This paper deals with the controllability for a class of non-autonomous neutral differential equations of fractional order with infinite delay in an abstract space. The semi-group theory of bounded linear operators, fractional calculus, and stochastic analysis techniques have been implemented to achieve the main result. We prove the existence of mild solution and controllability of the system by using the theory of measure of non-compactness, fixed point theorems, and $k$-set contractive mapping. An example is given to demonstrate the effectiveness of the abstract result.
△ Less
Submitted 10 March, 2024;
originally announced March 2024.
-
Time-discretization method for a multi-term time fractional differential equation with delay
Authors:
Areefa Khatoon,
Abdur Raheem,
Asma Afreen
Abstract:
This paper discusses a multi-term time-fractional delay differential equation in a real Hilbert space. An iterative scheme for a multi-term time-fractional differential equation is established using Rothe's method. The method of semi-discretization is extended to this kind of time fractional problem with delay in the case that the time delay parameter $ν>0$ satisfies $ν\leq T$, where $T$ denotes t…
▽ More
This paper discusses a multi-term time-fractional delay differential equation in a real Hilbert space. An iterative scheme for a multi-term time-fractional differential equation is established using Rothe's method. The method of semi-discretization is extended to this kind of time fractional problem with delay in the case that the time delay parameter $ν>0$ satisfies $ν\leq T$, where $T$ denotes the final time. We apply the accretivity of the operator $A$ in an iterative scheme to establish the existence and regularity of strong solutions to the considered problem. Finally, an example is provided to demonstrate the abstract result.
△ Less
Submitted 12 March, 2024;
originally announced March 2024.
-
Controllability of a second-order impulsive neutral differential equation via resolvent operator technique
Authors:
Asma Afreen,
Abdur Raheem,
Areefa Khatoon
Abstract:
This paper uses the resolvent operator technique to investigate second-order non-autonomous neutral integrodifferential equations with impulsive conditions in a Banach space. We study the existence of a mild solution and the system's approximate controllability. The semigroup and resolvent operator theory, graph norm, and Krasnoselskii's fixed point theorem are used to demonstrate the results. Fin…
▽ More
This paper uses the resolvent operator technique to investigate second-order non-autonomous neutral integrodifferential equations with impulsive conditions in a Banach space. We study the existence of a mild solution and the system's approximate controllability. The semigroup and resolvent operator theory, graph norm, and Krasnoselskii's fixed point theorem are used to demonstrate the results. Finally, we present our findings with an example.
△ Less
Submitted 4 December, 2023;
originally announced December 2023.
-
Optimal controllability for multi-term time-fractional stochastic systems with non-instantaneous impulses
Authors:
Asma Afreen,
Abdur Raheem,
Areefa Khatoon
Abstract:
In the present paper, we study the existence and optimal controllability of a multi-term time-fractional stochastic system with non-instantaneous impulses. Using semigroup theory, stochastic analysis theory, and Krasnoselskii's fixed point theorem, we first establish the existence of a mild solution. Further, we obtain that there exists an optimal state-control pair of the system. Some examples ar…
▽ More
In the present paper, we study the existence and optimal controllability of a multi-term time-fractional stochastic system with non-instantaneous impulses. Using semigroup theory, stochastic analysis theory, and Krasnoselskii's fixed point theorem, we first establish the existence of a mild solution. Further, we obtain that there exists an optimal state-control pair of the system. Some examples are given to illustrate the abstract results.
△ Less
Submitted 29 November, 2023;
originally announced November 2023.
-
A study of nonlocal fractional neutral stochastic integrodifferential inclusions of order $1<α<2$ with impulses
Authors:
Asma Afreen,
Abdur Raheem,
Areefa Khatoon
Abstract:
This paper considers a class of nonlocal fractional neutral stochastic integrodifferential inclusions of order $1<α<2$ with impulses in a Hilbert space. We study the existence of the mild solution for the cases when the multi-valued map has convex and non-convex values. The results are obtained by combining fixed-point theorems with the fractional order cosine family, semigroup theory, and stochas…
▽ More
This paper considers a class of nonlocal fractional neutral stochastic integrodifferential inclusions of order $1<α<2$ with impulses in a Hilbert space. We study the existence of the mild solution for the cases when the multi-valued map has convex and non-convex values. The results are obtained by combining fixed-point theorems with the fractional order cosine family, semigroup theory, and stochastic techniques. A new set of sufficient conditions is developed to demonstrate the approximate controllability of the system. Finally, an example is given to illustrate the obtained results.
△ Less
Submitted 29 November, 2023;
originally announced November 2023.
-
Healthcare Security Breaches in the United States: Insights and their Socio-Technical Implications
Authors:
Megha M. Moncy,
Sadia Afreen,
Saptarshi Purkayastha
Abstract:
This research examines the pivotal role of human behavior in the realm of healthcare data management, situated at the confluence of technological advancements and human conduct. An in-depth analysis of security breaches in the United States from 2009 to the present elucidates the dominance of human-induced security breaches. While technological weak points are certainly a concern, our study highli…
▽ More
This research examines the pivotal role of human behavior in the realm of healthcare data management, situated at the confluence of technological advancements and human conduct. An in-depth analysis of security breaches in the United States from 2009 to the present elucidates the dominance of human-induced security breaches. While technological weak points are certainly a concern, our study highlights that a significant proportion of breaches are precipitated by human errors and practices, thus pinpointing a conspicuous deficiency in training, awareness, and organizational architecture. In spite of stringent federal mandates, such as the Health Insurance Portability and Accountability Act (HIPAA) and the Health Information Technology for Economic and Clinical Health (HITECH) Act, breaches persist, emphasizing the indispensable role of human factors within this domain. Such oversights not only jeopardize patient data confidentiality but also undermine the foundational trust inherent in the healthcare infrastructure. By probing the socio-technical facets of healthcare security infringements, this article advocates for an integrated, dynamic, and holistic approach to healthcare data security. The findings underscore the imperative of augmenting technological defenses while concurrently elevating human conduct and institutional ethos, thereby cultivating a robust and impervious healthcare data management environment.
△ Less
Submitted 6 November, 2023;
originally announced November 2023.
-
Enhancing the ultrafast third order nonlinear optical response by charge transfer in VSe2-reduced graphene oxide hybrid
Authors:
Vinod Kumar,
Afreen,
K. A. Sree Raj,
Pratap mane,
Brahmananda Chakraborty,
Chandra S. Rout,
K. V. Adarsh
Abstract:
Nonlinear optical phenomena play a critical role in understanding microscopic light-matter interactions and have far-reaching applications across various fields, such as biosensing, quantum information, optical switching, and all-optical data processing. Most of these applications require materials with high third-order absorptive and refractive optical nonlinearities. However, most materials show…
▽ More
Nonlinear optical phenomena play a critical role in understanding microscopic light-matter interactions and have far-reaching applications across various fields, such as biosensing, quantum information, optical switching, and all-optical data processing. Most of these applications require materials with high third-order absorptive and refractive optical nonlinearities. However, most materials show weak nonlinear optical responses due to their perturbative nature and often need to be improved for practical applications. Here, we demonstrate that the charge donor-acceptor hybrid of VSe2-reduced graphene oxide (rGO) hybrid exhibits enhanced ultrafast third-order absorptive and refractive nonlinearities compared to the pristine systems, at least by one order of magnitude. Through density functional theory and Bader charge analysis, we elucidate the strong electronic coupling in the VSe2-rGO hybrid, involving the transfer of electrons from VSe2 to rGO. Steady-state and time-resolved photoluminescence (PL) measurements confirm the electronic coupling and charge transfer. Furthermore, we fabricate an ultrafast optical limiter device with better performance parameters, such as an onset threshold of 2.5 mJ cm-2 and differential transmittance of 0.42.
△ Less
Submitted 7 July, 2023;
originally announced July 2023.
-
Financial Numeric Extreme Labelling: A Dataset and Benchmarking for XBRL Tagging
Authors:
Soumya Sharma,
Subhendu Khatuya,
Manjunath Hegde,
Afreen Shaikh. Koustuv Dasgupta,
Pawan Goyal,
Niloy Ganguly
Abstract:
The U.S. Securities and Exchange Commission (SEC) mandates all public companies to file periodic financial statements that should contain numerals annotated with a particular label from a taxonomy. In this paper, we formulate the task of automating the assignment of a label to a particular numeral span in a sentence from an extremely large label set. Towards this task, we release a dataset, Financ…
▽ More
The U.S. Securities and Exchange Commission (SEC) mandates all public companies to file periodic financial statements that should contain numerals annotated with a particular label from a taxonomy. In this paper, we formulate the task of automating the assignment of a label to a particular numeral span in a sentence from an extremely large label set. Towards this task, we release a dataset, Financial Numeric Extreme Labelling (FNXL), annotated with 2,794 labels. We benchmark the performance of the FNXL dataset by formulating the task as (a) a sequence labelling problem and (b) a pipeline with span extraction followed by Extreme Classification. Although the two approaches perform comparably, the pipeline solution provides a slight edge for the least frequent labels.
△ Less
Submitted 6 June, 2023;
originally announced June 2023.
-
ECTSum: A New Benchmark Dataset For Bullet Point Summarization of Long Earnings Call Transcripts
Authors:
Rajdeep Mukherjee,
Abhinav Bohra,
Akash Banerjee,
Soumya Sharma,
Manjunath Hegde,
Afreen Shaikh,
Shivani Shrivastava,
Koustuv Dasgupta,
Niloy Ganguly,
Saptarshi Ghosh,
Pawan Goyal
Abstract:
Despite tremendous progress in automatic summarization, state-of-the-art methods are predominantly trained to excel in summarizing short newswire articles, or documents with strong layout biases such as scientific articles or government reports. Efficient techniques to summarize financial documents, including facts and figures, have largely been unexplored, majorly due to the unavailability of sui…
▽ More
Despite tremendous progress in automatic summarization, state-of-the-art methods are predominantly trained to excel in summarizing short newswire articles, or documents with strong layout biases such as scientific articles or government reports. Efficient techniques to summarize financial documents, including facts and figures, have largely been unexplored, majorly due to the unavailability of suitable datasets. In this work, we present ECTSum, a new dataset with transcripts of earnings calls (ECTs), hosted by publicly traded companies, as documents, and short experts-written telegram-style bullet point summaries derived from corresponding Reuters articles. ECTs are long unstructured documents without any prescribed length limit or format. We benchmark our dataset with state-of-the-art summarizers across various metrics evaluating the content quality and factual consistency of the generated summaries. Finally, we present a simple-yet-effective approach, ECT-BPS, to generate a set of bullet points that precisely capture the important facts discussed in the calls.
△ Less
Submitted 26 October, 2022; v1 submitted 22 October, 2022;
originally announced October 2022.
-
Otsu based Differential Evolution Method for Image Segmentation
Authors:
Afreen Shaikh,
Sharmila Botcha,
Murali Krishna
Abstract:
This paper proposes an OTSU based differential evolution method for satellite image segmentation and compares it with four other methods such as Modified Artificial Bee Colony Optimizer (MABC), Artificial Bee Colony (ABC), Genetic Algorithm (GA), and Particle Swarm Optimization (PSO) using the objective function proposed by Otsu for optimal multilevel thresholding. The experiments conducted and th…
▽ More
This paper proposes an OTSU based differential evolution method for satellite image segmentation and compares it with four other methods such as Modified Artificial Bee Colony Optimizer (MABC), Artificial Bee Colony (ABC), Genetic Algorithm (GA), and Particle Swarm Optimization (PSO) using the objective function proposed by Otsu for optimal multilevel thresholding. The experiments conducted and their results illustrate that our proposed DE and OTSU algorithm segmentation can effectively and precisely segment the input image, close to results obtained by the other methods. In the proposed DE and OTSU algorithm, instead of passing the fitness function variables, the entire image is passed as an input to the DE algorithm after obtaining the threshold values for the input number of levels in the OTSU algorithm. The image segmentation results are obtained after learning about the image instead of learning about the fitness variables. In comparison to other segmentation methods examined, the proposed DE and OTSU algorithm yields promising results with minimized computational time compared to some algorithms.
△ Less
Submitted 18 October, 2022;
originally announced October 2022.
-
Development and Validation of ML-DQA -- a Machine Learning Data Quality Assurance Framework for Healthcare
Authors:
Mark Sendak,
Gaurav Sirdeshmukh,
Timothy Ochoa,
Hayley Premo,
Linda Tang,
Kira Niederhoffer,
Sarah Reed,
Kaivalya Deshpande,
Emily Sterrett,
Melissa Bauer,
Laurie Snyder,
Afreen Shariff,
David Whellan,
Jeffrey Riggio,
David Gaieski,
Kristin Corey,
Megan Richards,
Michael Gao,
Marshall Nichols,
Bradley Heintze,
William Knechtle,
William Ratliff,
Suresh Balu
Abstract:
The approaches by which the machine learning and clinical research communities utilize real world data (RWD), including data captured in the electronic health record (EHR), vary dramatically. While clinical researchers cautiously use RWD for clinical investigations, ML for healthcare teams consume public datasets with minimal scrutiny to develop new algorithms. This study bridges this gap by devel…
▽ More
The approaches by which the machine learning and clinical research communities utilize real world data (RWD), including data captured in the electronic health record (EHR), vary dramatically. While clinical researchers cautiously use RWD for clinical investigations, ML for healthcare teams consume public datasets with minimal scrutiny to develop new algorithms. This study bridges this gap by developing and validating ML-DQA, a data quality assurance framework grounded in RWD best practices. The ML-DQA framework is applied to five ML projects across two geographies, different medical conditions, and different cohorts. A total of 2,999 quality checks and 24 quality reports were generated on RWD gathered on 247,536 patients across the five projects. Five generalizable practices emerge: all projects used a similar method to group redundant data element representations; all projects used automated utilities to build diagnosis and medication data elements; all projects used a common library of rules-based transformations; all projects used a unified approach to assign data quality checks to data elements; and all projects used a similar approach to clinical adjudication. An average of 5.8 individuals, including clinicians, data scientists, and trainees, were involved in implementing ML-DQA for each project and an average of 23.4 data elements per project were either transformed or removed in response to ML-DQA. This study demonstrates the importance role of ML-DQA in healthcare projects and provides teams a framework to conduct these essential activities.
△ Less
Submitted 4 August, 2022;
originally announced August 2022.
-
High-Performance Computing for SKA Transient Search: Use of FPGA based Accelerators -- a brief review
Authors:
R. Aafreen,
R. Abhishek,
B. Ajithkumar,
Arunkumar M. Vaidyanathan,
Indrajit. V. Barve,
Sahana Bhattramakki,
Shashank Bhat,
B. S. Girish,
Atul Ghalame,
Y. Gupta,
Harshal G. Hayatnagarkar,
P. A. Kamini,
A. Karastergiou,
L. Levin,
S. Madhavi,
M. Mekhala,
M. Mickaliger,
V. Mugundhan,
Arun Naidu,
J. Oppermann,
B. Arul Pandian,
N. Patra,
A. Raghunathan,
Jayanta Roy,
Shiv Sethi
, et al. (12 additional authors not shown)
Abstract:
This paper presents the High-Performance computing efforts with FPGA for the accelerated pulsar/transient search for the SKA. Case studies are presented from within SKA and pathfinder telescopes highlighting future opportunities. It reviews the scenario that has shifted from offline processing of the radio telescope data to digitizing several hundreds/thousands of antenna outputs over huge bandwid…
▽ More
This paper presents the High-Performance computing efforts with FPGA for the accelerated pulsar/transient search for the SKA. Case studies are presented from within SKA and pathfinder telescopes highlighting future opportunities. It reviews the scenario that has shifted from offline processing of the radio telescope data to digitizing several hundreds/thousands of antenna outputs over huge bandwidths, forming several 100s of beams, and processing the data in the SKA real-time pulsar search pipelines. A brief account of the different architectures of the accelerators, primarily the new generation Field Programmable Gate Array-based accelerators, showing their critical roles to achieve high-performance computing and in handling the enormous data volume problems of the SKA is presented here. It also presents the power-performance efficiency of this emerging technology and presents potential future scenarios.
△ Less
Submitted 17 January, 2023; v1 submitted 14 July, 2022;
originally announced July 2022.
-
Mapping the landscape of histomorphological cancer phenotypes using self-supervised learning on unlabeled, unannotated pathology slides
Authors:
Adalberto Claudio Quiros,
Nicolas Coudray,
Anna Yeaton,
Xinyu Yang,
Bojing Liu,
Hortense Le,
Luis Chiriboga,
Afreen Karimkhan,
Navneet Narula,
David A. Moore,
Christopher Y. Park,
Harvey Pass,
Andre L. Moreira,
John Le Quesne,
Aristotelis Tsirigos,
Ke Yuan
Abstract:
Definitive cancer diagnosis and management depend upon the extraction of information from microscopy images by pathologists. These images contain complex information requiring time-consuming expert human interpretation that is prone to human bias. Supervised deep learning approaches have proven powerful for classification tasks, but they are inherently limited by the cost and quality of annotation…
▽ More
Definitive cancer diagnosis and management depend upon the extraction of information from microscopy images by pathologists. These images contain complex information requiring time-consuming expert human interpretation that is prone to human bias. Supervised deep learning approaches have proven powerful for classification tasks, but they are inherently limited by the cost and quality of annotations used for training these models. To address this limitation of supervised methods, we developed Histomorphological Phenotype Learning (HPL), a fully blue{self-}supervised methodology that requires no expert labels or annotations and operates via the automatic discovery of discriminatory image features in small image tiles. Tiles are grouped into morphologically similar clusters which constitute a library of histomorphological phenotypes, revealing trajectories from benign to malignant tissue via inflammatory and reactive phenotypes. These clusters have distinct features which can be identified using orthogonal methods, linking histologic, molecular and clinical phenotypes. Applied to lung cancer tissues, we show that they align closely with patient survival, with histopathologically recognised tumor types and growth patterns, and with transcriptomic measures of immunophenotype. We then demonstrate that these properties are maintained in a multi-cancer study. These results show the clusters represent recurrent host responses and modes of tumor growth emerging under natural selection. Code, pre-trained models, learned embeddings, and documentation are available to the community at https://github.com/AdalbertoCq/Histomorphological-Phenotype-Learning
△ Less
Submitted 1 September, 2023; v1 submitted 4 May, 2022;
originally announced May 2022.
-
Handling tree-structured text: parsing directory pages
Authors:
Sarang Shrivastava,
Afreen Shaikh,
Shivani Shrivastava,
Chung Ming Ho,
Pradeep Reddy,
Vijay Saraswat
Abstract:
The determination of the reading sequence of text is fundamental to document understanding. This problem is easily solved in pages where the text is organized into a sequence of lines and vertical alignment runs the height of the page (producing multiple columns which can be read from left to right). We present a situation -- the directory page parsing problem -- where information is presented on…
▽ More
The determination of the reading sequence of text is fundamental to document understanding. This problem is easily solved in pages where the text is organized into a sequence of lines and vertical alignment runs the height of the page (producing multiple columns which can be read from left to right). We present a situation -- the directory page parsing problem -- where information is presented on the page in an irregular, visually-organized, two-dimensional format. Directory pages are fairly common in financial prospectuses and carry information about organizations, their addresses and relationships that is key to business tasks in client onboarding. Interestingly, directory pages sometimes have hierarchical structure, motivating the need to generalize the reading sequence to a reading tree. We present solutions to the problem of identifying directory pages and constructing the reading tree, using (learnt) classifiers for text segments and a bottom-up (right to left, bottom-to-top) traversal of segments. The solution is a key part of a production service supporting automatic extraction of organization, address and relationship information from client onboarding documents.
△ Less
Submitted 24 November, 2021;
originally announced November 2021.
-
Review of Cost Reduction Methods in Photoacoustic Computed Tomography
Authors:
Afreen Fatima,
Karl Kratkiewicz,
Rayyan Manwar,
Mohsin Zafar,
Ruiying Zhang,
Bin Huang,
Neda Dadashzadesh,
Jun Xia,
Mohammad Avanaki
Abstract:
Photoacoustic Computed Tomography (PACT) is a major configuration of photoacoustic imaging, a hybrid noninvasive modality for both functional and molecular imaging. PACT has rapidly gained importance in the field of biomedical imaging due to superior performance as compared to conventional optical imaging counterparts. However, the overall cost of developing a PACT system is one of the challenges…
▽ More
Photoacoustic Computed Tomography (PACT) is a major configuration of photoacoustic imaging, a hybrid noninvasive modality for both functional and molecular imaging. PACT has rapidly gained importance in the field of biomedical imaging due to superior performance as compared to conventional optical imaging counterparts. However, the overall cost of developing a PACT system is one of the challenges towards clinical translation of this novel technique. The cost of a typical commercial PACT system originates from optical source, ultrasound detector, and data acquisition unit. With growing applications of photoacoustic imaging, there is a tremendous demand towards reducing its cost. In this review article, we have discussed various approaches to reduce the overall cost of a PACT system, and provided a cost estimation to build a low-cost PACT system.
△ Less
Submitted 29 May, 2019; v1 submitted 26 February, 2019;
originally announced February 2019.
-
Interdisciplinary collaboration in research networks: Empirical analysis of energy-related research in Greece
Authors:
Georgios A. Tritsaris,
Afreen Siddiqi
Abstract:
Technological innovation is intimately related to knowledge creation and recombination. In this work we introduce a combined statistical and network-based approach to study collaboration in scientific authorship. We apply it to characterize recent research efforts in renewable energy technology and its intersections with the domains of nanoscience and nanotechnology with focus on materials, and el…
▽ More
Technological innovation is intimately related to knowledge creation and recombination. In this work we introduce a combined statistical and network-based approach to study collaboration in scientific authorship. We apply it to characterize recent research efforts in renewable energy technology and its intersections with the domains of nanoscience and nanotechnology with focus on materials, and electrical engineering and computer science in Greece and its broader European and international environment as a case study. Using our methods we attempt to illuminate the processes which underlie knowledge creation and diversification in these research networks: a (positive) relationship between expenditure on research and development and the extent and diversity of team-based research at the intersections of the three domains is established. Our specific findings collectively provide insights into the collaboration structure and evolution of energy-related research activity in Greece, while our methodology can be used for evidence-based design, monitoring, and evaluation of interdisciplinary research programs.
△ Less
Submitted 31 March, 2019; v1 submitted 13 May, 2018;
originally announced May 2018.
-
A Review on Elliptic Curve Cryptography for Embedded Systems
Authors:
Rahat Afreen,
S. C. Mehrotra
Abstract:
Importance of Elliptic Curves in Cryptography was independently proposed by Neal Koblitz and Victor Miller in 1985.Since then, Elliptic curve cryptography or ECC has evolved as a vast field for public key cryptography (PKC) systems. In PKC system, we use separate keys to encode and decode the data. Since one of the keys is distributed publicly in PKC systems, the strength of security depends on la…
▽ More
Importance of Elliptic Curves in Cryptography was independently proposed by Neal Koblitz and Victor Miller in 1985.Since then, Elliptic curve cryptography or ECC has evolved as a vast field for public key cryptography (PKC) systems. In PKC system, we use separate keys to encode and decode the data. Since one of the keys is distributed publicly in PKC systems, the strength of security depends on large key size. The mathematical problems of prime factorization and discrete logarithm are previously used in PKC systems. ECC has proved to provide same level of security with relatively small key sizes. The research in the field of ECC is mostly focused on its implementation on application specific systems. Such systems have restricted resources like storage, processing speed and domain specific CPU architecture.
△ Less
Submitted 19 July, 2011;
originally announced July 2011.