Search | arXiv e-print repository

Functional H_infity Filtering for Descriptor Systems with Monotone nonlinearities

Authors: Rishabh Sharma, Mahendra Kumar Gupta, Nutan Kumar Tomar

Abstract: This paper introduces a novel approach to design of functional H_\infty filters for a class of nonlinear descriptor systems subjected to disturbances. Departing from conventional assumptions regarding system regularity, we adopt a more inclusive approach by considering general descriptor systems that satisfy a rank condition on their coefficient matrices. Under this rank condition, we establish a… ▽ More This paper introduces a novel approach to design of functional H_\infty filters for a class of nonlinear descriptor systems subjected to disturbances. Departing from conventional assumptions regarding system regularity, we adopt a more inclusive approach by considering general descriptor systems that satisfy a rank condition on their coefficient matrices. Under this rank condition, we establish a linear matrix inequality (LMI) as a sufficient criterion ensuring the stability of the error system and constraining the L 2 gain of the mapping from disturbances to errors to a predetermined level. The efficacy of the proposed approach is demonstrated through a practical example involving a simple constrained mechanical system. △ Less

Submitted 9 September, 2024; originally announced September 2024.

Comments: 13 pages, 4 figures

MSC Class: 93B51; 93C15; 93C10

arXiv:2408.07407 [pdf, ps, other]

Probing 4 X 4 quark mixing matrix

Authors: Gurjit Kaur, Gulsheen Ahuja, Dheeraj Shukla, Manmohan Gupta

Abstract: Without adhering to any specific model, we have presented 4 X 4 quark mixing matrix as an extension of the 3 X 3 PDG parametrization of the CKM matrix. Using unitarity constraints as well as the hierarchy among the elements of the 3 X 3 CKM matrix, we have found the hierarchy among the 4th row and 4th column elements of the 4 X 4 quark mixing matrix. Further, for the fourth generation case, we hav… ▽ More Without adhering to any specific model, we have presented 4 X 4 quark mixing matrix as an extension of the 3 X 3 PDG parametrization of the CKM matrix. Using unitarity constraints as well as the hierarchy among the elements of the 3 X 3 CKM matrix, we have found the hierarchy among the 4th row and 4th column elements of the 4 X 4 quark mixing matrix. Further, for the fourth generation case, we have explicitly found the 9 independent rephasing invariant parameters J_4X4. Also, using phenomenological estimates of the 4th row and 4th column elements, we have numerically evaluated these 9 parameters. △ Less

Submitted 14 August, 2024; originally announced August 2024.

Comments: 16 pages

arXiv:2408.03838 [pdf, other]

Using a Distance Sensor to Detect Deviations in a Planar Surface

Authors: Carter Sifferman, William Sun, Mohit Gupta, Michael Gleicher

Abstract: We investigate methods for determining if a planar surface contains geometric deviations (e.g., protrusions, objects, divots, or cliffs) using only an instantaneous measurement from a miniature optical time-of-flight sensor. The key to our method is to utilize the entirety of information encoded in raw time-of-flight data captured by off-the-shelf distance sensors. We provide an analysis of the pr… ▽ More We investigate methods for determining if a planar surface contains geometric deviations (e.g., protrusions, objects, divots, or cliffs) using only an instantaneous measurement from a miniature optical time-of-flight sensor. The key to our method is to utilize the entirety of information encoded in raw time-of-flight data captured by off-the-shelf distance sensors. We provide an analysis of the problem in which we identify the key ambiguity between geometry and surface photometrics. To overcome this challenging ambiguity, we fit a Gaussian mixture model to a small dataset of planar surface measurements. This model implicitly captures the expected geometry and distribution of photometrics of the planar surface and is used to identify measurements that are likely to contain deviations. We characterize our method on a variety of surfaces and planar deviations across a range of scenarios. We find that our method utilizing raw time-of-flight data outperforms baselines which use only derived distance estimates. We build an example application in which our method enables mobile robot obstacle and cliff avoidance over a wide field-of-view. △ Less

Submitted 7 August, 2024; originally announced August 2024.

arXiv:2408.01013 [pdf, other]

Understanding and Enhancing Linux Kernel-based Packet Switching on WiFi Access Points

Authors: Shiqi Zhang, Mridul Gupta, Behnam Dezfouli

Abstract: As the number of WiFi devices and their traffic demands continue to rise, the need for a scalable and high-performance wireless infrastructure becomes increasingly essential. Central to this infrastructure are WiFi Access Points (APs), which facilitate packet switching between Ethernet and WiFi interfaces. Despite APs' reliance on the Linux kernel's data plane for packet switching, the detailed op… ▽ More As the number of WiFi devices and their traffic demands continue to rise, the need for a scalable and high-performance wireless infrastructure becomes increasingly essential. Central to this infrastructure are WiFi Access Points (APs), which facilitate packet switching between Ethernet and WiFi interfaces. Despite APs' reliance on the Linux kernel's data plane for packet switching, the detailed operations and complexities of switching packets between Ethernet and WiFi interfaces have not been investigated in existing works. This paper makes the following contributions towards filling this research gap. Through macro and micro-analysis of empirical experiments, our study reveals insights in two distinct categories. Firstly, while the kernel's statistics offer valuable insights into system operations, we identify and discuss potential pitfalls that can severely affect system analysis. For instance, we reveal the implications of device drivers on the meaning and accuracy of the statistics related to packet-switching tasks and processor utilization. Secondly, we analyze the impact of the packet switching path and core configuration on performance and power consumption. Specifically, we identify the differences in Ethernet-to-WiFi and WiFi-to-Ethernet data paths regarding processing components, multi-core utilization, and energy efficiency. We show that the WiFi-to-Ethernet data path leverages better multi-core processing and exhibits lower power consumption. △ Less

Submitted 2 August, 2024; originally announced August 2024.

Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

Report number: SIOTLAB-SUM-31J2024

arXiv:2408.00394 [pdf, ps, other]

Hierarchy of CKM matrix elements and implications of unitarity

Authors: Gurjit Kaur, Gulsheen Ahuja, Dheeraj Shukla, Manmohan Gupta

Abstract: The hierarchy amongst the CKM matrix elements, highlighted recently by Luo and Xing, has been rigorously revisited using the PDG parameterization incorporating unitarity constraints. Further, we have explored the evaluation of the CP violating parameter ε_k for the 9 possible unitarity ensured equivalent independent parametrizations of CKM matrix. Interestingly, we find that not all of these repro… ▽ More The hierarchy amongst the CKM matrix elements, highlighted recently by Luo and Xing, has been rigorously revisited using the PDG parameterization incorporating unitarity constraints. Further, we have explored the evaluation of the CP violating parameter ε_k for the 9 possible unitarity ensured equivalent independent parametrizations of CKM matrix. Interestingly, we find that not all of these reproduce the value of parameter ε_k, this being presumably due to the hierarchical nature of the CKM matrix elements. This situation is echoing similar conclusions regarding the evaluation of Jarlskog's rephasing invariant parameter J through unitarity ensured equivalent possibilities. △ Less

Submitted 1 August, 2024; originally announced August 2024.

Comments: 13 pages. arXiv admin note: text overlap with arXiv:2310.11152

arXiv:2407.12877 [pdf, other]

Review-Feedback-Reason (ReFeR): A Novel Framework for NLG Evaluation and Reasoning

Authors: Yaswanth Narsupalli, Abhranil Chandra, Sreevatsa Muppirala, Manish Gupta, Pawan Goyal

Abstract: Assessing the quality of Natural Language Generation (NLG) outputs, such as those produced by large language models (LLMs), poses significant challenges. Traditional approaches involve either resource-intensive human evaluations or automatic metrics, which often exhibit a low correlation with human judgment. In this study, we propose Review-Feedback-Reason (ReFeR), a novel evaluation framework for… ▽ More Assessing the quality of Natural Language Generation (NLG) outputs, such as those produced by large language models (LLMs), poses significant challenges. Traditional approaches involve either resource-intensive human evaluations or automatic metrics, which often exhibit a low correlation with human judgment. In this study, we propose Review-Feedback-Reason (ReFeR), a novel evaluation framework for NLG using LLM agents. We rigorously test ReFeR using two pre-existing benchmark datasets on diverse NLG tasks. The proposed framework not only enhances the accuracy of NLG evaluation, surpassing previous benchmarks by $\sim$20\%, but also generates constructive feedback and significantly improves collective reasoning. This feedback is then leveraged for the creation of instruction-tuning datasets, which, when used to fine-tune smaller models like Mistral-7B, makes them extremely good evaluators, yielding a better correlation with human evaluations and performance nearly on par with GPT-3.5. We highlight the effectiveness of our methodology through its application on three reasoning benchmarks, where it outperforms most of the state-of-the-art methods, and also outperforms the reasoning capabilities of models like GPT-3.5 Turbo by $\sim$11.67\% and GPT-4 by $\sim$1\% on an average. △ Less

Submitted 16 July, 2024; originally announced July 2024.

Comments: Paper Under Review

arXiv:2407.12165 [pdf, other]

Building AI Agents for Autonomous Clouds: Challenges and Design Principles

Authors: Manish Shetty, Yinfang Chen, Gagan Somashekar, Minghua Ma, Yogesh Simmhan, Xuchao Zhang, Jonathan Mace, Dax Vandevoorde, Pedro Las-Casas, Shachee Mishra Gupta, Suman Nath, Chetan Bansal, Saravan Rajmohan

Abstract: The rapid growth in the use of Large Language Models (LLMs) and AI Agents as part of software development and deployment is revolutionizing the information technology landscape. While code generation receives significant attention, a higher-impact application lies in using AI agents for operational resilience of cloud services, which currently require significant human effort and domain knowledge.… ▽ More The rapid growth in the use of Large Language Models (LLMs) and AI Agents as part of software development and deployment is revolutionizing the information technology landscape. While code generation receives significant attention, a higher-impact application lies in using AI agents for operational resilience of cloud services, which currently require significant human effort and domain knowledge. There is a growing interest in AI for IT Operations (AIOps) which aims to automate complex operational tasks, like fault localization and root cause analysis, thereby reducing human intervention and customer impact. However, achieving the vision of autonomous and self-healing clouds through AIOps is hampered by the lack of standardized frameworks for building, evaluating, and improving AIOps agents. This vision paper lays the groundwork for such a framework by first framing the requirements and then discussing design decisions that satisfy them. We also propose AIOpsLab, a prototype implementation leveraging agent-cloud-interface that orchestrates an application, injects real-time faults using chaos engineering, and interfaces with an agent to localize and resolve the faults. We report promising results and lay the groundwork to build a modular and robust framework for building, evaluating, and improving agents for autonomous clouds. △ Less

Submitted 31 July, 2024; v1 submitted 16 July, 2024; originally announced July 2024.

arXiv:2407.09463 [pdf, other]

Interactive Coding with Unbounded Noise

Authors: Eden Fargion, Ran Gelles, Meghal Gupta

Abstract: Interactive coding allows two parties to conduct a distributed computation despite noise corrupting a certain fraction of their communication. Dani et al.\@ (Inf.\@ and Comp., 2018) suggested a novel setting in which the amount of noise is unbounded and can significantly exceed the length of the (noise-free) computation. While no solution is possible in the worst case, under the restriction of obl… ▽ More Interactive coding allows two parties to conduct a distributed computation despite noise corrupting a certain fraction of their communication. Dani et al.\@ (Inf.\@ and Comp., 2018) suggested a novel setting in which the amount of noise is unbounded and can significantly exceed the length of the (noise-free) computation. While no solution is possible in the worst case, under the restriction of oblivious noise, Dani et al.\@ designed a coding scheme that succeeds with a polynomially small failure probability. We revisit the question of conducting computations under this harsh type of noise and devise a computationally-efficient coding scheme that guarantees the success of the computation, except with an exponentially small probability. This higher degree of correctness matches the case of coding schemes with a bounded fraction of noise. Our simulation of an $N$-bit noise-free computation in the presence of $T$ corruptions, communicates an optimal number of $O(N+T)$ bits and succeeds with probability $1-2^{-Ω(N)}$. We design this coding scheme by introducing an intermediary noise model, where an oblivious adversary can choose the locations of corruptions in a worst-case manner, but the effect of each corruption is random: the noise either flips the transmission with some probability or otherwise erases it. This randomized abstraction turns out to be instrumental in achieving an optimal coding scheme. △ Less

Submitted 12 July, 2024; originally announced July 2024.

arXiv:2407.09386 [pdf, other]

Radiance Fields from Photons

Authors: Sacha Jungerman, Mohit Gupta

Abstract: Neural radiance fields, or NeRFs, have become the de facto approach for high-quality view synthesis from a collection of images captured from multiple viewpoints. However, many issues remain when capturing images in-the-wild under challenging conditions, such as low light, high dynamic range, or rapid motion leading to smeared reconstructions with noticeable artifacts. In this work, we introduce q… ▽ More Neural radiance fields, or NeRFs, have become the de facto approach for high-quality view synthesis from a collection of images captured from multiple viewpoints. However, many issues remain when capturing images in-the-wild under challenging conditions, such as low light, high dynamic range, or rapid motion leading to smeared reconstructions with noticeable artifacts. In this work, we introduce quanta radiance fields, a novel class of neural radiance fields that are trained at the granularity of individual photons using single-photon cameras (SPCs). We develop theory and practical computational techniques for building radiance fields and estimating dense camera poses from unconventional, stochastic, and high-speed binary frame sequences captured by SPCs. We demonstrate, both via simulations and a SPC hardware prototype, high-fidelity reconstructions under high-speed motion, in low light, and for extreme dynamic range settings. △ Less

Submitted 12 July, 2024; originally announced July 2024.

arXiv:2407.06446 [pdf, ps, other]

Tight bounds for stream decodable error-correcting codes

Authors: Meghal Gupta, Venkatesan Guruswami, Mihir Singhal

Abstract: In order to communicate a message over a noisy channel, a sender (Alice) uses an error-correcting code to encode her message $x$ into a codeword. The receiver (Bob) decodes it correctly whenever there is at most a small constant fraction of adversarial error in the transmitted codeword. This work investigates the setting where Bob is computationally bounded. Specifically, Bob receives the message… ▽ More In order to communicate a message over a noisy channel, a sender (Alice) uses an error-correcting code to encode her message $x$ into a codeword. The receiver (Bob) decodes it correctly whenever there is at most a small constant fraction of adversarial error in the transmitted codeword. This work investigates the setting where Bob is computationally bounded. Specifically, Bob receives the message as a stream and must process it and write $x$ in order to a write-only tape while using low (say polylogarithmic) space. We show three basic results about this setting, which are informally as follows: (1) There is a stream decodable code of near-quadratic length. (2) There is no stream decodable code of sub-quadratic length. (3) If Bob need only compute a private linear function of the input bits, instead of writing them all to the output tape, there is a stream decodable code of near-linear length. △ Less

Submitted 8 July, 2024; originally announced July 2024.

arXiv:2407.02683 [pdf, other]

Generalized Event Cameras

Authors: Varun Sundar, Matthew Dutson, Andrei Ardelean, Claudio Bruschini, Edoardo Charbon, Mohit Gupta

Abstract: Event cameras capture the world at high time resolution and with minimal bandwidth requirements. However, event streams, which only encode changes in brightness, do not contain sufficient scene information to support a wide variety of downstream tasks. In this work, we design generalized event cameras that inherently preserve scene intensity in a bandwidth-efficient manner. We generalize event cam… ▽ More Event cameras capture the world at high time resolution and with minimal bandwidth requirements. However, event streams, which only encode changes in brightness, do not contain sufficient scene information to support a wide variety of downstream tasks. In this work, we design generalized event cameras that inherently preserve scene intensity in a bandwidth-efficient manner. We generalize event cameras in terms of when an event is generated and what information is transmitted. To implement our designs, we turn to single-photon sensors that provide digital access to individual photon detections; this modality gives us the flexibility to realize a rich space of generalized event cameras. Our single-photon event cameras are capable of high-speed, high-fidelity imaging at low readout rates. Consequently, these event cameras can support plug-and-play downstream inference, without capturing new event datasets or designing specialized event-vision models. As a practical implication, our designs, which involve lightweight and near-sensor-compatible computations, provide a way to use single-photon sensors without exorbitant bandwidth costs. △ Less

Submitted 2 July, 2024; originally announced July 2024.

Comments: CVPR 2024

arXiv:2406.19709 [pdf, ps, other]

Near Optimal Dual Fault Tolerant Distance Oracle

Authors: Dipan Dey, Manoj Gupta

Abstract: We present a dual fault-tolerant distance oracle for undirected and unweighted graphs. Given a set $F$ of two edges, as well as a source node $s$ and a destination node $t$, our oracle returns the length of the shortest path from $s$ to $t$ that avoids $F$ in $O(1)$ time with a high probability. The space complexity of our oracle is $\Tilde{O}(n^2)$ \footnote{$\Tilde{O}$ hides poly$\log n$ factor… ▽ More We present a dual fault-tolerant distance oracle for undirected and unweighted graphs. Given a set $F$ of two edges, as well as a source node $s$ and a destination node $t$, our oracle returns the length of the shortest path from $s$ to $t$ that avoids $F$ in $O(1)$ time with a high probability. The space complexity of our oracle is $\Tilde{O}(n^2)$ \footnote{$\Tilde{O}$ hides poly$\log n$ factor }, making it nearly optimal in terms of both space and query time. Prior to our work, Pettie and Duan [SODA 2009] designed a dual fault-tolerant distance oracle that required $\Tilde{O}(n^2)$ space and $O(\log n)$ query time. In addition to improving the query time, our oracle is much simpler than the previous approach. △ Less

Submitted 1 July, 2024; v1 submitted 28 June, 2024; originally announced June 2024.

Comments: Accepted in ESA 2024

arXiv:2406.16890 [pdf, other]

TextAge: A Curated and Diverse Text Dataset for Age Classification

Authors: Shravan Cheekati, Mridul Gupta, Vibha Raghu, Pranav Raj

Abstract: Age-related language patterns play a crucial role in understanding linguistic differences and developing age-appropriate communication strategies. However, the lack of comprehensive and diverse datasets has hindered the progress of research in this area. To address this issue, we present TextAge, a curated text dataset that maps sentences to the age and age group of the producer, as well as an und… ▽ More Age-related language patterns play a crucial role in understanding linguistic differences and developing age-appropriate communication strategies. However, the lack of comprehensive and diverse datasets has hindered the progress of research in this area. To address this issue, we present TextAge, a curated text dataset that maps sentences to the age and age group of the producer, as well as an underage (under 13) label. TextAge covers a wide range of ages and includes both spoken and written data from various sources such as CHILDES, Meta, Poki Poems-by-kids, JUSThink, and the TV show "Survivor." The dataset undergoes extensive cleaning and preprocessing to ensure data quality and consistency. We demonstrate the utility of TextAge through two applications: Underage Detection and Generational Classification. For Underage Detection, we train a Naive Bayes classifier, fine-tuned RoBERTa, and XLNet models to differentiate between language patterns of minors and young-adults and over. For Generational Classification, the models classify language patterns into different age groups (kids, teens, twenties, etc.). The models excel at classifying the "kids" group but struggle with older age groups, particularly "fifties," "sixties," and "seventies," likely due to limited data samples and less pronounced linguistic differences. TextAge offers a valuable resource for studying age-related language patterns and developing age-sensitive language models. The dataset's diverse composition and the promising results of the classification tasks highlight its potential for various applications, such as content moderation, targeted advertising, and age-appropriate communication. Future work aims to expand the dataset further and explore advanced modeling techniques to improve performance on older age groups. △ Less

Submitted 2 May, 2024; originally announced June 2024.

arXiv:2406.16833 [pdf, other]

USDC: A Dataset of $\underline{U}$ser $\underline{S}$tance and $\underline{D}$ogmatism in Long $\underline{C}$onversations

Authors: Mounika Marreddy, Subba Reddy Oota, Venkata Charan Chinni, Manish Gupta, Lucie Flek

Abstract: Identifying user's opinions and stances in long conversation threads on various topics can be extremely critical for enhanced personalization, market research, political campaigns, customer service, conflict resolution, targeted advertising, and content moderation. Hence, training language models to automate this task is critical. However, to train such models, gathering manual annotations has mul… ▽ More Identifying user's opinions and stances in long conversation threads on various topics can be extremely critical for enhanced personalization, market research, political campaigns, customer service, conflict resolution, targeted advertising, and content moderation. Hence, training language models to automate this task is critical. However, to train such models, gathering manual annotations has multiple challenges: 1) It is time-consuming and costly; 2) Conversation threads could be very long, increasing chances of noisy annotations; and 3) Interpreting instances where a user changes their opinion within a conversation is difficult because often such transitions are subtle and not expressed explicitly. Inspired by the recent success of large language models (LLMs) for complex natural language processing (NLP) tasks, we leverage Mistral Large and GPT-4 to automate the human annotation process on the following two tasks while also providing reasoning: i) User Stance classification, which involves labeling a user's stance of a post in a conversation on a five-point scale; ii) User Dogmatism classification, which deals with labeling a user's overall opinion in the conversation on a four-point scale. The majority voting on zero-shot, one-shot, and few-shot annotations from these two LLMs on 764 multi-user Reddit conversations helps us curate the USDC dataset. USDC is then used to finetune and instruction-tune multiple deployable small language models for the 5-class stance and 4-class dogmatism classification tasks. We make the code and dataset publicly available [https://anonymous.4open.science/r/USDC-0F7F]. △ Less

Submitted 24 June, 2024; originally announced June 2024.

Comments: 32 pages, 18 figures

arXiv:2406.16051 [pdf, other]

Entropy-driven decision-making dynamics sheds light on the emergence of the "paradox of choice"

Authors: Manish Gupta, Arnab Barua, Haralampos Hatzikirou

Abstract: Decision making is the cognitive process of selecting a course of action among multiple alternatives. As the decision maker belongs to a complex microenvironment (which contains multiple decision makers), has to make a decision where multiple options are present which often leads to a phenomenon known as the "paradox of choices". The latter refers to the case where too many options can lead to neg… ▽ More Decision making is the cognitive process of selecting a course of action among multiple alternatives. As the decision maker belongs to a complex microenvironment (which contains multiple decision makers), has to make a decision where multiple options are present which often leads to a phenomenon known as the "paradox of choices". The latter refers to the case where too many options can lead to negative outcomes, such as increased uncertainty, decision paralysis, and frustration. Here, we employ an entropy driven mechanism within a statistical physics framework to explain the premises of the paradox. In turn, we focus on the emergence of a collective "paradox of choice", in the case of interacting decision-making agents, quantified as the decision synchronization time. Our findings reveal a trade-off between synchronization time and the sensing radius, indicating the optimal conditions for information transfer among group members, which significantly depends the individual sensitivity parameters. Interestingly, when agents sense their microenvironment in a biased way or their decisions are influenced by their past choices, then the collective "paradox of choice" does not occur. In a nutshell, our theory offers a low-dimensional and unified statistical explanation of the "paradox of choice" at the individual and at the collective level. △ Less

Submitted 23 June, 2024; originally announced June 2024.

Comments: 20 pages, 8 figures

arXiv:2406.15499 [pdf, other]

doi 10.1016/j.mtcomm.2024.109583

Exploring large language models for microstructure evolution in materials

Authors: Prathamesh Satpute, Saurabh Tiwari, Maneet Gupta, Supriyo Ghosh

Abstract: There is a significant potential for coding skills to transition fully to natural language in the future. In this context, large language models (LLMs) have shown impressive natural language processing abilities to generate sophisticated computer code for research tasks in various domains. We report the first study on the applicability of LLMs to perform computer experiments on microstructure patt… ▽ More There is a significant potential for coding skills to transition fully to natural language in the future. In this context, large language models (LLMs) have shown impressive natural language processing abilities to generate sophisticated computer code for research tasks in various domains. We report the first study on the applicability of LLMs to perform computer experiments on microstructure pattern formation in model materials. In particular, we exploit LLM's ability to generate code for solving various types of phase-field-based partial differential equations (PDEs) that integrate additional physics to model material microstructures. The results indicate that LLMs have a remarkable capacity to generate multi-physics code and can effectively deal with materials microstructure problems up to a certain complexity. However, for complex multi-physics coupled PDEs for which a detailed understanding of the problem is required, LLMs fail to perform the task efficiently, since much more detailed instructions with many iterations of the same query are required to generate the desired output. Nonetheless, at their current stage of development and potential future advancements, LLMs offer a promising outlook for accelerating materials education and research by supporting beginners and experts in their physics-based methodology. We hope this paper will spur further interest to leverage LLMs as a supporting tool in the integrated computational materials engineering (ICME) approach to materials modeling and design. △ Less

Submitted 19 June, 2024; originally announced June 2024.

Journal ref: Materials Today Communications, 2024

arXiv:2406.00859 [pdf, other]

Streaming quanta sensors for online, high-performance imaging and vision

Authors: Tianyi Zhang, Matthew Dutson, Vivek Boominathan, Mohit Gupta, Ashok Veeraraghavan

Abstract: Recently quanta image sensors (QIS) -- ultra-fast, zero-read-noise binary image sensors -- have demonstrated remarkable imaging capabilities in many challenging scenarios. Despite their potential, the adoption of these sensors is severely hampered by (a) high data rates and (b) the need for new computational pipelines to handle the unconventional raw data. We introduce a simple, low-bandwidth comp… ▽ More Recently quanta image sensors (QIS) -- ultra-fast, zero-read-noise binary image sensors -- have demonstrated remarkable imaging capabilities in many challenging scenarios. Despite their potential, the adoption of these sensors is severely hampered by (a) high data rates and (b) the need for new computational pipelines to handle the unconventional raw data. We introduce a simple, low-bandwidth computational pipeline to address these challenges. Our approach is based on a novel streaming representation with a small memory footprint, efficiently capturing intensity information at multiple temporal scales. Updating the representation requires only 16 floating-point operations/pixel, which can be efficiently computed online at the native frame rate of the binary frames. We use a neural network operating on this representation to reconstruct videos in real-time (10-30 fps). We illustrate why such representation is well-suited for these emerging sensors, and how it offers low latency and high frame rate while retaining flexibility for downstream computer vision. Our approach results in significant data bandwidth reductions ~100X and real-time image reconstruction and computer vision -- $10^4$-$10^5$ reduction in computation than existing state-of-the-art approach while maintaining comparable quality. To the best of our knowledge, our approach is the first to achieve online, real-time image reconstruction on QIS. △ Less

Submitted 2 June, 2024; originally announced June 2024.

arXiv:2405.17190 [pdf, other]

SoK: Leveraging Transformers for Malware Analysis

Authors: Pradip Kunwar, Kshitiz Aryal, Maanak Gupta, Mahmoud Abdelsalam, Elisa Bertino

Abstract: The introduction of transformers has been an important breakthrough for AI research and application as transformers are the foundation behind Generative AI. A promising application domain for transformers is cybersecurity, in particular the malware domain analysis. The reason is the flexibility of the transformer models in handling long sequential features and understanding contextual relationship… ▽ More The introduction of transformers has been an important breakthrough for AI research and application as transformers are the foundation behind Generative AI. A promising application domain for transformers is cybersecurity, in particular the malware domain analysis. The reason is the flexibility of the transformer models in handling long sequential features and understanding contextual relationships. However, as the use of transformers for malware analysis is still in the infancy stage, it is critical to evaluate, systematize, and contextualize existing literature to foster future research. This Systematization of Knowledge (SoK) paper aims to provide a comprehensive analysis of transformer-based approaches designed for malware analysis. Based on our systematic analysis of existing knowledge, we structure and propose taxonomies based on: (a) how different transformers are adapted, organized, and modified across various use cases; and (b) how diverse feature types and their representation capabilities are reflected. We also provide an inventory of datasets used to explore multiple research avenues in the use of transformers for malware analysis and discuss open challenges with future research directions. We believe that this SoK paper will assist the research community in gaining detailed insights from existing work and will serve as a foundational resource for implementing novel research using transformers for malware analysis. △ Less

Submitted 27 May, 2024; originally announced May 2024.

arXiv:2405.13630 [pdf, other]

STONKS: Quasi-real time XMM-Newton transient detection system

Authors: E. Quintin, N. A. Webb, I. Georgantopoulos, M. Gupta, E. Kammoun, L. Michel, A. Schwope, H. Tranin, I. Traulsen

Abstract: Over recent decades, astronomy has entered the era of massive data and real-time surveys. This is improving the study of transient objects - although they still contain some of the most poorly understood phenomena in astrophysics, as it is inherently more difficult to obtain data on them. In order to help detect these objects in their brightest state, we have built a quasi-real time transient dete… ▽ More Over recent decades, astronomy has entered the era of massive data and real-time surveys. This is improving the study of transient objects - although they still contain some of the most poorly understood phenomena in astrophysics, as it is inherently more difficult to obtain data on them. In order to help detect these objects in their brightest state, we have built a quasi-real time transient detection system for the XMM-Newton pipeline: the Search for Transient Objects in New detections using Known Sources (STONKS) pipeline. STONKS detects long-term X-ray transients by automatically comparing new XMM-Newton detections to any available archival X-ray data at this position, sending out an alert if the amplitude of variability between observations is over 5. This required an initial careful cross-correlation and flux calibration of various X-ray catalogs from different observatories (XMM-Newton, Chandra, Swift, ROSAT, and eROSITA). We also systematically computed the XMM-Newton upper limits at the position of any X-ray source covered by the XMM-Newton observational footprint, even without any XMM-Newton counterpart. The behavior of STONKS was then tested on all 483 observations performed with imaging mode in 2021. Over the 2021 testing run, STONKS provided $0.7^{+0.7}_{-0.5}$ alerts per day, about 80% of them being serendipitous. STONKS also detected targeted tidal disruption events, ensuring its ability to detect other serendipitous events. As a byproduct of our method, the archival multi-instrument catalog contains about one million X-ray sources, with 15% of them involving several catalogs and 60% of them having XMM-Newton upper limits. STONKS demonstrates a great potential for revealing future serendipitous transient X-ray sources, providing the community with the ability to follow-up on these objects a few days after their detection. △ Less

Submitted 22 May, 2024; originally announced May 2024.

Comments: 30 pages, 30 figures, accepted in A&A

arXiv:2405.08123 [pdf, other]

Modeling sea ice in the marginal ice zone as a dense granular flow with rheology inferred from a discrete element model

Authors: Gonzalo G. de Diego, Mukund Gupta, Skylar A. Gering, Rohaiz Haris, Georg Stadler

Abstract: The marginal ice zone (MIZ) represents the periphery of the sea ice cover. Here, the macroscale behavior of the sea ice results from collisions and enduring contact between ice floes. This configuration closely resembles that of dense granular flows, which have been modeled successfully with the $μ(I)$ rheology. Here, we present a continuous model based on the $μ(I)$ rheology which treats sea ice… ▽ More The marginal ice zone (MIZ) represents the periphery of the sea ice cover. Here, the macroscale behavior of the sea ice results from collisions and enduring contact between ice floes. This configuration closely resembles that of dense granular flows, which have been modeled successfully with the $μ(I)$ rheology. Here, we present a continuous model based on the $μ(I)$ rheology which treats sea ice as a compressible fluid, with the local sea ice concentration given by a dilatancy function $Φ(I)$. We infer expressions for $μ(I)$ and $Φ(I)$ from a discrete element method (DEM) which considers polygonal-shaped ice floes. We do this by driving the sea ice with a one-dimensional shearing ocean current. The resulting continuous model is a nonlinear system of equations with the sea ice velocity, local concentration, and pressure as unknowns. The rheology is given by the sum of a plastic and a viscous term. In the context of a periodic patch of ocean, which is effectively a one dimensional problem, and under steady conditions, we prove this system to be well-posed, present a numerical algorithm for solving it, and compare its solutions to those of the DEM. These comparisons demonstrate the continuous model's ability to capture most of the DEM's results accurately. The continuous model is particularly accurate for ocean currents faster than 0.25 m/s; however, for low concentrations and slow ocean currents, the continuous model is less effective in capturing the DEM results. In the latter case, the lack of accuracy of the continuous model is found to be accompanied by the breakdown of a balance between the average shear stress and the integrated ocean drag extracted from the DEM. △ Less

Submitted 13 May, 2024; originally announced May 2024.

arXiv:2405.06038 [pdf, other]

From Algorithm to Hardware: A Survey on Efficient and Safe Deployment of Deep Neural Networks

Authors: Xue Geng, Zhe Wang, Chunyun Chen, Qing Xu, Kaixin Xu, Chao Jin, Manas Gupta, Xulei Yang, Zhenghua Chen, Mohamed M. Sabry Aly, Jie Lin, Min Wu, Xiaoli Li

Abstract: Deep neural networks (DNNs) have been widely used in many artificial intelligence (AI) tasks. However, deploying them brings significant challenges due to the huge cost of memory, energy, and computation. To address these challenges, researchers have developed various model compression techniques such as model quantization and model pruning. Recently, there has been a surge in research of compress… ▽ More Deep neural networks (DNNs) have been widely used in many artificial intelligence (AI) tasks. However, deploying them brings significant challenges due to the huge cost of memory, energy, and computation. To address these challenges, researchers have developed various model compression techniques such as model quantization and model pruning. Recently, there has been a surge in research of compression methods to achieve model efficiency while retaining the performance. Furthermore, more and more works focus on customizing the DNN hardware accelerators to better leverage the model compression techniques. In addition to efficiency, preserving security and privacy is critical for deploying DNNs. However, the vast and diverse body of related works can be overwhelming. This inspires us to conduct a comprehensive survey on recent research toward the goal of high-performance, cost-efficient, and safe deployment of DNNs. Our survey first covers the mainstream model compression techniques such as model quantization, model pruning, knowledge distillation, and optimizations of non-linear operations. We then introduce recent advances in designing hardware accelerators that can adapt to efficient model compression approaches. Additionally, we discuss how homomorphic encryption can be integrated to secure DNN deployment. Finally, we discuss several issues, such as hardware evaluation, generalization, and integration of various compression approaches. Overall, we aim to provide a big picture of efficient DNNs, from algorithm to hardware accelerators and security perspectives. △ Less

Submitted 9 May, 2024; originally announced May 2024.

Comments: This manuscript is the accepted version for TNNLS(IEEE Transactions on Neural Networks and Learning Systems)

arXiv:2405.04010 [pdf, other]

Explainability-Informed Targeted Malware Misclassification

Authors: Quincy Card, Kshitiz Aryal, Maanak Gupta

Abstract: In recent years, there has been a surge in malware attacks across critical infrastructures, requiring further research and development of appropriate response and remediation strategies in malware detection and classification. Several works have used machine learning models for malware classification into categories, and deep neural networks have shown promising results. However, these models have… ▽ More In recent years, there has been a surge in malware attacks across critical infrastructures, requiring further research and development of appropriate response and remediation strategies in malware detection and classification. Several works have used machine learning models for malware classification into categories, and deep neural networks have shown promising results. However, these models have shown its vulnerabilities against intentionally crafted adversarial attacks, which yields misclassification of a malicious file. Our paper explores such adversarial vulnerabilities of neural network based malware classification system in the dynamic and online analysis environments. To evaluate our approach, we trained Feed Forward Neural Networks (FFNN) to classify malware categories based on features obtained from dynamic and online analysis environments. We use the state-of-the-art method, SHapley Additive exPlanations (SHAP), for the feature attribution for malware classification, to inform the adversarial attackers about the features with significant importance on classification decision. Using the explainability-informed features, we perform targeted misclassification adversarial white-box evasion attacks using the Fast Gradient Sign Method (FGSM) and Projected Gradient Descent (PGD) attacks against the trained classifier. Our results demonstrated high evasion rate for some instances of attacks, showing a clear vulnerability of a malware classifier for such attacks. We offer recommendations for a balanced approach and a benchmark for much-needed future research into evasion attacks against malware classifiers, and develop more robust and trustworthy solutions. △ Less

Submitted 7 May, 2024; originally announced May 2024.

arXiv:2405.01728 [pdf, other]

Explainability Guided Adversarial Evasion Attacks on Malware Detectors

Authors: Kshitiz Aryal, Maanak Gupta, Mahmoud Abdelsalam, Moustafa Saleh

Abstract: As the focus on security of Artificial Intelligence (AI) is becoming paramount, research on crafting and inserting optimal adversarial perturbations has become increasingly critical. In the malware domain, this adversarial sample generation relies heavily on the accuracy and placement of crafted perturbation with the goal of evading a trained classifier. This work focuses on applying explainabilit… ▽ More As the focus on security of Artificial Intelligence (AI) is becoming paramount, research on crafting and inserting optimal adversarial perturbations has become increasingly critical. In the malware domain, this adversarial sample generation relies heavily on the accuracy and placement of crafted perturbation with the goal of evading a trained classifier. This work focuses on applying explainability techniques to enhance the adversarial evasion attack on a machine-learning-based Windows PE malware detector. The explainable tool identifies the regions of PE malware files that have the most significant impact on the decision-making process of a given malware detector, and therefore, the same regions can be leveraged to inject the adversarial perturbation for maximum efficiency. Profiling all the PE malware file regions based on their impact on the malware detector's decision enables the derivation of an efficient strategy for identifying the optimal location for perturbation injection. The strategy should incorporate the region's significance in influencing the malware detector's decision and the sensitivity of the PE malware file's integrity towards modifying that region. To assess the utility of explainable AI in crafting an adversarial sample of Windows PE malware, we utilize the DeepExplainer module of SHAP for determining the contribution of each region of PE malware to its detection by a CNN-based malware detector, MalConv. Furthermore, we analyzed the significance of SHAP values at a more granular level by subdividing each section of Windows PE into small subsections. We then performed an adversarial evasion attack on the subsections based on the corresponding SHAP values of the byte sequences. △ Less

Submitted 2 May, 2024; originally announced May 2024.

arXiv:2404.16595 [pdf, other]

Enhancement of spin current to charge current conversion in Ferromagnet/Graphene interface

Authors: Mahammad Tahir, Subhakanta Das, Mukul Gupta, Rohit Medwal, Soumik Mukhopadhyay

Abstract: The use of graphene in spintronic devices is contingent on its ability to convert a spin current into a charge current. We have systematically investigated the spin pumping induced spin-to-charge current conversion at the Graphene/FM interface and the effect of interface modification through high spin orbit coupling (SOC) material (Pt) as an interlayer (IL) of varying thicknesses by using broadban… ▽ More The use of graphene in spintronic devices is contingent on its ability to convert a spin current into a charge current. We have systematically investigated the spin pumping induced spin-to-charge current conversion at the Graphene/FM interface and the effect of interface modification through high spin orbit coupling (SOC) material (Pt) as an interlayer (IL) of varying thicknesses by using broadband FMR spectroscopy. The spin mixing conductance is enhanced from $1.66 \times 10^{18}$ m$^{-2}$ to $2.72 \times 10^{18}$ m$^{-2}$ whereas the spin current density is enhanced from 0.135$\pm $0.003 to 0.242$\pm$0.004 MA/m$^{2}$ at the Graphene/FM interface due to the interface modification using high SOC material Pt as an interlayer. The spin current to charge current conversion efficiency turns out to be $\approx 0.003$ nm for the Graphene/FM interface. These findings support the idea that Graphene in combination with high SOC material (Pt) could be a potential candidate for spintronic applications, specifically for spin-torque-based memory applications. △ Less

Submitted 25 May, 2024; v1 submitted 25 April, 2024; originally announced April 2024.

Comments: 8 Pages, 6 figures

arXiv:2404.13865 [pdf, other]

doi 10.1007/978-3-031-49601-1_6

Context-Enhanced Language Models for Generating Multi-Paper Citations

Authors: Avinash Anand, Kritarth Prasad, Ujjwal Goel, Mohit Gupta, Naman Lal, Astha Verma, Rajiv Ratn Shah

Abstract: Citation text plays a pivotal role in elucidating the connection between scientific documents, demanding an in-depth comprehension of the cited paper. Constructing citations is often time-consuming, requiring researchers to delve into extensive literature and grapple with articulating relevant content. To address this challenge, the field of citation text generation (CTG) has emerged. However, whi… ▽ More Citation text plays a pivotal role in elucidating the connection between scientific documents, demanding an in-depth comprehension of the cited paper. Constructing citations is often time-consuming, requiring researchers to delve into extensive literature and grapple with articulating relevant content. To address this challenge, the field of citation text generation (CTG) has emerged. However, while earlier methods have primarily centered on creating single-sentence citations, practical scenarios frequently necessitate citing multiple papers within a single paragraph. To bridge this gap, we propose a method that leverages Large Language Models (LLMs) to generate multi-citation sentences. Our approach involves a single source paper and a collection of target papers, culminating in a coherent paragraph containing multi-sentence citation text. Furthermore, we introduce a curated dataset named MCG-S2ORC, composed of English-language academic research papers in Computer Science, showcasing multiple citation instances. In our experiments, we evaluate three LLMs LLaMA, Alpaca, and Vicuna to ascertain the most effective model for this endeavor. Additionally, we exhibit enhanced performance by integrating knowledge graphs from target papers into the prompts for generating citation text. This research underscores the potential of harnessing LLMs for citation generation, opening a compelling avenue for exploring the intricate connections between scientific documents. △ Less

Submitted 22 April, 2024; originally announced April 2024.

Comments: 14 pages, 7 figures, 11th International Conference, BDA 2023, Delhi, India

Journal ref: Big Data and Artificial Intelligence 2023, Delhi, India, December 7, 80 94

arXiv:2404.13099 [pdf, other]

Mathify: Evaluating Large Language Models on Mathematical Problem Solving Tasks

Authors: Avinash Anand, Mohit Gupta, Kritarth Prasad, Navya Singla, Sanjana Sanjeev, Jatin Kumar, Adarsh Raj Shivam, Rajiv Ratn Shah

Abstract: The rapid progress in the field of natural language processing (NLP) systems and the expansion of large language models (LLMs) have opened up numerous opportunities in the field of education and instructional methods. These advancements offer the potential for tailored learning experiences and immediate feedback, all delivered through accessible and cost-effective services. One notable application… ▽ More The rapid progress in the field of natural language processing (NLP) systems and the expansion of large language models (LLMs) have opened up numerous opportunities in the field of education and instructional methods. These advancements offer the potential for tailored learning experiences and immediate feedback, all delivered through accessible and cost-effective services. One notable application area for this technological advancement is in the realm of solving mathematical problems. Mathematical problem-solving not only requires the ability to decipher complex problem statements but also the skill to perform precise arithmetic calculations at each step of the problem-solving process. However, the evaluation of the arithmetic capabilities of large language models remains an area that has received relatively little attention. In response, we introduce an extensive mathematics dataset called "MathQuest" sourced from the 11th and 12th standard Mathematics NCERT textbooks. This dataset encompasses mathematical challenges of varying complexity and covers a wide range of mathematical concepts. Utilizing this dataset, we conduct fine-tuning experiments with three prominent LLMs: LLaMA-2, WizardMath, and MAmmoTH. These fine-tuned models serve as benchmarks for evaluating their performance on our dataset. Our experiments reveal that among the three models, MAmmoTH-13B emerges as the most proficient, achieving the highest level of competence in solving the presented mathematical problems. Consequently, MAmmoTH-13B establishes itself as a robust and dependable benchmark for addressing NCERT mathematics problems. △ Less

Submitted 19 April, 2024; originally announced April 2024.

Comments: 10 pages, 3 figures, NeurIPS 2023 Workshop on Generative AI for Education (GAIED)

Journal ref: NeurIPS 2023 Workshop on Generative AI for Education (GAIED)

arXiv:2404.12473 [pdf, other]

Explainable Deep Learning Models for Dynamic and Online Malware Classification

Authors: Quincy Card, Daniel Simpson, Kshitiz Aryal, Maanak Gupta, Sheikh Rabiul Islam

Abstract: In recent years, there has been a significant surge in malware attacks, necessitating more advanced preventive measures and remedial strategies. While several successful AI-based malware classification approaches exist categorized into static, dynamic, or online analysis, most successful AI models lack easily interpretable decisions and explanations for their processes. Our paper aims to delve int… ▽ More In recent years, there has been a significant surge in malware attacks, necessitating more advanced preventive measures and remedial strategies. While several successful AI-based malware classification approaches exist categorized into static, dynamic, or online analysis, most successful AI models lack easily interpretable decisions and explanations for their processes. Our paper aims to delve into explainable malware classification across various execution environments (such as dynamic and online), thoroughly analyzing their respective strengths, weaknesses, and commonalities. To evaluate our approach, we train Feed Forward Neural Networks (FFNN) and Convolutional Neural Networks (CNN) to classify malware based on features obtained from dynamic and online analysis environments. The feature attribution for malware classification is performed by explainability tools, SHAP, LIME and Permutation Importance. We perform a detailed evaluation of the calculated global and local explanations from the experiments, discuss limitations and, ultimately, offer recommendations for achieving a balanced approach. △ Less

Submitted 18 April, 2024; originally announced April 2024.

arXiv:2404.11691 [pdf, ps, other]

doi 10.1109/INCET51464.2021.9456342

Improvement in Semantic Address Matching using Natural Language Processing

Authors: Vansh Gupta, Mohit Gupta, Jai Garg, Nitesh Garg

Abstract: Address matching is an important task for many businesses especially delivery and take out companies which help them to take out a certain address from their data warehouse. Existing solution uses similarity of strings, and edit distance algorithms to find out the similar addresses from the address database, but these algorithms could not work effectively with redundant, unstructured, or incomplet… ▽ More Address matching is an important task for many businesses especially delivery and take out companies which help them to take out a certain address from their data warehouse. Existing solution uses similarity of strings, and edit distance algorithms to find out the similar addresses from the address database, but these algorithms could not work effectively with redundant, unstructured, or incomplete address data. This paper discuss semantic Address matching technique, by which we can find out a particular address from a list of possible addresses. We have also reviewed existing practices and their shortcoming. Semantic address matching is an essentially NLP task in the field of deep learning. Through this technique We have the ability to triumph the drawbacks of existing methods like redundant or abbreviated data problems. The solution uses the OCR on invoices to extract the address and create the data pool of addresses. Then this data is fed to the algorithm BM-25 for scoring the best matching entries. Then to observe the best result, this will pass through BERT for giving the best possible result from the similar queries. Our investigation exhibits that our methodology enormously improves both accuracy and review of cutting-edge technology existing techniques. △ Less

Submitted 17 April, 2024; originally announced April 2024.

Comments: 5 pages, 7 tables, 2021 2nd International Conference for Emerging Technology (INCET)

Journal ref: 2021 2nd International Conference for Emerging Technology (INCET), Belagavi, India, 2021, pp. 1-5

arXiv:2404.11661 [pdf, other]

doi 10.1109/GlobConET53749.2022.9872449

Designing an Intelligent Parcel Management System using IoT & Machine Learning

Authors: Mohit Gupta, Nitesh Garg, Jai Garg, Vansh Gupta, Devraj Gautam

Abstract: Parcels delivery is a critical activity in railways. More importantly, each parcel must be thoroughly checked and sorted according to its destination address. We require an efficient and robust IoT system capable of doing all of these tasks with great precision and minimal human interaction. This paper discusses, We created a fully-fledged solution using IoT and machine learning to assist trains i… ▽ More Parcels delivery is a critical activity in railways. More importantly, each parcel must be thoroughly checked and sorted according to its destination address. We require an efficient and robust IoT system capable of doing all of these tasks with great precision and minimal human interaction. This paper discusses, We created a fully-fledged solution using IoT and machine learning to assist trains in performing this operation efficiently. In this study, we covered the product, which consists mostly of two phases. Scanning is the first step, followed by sorting. During the scanning process, the parcel will be passed through three scanners that will look for explosives, drugs, and any dangerous materials in the parcel and will trash it if any of the tests fail. When the scanning step is over, the parcel moves on to the sorting phase, where we use QR codes to retrieve the details of the parcels and sort them properly. The simulation of the system is done using the blender software. Our research shows that our procedure significantly improves accuracy as well as the assessment of cutting-edge technology and existing techniques. △ Less

Submitted 17 April, 2024; originally announced April 2024.

Comments: 6 pages, 6 figures, 2022 IEEE IAS Global Conference on Emerging Technologies (GlobConET)

Journal ref: 2022 IEEE IAS Global Conference on Emerging Technologies (GlobConET), Arad, Romania, 2022, pp. 751-756

arXiv:2404.10305 [pdf, other]

doi 10.1145/3606040.3617444

TC-OCR: TableCraft OCR for Efficient Detection & Recognition of Table Structure & Content

Authors: Avinash Anand, Raj Jaiswal, Pijush Bhuyan, Mohit Gupta, Siddhesh Bangar, Md. Modassir Imam, Rajiv Ratn Shah, Shin'ichi Satoh

Abstract: The automatic recognition of tabular data in document images presents a significant challenge due to the diverse range of table styles and complex structures. Tables offer valuable content representation, enhancing the predictive capabilities of various systems such as search engines and Knowledge Graphs. Addressing the two main problems, namely table detection (TD) and table structure recognition… ▽ More The automatic recognition of tabular data in document images presents a significant challenge due to the diverse range of table styles and complex structures. Tables offer valuable content representation, enhancing the predictive capabilities of various systems such as search engines and Knowledge Graphs. Addressing the two main problems, namely table detection (TD) and table structure recognition (TSR), has traditionally been approached independently. In this research, we propose an end-to-end pipeline that integrates deep learning models, including DETR, CascadeTabNet, and PP OCR v2, to achieve comprehensive image-based table recognition. This integrated approach effectively handles diverse table styles, complex structures, and image distortions, resulting in improved accuracy and efficiency compared to existing methods like Table Transformers. Our system achieves simultaneous table detection (TD), table structure recognition (TSR), and table content recognition (TCR), preserving table structures and accurately extracting tabular data from document images. The integration of multiple models addresses the intricacies of table recognition, making our approach a promising solution for image-based table understanding, data extraction, and information retrieval applications. Our proposed approach achieves an IOU of 0.96 and an OCR Accuracy of 78%, showcasing a remarkable improvement of approximately 25% in the OCR Accuracy compared to the previous Table Transformer approach. △ Less

Submitted 19 April, 2024; v1 submitted 16 April, 2024; originally announced April 2024.

Comments: 8 pages, 2 figures, Workshop of 1st MMIR Deep Multimodal Learning for Information Retrieval

arXiv:2404.09763 [pdf, other]

KG-CTG: Citation Generation through Knowledge Graph-guided Large Language Models

Authors: Avinash Anand, Mohit Gupta, Kritarth Prasad, Ujjwal Goel, Naman Lal, Astha Verma, Rajiv Ratn Shah

Abstract: Citation Text Generation (CTG) is a task in natural language processing (NLP) that aims to produce text that accurately cites or references a cited document within a source document. In CTG, the generated text draws upon contextual cues from both the source document and the cited paper, ensuring accurate and relevant citation information is provided. Previous work in the field of citation generati… ▽ More Citation Text Generation (CTG) is a task in natural language processing (NLP) that aims to produce text that accurately cites or references a cited document within a source document. In CTG, the generated text draws upon contextual cues from both the source document and the cited paper, ensuring accurate and relevant citation information is provided. Previous work in the field of citation generation is mainly based on the text summarization of documents. Following this, this paper presents a framework, and a comparative study to demonstrate the use of Large Language Models (LLMs) for the task of citation generation. Also, we have shown the improvement in the results of citation generation by incorporating the knowledge graph relations of the papers in the prompt for the LLM to better learn the relationship between the papers. To assess how well our model is performing, we have used a subset of standard S2ORC dataset, which only consists of computer science academic research papers in the English Language. Vicuna performs best for this task with 14.15 Meteor, 12.88 Rouge-1, 1.52 Rouge-2, and 10.94 Rouge-L. Also, Alpaca performs best, and improves the performance by 36.98% in Rouge-1, and 33.14% in Meteor by including knowledge graphs. △ Less

Submitted 15 April, 2024; originally announced April 2024.

arXiv:2404.09530 [pdf, other]

doi 10.1145/3595916.3626448

RanLayNet: A Dataset for Document Layout Detection used for Domain Adaptation and Generalization

Authors: Avinash Anand, Raj Jaiswal, Mohit Gupta, Siddhesh S Bangar, Pijush Bhuyan, Naman Lal, Rajeev Singh, Ritika Jha, Rajiv Ratn Shah, Shin'ichi Satoh

Abstract: Large ground-truth datasets and recent advances in deep learning techniques have been useful for layout detection. However, because of the restricted layout diversity of these datasets, training on them requires a sizable number of annotated instances, which is both expensive and time-consuming. As a result, differences between the source and target domains may significantly impact how well these… ▽ More Large ground-truth datasets and recent advances in deep learning techniques have been useful for layout detection. However, because of the restricted layout diversity of these datasets, training on them requires a sizable number of annotated instances, which is both expensive and time-consuming. As a result, differences between the source and target domains may significantly impact how well these models function. To solve this problem, domain adaptation approaches have been developed that use a small quantity of labeled data to adjust the model to the target domain. In this research, we introduced a synthetic document dataset called RanLayNet, enriched with automatically assigned labels denoting spatial positions, ranges, and types of layout elements. The primary aim of this endeavor is to develop a versatile dataset capable of training models with robustness and adaptability to diverse document formats. Through empirical experimentation, we demonstrate that a deep layout identification model trained on our dataset exhibits enhanced performance compared to a model trained solely on actual documents. Moreover, we conduct a comparative analysis by fine-tuning inference models using both PubLayNet and IIIT-AR-13K datasets on the Doclaynet dataset. Our findings emphasize that models enriched with our dataset are optimal for tasks such as achieving 0.398 and 0.588 mAP95 score in the scientific document domain for the TABLE class. △ Less

Submitted 19 April, 2024; v1 submitted 15 April, 2024; originally announced April 2024.

Comments: 8 pages, 6 figures, MMAsia 2023 Proceedings of the 5th ACM International Conference on Multimedia in Asia

Journal ref: In Proceedings of the 5th ACM International Conference on Multimedia in Asia 2023. Association for Computing Machinery, NY, USA, Article 74, pp. 1-6

arXiv:2404.07959 [pdf]

Damage identification of offshore jacket platforms in a digital twin framework considering optimal sensor placement

Authors: Mengmeng Wang, Atilla Incecik, Shizhe Feng, M. K. Gupta, Grzegorz Krlolczyk, Z Li

Abstract: A new digital twin (DT) framework with optimal sensor placement (OSP) is proposed to accurately calculate the modal responses and identify the damage ratios of the offshore jacket platforms. The proposed damage identification framework consists of two models (namely one OSP model and one damage identification model). The OSP model adopts the multi-objective Lichtenberg algorithm (MOLA) to perform… ▽ More A new digital twin (DT) framework with optimal sensor placement (OSP) is proposed to accurately calculate the modal responses and identify the damage ratios of the offshore jacket platforms. The proposed damage identification framework consists of two models (namely one OSP model and one damage identification model). The OSP model adopts the multi-objective Lichtenberg algorithm (MOLA) to perform the sensor number/location optimization to make a good balance between the sensor cost and the modal calculation accuracy. In the damage identification model, the Markov Chain Monte Carlo (MCMC)-Bayesian method is developed to calculate the structural damage ratios based on the modal information obtained from the sensory measurements, where the uncertainties of the structural parameters are quantified. The proposed method is validated using an offshore jacket platform, and the analysis results demonstrate efficient identification of the structural damage location and severity. △ Less

Submitted 26 March, 2024; originally announced April 2024.

arXiv:2404.07670 [pdf, ps, other]

On Naisargik Images of Varshamov-Tenengolts and Helberg Codes

Authors: Kalp Pandya, Devdeep Shetranjiwala, Naisargi Savaliya, Manish K. Gupta

Abstract: The VT and Helberg codes, both in binary and non-binary forms, stand as elegant solutions for rectifying insertion and deletion errors. In this paper we consider the quaternary versions of these codes. It is well known that many optimal binary non-linear codes like Kerdock and Prepreta can be depicted as Gray images (isometry) of codes defined over $\mathbb{Z}_4$. Thus a natural question arises: C… ▽ More The VT and Helberg codes, both in binary and non-binary forms, stand as elegant solutions for rectifying insertion and deletion errors. In this paper we consider the quaternary versions of these codes. It is well known that many optimal binary non-linear codes like Kerdock and Prepreta can be depicted as Gray images (isometry) of codes defined over $\mathbb{Z}_4$. Thus a natural question arises: Can we find similar maps between quaternary and binary spaces which gives interesting properties when applied to the VT and Helberg codes. We found several such maps called Naisargik (natural) maps and we study the images of quaternary VT and Helberg codes under these maps. Naisargik and inverse Naisargik images gives interesting error-correcting properties for VT and Helberg codes. If two Naisargik images of VT code generates an intersecting one deletion sphere, then the images holds the same weights. A quaternary Helberg code designed to correct $s$ deletions can effectively rectify $s+1$ deletion errors when considering its Naisargik image, and $s$-deletion correcting binary Helberg code can corrects $\lfloor\frac{s}{2}\rfloor$ errors with inverse Naisargik image. △ Less

Submitted 11 April, 2024; originally announced April 2024.

Comments: 20 pages, 18 Tables, draft, data is at https://github.com/guptalab/GrayVT

arXiv:2404.04877 [pdf, other]

A Bird-Eye view on DNA Storage Simulators

Authors: Sanket Doshi, Mihir Gohel, Manish K. Gupta

Abstract: In the current world due to the huge demand for storage, DNA-based storage solution sounds quite promising because of their longevity, low power consumption, and high capacity. However in real life storing data in the form of DNA is quite expensive, and challenging. Therefore researchers and developers develop such kind of software that helps simulate real-life DNA storage without worrying about t… ▽ More In the current world due to the huge demand for storage, DNA-based storage solution sounds quite promising because of their longevity, low power consumption, and high capacity. However in real life storing data in the form of DNA is quite expensive, and challenging. Therefore researchers and developers develop such kind of software that helps simulate real-life DNA storage without worrying about the cost. This paper aims to review some of the software that performs DNA storage simulations in different domains. The paper also explains the core concepts such as synthesis, sequencing, clustering, reconstruction, GC window, K-mer window, etc and some overview on existing algorithms. Further, we present 3 different softwares on the basis of domain, implementation techniques, and customer/commercial usability. △ Less

Submitted 7 April, 2024; originally announced April 2024.

Comments: 19 pages, 19 figures, draft, review

arXiv:2404.03847 [pdf, other]

Optimal quantile estimation: beyond the comparison model

Authors: Meghal Gupta, Mihir Singhal, Hongxun Wu

Abstract: Estimating quantiles is one of the foundational problems of data sketching. Given $n$ elements $x_1, x_2, \dots, x_n$ from some universe of size $U$ arriving in a data stream, a quantile sketch estimates the rank of any element with additive error at most $\varepsilon n$. A low-space algorithm solving this task has applications in database systems, network measurement, load balancing, and many oth… ▽ More Estimating quantiles is one of the foundational problems of data sketching. Given $n$ elements $x_1, x_2, \dots, x_n$ from some universe of size $U$ arriving in a data stream, a quantile sketch estimates the rank of any element with additive error at most $\varepsilon n$. A low-space algorithm solving this task has applications in database systems, network measurement, load balancing, and many other practical scenarios. Current quantile estimation algorithms described as optimal include the GK sketch (Greenwald and Khanna 2001) using $O(\varepsilon^{-1} \log n)$ words (deterministic) and the KLL sketch (Karnin, Lang, and Liberty 2016) using $O(\varepsilon^{-1} \log\log(1/δ))$ words (randomized, with failure probability $δ$). However, both algorithms are only optimal in the comparison-based model, whereas most typical applications involve streams of integers that the sketch can use aside from making comparisons. If we go beyond the comparison-based model, the deterministic q-digest sketch (Shrivastava, Buragohain, Agrawal, and Suri 2004) achieves a space complexity of $O(\varepsilon^{-1}\log U)$ words, which is incomparable to the previously-mentioned sketches. It has long been asked whether there is a quantile sketch using $O(\varepsilon^{-1})$ words of space (which is optimal as long as $n \leq \mathrm{poly}(U)$). In this work, we present a deterministic algorithm using $O(\varepsilon^{-1})$ words, resolving this line of work. △ Less

Submitted 4 April, 2024; originally announced April 2024.

arXiv:2404.03598 [pdf, other]

Intent Detection and Entity Extraction from BioMedical Literature

Authors: Ankan Mullick, Mukur Gupta, Pawan Goyal

Abstract: Biomedical queries have become increasingly prevalent in web searches, reflecting the growing interest in accessing biomedical literature. Despite recent research on large-language models (LLMs) motivated by endeavours to attain generalized intelligence, their efficacy in replacing task and domain-specific natural language understanding approaches remains questionable. In this paper, we address th… ▽ More Biomedical queries have become increasingly prevalent in web searches, reflecting the growing interest in accessing biomedical literature. Despite recent research on large-language models (LLMs) motivated by endeavours to attain generalized intelligence, their efficacy in replacing task and domain-specific natural language understanding approaches remains questionable. In this paper, we address this question by conducting a comprehensive empirical evaluation of intent detection and named entity recognition (NER) tasks from biomedical text. We show that Supervised Fine Tuned approaches are still relevant and more effective than general-purpose LLMs. Biomedical transformer models such as PubMedBERT can surpass ChatGPT on NER task with only 5 supervised examples. △ Less

Submitted 5 August, 2024; v1 submitted 4 April, 2024; originally announced April 2024.

Comments: Accepted to CL4Health LREC-COLING 2024

arXiv:2403.18128 [pdf, other]

HealthGAT: Node Classifications in Electronic Health Records using Graph Attention Networks

Authors: Fahmida Liza Piya, Mehak Gupta, Rahmatollah Beheshti

Abstract: While electronic health records (EHRs) are widely used across various applications in healthcare, most applications use the EHRs in their raw (tabular) format. Relying on raw or simple data pre-processing can greatly limit the performance or even applicability of downstream tasks using EHRs. To address this challenge, we present HealthGAT, a novel graph attention network framework that utilizes a… ▽ More While electronic health records (EHRs) are widely used across various applications in healthcare, most applications use the EHRs in their raw (tabular) format. Relying on raw or simple data pre-processing can greatly limit the performance or even applicability of downstream tasks using EHRs. To address this challenge, we present HealthGAT, a novel graph attention network framework that utilizes a hierarchical approach to generate embeddings from EHR, surpassing traditional graph-based methods. Our model iteratively refines the embeddings for medical codes, resulting in improved EHR data analysis. We also introduce customized EHR-centric auxiliary pre-training tasks to leverage the rich medical knowledge embedded within the data. This approach provides a comprehensive analysis of complex medical relationships and offers significant advancement over standard data representation techniques. HealthGAT has demonstrated its effectiveness in various healthcare scenarios through comprehensive evaluations against established methodologies. Specifically, our model shows outstanding performance in node classification and downstream tasks such as predicting readmissions and diagnosis classifications. Our code is available at https://github.com/healthylaife/HealthGAT △ Less

Submitted 26 March, 2024; originally announced March 2024.

arXiv:2403.17801 [pdf, other]

Towards 3D Vision with Low-Cost Single-Photon Cameras

Authors: Fangzhou Mu, Carter Sifferman, Sacha Jungerman, Yiquan Li, Mark Han, Michael Gleicher, Mohit Gupta, Yin Li

Abstract: We present a method for reconstructing 3D shape of arbitrary Lambertian objects based on measurements by miniature, energy-efficient, low-cost single-photon cameras. These cameras, operating as time resolved image sensors, illuminate the scene with a very fast pulse of diffuse light and record the shape of that pulse as it returns back from the scene at a high temporal resolution. We propose to mo… ▽ More We present a method for reconstructing 3D shape of arbitrary Lambertian objects based on measurements by miniature, energy-efficient, low-cost single-photon cameras. These cameras, operating as time resolved image sensors, illuminate the scene with a very fast pulse of diffuse light and record the shape of that pulse as it returns back from the scene at a high temporal resolution. We propose to model this image formation process, account for its non-idealities, and adapt neural rendering to reconstruct 3D geometry from a set of spatially distributed sensors with known poses. We show that our approach can successfully recover complex 3D shapes from simulated data. We further demonstrate 3D object reconstruction from real-world captures, utilizing measurements from a commodity proximity sensor. Our work draws a connection between image-based modeling and active range scanning and is a step towards 3D vision with single-photon cameras. △ Less

Submitted 29 March, 2024; v1 submitted 26 March, 2024; originally announced March 2024.

arXiv:2403.10092 [pdf, other]

Specification and Enforcement of Activity Dependency Policies using XACML

Authors: Tanjila Mawla, Maanak Gupta, Ravi Sandhu

Abstract: The evolving smart and interconnected systems are designed to operate with minimal human intervention. Devices within these smart systems often engage in prolonged operations based on sensor data and contextual factors. Recently, an Activity-Centric Access Control (ACAC) model has been introduced to regulate these prolonged operations, referred to as activities, which undergo state changes over ex… ▽ More The evolving smart and interconnected systems are designed to operate with minimal human intervention. Devices within these smart systems often engage in prolonged operations based on sensor data and contextual factors. Recently, an Activity-Centric Access Control (ACAC) model has been introduced to regulate these prolonged operations, referred to as activities, which undergo state changes over extended duration of time. Dependencies among different activities can influence and restrict the execution of one another, necessitating active and real-time monitoring of the dependencies between activities to prevent security violation. In the ACAC model, the activity dependencies, denoted as "D", is considered as a decision parameter for controlling a requested activity. These dependencies must be evaluated throughout all phases of an activity's life cycle. To ensure the consistency of access control rules across diverse domains and applications, a standard policy language is essential. We propose a policy framework adapting the widely-used eXtensible Access Control Markup Language (XACML) , referred to as $\mathrm{XACML_{AD}}$, to specify the activity dependency policies. This work involves extending the syntax and semantics of XACML by introducing new elements to check dependent activities' states and handle state updates on dependent activities. In addition to the language extension, we present the enforcement architecture and data flow model of evaluating policies for activity dependencies. The integration of the proposed $\mathrm{XACML_{AD}}$ policy framework and the enforcement of the policies supports dependency evaluation, necessary updates and continuous enforcement of policies to control an activity throughout its life cycle. We implement the enforcement architecture exploiting the $\mathrm{XACML_{AD}}$ policy framework and discuss the performance evaluation results. △ Less

Submitted 15 March, 2024; originally announced March 2024.

Comments: 10 pages, Accepted by ISSSR 2024 (The 10th International Symposium on System Security, Safety, and Reliability) sponsored by the IEEE Reliability Society

arXiv:2403.10025 [pdf, other]

doi 10.1103/PhysRevB.110.054441

Inter-chain Interactions, Multi-magnon condensation and Strain effect in chain compound NaVOPO$_4$

Authors: Manoj Gupta, Manodip Routh, Manoranjan Kumar, Tanusri Saha Dasgupta

Abstract: Employing first-principles modelling and many-body methods, the magnetic properties of spin-1/2 chain compound NaVOPO$_4$ are explored. The extensive first-principles calculations establish an intricate three-dimensionally coupled model that consists of weakly alternating $J$-$J^{\prime}$ antiferromagnetic chains running along cris-cross directions between two consecutive $ab$ planes, connected vi… ▽ More Employing first-principles modelling and many-body methods, the magnetic properties of spin-1/2 chain compound NaVOPO$_4$ are explored. The extensive first-principles calculations establish an intricate three-dimensionally coupled model that consists of weakly alternating $J$-$J^{\prime}$ antiferromagnetic chains running along cris-cross directions between two consecutive $ab$ planes, connected via two subleading couplings, a ferromagnetic exchange along the $c$ direction ($J_c$) and a weaker antiferromagnetic exchange ($J_a$) along the body diagonal direction. The exact diagonalization and density matrix renormalized group study has been carried out on a two-dimensional spin model with $J$-$J^{\prime}$-$J_c$ and effective $J_d$ couplings, constructed based on the full model, for numerical ease. The $J_c$-$J_d$ phase diagram is found to host a {\it disorder} phase with a finite spin gap for comparable values of $J_c$ and $J_d$, arising out of the competing nature of these two interactions, other than two ordered phases. The calculated thermodynamic properties of this model provide a fair description of experimentally measured data. The predominant manifestation of $J_c$ and $J_d$ in the disorder phase happens in the stabilisation of a multi-magnon condensed phase, upon gap closing by application of an external magnetic field. We further explore the effect of tensile uniaxial strain, which is found to drive the system from gapful to gapless ground state. △ Less

Submitted 29 August, 2024; v1 submitted 15 March, 2024; originally announced March 2024.

arXiv:2403.08848 [pdf, other]

FocusMAE: Gallbladder Cancer Detection from Ultrasound Videos with Focused Masked Autoencoders

Authors: Soumen Basu, Mayuna Gupta, Chetan Madan, Pankaj Gupta, Chetan Arora

Abstract: In recent years, automated Gallbladder Cancer (GBC) detection has gained the attention of researchers. Current state-of-the-art (SOTA) methodologies relying on ultrasound sonography (US) images exhibit limited generalization, emphasizing the need for transformative approaches. We observe that individual US frames may lack sufficient information to capture disease manifestation. This study advocate… ▽ More In recent years, automated Gallbladder Cancer (GBC) detection has gained the attention of researchers. Current state-of-the-art (SOTA) methodologies relying on ultrasound sonography (US) images exhibit limited generalization, emphasizing the need for transformative approaches. We observe that individual US frames may lack sufficient information to capture disease manifestation. This study advocates for a paradigm shift towards video-based GBC detection, leveraging the inherent advantages of spatiotemporal representations. Employing the Masked Autoencoder (MAE) for representation learning, we address shortcomings in conventional image-based methods. We propose a novel design called FocusMAE to systematically bias the selection of masking tokens from high-information regions, fostering a more refined representation of malignancy. Additionally, we contribute the most extensive US video dataset for GBC detection. We also note that, this is the first study on US video-based GBC detection. We validate the proposed methods on the curated dataset, and report a new state-of-the-art (SOTA) accuracy of 96.4% for the GBC detection problem, against an accuracy of 84% by current Image-based SOTA - GBCNet, and RadFormer, and 94.7% by Video-based SOTA - AdaMAE. We further demonstrate the generality of the proposed FocusMAE on a public CT-based Covid detection dataset, reporting an improvement in accuracy by 3.3% over current baselines. The source code and pretrained models are available at: https://gbc-iitd.github.io/focusmae △ Less

Submitted 29 March, 2024; v1 submitted 13 March, 2024; originally announced March 2024.

Comments: To Appear at CVPR 2024

arXiv:2403.06428 [pdf, other]

Intra-Section Code Cave Injection for Adversarial Evasion Attacks on Windows PE Malware File

Authors: Kshitiz Aryal, Maanak Gupta, Mahmoud Abdelsalam, Moustafa Saleh

Abstract: Windows malware is predominantly available in cyberspace and is a prime target for deliberate adversarial evasion attacks. Although researchers have investigated the adversarial malware attack problem, a multitude of important questions remain unanswered, including (a) Are the existing techniques to inject adversarial perturbations in Windows Portable Executable (PE) malware files effective enough… ▽ More Windows malware is predominantly available in cyberspace and is a prime target for deliberate adversarial evasion attacks. Although researchers have investigated the adversarial malware attack problem, a multitude of important questions remain unanswered, including (a) Are the existing techniques to inject adversarial perturbations in Windows Portable Executable (PE) malware files effective enough for evasion purposes?; (b) Does the attack process preserve the original behavior of malware?; (c) Are there unexplored approaches/locations that can be used to carry out adversarial evasion attacks on Windows PE malware?; and (d) What are the optimal locations and sizes of adversarial perturbations required to evade an ML-based malware detector without significant structural change in the PE file? To answer some of these questions, this work proposes a novel approach that injects a code cave within the section (i.e., intra-section) of Windows PE malware files to make space for adversarial perturbations. In addition, a code loader is also injected inside the PE file, which reverts adversarial malware to its original form during the execution, preserving the malware's functionality and executability. To understand the effectiveness of our approach, we injected adversarial perturbations inside the .text, .data and .rdata sections, generated using the gradient descent and Fast Gradient Sign Method (FGSM), to target the two popular CNN-based malware detectors, MalConv and MalConv2. Our experiments yielded notable results, achieving a 92.31% evasion rate with gradient descent and 96.26% with FGSM against MalConv, compared to the 16.17% evasion rate for append attacks. Similarly, when targeting MalConv2, our approach achieved a remarkable maximum evasion rate of 97.93% with gradient descent and 94.34% with FGSM, significantly surpassing the 4.01% evasion rate observed with append attacks. △ Less

Submitted 11 March, 2024; originally announced March 2024.

arXiv:2402.12832 [pdf, ps, other]

Nearly Optimal Fault Tolerant Distance Oracle

Authors: Dipan Dey, Manoj Gupta

Abstract: We present an $f$-fault tolerant distance oracle for an undirected weighted graph where each edge has an integral weight from $[1 \dots W]$. Given a set $F$ of $f$ edges, as well as a source node $s$ and a destination node $t$, our oracle returns the \emph{shortest path} from $s$ to $t$ avoiding $F$ in $O((cf \log (nW))^{O(f^2)})$ time, where $c > 1$ is a constant. The space complexity of our orac… ▽ More We present an $f$-fault tolerant distance oracle for an undirected weighted graph where each edge has an integral weight from $[1 \dots W]$. Given a set $F$ of $f$ edges, as well as a source node $s$ and a destination node $t$, our oracle returns the \emph{shortest path} from $s$ to $t$ avoiding $F$ in $O((cf \log (nW))^{O(f^2)})$ time, where $c > 1$ is a constant. The space complexity of our oracle is $O(f^4n^2\log^2 (nW))$. For a constant $f$, our oracle is nearly optimal both in terms of space and time (barring some logarithmic factor). △ Less

Submitted 16 July, 2024; v1 submitted 20 February, 2024; originally announced February 2024.

Comments: accepted in STOC, 2024

arXiv:2402.12254 [pdf, other]

doi 10.1051/0004-6361/202346212

Stability of the coronal magnetic field around large confined and eruptive solar flares

Authors: Manu Gupta, J. K. Thalmann, A. M. Veronig

Abstract: In order to improve our understanding on the pre-requisites of eruptive solar flares, we study and compare different measures that characterize the eruptive potential of solar active regions - the critical height for torus instability as a local measure and the helicity ratio as a global measure - with the structural properties of the underlying magnetic field, namely the altitude of the center of… ▽ More In order to improve our understanding on the pre-requisites of eruptive solar flares, we study and compare different measures that characterize the eruptive potential of solar active regions - the critical height for torus instability as a local measure and the helicity ratio as a global measure - with the structural properties of the underlying magnetic field, namely the altitude of the center of the current-carrying magnetic structure. Using time series of 3D optimization-based nonlinear force-free magnetic field models for 10 different active regions (ARs) around the time of large solar flares, we determine the altitudes of the current-weighted centers of the non-potential model structures. Based on the potential magnetic field, we inspect the decay index, $n$, in multiple vertical planes oriented along of or perpendicular to the flare-relevant polarity inversion line, and estimate the critical height ($h_{\mathrm{crit}}$) for torus instability (TI) using different thresholds of $n$. The critical heights are interpreted with respect to the altitudes of the current-weighted centers of the associated non-potential structures, as well as the eruptive character of the associated flares, and the eruptive potential of the host AR, as characterized by the helicity ratio. Our most important findings are that (i) $h_{\mathrm{crit}}$ is more segregated in terms of flare type than the helicity ratio, and that (ii) coronal field configurations with a higher eruptive potential (in terms of the helicity ratio) also appear to be more prone to TI. Furthermore, we find no pronounced differences in the altitudes of the non-potential structures prior to confined and eruptive flares. △ Less

Submitted 19 February, 2024; originally announced February 2024.

Comments: Accepted for publication in A & A journal, 16 pages, and 8 figures

Journal ref: A&A 686, A115 (2024)

arXiv:2402.09728 [pdf, other]

AbuseGPT: Abuse of Generative AI ChatBots to Create Smishing Campaigns

Authors: Ashfak Md Shibli, Mir Mehedi A. Pritom, Maanak Gupta

Abstract: SMS phishing, also known as "smishing", is a growing threat that tricks users into disclosing private information or clicking into URLs with malicious content through fraudulent mobile text messages. In recent past, we have also observed a rapid advancement of conversational generative AI chatbot services (e.g., OpenAI's ChatGPT, Google's BARD), which are powered by pre-trained large language mode… ▽ More SMS phishing, also known as "smishing", is a growing threat that tricks users into disclosing private information or clicking into URLs with malicious content through fraudulent mobile text messages. In recent past, we have also observed a rapid advancement of conversational generative AI chatbot services (e.g., OpenAI's ChatGPT, Google's BARD), which are powered by pre-trained large language models (LLMs). These AI chatbots certainly have a lot of utilities but it is not systematically understood how they can play a role in creating threats and attacks. In this paper, we propose AbuseGPT method to show how the existing generative AI-based chatbot services can be exploited by attackers in real world to create smishing texts and eventually lead to craftier smishing campaigns. To the best of our knowledge, there is no pre-existing work that evidently shows the impacts of these generative text-based models on creating SMS phishing. Thus, we believe this study is the first of its kind to shed light on this emerging cybersecurity threat. We have found strong empirical evidences to show that attackers can exploit ethical standards in the existing generative AI-based chatbot services by crafting prompt injection attacks to create newer smishing campaigns. We also discuss some future research directions and guidelines to protect the abuse of generative AI-based services and safeguard users from smishing attacks. △ Less

Submitted 15 February, 2024; originally announced February 2024.

Comments: 6 pages, 12 figures, published in ISDFS 2024

arXiv:2401.09349 [pdf, other]

Doping induced singlet to triplet superconducting transition in Ba$_{2}$CuO$_{3+δ}$

Authors: Priyo Adhikary, Mayank Gupta, B. R. K. Nanda, Shantanu Mukherjee

Abstract: In this study, we perform a numerical simulation on the recently discovered high-temperature superconductor ($T_c$= 73K) Ba$_2$CuO$_{3.2}$ \cite{lietal} while focusing on doping dependence of alternating CuO$_6$ octahedra and CuO chain-like states. Employing the multiband random-phase approximation, we compute the spin-fluctuation mediated pairing interaction, subsequently determining its pairing… ▽ More In this study, we perform a numerical simulation on the recently discovered high-temperature superconductor ($T_c$= 73K) Ba$_2$CuO$_{3.2}$ \cite{lietal} while focusing on doping dependence of alternating CuO$_6$ octahedra and CuO chain-like states. Employing the multiband random-phase approximation, we compute the spin-fluctuation mediated pairing interaction, subsequently determining its pairing eigenvalues and eigenfunctions relative to oxygen-doping levels. We find that, for the certain range of hole doping in Ba$_2$CuO$_{3+δ}$, a singlet $d_{x^2-y^2}$-wave pairing symmetry emerges as long as we keep the doping below the critical value $x_{c}$. Interestingly upon hole doping, the dominant pairing symmetry undergoes a transition to a triplet (odd paring) type from the singlet state. This change in pairing is driven by the competition between the nesting vectors coming from the Fermi surface of $d_{z^2}$ and $d_{x^2-y^2}$ orbitals within the CuO$_6$ octahedra. This triplet state is attainable through hole doping, while supressing inter-layer self-doping effects. Furthermore, we present the density of states within the superconducting phase, offering a potential comparison with tunnelling spectra in Ba$_2$CuO$_{3+δ}$. Our research provides novel insights into the intricate pairing symmetries in Ba$_2$CuO$_{3+δ}$ and their underlying pairing mechanisms. △ Less

Submitted 17 January, 2024; originally announced January 2024.

arXiv:2401.04213 [pdf]

Comments regarding Transonic dislocation propagation in diamond by Katagiri, et al. (Science 382, 69-72, 2023)

Authors: James A. Hawreliak, J. M. Winey, Surinder. M. Sharma, Yogendra M. Gupta

Abstract: We have carefully examined the above-referenced paper and find the claims of stacking fault formation and transonic dislocation propagation in diamond to be not valid. Additionally, it is quite puzzling that 14 authors on this paper are also co-authors on another recent paper that directly conflicts with the dislocation claims in the Science paper. We have carefully examined the above-referenced paper and find the claims of stacking fault formation and transonic dislocation propagation in diamond to be not valid. Additionally, it is quite puzzling that 14 authors on this paper are also co-authors on another recent paper that directly conflicts with the dislocation claims in the Science paper. △ Less

Submitted 8 January, 2024; originally announced January 2024.

Comments: 11 pages, 3 figures

arXiv:2312.17703 [pdf, other]

Evidence for $π$-shifted Cooper quartets and few-mode transport in PbTe nanowire three-terminal Josephson junctions

Authors: Mohit Gupta, Vipin Khade, Colin Riggert, Lior Shani, Gavin Menning, Pim Lueb, Jason Jung, Régis Mélin, Erik P. A. M. Bakkers, Vlad S. Pribiag

Abstract: Josephson junctions are typically characterized by a single phase difference across two superconductors. This conventional two-terminal Josephson junction can be generalized to a multi-terminal device where the Josephson energy contains terms with contributions from multiple independent phase variables. Such multi-terminal Josephson junctions (MTJJs) are being considered as platforms for engineeri… ▽ More Josephson junctions are typically characterized by a single phase difference across two superconductors. This conventional two-terminal Josephson junction can be generalized to a multi-terminal device where the Josephson energy contains terms with contributions from multiple independent phase variables. Such multi-terminal Josephson junctions (MTJJs) are being considered as platforms for engineering effective Hamiltonians with non-trivial topologies, such as Weyl crossings and higher-order Chern numbers. This approach offers unique possibilities that are complementary to phenomena attainable in bulk crystals, including topological states in more than three dimensions and real-time gate-tunability of the Hamiltonians. However, these prospects rely on the ability to create MTJJs with non-classical multi-terminal couplings in which only a handful of quantum modes are populated. Here, we demonstrate these requirements by using a three-terminal Josephson junction fabricated on selective-area-grown (SAG) PbTe nanowires. We observe signatures of a $π$-shifted Josephson effect, consistent with inter-terminal couplings mediated by four-particle quantum states called Cooper quartets. We further observe supercurrent co-existent with a non-monotonic evolution of the conductance with gate voltage, indicating transport mediated by a few quantum modes in both two- and three-terminal devices. These results establish a platform for investigations of topological Hamiltonians based on Andreev bound states. △ Less

Submitted 22 April, 2024; v1 submitted 29 December, 2023; originally announced December 2023.

arXiv:2312.15426 [pdf, other]

The Group Access Bounds for Binary Search Trees

Authors: Parinya Chalermsook, Manoj Gupta, Wanchote Jiamjitrak, Akash Pareek, Sorrachai Yingchareonthawornchai

Abstract: The access lemma (Sleator and Tarjan, JACM 1985) is a property of binary search trees that implies interesting consequences such as static optimality, static finger, and working set property. However, there are known corollaries of the dynamic optimality that cannot be derived via the access lemma, such as the dynamic finger, and any $o(\log n)$-competitive ratio to the optimal BST where $n$ is th… ▽ More The access lemma (Sleator and Tarjan, JACM 1985) is a property of binary search trees that implies interesting consequences such as static optimality, static finger, and working set property. However, there are known corollaries of the dynamic optimality that cannot be derived via the access lemma, such as the dynamic finger, and any $o(\log n)$-competitive ratio to the optimal BST where $n$ is the number of keys. In this paper, we introduce the group access bound that can be defined with respect to a reference group access tree. Group access bounds generalize the access lemma and imply properties that are far stronger than those implied by the access lemma. For each of the following results, there is a group access tree whose group access bound Is $O(\sqrt{\log n})$-competitive to the optimal BST. Achieves the $k$-finger bound with an additive term of $O(m \log k \log \log n)$ (randomized) when the reference tree is an almost complete binary tree. Satisfies the unified bound with an additive term of $O(m \log \log n)$. Matches the unified bound with a time window $k$ with an additive term of $O(m \log k \log \log n)$ (randomized). Furthermore, we prove simulation theorem: For every group access tree, there is an online BST algorithm that is $O(1)$-competitive with its group access bound. In particular, any new group access bound will automatically imply a new BST algorithm achieving the same bound. Thereby, we obtain an improved $k$-finger bound (reference tree is an almost complete binary tree), an improved unified bound with a time window $k$, and matching the best-known bound for Unified bound in the BST model. Since any dynamically optimal BST must achieve the group access bounds, we believe our results provide a new direction towards proving $o(\log n)$-competitiveness of Splay tree and Greedy. △ Less

Submitted 24 December, 2023; originally announced December 2023.

Showing 1–50 of 680 results for author: Gupta, M