Search | arXiv e-print repository

Generating novel experimental hypotheses from language models: A case study on cross-dative generalization

Abstract: Neural network language models (LMs) have been shown to successfully capture complex linguistic knowledge. However, their utility for understanding language acquisition is still debated. We contribute to this debate by presenting a case study where we use LMs as simulated learners to derive novel experimental hypotheses to be tested with humans. We apply this paradigm to study cross-dative general… ▽ More Neural network language models (LMs) have been shown to successfully capture complex linguistic knowledge. However, their utility for understanding language acquisition is still debated. We contribute to this debate by presenting a case study where we use LMs as simulated learners to derive novel experimental hypotheses to be tested with humans. We apply this paradigm to study cross-dative generalization (CDG): productive generalization of novel verbs across dative constructions (she pilked me the ball/she pilked the ball to me) -- acquisition of which is known to involve a large space of contextual features -- using LMs trained on child-directed speech. We specifically ask: "what properties of the training exposure facilitate a novel verb's generalization to the (unmodeled) alternate construction?" To answer this, we systematically vary the exposure context in which a novel dative verb occurs in terms of the properties of the theme and recipient, and then analyze the LMs' usage of the novel verb in the unmodeled dative construction. We find LMs to replicate known patterns of children's CDG, as a precondition to exploring novel hypotheses. Subsequent simulations reveal a nuanced role of the features of the novel verbs' exposure context on the LMs' CDG. We find CDG to be facilitated when the first postverbal argument of the exposure context is pronominal, definite, short, and conforms to the prototypical animacy expectations of the exposure dative. These patterns are characteristic of harmonic alignment in datives, where the argument with features ranking higher on the discourse prominence scale tends to precede the other. This gives rise to a novel hypothesis that CDG is facilitated insofar as the features of the exposure context -- in particular, its first postverbal argument -- are harmonically aligned. We conclude by proposing future experiments that can test this hypothesis in children. △ Less

Submitted 9 August, 2024; originally announced August 2024.

arXiv:2403.19827 [pdf, other]

Language Models Learn Rare Phenomena from Less Rare Phenomena: The Case of the Missing AANNs

Authors: Kanishka Misra, Kyle Mahowald

Abstract: Language models learn rare syntactic phenomena, but the extent to which this is attributable to generalization vs. memorization is a major open question. To that end, we iteratively trained transformer language models on systematically manipulated corpora which were human-scale in size, and then evaluated their learning of a rare grammatical phenomenon: the English Article+Adjective+Numeral+Noun (… ▽ More Language models learn rare syntactic phenomena, but the extent to which this is attributable to generalization vs. memorization is a major open question. To that end, we iteratively trained transformer language models on systematically manipulated corpora which were human-scale in size, and then evaluated their learning of a rare grammatical phenomenon: the English Article+Adjective+Numeral+Noun (AANN) construction (``a beautiful five days''). We compared how well this construction was learned on the default corpus relative to a counterfactual corpus in which AANN sentences were removed. We found that AANNs were still learned better than systematically perturbed variants of the construction. Using additional counterfactual corpora, we suggest that this learning occurs through generalization from related constructions (e.g., ``a few days''). An additional experiment showed that this learning is enhanced when there is more variability in the input. Taken together, our results provide an existence proof that LMs can learn rare grammatical phenomena by generalization from less rare phenomena. Data and code: https://github.com/kanishkamisra/aannalysis. △ Less

Submitted 10 August, 2024; v1 submitted 28 March, 2024; originally announced March 2024.

Comments: Updated version

arXiv:2401.06640 [pdf, other]

Experimental Contexts Can Facilitate Robust Semantic Property Inference in Language Models, but Inconsistently

Authors: Kanishka Misra, Allyson Ettinger, Kyle Mahowald

Abstract: Recent zero-shot evaluations have highlighted important limitations in the abilities of language models (LMs) to perform meaning extraction. However, it is now well known that LMs can demonstrate radical improvements in the presence of experimental contexts such as in-context examples and instructions. How well does this translate to previously studied meaning-sensitive tasks? We present a case-st… ▽ More Recent zero-shot evaluations have highlighted important limitations in the abilities of language models (LMs) to perform meaning extraction. However, it is now well known that LMs can demonstrate radical improvements in the presence of experimental contexts such as in-context examples and instructions. How well does this translate to previously studied meaning-sensitive tasks? We present a case-study on the extent to which experimental contexts can improve LMs' robustness in performing property inheritance -- predicting semantic properties of novel concepts, a task that they have been previously shown to fail on. Upon carefully controlling the nature of the in-context examples and the instructions, our work reveals that they can indeed lead to non-trivial property inheritance behavior in LMs. However, this ability is inconsistent: with a minimal reformulation of the task, some LMs were found to pick up on shallow, non-semantic heuristics from their inputs, suggesting that the computational principles of semantic property inference are yet to be mastered by LMs. △ Less

Submitted 12 January, 2024; originally announced January 2024.

arXiv:2312.03708 [pdf, other]

Abstraction via exemplars? A representational case study on lexical category inference in BERT

Authors: Kanishka Misra, Najoung Kim

Abstract: Exemplar based accounts are often considered to be in direct opposition to pure linguistic abstraction in explaining language learners' ability to generalize to novel expressions. However, the recent success of neural network language models on linguistically sensitive tasks suggests that perhaps abstractions can arise via the encoding of exemplars. We provide empirical evidence for this claim by… ▽ More Exemplar based accounts are often considered to be in direct opposition to pure linguistic abstraction in explaining language learners' ability to generalize to novel expressions. However, the recent success of neural network language models on linguistically sensitive tasks suggests that perhaps abstractions can arise via the encoding of exemplars. We provide empirical evidence for this claim by adapting an existing experiment that studies how an LM (BERT) generalizes the usage of novel tokens that belong to lexical categories such as Noun/Verb/Adjective/Adverb from exposure to only a single instance of their usage. We analyze the representational behavior of the novel tokens in these experiments, and find that BERT's capacity to generalize to unseen expressions involving the use of these novel tokens constitutes the movement of novel token representations towards regions of known category exemplars in two-dimensional space. Our results suggest that learners' encoding of exemplars can indeed give rise to abstraction like behavior. △ Less

Submitted 3 November, 2023; originally announced December 2023.

Comments: 2-page abstract, to appear in BUCLD48

arXiv:2310.18736 [pdf, other]

A Gale-Shapley View of Unique Stable Marriages

Authors: Kartik Gokhale, Amit Kumar Mallik, Ankit Kumar Misra, Swaprava Nath

Abstract: Stable marriage of a two-sided market with unit demand is a classic problem that arises in many real-world scenarios. In addition, a unique stable marriage in this market simplifies a host of downstream desiderata. In this paper, we explore a new set of sufficient conditions for unique stable matching (USM) under this setup. Unlike other approaches that also address this question using the structu… ▽ More Stable marriage of a two-sided market with unit demand is a classic problem that arises in many real-world scenarios. In addition, a unique stable marriage in this market simplifies a host of downstream desiderata. In this paper, we explore a new set of sufficient conditions for unique stable matching (USM) under this setup. Unlike other approaches that also address this question using the structure of preference profiles, we use an algorithmic viewpoint and investigate if this question can be answered using the lens of the deferred acceptance (DA) algorithm (Gale and Shapley, 1962). Our results yield a set of sufficient conditions for USM (viz., MaxProp and MaxRou) and show that these are disjoint from the previously known sufficiency conditions like sequential preference and no crossing. We also provide a characterization of MaxProp that makes it efficiently verifiable, and shows the gap between MaxProp and the entire USM class. These results give a more detailed view of the sub-structures of the USM class. △ Less

Submitted 2 August, 2024; v1 submitted 28 October, 2023; originally announced October 2023.

Comments: 20 pages, 1 figure, In Proceedings, ECAI 2024

arXiv:2306.04009 [pdf, other]

Triggering Multi-Hop Reasoning for Question Answering in Language Models using Soft Prompts and Random Walks

Authors: Kanishka Misra, Cicero Nogueira dos Santos, Siamak Shakeri

Abstract: Despite readily memorizing world knowledge about entities, pre-trained language models (LMs) struggle to compose together two or more facts to perform multi-hop reasoning in question-answering tasks. In this work, we propose techniques that improve upon this limitation by relying on random walks over structured knowledge graphs. Specifically, we use soft prompts to guide LMs to chain together thei… ▽ More Despite readily memorizing world knowledge about entities, pre-trained language models (LMs) struggle to compose together two or more facts to perform multi-hop reasoning in question-answering tasks. In this work, we propose techniques that improve upon this limitation by relying on random walks over structured knowledge graphs. Specifically, we use soft prompts to guide LMs to chain together their encoded knowledge by learning to map multi-hop questions to random walk paths that lead to the answer. Applying our methods on two T5 LMs shows substantial improvements over standard tuning approaches in answering questions that require 2-hop reasoning. △ Less

Submitted 6 June, 2023; originally announced June 2023.

Comments: Findings of ACL 2023

arXiv:2302.00093 [pdf, other]

Large Language Models Can Be Easily Distracted by Irrelevant Context

Authors: Freda Shi, Xinyun Chen, Kanishka Misra, Nathan Scales, David Dohan, Ed Chi, Nathanael Schärli, Denny Zhou

Abstract: Large language models have achieved impressive performance on various natural language processing tasks. However, so far they have been evaluated primarily on benchmarks where all information in the input context is relevant for solving the task. In this work, we investigate the distractibility of large language models, i.e., how the model problem-solving accuracy can be influenced by irrelevant c… ▽ More Large language models have achieved impressive performance on various natural language processing tasks. However, so far they have been evaluated primarily on benchmarks where all information in the input context is relevant for solving the task. In this work, we investigate the distractibility of large language models, i.e., how the model problem-solving accuracy can be influenced by irrelevant context. In particular, we introduce Grade-School Math with Irrelevant Context (GSM-IC), an arithmetic reasoning dataset with irrelevant information in the problem description. We use this benchmark to measure the distractibility of cutting-edge prompting techniques for large language models, and find that the model performance is dramatically decreased when irrelevant information is included. We also identify several approaches for mitigating this deficiency, such as decoding with self-consistency and adding to the prompt an instruction that tells the language model to ignore the irrelevant information. △ Less

Submitted 6 June, 2023; v1 submitted 31 January, 2023; originally announced February 2023.

Comments: Published in ICML 2023

arXiv:2212.08979 [pdf, other]

Language model acceptability judgements are not always robust to context

Authors: Koustuv Sinha, Jon Gauthier, Aaron Mueller, Kanishka Misra, Keren Fuentes, Roger Levy, Adina Williams

Abstract: Targeted syntactic evaluations of language models ask whether models show stable preferences for syntactically acceptable content over minimal-pair unacceptable inputs. Most targeted syntactic evaluation datasets ask models to make these judgements with just a single context-free sentence as input. This does not match language models' training regime, in which input sentences are always highly con… ▽ More Targeted syntactic evaluations of language models ask whether models show stable preferences for syntactically acceptable content over minimal-pair unacceptable inputs. Most targeted syntactic evaluation datasets ask models to make these judgements with just a single context-free sentence as input. This does not match language models' training regime, in which input sentences are always highly contextualized by the surrounding corpus. This mismatch raises an important question: how robust are models' syntactic judgements in different contexts? In this paper, we investigate the stability of language models' performance on targeted syntactic evaluations as we vary properties of the input context: the length of the context, the types of syntactic phenomena it contains, and whether or not there are violations of grammaticality. We find that model judgements are generally robust when placed in randomly sampled linguistic contexts. However, they are substantially unstable for contexts containing syntactic structures matching those in the critical test content. Among all tested models (GPT-2 and five variants of OPT), we significantly improve models' judgements by providing contexts with matching syntactic structures, and conversely significantly worsen them using unacceptable contexts with matching but violated syntactic structures. This effect is amplified by the length of the context, except for unrelated inputs. We show that these changes in model performance are not explainable by simple features matching the context and the test inputs, such as lexical overlap and dependency overlap. This sensitivity to highly specific syntactic features of the context can only be explained by the models' implicit in-context learning abilities. △ Less

Submitted 17 December, 2022; originally announced December 2022.

arXiv:2210.01963 [pdf, other]

COMPS: Conceptual Minimal Pair Sentences for testing Robust Property Knowledge and its Inheritance in Pre-trained Language Models

Authors: Kanishka Misra, Julia Taylor Rayz, Allyson Ettinger

Abstract: A characteristic feature of human semantic cognition is its ability to not only store and retrieve the properties of concepts observed through experience, but to also facilitate the inheritance of properties (can breathe) from superordinate concepts (animal) to their subordinates (dog) -- i.e. demonstrate property inheritance. In this paper, we present COMPS, a collection of minimal pair sentences… ▽ More A characteristic feature of human semantic cognition is its ability to not only store and retrieve the properties of concepts observed through experience, but to also facilitate the inheritance of properties (can breathe) from superordinate concepts (animal) to their subordinates (dog) -- i.e. demonstrate property inheritance. In this paper, we present COMPS, a collection of minimal pair sentences that jointly tests pre-trained language models (PLMs) on their ability to attribute properties to concepts and their ability to demonstrate property inheritance behavior. Analyses of 22 different PLMs on COMPS reveal that they can easily distinguish between concepts on the basis of a property when they are trivially different, but find it relatively difficult when concepts are related on the basis of nuanced knowledge representations. Furthermore, we find that PLMs can demonstrate behavior consistent with property inheritance to a great extent, but fail in the presence of distracting information, which decreases the performance of many models, sometimes even below chance. This lack of robustness in demonstrating simple reasoning raises important questions about PLMs' capacity to make correct inferences even when they appear to possess the prerequisite knowledge. △ Less

Submitted 8 February, 2023; v1 submitted 4 October, 2022; originally announced October 2022.

Comments: EACL 2023 Camera Ready version. Code can be found at https://github.com/kanishkamisra/comps

arXiv:2205.06910 [pdf, other]

A Property Induction Framework for Neural Language Models

Authors: Kanishka Misra, Julia Taylor Rayz, Allyson Ettinger

Abstract: To what extent can experience from language contribute to our conceptual knowledge? Computational explorations of this question have shed light on the ability of powerful neural language models (LMs) -- informed solely through text input -- to encode and elicit information about concepts and properties. To extend this line of research, we present a framework that uses neural-network language model… ▽ More To what extent can experience from language contribute to our conceptual knowledge? Computational explorations of this question have shed light on the ability of powerful neural language models (LMs) -- informed solely through text input -- to encode and elicit information about concepts and properties. To extend this line of research, we present a framework that uses neural-network language models (LMs) to perform property induction -- a task in which humans generalize novel property knowledge (has sesamoid bones) from one or more concepts (robins) to others (sparrows, canaries). Patterns of property induction observed in humans have shed considerable light on the nature and organization of human conceptual knowledge. Inspired by this insight, we use our framework to explore the property inductions of LMs, and find that they show an inductive preference to generalize novel properties on the basis of category membership, suggesting the presence of a taxonomic bias in their representations. △ Less

Submitted 13 May, 2022; originally announced May 2022.

Comments: CogSci 2022 camera ready version, with hyperref-compatible citations. Code and Supplemental Material can be found in https://github.com/kanishkamisra/lm-induction

arXiv:2203.13112 [pdf, other]

minicons: Enabling Flexible Behavioral and Representational Analyses of Transformer Language Models

Authors: Kanishka Misra

Abstract: We present minicons, an open source library that provides a standard API for researchers interested in conducting behavioral and representational analyses of transformer-based language models (LMs). Specifically, minicons enables researchers to apply analysis methods at two levels: (1) at the prediction level -- by providing functions to efficiently extract word/sentence level probabilities; and (… ▽ More We present minicons, an open source library that provides a standard API for researchers interested in conducting behavioral and representational analyses of transformer-based language models (LMs). Specifically, minicons enables researchers to apply analysis methods at two levels: (1) at the prediction level -- by providing functions to efficiently extract word/sentence level probabilities; and (2) at the representational level -- by also facilitating efficient extraction of word/phrase level vectors from one or more layers. In this paper, we describe the library and apply it to two motivating case studies: One focusing on the learning dynamics of the BERT architecture on relative grammatical judgments, and the other on benchmarking 23 different LMs on zero-shot abductive reasoning. minicons is available at https://github.com/kanishkamisra/minicons △ Less

Submitted 24 March, 2022; originally announced March 2022.

Comments: To be submitted; Code to reproduce experiments can be found on https://github.com/kanishkamisra/minicons-experiments

arXiv:2203.12606 [pdf]

Journey of Cryptocurrency in India In View of Financial Budget 2022-23

Authors: Varun Shukla, Manoj Kumar Misra, Atul Chaturvedi

Abstract: Recently, Indian Finance minister Nirmala Sitharaman announced in Union budget 2022-23 that Indian government will put 30% tax (the highest tax slab in India) on income generated from cryptocurrencies. Big financial institutions, experts and academicians have different opinions in this regard. They claim that it would be the end of cryptocurrency market in India or it would be possible that RBI (R… ▽ More Recently, Indian Finance minister Nirmala Sitharaman announced in Union budget 2022-23 that Indian government will put 30% tax (the highest tax slab in India) on income generated from cryptocurrencies. Big financial institutions, experts and academicians have different opinions in this regard. They claim that it would be the end of cryptocurrency market in India or it would be possible that RBI (Reserve Bank of India) may launch its own crypto or digital currency. So in this context, in this article, the journey and future aspects of cryptocurrency in India are discussed and we hope that it will be a reference for further research and discussion in this area. △ Less

Submitted 4 February, 2022; originally announced March 2022.

arXiv:2111.02603 [pdf, ps, other]

On Semantic Cognition, Inductive Generalization, and Language Models

Authors: Kanishka Misra

Abstract: My doctoral research focuses on understanding semantic knowledge in neural network models trained solely to predict natural language (referred to as language models, or LMs), by drawing on insights from the study of concepts and categories grounded in cognitive science. I propose a framework inspired by 'inductive reasoning,' a phenomenon that sheds light on how humans utilize background knowledge… ▽ More My doctoral research focuses on understanding semantic knowledge in neural network models trained solely to predict natural language (referred to as language models, or LMs), by drawing on insights from the study of concepts and categories grounded in cognitive science. I propose a framework inspired by 'inductive reasoning,' a phenomenon that sheds light on how humans utilize background knowledge to make inductive leaps and generalize from new pieces of information about concepts and their properties. Drawing from experiments that study inductive reasoning, I propose to analyze semantic inductive generalization in LMs using phenomena observed in human-induction literature, investigate inductive behavior on tasks such as implicit reasoning and emergent feature recognition, and analyze and relate induction dynamics to the learned conceptual representation space. △ Less

Submitted 3 November, 2021; originally announced November 2021.

Comments: Accepted at AAAI 2022 Doctoral Consortium

arXiv:2105.02987 [pdf, other]

Do language models learn typicality judgments from text?

Authors: Kanishka Misra, Allyson Ettinger, Julia Taylor Rayz

Abstract: Building on research arguing for the possibility of conceptual and categorical knowledge acquisition through statistics contained in language, we evaluate predictive language models (LMs) -- informed solely by textual input -- on a prevalent phenomenon in cognitive science: typicality. Inspired by experiments that involve language processing and show robust typicality effects in humans, we propose… ▽ More Building on research arguing for the possibility of conceptual and categorical knowledge acquisition through statistics contained in language, we evaluate predictive language models (LMs) -- informed solely by textual input -- on a prevalent phenomenon in cognitive science: typicality. Inspired by experiments that involve language processing and show robust typicality effects in humans, we propose two tests for LMs. Our first test targets whether typicality modulates LM probabilities in assigning taxonomic category memberships to items. The second test investigates sensitivities to typicality in LMs' probabilities when extending new information about items to their categories. Both tests show modest -- but not completely absent -- correspondence between LMs and humans, suggesting that text-based exposure alone is insufficient to acquire typicality knowledge. △ Less

Submitted 6 May, 2021; originally announced May 2021.

Comments: Accepted as a talk to CogSci 2021

arXiv:2104.10813 [pdf, other]

Finding Fuzziness in Neural Network Models of Language Processing

Authors: Kanishka Misra, Julia Taylor Rayz

Abstract: Humans often communicate by using imprecise language, suggesting that fuzzy concepts with unclear boundaries are prevalent in language use. In this paper, we test the extent to which models trained to capture the distributional statistics of language show correspondence to fuzzy-membership patterns. Using the task of natural language inference, we test a recent state of the art model on the classi… ▽ More Humans often communicate by using imprecise language, suggesting that fuzzy concepts with unclear boundaries are prevalent in language use. In this paper, we test the extent to which models trained to capture the distributional statistics of language show correspondence to fuzzy-membership patterns. Using the task of natural language inference, we test a recent state of the art model on the classical case of temperature, by examining its mapping of temperature data to fuzzy-perceptions such as "cool", "hot", etc. We find the model to show patterns that are similar to classical fuzzy-set theoretic formulations of linguistic hedges, albeit with a substantial amount of noise, suggesting that models trained solely on language show promise in encoding fuzziness. △ Less

Submitted 21 April, 2021; originally announced April 2021.

Comments: To appear at NAFIPS 2021

arXiv:2101.07397 [pdf, ps, other]

Exploring Lexical Irregularities in Hypothesis-Only Models of Natural Language Inference

Authors: Qingyuan Hu, Yi Zhang, Kanishka Misra, Julia Rayz

Abstract: Natural Language Inference (NLI) or Recognizing Textual Entailment (RTE) is the task of predicting the entailment relation between a pair of sentences (premise and hypothesis). This task has been described as a valuable testing ground for the development of semantic representations, and is a key component in natural language understanding evaluation benchmarks. Models that understand entailment sh… ▽ More Natural Language Inference (NLI) or Recognizing Textual Entailment (RTE) is the task of predicting the entailment relation between a pair of sentences (premise and hypothesis). This task has been described as a valuable testing ground for the development of semantic representations, and is a key component in natural language understanding evaluation benchmarks. Models that understand entailment should encode both, the premise and the hypothesis. However, experiments by Poliak et al. revealed a strong preference of these models towards patterns observed only in the hypothesis, based on a 10 dataset comparison. Their results indicated the existence of statistical irregularities present in the hypothesis that bias the model into performing competitively with the state of the art. While recast datasets provide large scale generation of NLI instances due to minimal human intervention, the papers that generate them do not provide fine-grained analysis of the potential statistical patterns that can bias NLI models. In this work, we analyze hypothesis-only models trained on one of the recast datasets provided in Poliak et al. for word-level patterns. Our results indicate the existence of potential lexical biases that could contribute to inflating the model performance. △ Less

Submitted 21 January, 2021; v1 submitted 18 January, 2021; originally announced January 2021.

Comments: Accepted by 2020 IEEE 19th International Conference on Cognitive Informatics & Cognitive Computing (ICCI* CC). IEEE

arXiv:2101.01693 [pdf, other]

COVID-19 Tests Gone Rogue: Privacy, Efficacy, Mismanagement and Misunderstandings

Authors: Manuel Morales, Rachel Barbar, Darshan Gandhi, Sanskruti Landage, Joseph Bae, Arpita Vats, Jil Kothari, Sheshank Shankar, Rohan Sukumaran, Himi Mathur, Krutika Misra, Aishwarya Saxena, Parth Patwa, Sethuraman T. V., Maurizio Arseni, Shailesh Advani, Kasia Jakimowicz, Sunaina Anand, Priyanshi Katiyar, Ashley Mehra, Rohan Iyer, Srinidhi Murali, Aryan Mahindra, Mikhail Dmitrienko, Saurish Srivastava , et al. (5 additional authors not shown)

Abstract: COVID-19 testing, the cornerstone for effective screening and identification of COVID-19 cases, remains paramount as an intervention tool to curb the spread of COVID-19 both at local and national levels. However, the speed at which the pandemic struck and the response was rolled out, the widespread impact on healthcare infrastructure, the lack of sufficient preparation within the public health sys… ▽ More COVID-19 testing, the cornerstone for effective screening and identification of COVID-19 cases, remains paramount as an intervention tool to curb the spread of COVID-19 both at local and national levels. However, the speed at which the pandemic struck and the response was rolled out, the widespread impact on healthcare infrastructure, the lack of sufficient preparation within the public health system, and the complexity of the crisis led to utter confusion among test-takers. Invasion of privacy remains a crucial concern. The user experience of test takers remains low. User friction affects user behavior and discourages participation in testing programs. Test efficacy has been overstated. Test results are poorly understood resulting in inappropriate follow-up recommendations. Herein, we review the current landscape of COVID-19 testing, identify four key challenges, and discuss the consequences of the failure to address these challenges. The current infrastructure around testing and information propagation is highly privacy-invasive and does not leverage scalable digital components. In this work, we discuss challenges complicating the existing covid-19 testing ecosystem and highlight the need to improve the testing experience for the user and reduce privacy invasions. Digital tools will play a critical role in resolving these challenges. △ Less

Submitted 7 May, 2021; v1 submitted 5 January, 2021; originally announced January 2021.

Comments: 22 pages, 2 figures

arXiv:2012.01772 [pdf, other]

Digital Landscape of COVID-19 Testing: Challenges and Opportunities

Authors: Darshan Gandhi, Rohan Sukumaran, Priyanshi Katiyar, Alex Radunsky, Sunaina Anand, Shailesh Advani, Jil Kothari, Kasia Jakimowicz, Sheshank Shankar, Sethuraman T. V., Krutika Misra, Aishwarya Saxena, Sanskruti Landage, Richa Sonker, Parth Patwa, Aryan Mahindra, Mikhail Dmitrienko, Kanishka Vaish, Ashley Mehra, Srinidhi Murali, Rohan Iyer, Joseph Bae, Vivek Sharma, Abhishek Singh, Rachel Barbar , et al. (1 additional authors not shown)

Abstract: The COVID-19 Pandemic has left a devastating trail all over the world, in terms of loss of lives, economic decline, travel restrictions, trade deficit, and collapsing economy including real-estate, job loss, loss of health benefits, the decline in quality of access to care and services and overall quality of life. Immunization from the anticipated vaccines will not be the stand-alone guideline tha… ▽ More The COVID-19 Pandemic has left a devastating trail all over the world, in terms of loss of lives, economic decline, travel restrictions, trade deficit, and collapsing economy including real-estate, job loss, loss of health benefits, the decline in quality of access to care and services and overall quality of life. Immunization from the anticipated vaccines will not be the stand-alone guideline that will help surpass the pandemic and return to normalcy. Four pillars of effective public health intervention include diagnostic testing for both asymptomatic and symptomatic individuals, contact tracing, quarantine of individuals with symptoms or who are exposed to COVID-19, and maintaining strict hygiene standards at the individual and community level. Digital technology, currently being used for COVID-19 testing include certain mobile apps, web dashboards, and online self-assessment tools. Herein, we look into various digital solutions adapted by communities across universities, businesses, and other organizations. We summarize the challenges experienced using these tools in terms of quality of information, privacy, and user-centric issues. Despite numerous digital solutions available and being developed, many vary in terms of information being shared in terms of both quality and quantity, which can be overwhelming to the users. Understanding the testing landscape through a digital lens will give a clear insight into the multiple challenges that we face including data privacy, cost, and miscommunication. It is the destiny of digitalization to navigate testing for COVID-19. Block-chain based systems can be used for privacy preservation and ensuring ownership of the data to remain with the user. Another solution involves having digital health passports with relevant and correct information. In this early draft, we summarize the challenges and propose possible solutions to address the same. △ Less

Submitted 3 December, 2020; originally announced December 2020.

Comments: 28 pages, 4 figures

arXiv:2010.03010 [pdf, other]

doi 10.18653/v1/2020.findings-emnlp.415

Exploring BERT's Sensitivity to Lexical Cues using Tests from Semantic Priming

Authors: Kanishka Misra, Allyson Ettinger, Julia Taylor Rayz

Abstract: Models trained to estimate word probabilities in context have become ubiquitous in natural language processing. How do these models use lexical cues in context to inform their word probabilities? To answer this question, we present a case study analyzing the pre-trained BERT model with tests informed by semantic priming. Using English lexical stimuli that show priming in humans, we find that BERT… ▽ More Models trained to estimate word probabilities in context have become ubiquitous in natural language processing. How do these models use lexical cues in context to inform their word probabilities? To answer this question, we present a case study analyzing the pre-trained BERT model with tests informed by semantic priming. Using English lexical stimuli that show priming in humans, we find that BERT too shows "priming," predicting a word with greater probability when the context includes a related word versus an unrelated one. This effect decreases as the amount of information provided by the context increases. Follow-up analysis shows BERT to be increasingly distracted by related prime words as context becomes more informative, assigning lower probabilities to related words. Our findings highlight the importance of considering contextual constraint effects when studying word prediction in these models, and highlight possible parallels with human processing. △ Less

Submitted 6 October, 2020; originally announced October 2020.

Comments: Accepted for publication in Findings of ACL: EMNLP 2020

arXiv:2010.01666 [pdf, other]

Multi-Modal Retrieval using Graph Neural Networks

Authors: Aashish Kumar Misraa, Ajinkya Kale, Pranav Aggarwal, Ali Aminian

Abstract: Most real world applications of image retrieval such as Adobe Stock, which is a marketplace for stock photography and illustrations, need a way for users to find images which are both visually (i.e. aesthetically) and conceptually (i.e. containing the same salient objects) as a query image. Learning visual-semantic representations from images is a well studied problem for image retrieval. Filterin… ▽ More Most real world applications of image retrieval such as Adobe Stock, which is a marketplace for stock photography and illustrations, need a way for users to find images which are both visually (i.e. aesthetically) and conceptually (i.e. containing the same salient objects) as a query image. Learning visual-semantic representations from images is a well studied problem for image retrieval. Filtering based on image concepts or attributes is traditionally achieved with index-based filtering (e.g. on textual tags) or by re-ranking after an initial visual embedding based retrieval. In this paper, we learn a joint vision and concept embedding in the same high-dimensional space. This joint model gives the user fine-grained control over the semantics of the result set, allowing them to explore the catalog of images more rapidly. We model the visual and concept relationships as a graph structure, which captures the rich information through node neighborhood. This graph structure helps us learn multi-modal node embeddings using Graph Neural Networks. We also introduce a novel inference time control, based on selective neighborhood connectivity allowing the user control over the retrieval algorithm. We evaluate these multi-modal embeddings quantitatively on the downstream relevance task of image retrieval on MS-COCO dataset and qualitatively on MS-COCO and an Adobe Stock dataset. △ Less

Submitted 4 October, 2020; originally announced October 2020.

arXiv:1809.00367 [pdf, ps, other]

doi 10.2514/1.G003541

Momentum Model-based Minimal Parameter Identification of a Space Robot

Authors: B. Naveen, Suril V. Shah, Arun K. Misra

Abstract: Accurate information of inertial parameters is critical to motion planning and control of space robots. Before the launch, only a rudimentary estimate of the inertial parameters is available from experiments and computer-aided design (CAD) models. After the launch, on-orbit operations substantially alter the value of inertial parameters. In this work, we propose a new momentum model-based method f… ▽ More Accurate information of inertial parameters is critical to motion planning and control of space robots. Before the launch, only a rudimentary estimate of the inertial parameters is available from experiments and computer-aided design (CAD) models. After the launch, on-orbit operations substantially alter the value of inertial parameters. In this work, we propose a new momentum model-based method for identifying the minimal parameters of a space robot while on orbit. Minimal parameters are combinations of the inertial parameters of the links and uniquely define the momentum and dynamic models. Consequently, they are sufficient for motion planning and control of both the satellite and robotic arms mounted on it. The key to the proposed framework is the unique formulation of momentum model in the linear form of minimal parameters. Further, to estimate the minimal parameters, we propose a novel joint trajectory planning and optimization technique based on direction combinations of joints' velocity. The efficacy of the identification framework is demonstrated on a 12 degrees-of-freedom, spatial, dual-arm space robot. The methodology is developed for tree-type space robots, requires just the pose and twist data, and scalable with increasing number of joints. △ Less

Submitted 2 September, 2018; originally announced September 2018.

Comments: Accepted for publication in AIAA Journal of Guidance, Control, and Dynamics

arXiv:1509.04618 [pdf]

Cost Efficient Design of Reversible Adder Circuits for Low Power Applications

Authors: Neeraj Kumar Misra, Mukesh Kumar Kushwaha, Subodh Wairya, Amit Kumar

Abstract: A large amount of research is currently going on in the field of reversible logic, which have low heat dissipation, low power consumption, which is the main factor to apply reversible in digital VLSI circuit design. This paper introduces reversible gate named as Inventive0 gate. The novel gate is synthesis the efficient adder modules with minimum garbage output and gate count. The Inventive0 gate… ▽ More A large amount of research is currently going on in the field of reversible logic, which have low heat dissipation, low power consumption, which is the main factor to apply reversible in digital VLSI circuit design. This paper introduces reversible gate named as Inventive0 gate. The novel gate is synthesis the efficient adder modules with minimum garbage output and gate count. The Inventive0 gate capable of implementing a 4-bit ripple carry adder and carry skip adders.It is presented that Inventive0 gate is much more efficient and optimized approach as compared to their existing design, in terms of gate count, garbage outputs and constant inputs. In addition, some popular available reversible gates are implemented in the MOS transistor design the implementation kept in mind for minimum MOS transistor count and are completely reversible in behavior more precise forward and backward computation. Lesser architectural complexity show that the novel designs are compact, fast as well as low power. △ Less

Submitted 11 September, 2015; originally announced September 2015.

Comments: 9 pages, 12 figures, journal

arXiv:1509.04328 [pdf]

doi 10.5121/vlsic.2014.5502

Evolution of structure of some binary group based n bit comparator, n-to-2n decoder by reversible technique

Authors: Neeraj Kumar Misra, Subodh Wairya, Vinod Kumar Singh

Abstract: Reversible logic has attracted substantial interest due to its low power consumption which is the main concern of low power VLSI circuit design. In this paper, a novel 4x4 reversible gate called inventive gate has been introduced and using this gate 1-bit, 2-bit, 8-bit, 32-bit and n-bit group-based reversible comparator have been constructed with low value of reversible parameters. The MOS transis… ▽ More Reversible logic has attracted substantial interest due to its low power consumption which is the main concern of low power VLSI circuit design. In this paper, a novel 4x4 reversible gate called inventive gate has been introduced and using this gate 1-bit, 2-bit, 8-bit, 32-bit and n-bit group-based reversible comparator have been constructed with low value of reversible parameters. The MOS transistor realizations of 1-bit, 2- bit, and 8-bit of reversible comparator are also presented and finding power, delay and power delay product (PDP) with appropriate aspect ratio W/L. Novel inventive gate has the ability to use as an n-to-2n decoder. Different proposed novel reversible circuit design style is compared with the existing ones. The relative results shows that the novel reversible gate wide utility, group-based reversible comparator outperforms the present design style in terms of number of gates, garbage outputs and constant input. △ Less

Submitted 11 September, 2015; originally announced September 2015.

Comments: 22 pages, 19 figure, journal

Journal ref: International Journal of VLSI design & Communication Systems (VLSICS) Vol.5, No.5, October 2014

arXiv:1509.04240 [pdf]

doi 10.5121/vlsic.2015.6401

Feasible methodology for optimization of a novel reversible binary compressor

Authors: Neeraj Kumar Misra, Mukesh Kumar Kushwaha, Subodh Wairya, Amit Kumar

Abstract: Now a day reversible logic is an attractive research area due to its low power consumption in the area of VLSI circuit design. The reversible logic gate is utilized to optimize power consumption by a feature of retrieving input logic from an output logic because of bijective mapping between input and output. In this manuscript, we design 4 2 and 5 2 reversible compressor circuits using a new type… ▽ More Now a day reversible logic is an attractive research area due to its low power consumption in the area of VLSI circuit design. The reversible logic gate is utilized to optimize power consumption by a feature of retrieving input logic from an output logic because of bijective mapping between input and output. In this manuscript, we design 4 2 and 5 2 reversible compressor circuits using a new type of reversible gate. In addition, we propose new gate, named as inventive0 gate for optimizing a compressor circuit. The utility of the inventive0 gate is that it can be used as full adder and full subtraction with low value of garbage outputs and quantum cost. An algorithm is shown for designing a compressor structure. The comparative study shows that the proposed compressor structure outperforms the existing ones in terms of garbage outputs, number of gates and quantum cost. The compressor can reduce the effect of carry (Produce from full adder) of the arithmetic frame design. In addition, we implement a basic reversible gate of MOS transistor with less number of MOS transistor count. △ Less

Submitted 11 September, 2015; originally announced September 2015.

Comments: 13 pages, 9 figures

Journal ref: International Journal of VLSI design & Communication Systems (VLSICS) Vol.6, No.4, August 2015

arXiv:1412.0691 [pdf, other]

RoboBrain: Large-Scale Knowledge Engine for Robots

Authors: Ashutosh Saxena, Ashesh Jain, Ozan Sener, Aditya Jami, Dipendra K. Misra, Hema S. Koppula

Abstract: In this paper we introduce a knowledge engine, which learns and shares knowledge representations, for robots to carry out a variety of tasks. Building such an engine brings with it the challenge of dealing with multiple data modalities including symbols, natural language, haptic senses, robot trajectories, visual features and many others. The \textit{knowledge} stored in the engine comes from mult… ▽ More In this paper we introduce a knowledge engine, which learns and shares knowledge representations, for robots to carry out a variety of tasks. Building such an engine brings with it the challenge of dealing with multiple data modalities including symbols, natural language, haptic senses, robot trajectories, visual features and many others. The \textit{knowledge} stored in the engine comes from multiple sources including physical interactions that robots have while performing tasks (perception, planning and control), knowledge bases from the Internet and learned representations from several robotics research groups. We discuss various technical aspects and associated challenges such as modeling the correctness of knowledge, inferring latent information and formulating different robotic tasks as queries to the knowledge engine. We describe the system architecture and how it supports different mechanisms for users and robots to interact with the engine. Finally, we demonstrate its use in three important research areas: grounding natural language, perception, and planning, which are the key building blocks for many robotic tasks. This knowledge engine is a collaborative effort and we call it RoboBrain. △ Less

Submitted 12 April, 2015; v1 submitted 1 December, 2014; originally announced December 2014.

Comments: 10 pages, 9 figures

arXiv:1001.2270 [pdf]

An Improved Approach to High Level Privacy Preserving Itemset Mining

Authors: Rajesh Kumar Boora, Ruchi Shukla, A. K. Misra

Abstract: Privacy preserving association rule mining has triggered the development of many privacy preserving data mining techniques. A large fraction of them use randomized data distortion techniques to mask the data for preserving. This paper proposes a new transaction randomization method which is a combination of the fake transaction randomization method and a new per transaction randomization method.… ▽ More Privacy preserving association rule mining has triggered the development of many privacy preserving data mining techniques. A large fraction of them use randomized data distortion techniques to mask the data for preserving. This paper proposes a new transaction randomization method which is a combination of the fake transaction randomization method and a new per transaction randomization method. This method distorts the items within each transaction and ensures a higher level of data privacy in comparison to the previous approaches. The pertransaction randomization method involves a randomization function to replace the item by a random number guarantying privacy within the transaction also. A tool has also been developed to implement the proposed approach to mine frequent itemsets and association rules from the data guaranteeing the antimonotonic property. △ Less

Submitted 13 January, 2010; originally announced January 2010.

Comments: 8 pages IEEE format, International Journal of Computer Science and Information Security, IJCSIS December 2009, ISSN 1947 5500, http://sites.google.com/site/ijcsis/

Report number: Volume 6, No. 3, ISSN 1947 5500

Journal ref: International Journal of Computer Science and Information Security, IJCSIS, Vol. 6, No. 3, pp. 216-223, December 2009, USA

Showing 1–26 of 26 results for author: Misra, K