Search | arXiv e-print repository

Condensed Sample-Guided Model Inversion for Knowledge Distillation

Authors: Kuluhan Binici, Shivam Aggarwal, Cihan Acar, Nam Trung Pham, Karianto Leman, Gim Hee Lee, Tulika Mitra

Abstract: Knowledge distillation (KD) is a key element in neural network compression that allows knowledge transfer from a pre-trained teacher model to a more compact student model. KD relies on access to the training dataset, which may not always be fully available due to privacy concerns or logistical issues related to the size of the data. To address this, "data-free" KD methods use synthetic data, gener… ▽ More Knowledge distillation (KD) is a key element in neural network compression that allows knowledge transfer from a pre-trained teacher model to a more compact student model. KD relies on access to the training dataset, which may not always be fully available due to privacy concerns or logistical issues related to the size of the data. To address this, "data-free" KD methods use synthetic data, generated through model inversion, to mimic the target data distribution. However, conventional model inversion methods are not designed to utilize supplementary information from the target dataset, and thus, cannot leverage it to improve performance, even when it is available. In this paper, we consider condensed samples, as a form of supplementary information, and introduce a method for using them to better approximate the target data distribution, thereby enhancing the KD performance. Our approach is versatile, evidenced by improvements of up to 11.4% in KD accuracy across various datasets and model inversion-based methods. Importantly, it remains effective even when using as few as one condensed sample per class, and can also enhance performance in few-shot scenarios where only limited real data samples are available. △ Less

Submitted 25 August, 2024; originally announced August 2024.

arXiv:2408.06201 [pdf, other]

doi 10.1145/3687041

Investigating Characteristics of Media Recommendation Solicitation in r/ifyoulikeblank

Authors: Md Momen Bhuiyan, Donghan Hu, Andrew Jelson, Tanushree Mitra, Sang Won Lee

Abstract: Despite the existence of search-based recommender systems like Google, Netflix, and Spotify, online users sometimes may turn to crowdsourced recommendations in places like the r/ifyoulikeblank subreddit. In this exploratory study, we probe why users go to r/ifyoulikeblank, how they look for recommendation, and how the subreddit users respond to recommendation requests. To answer, we collected samp… ▽ More Despite the existence of search-based recommender systems like Google, Netflix, and Spotify, online users sometimes may turn to crowdsourced recommendations in places like the r/ifyoulikeblank subreddit. In this exploratory study, we probe why users go to r/ifyoulikeblank, how they look for recommendation, and how the subreddit users respond to recommendation requests. To answer, we collected sample posts from r/ifyoulikeblank and analyzed them using a qualitative approach. Our analysis reveals that users come to this subreddit for various reasons, such as exhausting popular search systems, not knowing what or how to search for an item, and thinking crowd have better knowledge than search systems. Examining users query and their description, we found novel information users provide during recommendation seeking using r/ifyoulikeblank. For example, sometimes they ask for artifacts recommendation based on the tools used to create them. Or, sometimes indicating a recommendation seeker's time constraints can help better suit recommendations to their needs. Finally, recommendation responses and interactions revealed patterns of how requesters and responders refine queries and recommendations. Our work informs future intelligent recommender systems design. △ Less

Submitted 12 August, 2024; originally announced August 2024.

Comments: page 23

arXiv:2408.01962 [pdf, other]

The Implications of Open Generative Models in Human-Centered Data Science Work: A Case Study with Fact-Checking Organizations

Authors: Robert Wolfe, Tanushree Mitra

Abstract: Calls to use open generative language models in academic research have highlighted the need for reproducibility and transparency in scientific research. However, the impact of generative AI extends well beyond academia, as corporations and public interest organizations have begun integrating these models into their data science pipelines. We expand this lens to include the impact of open models on… ▽ More Calls to use open generative language models in academic research have highlighted the need for reproducibility and transparency in scientific research. However, the impact of generative AI extends well beyond academia, as corporations and public interest organizations have begun integrating these models into their data science pipelines. We expand this lens to include the impact of open models on organizations, focusing specifically on fact-checking organizations, which use AI to observe and analyze large volumes of circulating misinformation, yet must also ensure the reproducibility and impartiality of their work. We wanted to understand where fact-checking organizations use open models in their data science pipelines; what motivates their use of open models or proprietary models; and how their use of open or proprietary models can inform research on the societal impact of generative AI. To answer these questions, we conducted an interview study with N=24 professionals at 20 fact-checking organizations on six continents. Based on these interviews, we offer a five-component conceptual model of where fact-checking organizations employ generative AI to support or automate parts of their data science pipeline, including Data Ingestion, Data Analysis, Data Retrieval, Data Delivery, and Data Sharing. We then provide taxonomies of fact-checking organizations' motivations for using open models and the limitations that prevent them for further adopting open models, finding that they prefer open models for Organizational Autonomy, Data Privacy and Ownership, Application Specificity, and Capability Transparency. However, they nonetheless use proprietary models due to perceived advantages in Performance, Usability, and Safety, as well as Opportunity Costs related to participation in emerging generative AI ecosystems. Our work provides novel perspective on open models in data-driven organizations. △ Less

Submitted 4 August, 2024; originally announced August 2024.

Comments: Accepted at Artificial Intelligence, Ethics, and Society 2024

arXiv:2407.16040 [pdf, other]

Generalizing Teacher Networks for Effective Knowledge Distillation Across Student Architectures

Authors: Kuluhan Binici, Weiming Wu, Tulika Mitra

Abstract: Knowledge distillation (KD) is a model compression method that entails training a compact student model to emulate the performance of a more complex teacher model. However, the architectural capacity gap between the two models limits the effectiveness of knowledge transfer. Addressing this issue, previous works focused on customizing teacher-student pairs to improve compatibility, a computationall… ▽ More Knowledge distillation (KD) is a model compression method that entails training a compact student model to emulate the performance of a more complex teacher model. However, the architectural capacity gap between the two models limits the effectiveness of knowledge transfer. Addressing this issue, previous works focused on customizing teacher-student pairs to improve compatibility, a computationally expensive process that needs to be repeated every time either model changes. Hence, these methods are impractical when a teacher model has to be compressed into different student models for deployment on multiple hardware devices with distinct resource constraints. In this work, we propose Generic Teacher Network (GTN), a one-off KD-aware training to create a generic teacher capable of effectively transferring knowledge to any student model sampled from a given finite pool of architectures. To this end, we represent the student pool as a weight-sharing supernet and condition our generic teacher to align with the capacities of various student architectures sampled from this supernet. Experimental evaluation shows that our method both improves overall KD effectiveness and amortizes the minimal additional training cost of the generic teacher across students in the pool. △ Less

Submitted 22 July, 2024; originally announced July 2024.

Comments: Accepted by the BMVC-24

arXiv:2407.02472 [pdf, other]

ValueScope: Unveiling Implicit Norms and Values via Return Potential Model of Social Interactions

Authors: Chan Young Park, Shuyue Stella Li, Hayoung Jung, Svitlana Volkova, Tanushree Mitra, David Jurgens, Yulia Tsvetkov

Abstract: This study introduces ValueScope, a framework leveraging language models to quantify social norms and values within online communities, grounded in social science perspectives on normative structures. We employ ValueScope to dissect and analyze linguistic and stylistic expressions across 13 Reddit communities categorized under gender, politics, science, and finance. Our analysis provides a quantit… ▽ More This study introduces ValueScope, a framework leveraging language models to quantify social norms and values within online communities, grounded in social science perspectives on normative structures. We employ ValueScope to dissect and analyze linguistic and stylistic expressions across 13 Reddit communities categorized under gender, politics, science, and finance. Our analysis provides a quantitative foundation showing that even closely related communities exhibit remarkably diverse norms. This diversity supports existing theories and adds a new dimension--community preference--to understanding community interactions. ValueScope not only delineates differing social norms among communities but also effectively traces their evolution and the influence of significant external events like the U.S. presidential elections and the emergence of new sub-communities. The framework thus highlights the pivotal role of social norms in shaping online interactions, presenting a substantial advance in both the theory and application of social norm studies in digital spaces. △ Less

Submitted 2 July, 2024; originally announced July 2024.

Comments: First three authors contributed equally. 33 pages. In submission

arXiv:2406.06543 [pdf, other]

SparrowSNN: A Hardware/software Co-design for Energy Efficient ECG Classification

Authors: Zhanglu Yan, Zhenyu Bai, Tulika Mitra, Weng-Fai Wong

Abstract: Heart disease is one of the leading causes of death worldwide. Given its high risk and often asymptomatic nature, real-time continuous monitoring is essential. Unlike traditional artificial neural networks (ANNs), spiking neural networks (SNNs) are well-known for their energy efficiency, making them ideal for wearable devices and energy-constrained edge computing platforms. However, current energy… ▽ More Heart disease is one of the leading causes of death worldwide. Given its high risk and often asymptomatic nature, real-time continuous monitoring is essential. Unlike traditional artificial neural networks (ANNs), spiking neural networks (SNNs) are well-known for their energy efficiency, making them ideal for wearable devices and energy-constrained edge computing platforms. However, current energy measurement of SNN implementations for detecting heart diseases typically rely on empirical values, often overlooking hardware overhead. Additionally, the integer and fire activations in SNNs require multiple memory accesses and repeated computations, which can further compromise energy efficiency. In this paper, we propose sparrowSNN, a redesign of the standard SNN workflow from a hardware perspective, and present a dedicated ASIC design for SNNs, optimized for ultra-low power wearable devices used in heartbeat classification. Using the MIT-BIH dataset, our SNN achieves a state-of-the-art accuracy of 98.29% for SNNs, with energy consumption of 31.39nJ per inference and power usage of 6.1uW, making sparrowSNN the highest accuracy with the lowest energy use among comparable systems. We also compare the energy-to-accuracy trade-offs between SNNs and quantized ANNs, offering recommendations on insights on how best to use SNNs. △ Less

Submitted 6 May, 2024; originally announced June 2024.

arXiv:2405.17025 [pdf, other]

SWAT: Scalable and Efficient Window Attention-based Transformers Acceleration on FPGAs

Authors: Zhenyu Bai, Pranav Dangi, Huize Li, Tulika Mitra

Abstract: Efficiently supporting long context length is crucial for Transformer models. The quadratic complexity of the self-attention computation plagues traditional Transformers. Sliding window-based static sparse attention mitigates the problem by limiting the attention scope of the input tokens, reducing the theoretical complexity from quadratic to linear. Although the sparsity induced by window attenti… ▽ More Efficiently supporting long context length is crucial for Transformer models. The quadratic complexity of the self-attention computation plagues traditional Transformers. Sliding window-based static sparse attention mitigates the problem by limiting the attention scope of the input tokens, reducing the theoretical complexity from quadratic to linear. Although the sparsity induced by window attention is highly structured, it does not align perfectly with the microarchitecture of the conventional accelerators, leading to suboptimal implementation. In response, we propose a dataflow-aware FPGA-based accelerator design, SWAT, that efficiently leverages the sparsity to achieve scalable performance for long input. The proposed microarchitecture is based on a design that maximizes data reuse by using a combination of row-wise dataflow, kernel fusion optimization, and an input-stationary design considering the distributed memory and computation resources of FPGA. Consequently, it achieves up to 22$\times$ and 5.7$\times$ improvement in latency and energy efficiency compared to the baseline FPGA-based accelerator and 15$\times$ energy efficiency compared to GPU-based solution. △ Less

Submitted 27 May, 2024; originally announced May 2024.

Comments: Accepeted paper for DAC'22

arXiv:2405.15985 [pdf, other]

doi 10.1145/3630106.3658987

The Impact and Opportunities of Generative AI in Fact-Checking

Authors: Robert Wolfe, Tanushree Mitra

Abstract: Generative AI appears poised to transform white collar professions, with more than 90% of Fortune 500 companies using OpenAI's flagship GPT models, which have been characterized as "general purpose technologies" capable of effecting epochal changes in the economy. But how will such technologies impact organizations whose job is to verify and report factual information, and to ensure the health of… ▽ More Generative AI appears poised to transform white collar professions, with more than 90% of Fortune 500 companies using OpenAI's flagship GPT models, which have been characterized as "general purpose technologies" capable of effecting epochal changes in the economy. But how will such technologies impact organizations whose job is to verify and report factual information, and to ensure the health of the information ecosystem? To investigate this question, we conducted 30 interviews with N=38 participants working at 29 fact-checking organizations across six continents, asking about how they use generative AI and the opportunities and challenges they see in the technology. We found that uses of generative AI envisioned by fact-checkers differ based on organizational infrastructure, with applications for quality assurance in Editing, for trend analysis in Investigation, and for information literacy in Advocacy. We used the TOE framework to describe participant concerns ranging from the Technological (lack of transparency), to the Organizational (resource constraints), to the Environmental (uncertain and evolving policy). Building on the insights of our participants, we describe value tensions between fact-checking and generative AI, and propose a novel Verification dimension to the design space of generative models for information verification work. Finally, we outline an agenda for fairness, accountability, and transparency research to support the responsible use of generative AI in fact-checking. Throughout, we highlight the importance of human infrastructure and labor in producing verified information in collaboration with AI. We expect that this work will inform not only the scientific literature on fact-checking, but also contribute to understanding of organizational adaptation to a powerful but unreliable new technology. △ Less

Submitted 24 May, 2024; originally announced May 2024.

Comments: To be published at the ACM Conference on Fairness, Accountability, and Transparency (FAccT) 2024

arXiv:2405.05378 [pdf, other]

"They are uncultured": Unveiling Covert Harms and Social Threats in LLM Generated Conversations

Authors: Preetam Prabhu Srikar Dammu, Hayoung Jung, Anjali Singh, Monojit Choudhury, Tanushree Mitra

Abstract: Large language models (LLMs) have emerged as an integral part of modern societies, powering user-facing applications such as personal assistants and enterprise applications like recruitment tools. Despite their utility, research indicates that LLMs perpetuate systemic biases. Yet, prior works on LLM harms predominantly focus on Western concepts like race and gender, often overlooking cultural conc… ▽ More Large language models (LLMs) have emerged as an integral part of modern societies, powering user-facing applications such as personal assistants and enterprise applications like recruitment tools. Despite their utility, research indicates that LLMs perpetuate systemic biases. Yet, prior works on LLM harms predominantly focus on Western concepts like race and gender, often overlooking cultural concepts from other parts of the world. Additionally, these studies typically investigate "harm" as a singular dimension, ignoring the various and subtle forms in which harms manifest. To address this gap, we introduce the Covert Harms and Social Threats (CHAST), a set of seven metrics grounded in social science literature. We utilize evaluation models aligned with human assessments to examine the presence of covert harms in LLM-generated conversations, particularly in the context of recruitment. Our experiments reveal that seven out of the eight LLMs included in this study generated conversations riddled with CHAST, characterized by malign views expressed in seemingly neutral language unlikely to be detected by existing methods. Notably, these LLMs manifested more extreme views and opinions when dealing with non-Western concepts like caste, compared to Western ones such as race. △ Less

Submitted 8 May, 2024; originally announced May 2024.

arXiv:2402.17218 [pdf, other]

doi 10.1145/3613904.3642490

Viblio: Introducing Credibility Signals and Citations to Video-Sharing Platforms

Authors: Emelia Hughes, Renee Wang, Prerna Juneja, Tony Li, Tanu Mitra, Amy Zhang

Abstract: As more users turn to video-sharing platforms like YouTube as an information source, they may consume misinformation despite their best efforts. In this work, we investigate ways that users can better assess the credibility of videos by first exploring how users currently determine credibility using existing signals on platforms and then by introducing and evaluating new credibility-based signals.… ▽ More As more users turn to video-sharing platforms like YouTube as an information source, they may consume misinformation despite their best efforts. In this work, we investigate ways that users can better assess the credibility of videos by first exploring how users currently determine credibility using existing signals on platforms and then by introducing and evaluating new credibility-based signals. We conducted 12 contextual inquiry interviews with YouTube users, determining that participants used a combination of existing signals, such as the channel name, the production quality, and prior knowledge, to evaluate credibility, yet sometimes stumbled in their efforts to do so. We then developed Viblio, a prototype system that enables YouTube users to view and add citations and related information while watching a video based on our participants' needs. From an evaluation with 12 people, all participants found Viblio to be intuitive and useful in the process of evaluating a video's credibility and could see themselves using Viblio in the future. △ Less

Submitted 27 February, 2024; originally announced February 2024.

arXiv:2401.03533 [pdf, other]

Characterizing Political Campaigning with Lexical Mutants on Indian Social Media

Authors: Shruti Phadke, Tanushree Mitra

Abstract: Increasingly online platforms are becoming popular arenas of political amplification in India. With known instances of pre-organized coordinated operations, researchers are questioning the legitimacy of political expression and its consequences on the democratic processes in India. In this paper, we study an evolved form of political amplification by first identifying and then characterizing polit… ▽ More Increasingly online platforms are becoming popular arenas of political amplification in India. With known instances of pre-organized coordinated operations, researchers are questioning the legitimacy of political expression and its consequences on the democratic processes in India. In this paper, we study an evolved form of political amplification by first identifying and then characterizing political campaigns with lexical mutations. By lexical mutation, we mean content that is reframed, paraphrased, or altered while preserving the same underlying message. Using multilingual embeddings and network analysis, we detect over 3.8K political campaigns with text mutations spanning multiple languages and social media platforms in India. By further assessing the political leanings of accounts repeatedly involved in such amplification campaigns, we contribute a broader understanding of how political amplification is used across various political parties in India. Moreover, our temporal analysis of the largest amplification campaigns suggests that political campaigning can evolve as temporally ordered arguments and counter-arguments between groups with competing political interests. Overall, our work contributes insights into how lexical mutations can be leveraged to bypass the platform manipulation policies and how such competing campaigning can provide an exaggerated sense of political divide on Indian social media. △ Less

Submitted 7 January, 2024; originally announced January 2024.

Journal ref: THE 18TH INTERNATIONAL AAAI CONFERENCE ICWSM 2024

arXiv:2311.14272 [pdf, other]

CRISP: Hybrid Structured Sparsity for Class-aware Model Pruning

Authors: Shivam Aggarwal, Kuluhan Binici, Tulika Mitra

Abstract: Machine learning pipelines for classification tasks often train a universal model to achieve accuracy across a broad range of classes. However, a typical user encounters only a limited selection of classes regularly. This disparity provides an opportunity to enhance computational efficiency by tailoring models to focus on user-specific classes. Existing works rely on unstructured pruning, which in… ▽ More Machine learning pipelines for classification tasks often train a universal model to achieve accuracy across a broad range of classes. However, a typical user encounters only a limited selection of classes regularly. This disparity provides an opportunity to enhance computational efficiency by tailoring models to focus on user-specific classes. Existing works rely on unstructured pruning, which introduces randomly distributed non-zero values in the model, making it unsuitable for hardware acceleration. Alternatively, some approaches employ structured pruning, such as channel pruning, but these tend to provide only minimal compression and may lead to reduced model accuracy. In this work, we propose CRISP, a novel pruning framework leveraging a hybrid structured sparsity pattern that combines both fine-grained N:M structured sparsity and coarse-grained block sparsity. Our pruning strategy is guided by a gradient-based class-aware saliency score, allowing us to retain weights crucial for user-specific classes. CRISP achieves high accuracy with minimal memory consumption for popular models like ResNet-50, VGG-16, and MobileNetV2 on ImageNet and CIFAR-100 datasets. Moreover, CRISP delivers up to 14$\times$ reduction in latency and energy consumption compared to existing pruning methods while maintaining comparable accuracy. Our code is available at https://github.com/shivmgg/CRISP/. △ Less

Submitted 18 March, 2024; v1 submitted 23 November, 2023; originally announced November 2023.

Comments: 6 pages, accepted in Design, Automation & Test in Europe Conference & Exhibition (DATE) 2024

arXiv:2311.12359 [pdf, other]

Shedding the Bits: Pushing the Boundaries of Quantization with Minifloats on FPGAs

Authors: Shivam Aggarwal, Hans Jakob Damsgaard, Alessandro Pappalardo, Giuseppe Franco, Thomas B. Preußer, Michaela Blott, Tulika Mitra

Abstract: Post-training quantization (PTQ) is a powerful technique for model compression, reducing the numerical precision in neural networks without additional training overhead. Recent works have investigated adopting 8-bit floating-point formats(FP8) in the context of PTQ for model inference. However, floating-point formats smaller than 8 bits and their relative comparison in terms of accuracy-hardware c… ▽ More Post-training quantization (PTQ) is a powerful technique for model compression, reducing the numerical precision in neural networks without additional training overhead. Recent works have investigated adopting 8-bit floating-point formats(FP8) in the context of PTQ for model inference. However, floating-point formats smaller than 8 bits and their relative comparison in terms of accuracy-hardware cost with integers remains unexplored on FPGAs. In this work, we present minifloats, which are reduced-precision floating-point formats capable of further reducing the memory footprint, latency, and energy cost of a model while approaching full-precision model accuracy. We implement a custom FPGA-based multiply-accumulate operator library and explore the vast design space, comparing minifloat and integer representations across 3 to 8 bits for both weights and activations. We also examine the applicability of various integerbased quantization techniques to minifloats. Our experiments show that minifloats offer a promising alternative for emerging workloads such as vision transformers. △ Less

Submitted 5 July, 2024; v1 submitted 21 November, 2023; originally announced November 2023.

Comments: Accepted in FPL (International Conference on Field-Programmable Logic and Applications) 2024 conference. Revised with updated results

arXiv:2311.03826 [pdf, other]

Accelerating Unstructured SpGEMM using Structured In-situ Computing

Authors: Huize Li, Tulika Mitra

Abstract: Sparse matrix-matrix multiplication (SpGEMM) is a critical kernel widely employed in machine learning and graph algorithms. However, real-world matrices' high sparsity makes SpGEMM memory-intensive. In-situ computing offers the potential to accelerate memory-intensive applications through high bandwidth and parallelism. Nevertheless, the irregular distribution of non-zeros renders SpGEMM a typical… ▽ More Sparse matrix-matrix multiplication (SpGEMM) is a critical kernel widely employed in machine learning and graph algorithms. However, real-world matrices' high sparsity makes SpGEMM memory-intensive. In-situ computing offers the potential to accelerate memory-intensive applications through high bandwidth and parallelism. Nevertheless, the irregular distribution of non-zeros renders SpGEMM a typical unstructured software. In contrast, in-situ computing platforms follow a fixed calculation manner, making them structured hardware. The mismatch between unstructured software and structured hardware leads to sub-optimal performance of current solutions. In this paper, we propose SPLIM, a novel in-situ computing SpGEMM accelerator. SPLIM involves two innovations. First, we present a novel computation paradigm that converts SpGEMM into structured in-situ multiplication and unstructured accumulation. Second, we develop a unique coordinates alignment method utilizing in-situ search operations, effectively transforming unstructured accumulation into high parallel searching operations. Our experimental results demonstrate that SPLIM achieves 275.74$\times$ performance improvement and 687.19$\times$ energy saving compared to NVIDIA RTX A6000 GPU. △ Less

Submitted 7 November, 2023; originally announced November 2023.

arXiv:2309.11071 [pdf, other]

InkStream: Real-time GNN Inference on Streaming Graphs via Incremental Update

Authors: Dan Wu, Zhaoying Li, Tulika Mitra

Abstract: Classic Graph Neural Network (GNN) inference approaches, designed for static graphs, are ill-suited for streaming graphs that evolve with time. The dynamism intrinsic to streaming graphs necessitates constant updates, posing unique challenges to acceleration on GPU. We address these challenges based on two key insights: (1) Inside the $k$-hop neighborhood, a significant fraction of the nodes is no… ▽ More Classic Graph Neural Network (GNN) inference approaches, designed for static graphs, are ill-suited for streaming graphs that evolve with time. The dynamism intrinsic to streaming graphs necessitates constant updates, posing unique challenges to acceleration on GPU. We address these challenges based on two key insights: (1) Inside the $k$-hop neighborhood, a significant fraction of the nodes is not impacted by the modified edges when the model uses min or max as aggregation function; (2) When the model weights remain static while the graph structure changes, node embeddings can incrementally evolve over time by computing only the impacted part of the neighborhood. With these insights, we propose a novel method, InkStream, designed for real-time inference with minimal memory access and computation, while ensuring an identical output to conventional methods. InkStream operates on the principle of propagating and fetching data only when necessary. It uses an event-based system to control inter-layer effect propagation and intra-layer incremental updates of node embedding. InkStream is highly extensible and easily configurable by allowing users to create and process customized events. We showcase that less than 10 lines of additional user code are needed to support popular GNN models such as GCN, GraphSAGE, and GIN. Our experiments with three GNN models on four large graphs demonstrate that InkStream accelerates by 2.5-427$\times$ on a CPU cluster and 2.4-343$\times$ on two different GPU clusters while producing identical outputs as GNN model inference on the latest graph snapshot. △ Less

Submitted 20 September, 2023; originally announced September 2023.

arXiv:2309.10623 [pdf, other]

Flip: Data-Centric Edge CGRA Accelerator

Authors: Dan Wu, Peng Chen, Thilini Kaushalya Bandara, Zhaoying Li, Tulika Mitra

Abstract: Coarse-Grained Reconfigurable Arrays (CGRA) are promising edge accelerators due to the outstanding balance in flexibility, performance, and energy efficiency. Classic CGRAs statically map compute operations onto the processing elements (PE) and route the data dependencies among the operations through the Network-on-Chip. However, CGRAs are designed for fine-grained static instruction-level paralle… ▽ More Coarse-Grained Reconfigurable Arrays (CGRA) are promising edge accelerators due to the outstanding balance in flexibility, performance, and energy efficiency. Classic CGRAs statically map compute operations onto the processing elements (PE) and route the data dependencies among the operations through the Network-on-Chip. However, CGRAs are designed for fine-grained static instruction-level parallelism and struggle to accelerate applications with dynamic and irregular data-level parallelism, such as graph processing. To address this limitation, we present Flip, a novel accelerator that enhances traditional CGRA architectures to boost the performance of graph applications. Flip retains the classic CGRA execution model while introducing a special data-centric mode for efficient graph processing. Specifically, it exploits the natural data parallelism of graph algorithms by mapping graph vertices onto processing elements (PEs) rather than the operations, and supporting dynamic routing of temporary data according to the runtime evolution of the graph frontier. Experimental results demonstrate that Flip achieves up to 36$\times$ speedup with merely 19% more area compared to classic CGRAs. Compared to state-of-the-art large-scale graph processors, Flip has similar energy efficiency and 2.2$\times$ better area efficiency at a much-reduced power/area budget. △ Less

Submitted 19 September, 2023; originally announced September 2023.

arXiv:2309.06127 [pdf, other]

Accelerating Edge AI with Morpher: An Integrated Design, Compilation and Simulation Framework for CGRAs

Authors: Dhananjaya Wijerathne, Zhaoying Li, Tulika Mitra

Abstract: Coarse-Grained Reconfigurable Arrays (CGRAs) hold great promise as power-efficient edge accelerator, offering versatility beyond AI applications. Morpher, an open-source, architecture-adaptive CGRA design framework, is specifically designed to explore the vast design space of CGRAs. The comprehensive ecosystem of Morpher includes a tailored compiler, simulator, accelerator synthesis, and validatio… ▽ More Coarse-Grained Reconfigurable Arrays (CGRAs) hold great promise as power-efficient edge accelerator, offering versatility beyond AI applications. Morpher, an open-source, architecture-adaptive CGRA design framework, is specifically designed to explore the vast design space of CGRAs. The comprehensive ecosystem of Morpher includes a tailored compiler, simulator, accelerator synthesis, and validation framework. This study provides an overview of Morpher, highlighting its capabilities in automatically compiling AI application kernels onto user-defined CGRA architectures and verifying their functionality. Through the Morpher framework, the versatility of CGRAs is harnessed to facilitate efficient compilation and verification of edge AI applications, covering important kernels representative of a wide range of embedded AI workloads. Morpher is available online at https://github.com/ecolab-nus/morpher-v2. △ Less

Submitted 12 September, 2023; originally announced September 2023.

Comments: This work was accepted by the Workshop on Compilers, Deployment, and Tooling for Edge AI (CODAI 2023), co-hosted at Embedded Systems Week on September 21st, 2023

arXiv:2302.08351 [pdf, other]

doi 10.1145/3584741

A Survey on Event-based News Narrative Extraction

Authors: Brian Keith Norambuena, Tanushree Mitra, Chris North

Abstract: Narratives are fundamental to our understanding of the world, providing us with a natural structure for knowledge representation over time. Computational narrative extraction is a subfield of artificial intelligence that makes heavy use of information retrieval and natural language processing techniques. Despite the importance of computational narrative extraction, relatively little scholarly work… ▽ More Narratives are fundamental to our understanding of the world, providing us with a natural structure for knowledge representation over time. Computational narrative extraction is a subfield of artificial intelligence that makes heavy use of information retrieval and natural language processing techniques. Despite the importance of computational narrative extraction, relatively little scholarly work exists on synthesizing previous research and strategizing future research in the area. In particular, this article focuses on extracting news narratives from an event-centric perspective. Extracting narratives from news data has multiple applications in understanding the evolving information landscape. This survey presents an extensive study of research in the area of event-based news narrative extraction. In particular, we screened over 900 articles that yielded 54 relevant articles. These articles are synthesized and organized by representation model, extraction criteria, and evaluation approaches. Based on the reviewed studies, we identify recent trends, open challenges, and potential research lines. △ Less

Submitted 10 March, 2023; v1 submitted 16 February, 2023; originally announced February 2023.

Comments: 37 pages, 3 figures, to be published in the journal ACM CSUR

arXiv:2302.07836 [pdf, other]

doi 10.1145/3544548.3580846

Assessing enactment of content regulation policies: A post hoc crowd-sourced audit of election misinformation on YouTube

Authors: Prerna Juneja, Md Momen Bhuiyan, Tanushree Mitra

Abstract: With the 2022 US midterm elections approaching, conspiratorial claims about the 2020 presidential elections continue to threaten users' trust in the electoral process. To regulate election misinformation, YouTube introduced policies to remove such content from its searches and recommendations. In this paper, we conduct a 9-day crowd-sourced audit on YouTube to assess the extent of enactment of suc… ▽ More With the 2022 US midterm elections approaching, conspiratorial claims about the 2020 presidential elections continue to threaten users' trust in the electoral process. To regulate election misinformation, YouTube introduced policies to remove such content from its searches and recommendations. In this paper, we conduct a 9-day crowd-sourced audit on YouTube to assess the extent of enactment of such policies. We recruited 99 users who installed a browser extension that enabled us to collect up-next recommendation trails and search results for 45 videos and 88 search queries about the 2020 elections. We find that YouTube's search results, irrespective of search query bias, contain more videos that oppose rather than support election misinformation. However, watching misinformative election videos still lead users to a small number of misinformative videos in the up-next trails. Our results imply that while YouTube largely seems successful in regulating election misinformation, there is still room for improvement. △ Less

Submitted 15 February, 2023; originally announced February 2023.

Comments: 22 pages

ACM Class: H.0

arXiv:2302.06452 [pdf, other]

doi 10.1145/3581641.3584076

Mixed Multi-Model Semantic Interaction for Graph-based Narrative Visualizations

Authors: Brian Keith Norambuena, Tanushree Mitra, Chris North

Abstract: Narrative sensemaking is an essential part of understanding sequential data. Narrative maps are a visual representation model that can assist analysts to understand narratives. In this work, we present a semantic interaction (SI) framework for narrative maps that can support analysts through their sensemaking process. In contrast to traditional SI systems which rely on dimensionality reduction and… ▽ More Narrative sensemaking is an essential part of understanding sequential data. Narrative maps are a visual representation model that can assist analysts to understand narratives. In this work, we present a semantic interaction (SI) framework for narrative maps that can support analysts through their sensemaking process. In contrast to traditional SI systems which rely on dimensionality reduction and work on a projection space, our approach has an additional abstraction layer -- the structure space -- that builds upon the projection space and encodes the narrative in a discrete structure. This extra layer introduces additional challenges that must be addressed when integrating SI with the narrative extraction pipeline. We address these challenges by presenting the general concept of Mixed Multi-Model Semantic Interaction (3MSI) -- an SI pipeline, where the highest-level model corresponds to an abstract discrete structure and the lower-level models are continuous. To evaluate the performance of our 3MSI models for narrative maps, we present a quantitative simulation-based evaluation and a qualitative evaluation with case studies and expert feedback. We find that our SI system can model the analysts' intent and support incremental formalism for narrative maps. △ Less

Submitted 13 February, 2023; originally announced February 2023.

Comments: 24 pages, 9 figures, IUI 2023

arXiv:2302.04219 [pdf, other]

doi 10.1145/3544548.3581244

NewsComp: Facilitating Diverse News Reading through Comparative Annotation

Authors: Md Momen Bhuiyan, Sang Won Lee, Nitesh Goyal, Tanushree Mitra

Abstract: To support efficient, balanced news consumption, merging articles from diverse sources into one, potentially through crowdsourcing, could alleviate some hurdles. However, the merging process could also impact annotators' attitudes towards the content. To test this theory, we propose comparative news annotation, i.e., annotating similarities and differences between a pair of articles. By developing… ▽ More To support efficient, balanced news consumption, merging articles from diverse sources into one, potentially through crowdsourcing, could alleviate some hurdles. However, the merging process could also impact annotators' attitudes towards the content. To test this theory, we propose comparative news annotation, i.e., annotating similarities and differences between a pair of articles. By developing and deploying NewsComp -- a prototype system -- we conducted a between-subjects experiment(N=109) to examine how users' annotations compare to experts', and how comparative annotation affects users' perceptions of article credibility and quality. We found that comparative annotation can marginally impact users' credibility perceptions in certain cases. While users' annotations were not on par with experts', they showed greater precision in finding similarities than in identifying disparate important statements. The comparison process led users to notice differences in information placement/depth, degree of factuality/opinion, and empathetic/inflammatory language use. We discuss implications for the design of future comparative annotation tasks. △ Less

Submitted 8 February, 2023; originally announced February 2023.

Comments: 2023 ACM CHI Conference on Human Factors in Computing Systems, 17 pages

arXiv:2208.04465 [pdf, other]

Characterizing Social Movement Narratives in Online Communities: The 2021 Cuban Protests on Reddit

Authors: Brian Felipe Keith Norambuena, Tanushree Mitra, Chris North

Abstract: Social movements are dominated by storytelling, as narratives play a key role in how communities involved in these movements shape their identities. Thus, recognizing the accepted narratives of different communities is central to understanding social movements. In this context, journalists face the challenge of making sense of these emerging narratives in social media when they seek to report soci… ▽ More Social movements are dominated by storytelling, as narratives play a key role in how communities involved in these movements shape their identities. Thus, recognizing the accepted narratives of different communities is central to understanding social movements. In this context, journalists face the challenge of making sense of these emerging narratives in social media when they seek to report social protests. Thus, they would benefit from support tools that allow them to identify and explore such narratives. In this work, we propose a narrative extraction algorithm from social media that incorporates the concept of community acceptance. Using our method, we study the 2021 Cuban protests and characterize five relevant communities. The extracted narratives differ in both structure and content across communities. Our work has implications in the study of social movements, intelligence analysis, computational journalism, and misinformation research. △ Less

Submitted 8 August, 2022; originally announced August 2022.

Comments: 2022 Computation + Journalism Conference (C+J 2022), 6 pages

arXiv:2207.10192 [pdf]

doi 10.1145/3632297

Building Human Values into Recommender Systems: An Interdisciplinary Synthesis

Authors: Jonathan Stray, Alon Halevy, Parisa Assar, Dylan Hadfield-Menell, Craig Boutilier, Amar Ashar, Lex Beattie, Michael Ekstrand, Claire Leibowicz, Connie Moon Sehat, Sara Johansen, Lianne Kerlin, David Vickrey, Spandana Singh, Sanne Vrijenhoek, Amy Zhang, McKane Andrus, Natali Helberger, Polina Proutskova, Tanushree Mitra, Nina Vasan

Abstract: Recommender systems are the algorithms which select, filter, and personalize content across many of the worlds largest platforms and apps. As such, their positive and negative effects on individuals and on societies have been extensively theorized and studied. Our overarching question is how to ensure that recommender systems enact the values of the individuals and societies that they serve. Addre… ▽ More Recommender systems are the algorithms which select, filter, and personalize content across many of the worlds largest platforms and apps. As such, their positive and negative effects on individuals and on societies have been extensively theorized and studied. Our overarching question is how to ensure that recommender systems enact the values of the individuals and societies that they serve. Addressing this question in a principled fashion requires technical knowledge of recommender design and operation, and also critically depends on insights from diverse fields including social science, ethics, economics, psychology, policy and law. This paper is a multidisciplinary effort to synthesize theory and practice from different perspectives, with the goal of providing a shared language, articulating current design approaches, and identifying open problems. It is not a comprehensive survey of this large space, but a set of highlights identified by our diverse author cohort. We collect a set of values that seem most relevant to recommender systems operating across different domains, then examine them from the perspectives of current industry practice, measurement, product design, and policy approaches. Important open problems include multi-stakeholder processes for defining values and resolving trade-offs, better values-driven measurements, recommender controls that people use, non-behavioral algorithmic feedback, optimization for long-term outcomes, causal inference of recommender effects, academic-industry research collaborations, and interdisciplinary policy-making. △ Less

Submitted 20 July, 2022; originally announced July 2022.

ACM Class: J.4; H.3.3; K.4.2

Journal ref: ACM Trans. Recomm. Syst. 2, 3, Article 20 (September 2024), 57 pages

arXiv:2205.10894 [pdf, other]

Human and technological infrastructures of fact-checking

Authors: Prerna Juneja, Tanushree Mitra

Abstract: Increasing demands for fact-checking has led to a growing interest in developing systems and tools to automate the fact-checking process. However, such systems are limited in practice because their system design often does not take into account how fact-checking is done in the real world and ignores the insights and needs of various stakeholder groups core to the fact-checking process. This paper… ▽ More Increasing demands for fact-checking has led to a growing interest in developing systems and tools to automate the fact-checking process. However, such systems are limited in practice because their system design often does not take into account how fact-checking is done in the real world and ignores the insights and needs of various stakeholder groups core to the fact-checking process. This paper unpacks the fact-checking process by revealing the infrastructures -- both human and technological -- that support and shape fact-checking work. We interviewed 26 participants belonging to 16 fact-checking teams and organizations with representation from 4 continents. Through these interviews, we describe the human infrastructure of fact-checking by identifying and presenting, in-depth, the roles of six primary stakeholder groups, 1) Editors, 2) External fact-checkers, 3) In-house fact-checkers, 4) Investigators and researchers, 5) Social media managers, and 6) Advocators. Our findings highlight that the fact-checking process is a collaborative effort among various stakeholder groups and associated technological and informational infrastructures. By rendering visibility to the infrastructures, we reveal how fact-checking has evolved to include both short-term claims centric and long-term advocacy centric fact-checking. Our work also identifies key social and technical needs and challenges faced by each stakeholder group. Based on our findings, we suggest that improving the quality of fact-checking requires systematic changes in the civic, informational, and technological contexts. △ Less

Submitted 22 May, 2022; originally announced May 2022.

arXiv:2204.10729 [pdf, other]

Pathways through Conspiracy: The Evolution of Conspiracy Radicalization through Engagement in Online Conspiracy Discussions

Authors: Shruti Phadke, Mattia Samory, Tanushree Mitra

Abstract: The disruptive offline mobilization of participants in online conspiracy theory (CT) discussions has highlighted the importance of understanding how online users may form radicalized conspiracy beliefs. While prior work researched the factors leading up to joining online CT discussions and provided theories of how conspiracy beliefs form, we have little understanding of how conspiracy radicalizati… ▽ More The disruptive offline mobilization of participants in online conspiracy theory (CT) discussions has highlighted the importance of understanding how online users may form radicalized conspiracy beliefs. While prior work researched the factors leading up to joining online CT discussions and provided theories of how conspiracy beliefs form, we have little understanding of how conspiracy radicalization evolves after users join CT discussion communities. In this paper, we provide the empirical modeling of various radicalization phases in online CT discussion participants. To unpack how conspiracy engagement is related to radicalization, we first characterize the users' journey through CT discussions via conspiracy engagement pathways. Specifically, by studying 36K Reddit users through their 169M contributions, we uncover four distinct pathways of conspiracy engagement: steady high, increasing, decreasing, and steady low. We further model three successive stages of radicalization guided by prior theoretical works. Specific sub-populations of users, namely those on steady high and increasing conspiracy engagement pathways, progress successively through various radicalization stages. In contrast, users on the decreasing engagement pathway show distinct behavior: they limit their CT discussions to specialized topics, participate in diverse discussion groups, and show reduced conformity with conspiracy subreddits. By examining users who disengage from online CT discussions, this paper provides promising insights about conspiracy recovery process. △ Less

Submitted 22 April, 2022; originally announced April 2022.

Journal ref: Proceedings of the International AAAI Conference on Web and Social Media (ICWSM) 2022

arXiv:2202.02479 [pdf, other]

Algorithmic nudge to make better choices: Evaluating effectiveness of XAI frameworks to reveal biases in algorithmic decision making to users

Authors: Prerna Juneja, Tanushree Mitra

Abstract: In this position paper, we propose the use of existing XAI frameworks to design interventions in scenarios where algorithms expose users to problematic content (e.g. anti vaccine videos). Our intervention design includes facts (to indicate algorithmic justification of what happened) accompanied with either fore warnings or counterfactual explanations. While fore warnings indicate potential risks o… ▽ More In this position paper, we propose the use of existing XAI frameworks to design interventions in scenarios where algorithms expose users to problematic content (e.g. anti vaccine videos). Our intervention design includes facts (to indicate algorithmic justification of what happened) accompanied with either fore warnings or counterfactual explanations. While fore warnings indicate potential risks of an action to users, the counterfactual explanations will indicate what actions user should perform to change the algorithmic outcome. We envision the use of such interventions as `decision aids' to users which will help them make informed choices. △ Less

Submitted 4 February, 2022; originally announced February 2022.

arXiv:2201.11709 [pdf, other]

doi 10.1145/3491102.3502028

OtherTube: Facilitating Content Discovery and Reflection by Exchanging YouTube Recommendations with Strangers

Authors: Md Momen Bhuiyan, Carlos Augusto Bautista Isaza, Tanushree Mitra, Sang Won Lee

Abstract: To promote engagement, recommendation algorithms on platforms like YouTube increasingly personalize users' feeds, limiting users' exposure to diverse content and depriving them of opportunities to reflect on their interests compared to others'. In this work, we investigate how exchanging recommendations with strangers can help users discover new content and reflect. We tested this idea by developi… ▽ More To promote engagement, recommendation algorithms on platforms like YouTube increasingly personalize users' feeds, limiting users' exposure to diverse content and depriving them of opportunities to reflect on their interests compared to others'. In this work, we investigate how exchanging recommendations with strangers can help users discover new content and reflect. We tested this idea by developing OtherTube -- a browser extension for YouTube that displays strangers' personalized YouTube recommendations. OtherTube allows users to (i) create an anonymized profile for social comparison, (ii) share their recommended videos with others, and (iii) browse strangers' YouTube recommendations. We conducted a 10-day-long user study (n=41) followed by a post-study interview (n=11). Our results reveal that users discovered and developed new interests from seeing OtherTube recommendations. We identified user and content characteristics that affect interaction and engagement with exchanged recommendations; for example, younger users interacted more with OtherTube, while the perceived irrelevance of some content discouraged users from watching certain videos. Users reflected on their interests as well as others', recognizing similarities and differences. Our work shows promise for designs leveraging the exchange of personalized recommendations with strangers. △ Less

Submitted 27 January, 2022; originally announced January 2022.

Comments: CHI 2022, 17 pages

arXiv:2201.03019 [pdf, other]

Robust and Resource-Efficient Data-Free Knowledge Distillation by Generative Pseudo Replay

Authors: Kuluhan Binici, Shivam Aggarwal, Nam Trung Pham, Karianto Leman, Tulika Mitra

Abstract: Data-Free Knowledge Distillation (KD) allows knowledge transfer from a trained neural network (teacher) to a more compact one (student) in the absence of original training data. Existing works use a validation set to monitor the accuracy of the student over real data and report the highest performance throughout the entire process. However, validation data may not be available at distillation time… ▽ More Data-Free Knowledge Distillation (KD) allows knowledge transfer from a trained neural network (teacher) to a more compact one (student) in the absence of original training data. Existing works use a validation set to monitor the accuracy of the student over real data and report the highest performance throughout the entire process. However, validation data may not be available at distillation time either, making it infeasible to record the student snapshot that achieved the peak accuracy. Therefore, a practical data-free KD method should be robust and ideally provide monotonically increasing student accuracy during distillation. This is challenging because the student experiences knowledge degradation due to the distribution shift of the synthetic data. A straightforward approach to overcome this issue is to store and rehearse the generated samples periodically, which increases the memory footprint and creates privacy concerns. We propose to model the distribution of the previously observed synthetic samples with a generative network. In particular, we design a Variational Autoencoder (VAE) with a training objective that is customized to learn the synthetic data representations optimally. The student is rehearsed by the generative pseudo replay technique, with samples produced by the VAE. Hence knowledge degradation can be prevented without storing any samples. Experiments on image classification benchmarks show that our method optimizes the expected value of the distilled model accuracy while eliminating the large memory overhead incurred by the sample-storing methods. △ Less

Submitted 29 July, 2024; v1 submitted 9 January, 2022; originally announced January 2022.

Comments: AAAI Conference on Artificial Intelligence

arXiv:2112.12205 [pdf, other]

Design guidelines for narrative maps in sensemaking tasks

Authors: Brian Felipe Keith Norambuena, Tanushree Mitra, Chris North

Abstract: Narrative sensemaking is a fundamental process to understand sequential information. Narrative maps are a visual representation framework that can aid analysts in their narrative sensemaking process. Narrative maps allow analysts to understand the big picture of a narrative, uncover new relationships between events, and model the connection between storylines. We seek to understand how analysts cr… ▽ More Narrative sensemaking is a fundamental process to understand sequential information. Narrative maps are a visual representation framework that can aid analysts in their narrative sensemaking process. Narrative maps allow analysts to understand the big picture of a narrative, uncover new relationships between events, and model the connection between storylines. We seek to understand how analysts create and use narrative maps in order to obtain design guidelines for an interactive visualization tool for narrative maps that can aid analysts in narrative sensemaking. We perform two experiments with a data set of news articles. The insights extracted from our studies can be used to design narrative maps, extraction algorithms, and visual analytics tools to support the narrative sensemaking process. The contributions of this paper are three-fold: (1) an analysis of how analysts construct narrative maps; (2) a user evaluation of specific narrative map features; and (3) design guidelines for narrative maps. Our findings suggest ways for designing narrative maps and extraction algorithms, as well as providing insights towards useful interactions. We discuss these insights and design guidelines and reflect on the potential challenges involved. As key highlights, we find that narrative maps should avoid redundant connections that can be inferred by using the transitive property of event connections, reducing the overall complexity of the map. Moreover, narrative maps should use multiple types of cognitive connections between events such as topical and causal connections, as this emulates the strategies that analysts use in the narrative sensemaking process. △ Less

Submitted 22 December, 2021; originally announced December 2021.

Comments: Accepted paper in SAGE Information Visualization Journal, 20 pages, 8 tables, 9 figures. arXiv admin note: text overlap with arXiv:2108.06035

arXiv:2108.06035 [pdf, other]

Narrative Sensemaking: Strategies for Narrative Maps Construction

Authors: Brian Felipe Keith Norambuena, Tanushree Mitra, Chris North

Abstract: Narrative sensemaking is a fundamental process to understand sequential information. Narrative maps are a visual representation framework that can aid analysts in this process. They allow analysts to understand the big picture of a narrative, uncover new relationships between events, and model connections between storylines. As a sensemaking tool, narrative maps have applications in intelligence a… ▽ More Narrative sensemaking is a fundamental process to understand sequential information. Narrative maps are a visual representation framework that can aid analysts in this process. They allow analysts to understand the big picture of a narrative, uncover new relationships between events, and model connections between storylines. As a sensemaking tool, narrative maps have applications in intelligence analysis, misinformation modeling, and computational journalism. In this work, we seek to understand how analysts construct narrative maps in order to improve narrative map representation and extraction methods. We perform an experiment with a data set of news articles. Our main contribution is an analysis of how analysts construct narrative maps. The insights extracted from our study can be used to design narrative map visualizations, extraction algorithms, and visual analytics tools to support the sensemaking process. △ Less

Submitted 1 September, 2021; v1 submitted 12 August, 2021; originally announced August 2021.

Comments: Accepted as a short paper in IEEE VIS 2021; added citation information

arXiv:2108.05698 [pdf, other]

Preventing Catastrophic Forgetting and Distribution Mismatch in Knowledge Distillation via Synthetic Data

Authors: Kuluhan Binici, Nam Trung Pham, Tulika Mitra, Karianto Leman

Abstract: With the increasing popularity of deep learning on edge devices, compressing large neural networks to meet the hardware requirements of resource-constrained devices became a significant research direction. Numerous compression methodologies are currently being used to reduce the memory sizes and energy consumption of neural networks. Knowledge distillation (KD) is among such methodologies and it f… ▽ More With the increasing popularity of deep learning on edge devices, compressing large neural networks to meet the hardware requirements of resource-constrained devices became a significant research direction. Numerous compression methodologies are currently being used to reduce the memory sizes and energy consumption of neural networks. Knowledge distillation (KD) is among such methodologies and it functions by using data samples to transfer the knowledge captured by a large model (teacher) to a smaller one(student). However, due to various reasons, the original training data might not be accessible at the compression stage. Therefore, data-free model compression is an ongoing research problem that has been addressed by various works. In this paper, we point out that catastrophic forgetting is a problem that can potentially be observed in existing data-free distillation methods. Moreover, the sample generation strategies in some of these methods could result in a mismatch between the synthetic and real data distributions. To prevent such problems, we propose a data-free KD framework that maintains a dynamic collection of generated samples over time. Additionally, we add the constraint of matching the real data distribution in sample generation strategies that target maximum information gain. Our experiments demonstrate that we can improve the accuracy of the student models obtained via KD when compared with state-of-the-art approaches on the SVHN, Fashion MNIST and CIFAR100 datasets. △ Less

Submitted 5 November, 2021; v1 submitted 11 August, 2021; originally announced August 2021.

Comments: Accepted by the 2022 Winter Conference on Applications of Computer Vision (WACV 2022)

Journal ref: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2022, pp. 663-671

arXiv:2108.02325 [pdf, other]

doi 10.1145/3479539

Designing Transparency Cues in Online News Platforms to Promote Trust: Journalists' & Consumers' Perspectives

Authors: Md Momen Bhuiyan, Hayden Whitley, Michael Horning, Sang Won Lee, Tanushree Mitra

Abstract: As news organizations embrace transparency practices on their websites to distinguish themselves from those spreading misinformation, HCI designers have the opportunity to help them effectively utilize the ideals of transparency to build trust. How can we utilize transparency to promote trust in news? We examine this question through a qualitative lens by interviewing journalists and news consumer… ▽ More As news organizations embrace transparency practices on their websites to distinguish themselves from those spreading misinformation, HCI designers have the opportunity to help them effectively utilize the ideals of transparency to build trust. How can we utilize transparency to promote trust in news? We examine this question through a qualitative lens by interviewing journalists and news consumers -- the two stakeholders in a news system. We designed a scenario to demonstrate transparency features using two fundamental news attributes that convey the trustworthiness of a news article: source and message. In the interviews, our news consumers expressed the idea that news transparency could be best shown by providing indicators of objectivity in two areas (news selection and framing) and by providing indicators of evidence in four areas (presence of source materials, anonymous sourcing, verification, and corrections upon erroneous reporting). While our journalists agreed with news consumers' suggestions of using evidence indicators, they also suggested additional transparency indicators in areas such as the news reporting process and personal/organizational conflicts of interest. Prompted by our scenario, participants offered new design considerations for building trustworthy news platforms, such as designing for easy comprehension, presenting appropriate details in news articles (e.g., showing the number and nature of corrections made to an article), and comparing attributes across news organizations to highlight diverging practices. Comparing the responses from our two stakeholder groups reveals conflicting suggestions with trade-offs between them. Our study has implications for HCI designers in building trustworthy news systems. △ Less

Submitted 20 September, 2021; v1 submitted 4 August, 2021; originally announced August 2021.

Comments: 31 pages, CSCW 2021

arXiv:2108.01536 [pdf, other]

doi 10.1145/3479571

NudgeCred: Supporting News Credibility Assessment on Social Media Through Nudges

Authors: Md Momen Bhuiyan, Michael Horning, Sang Won Lee, Tanushree Mitra

Abstract: Struggling to curb misinformation, social media platforms are experimenting with design interventions to enhance consumption of credible news on their platforms. Some of these interventions, such as the use of warning messages, are examples of nudges -- a choice-preserving technique to steer behavior. Despite their application, we do not know whether nudges could steer people into making conscious… ▽ More Struggling to curb misinformation, social media platforms are experimenting with design interventions to enhance consumption of credible news on their platforms. Some of these interventions, such as the use of warning messages, are examples of nudges -- a choice-preserving technique to steer behavior. Despite their application, we do not know whether nudges could steer people into making conscious news credibility judgments online and if they do, under what constraints. To answer, we combine nudge techniques with heuristic based information processing to design NudgeCred -- a browser extension for Twitter. NudgeCred directs users' attention to two design cues: authority of a source and other users' collective opinion on a report by activating three design nudges -- Reliable, Questionable, and Unreliable, each denoting particular levels of credibility for news tweets. In a controlled experiment, we found that NudgeCred significantly helped users (n=430) distinguish news tweets' credibility, unrestricted by three behavioral confounds -- political ideology, political cynicism, and media skepticism. A five-day field deployment with twelve participants revealed that NudgeCred improved their recognition of news items and attention towards all of our nudges, particularly towards Questionable. Among other considerations, participants proposed that designers should incorporate heuristics that users' would trust. Our work informs nudge-based system design approaches for online media. △ Less

Submitted 20 September, 2021; v1 submitted 3 August, 2021; originally announced August 2021.

Comments: 30 pages, CSCW 2021

arXiv:2107.10204 [pdf, other]

Characterizing Social Imaginaries and Self-Disclosures of Dissonance in Online Conspiracy Discussion Communities

Authors: Shruti Phadke, Mattia Samory, Tanushree Mitra

Abstract: Online discussion platforms offer a forum to strengthen and propagate belief in misinformed conspiracy theories. Yet, they also offer avenues for conspiracy theorists to express their doubts and experiences of cognitive dissonance. Such expressions of dissonance may shed light on who abandons misguided beliefs and under which circumstances. This paper characterizes self-disclosures of dissonance a… ▽ More Online discussion platforms offer a forum to strengthen and propagate belief in misinformed conspiracy theories. Yet, they also offer avenues for conspiracy theorists to express their doubts and experiences of cognitive dissonance. Such expressions of dissonance may shed light on who abandons misguided beliefs and under which circumstances. This paper characterizes self-disclosures of dissonance about QAnon, a conspiracy theory initiated by a mysterious leader Q and popularized by their followers, anons in conspiracy theory subreddits. To understand what dissonance and disbelief mean within conspiracy communities, we first characterize their social imaginaries, a broad understanding of how people collectively imagine their social existence. Focusing on 2K posts from two image boards, 4chan and 8chan, and 1.2 M comments and posts from 12 subreddits dedicated to QAnon, we adopt a mixed methods approach to uncover the symbolic language representing the movement, expectations, practices, heroes and foes of the QAnon community. We use these social imaginaries to create a computational framework for distinguishing belief and dissonance from general discussion about QAnon. Further, analyzing user engagement with QAnon conspiracy subreddits, we find that self-disclosures of dissonance correlate with a significant decrease in user contributions and ultimately with their departure from the community. We contribute a computational framework for identifying dissonance self-disclosures and measuring the changes in user engagement surrounding dissonance. Our work can provide insights into designing dissonance-based interventions that can potentially dissuade conspiracists from online conspiracy discussion communities. △ Less

Submitted 21 July, 2021; originally announced July 2021.

Comments: Accepted at CSCW 2021

arXiv:2105.08827 [pdf, other]

Educators, Solicitors, Flamers, Motivators, Sympathizers: Characterizing Roles in Online Extremist Movements

Authors: Shruti Phadke, Tanushree Mitra

Abstract: Social media provides the means by which extremist social movements, such as white supremacy and anti LGBTQ, thrive online. Yet, we know little about the roles played by the participants of such movements. In this paper, we investigate these participants to characterize their roles, their role dynamics, and their influence in spreading online extremism. Our participants, online extremist accounts,… ▽ More Social media provides the means by which extremist social movements, such as white supremacy and anti LGBTQ, thrive online. Yet, we know little about the roles played by the participants of such movements. In this paper, we investigate these participants to characterize their roles, their role dynamics, and their influence in spreading online extremism. Our participants, online extremist accounts, are 4,876 public Facebook pages or groups that have shared information from the websites of 289 Southern Poverty Law Center designated extremist groups. By clustering the quantitative features followed by qualitative expert validation, we identify five roles surrounding extremist activism: educators, solicitors, flamers, motivators, sympathizers. For example, solicitors use links from extremist websites to attract donations and participation in extremist issues, whereas flamers share inflammatory extremist content inciting anger. We further investigate role dynamics such as, how stable these roles are over time and how likely will extremist accounts transition from one role into another. We find that roles core to the movement, educators and solicitors, are more stable, while flamers and motivators can transition to sympathizers with high probability. We further find that educators and solicitors exert the most influence in triggering extremist link posts, whereas flamers are influential in triggering the spread of information from fake news sources. Our results help in situating various roles on the trajectory of deeper engagement into the extremist movements and understanding the potential effect of various counter extremism interventions. Our findings have implications for understanding how online extremist movements flourish through participatory activism and how they gain a spectrum of allies for mobilizing extremism online. △ Less

Submitted 18 May, 2021; originally announced May 2021.

Comments: Accepted at Computer Supported Cooperative Work (CSCW 2021)

arXiv:2101.08419 [pdf, other]

doi 10.1145/3411764.3445250

Auditing E-Commerce Platforms for Algorithmically Curated Vaccine Misinformation

Authors: Prerna Juneja, Tanushree Mitra

Abstract: There is a growing concern that e-commerce platforms are amplifying vaccine-misinformation. To investigate, we conduct two-sets of algorithmic audits for vaccine misinformation on the search and recommendation algorithms of Amazon -- world's leading e-retailer. First, we systematically audit search-results belonging to vaccine-related search-queries without logging into the platform -- unpersonali… ▽ More There is a growing concern that e-commerce platforms are amplifying vaccine-misinformation. To investigate, we conduct two-sets of algorithmic audits for vaccine misinformation on the search and recommendation algorithms of Amazon -- world's leading e-retailer. First, we systematically audit search-results belonging to vaccine-related search-queries without logging into the platform -- unpersonalized audits. We find 10.47% of search-results promote misinformative health products. We also observe ranking-bias, with Amazon ranking misinformative search-results higher than debunking search-results. Next, we analyze the effects of personalization due to account-history, where history is built progressively by performing various real-world user-actions, such as clicking a product. We find evidence of filter-bubble effect in Amazon's recommendations; accounts performing actions on misinformative products are presented with more misinformation compared to accounts performing actions on neutral and debunking products. Interestingly, once user clicks on a misinformative product, homepage recommendations become more contaminated compared to when user shows an intention to buy that product. △ Less

Submitted 29 January, 2021; v1 submitted 20 January, 2021; originally announced January 2021.

Journal ref: CHI Conference on Human Factors in Computing Systems 2021

arXiv:2009.04527 [pdf, other]

What Makes People Join Conspiracy Communities?: Role of Social Factors in Conspiracy Engagement

Authors: Shruti Phadke, Mattia Samory, Tanushree Mitra

Abstract: Widespread conspiracy theories, like those motivating anti-vaccination attitudes or climate change denial, propel collective action and bear society-wide consequences. Yet, empirical research has largely studied conspiracy theory adoption as an individual pursuit, rather than as a socially mediated process. What makes users join communities endorsing and spreading conspiracy theories? We leverage… ▽ More Widespread conspiracy theories, like those motivating anti-vaccination attitudes or climate change denial, propel collective action and bear society-wide consequences. Yet, empirical research has largely studied conspiracy theory adoption as an individual pursuit, rather than as a socially mediated process. What makes users join communities endorsing and spreading conspiracy theories? We leverage longitudinal data from 56 conspiracy communities on Reddit to compare individual and social factors determining which users join the communities. Using a quasi-experimental approach, we first identify 30K future conspiracists-(FC) and 30K matched non-conspiracists-(NC). We then provide empirical evidence of importance of social factors across six dimensions relative to the individual factors by analyzing 6 million Reddit comments and posts. Specifically in social factors, we find that dyadic interactions with members of the conspiracy communities and marginalization outside of the conspiracy communities, are the most important social precursors to conspiracy joining-even outperforming individual factor baselines. Our results offer quantitative backing to understand social processes and echo chamber effects in conspiratorial engagement, with important implications for democratic institutions and online communities. △ Less

Submitted 6 October, 2020; v1 submitted 9 September, 2020; originally announced September 2020.

Comments: Accepted at ACM CSCW 2020

Journal ref: Computer Supported Cooperative Work and Social Computing 2020

arXiv:2009.04508 [pdf, other]

Narrative Maps: An Algorithmic Approach to Represent and Extract Information Narratives

Authors: Brian Keith, Tanushree Mitra

Abstract: Narratives are fundamental to our perception of the world and are pervasive in all activities that involve the representation of events in time. Yet, modern online information systems do not incorporate narratives in their representation of events occurring over time. This article aims to bridge this gap, combining the theory of narrative representations with the data from modern online systems. W… ▽ More Narratives are fundamental to our perception of the world and are pervasive in all activities that involve the representation of events in time. Yet, modern online information systems do not incorporate narratives in their representation of events occurring over time. This article aims to bridge this gap, combining the theory of narrative representations with the data from modern online systems. We make three key contributions: a theory-driven computational representation of narratives, a novel extraction algorithm to obtain these representations from data, and an evaluation of our approach. In particular, given the effectiveness of visual metaphors, we employ a route map metaphor to design a narrative map representation. The narrative map representation illustrates the events and stories in the narrative as a series of landmarks and routes on the map. Each element of our representation is backed by a corresponding element from formal narrative theory, thus providing a solid theoretical background to our method. Our approach extracts the underlying graph structure of the narrative map using a novel optimization technique focused on maximizing coherence while respecting structural and coverage constraints. We showcase the effectiveness of our approach by performing a user evaluation to assess the quality of the representation, metaphor, and visualization. Evaluation results indicate that the Narrative Map representation is a powerful method to communicate complex narratives to individuals. Our findings have implications for intelligence analysts, computational journalists, and misinformation researchers. △ Less

Submitted 26 October, 2020; v1 submitted 9 September, 2020; originally announced September 2020.

Comments: 33 pages, 15 figures, CSCW 2020

arXiv:2008.09533 [pdf, other]

doi 10.1145/3415164

Investigating Differences in Crowdsourced News Credibility Assessment: Raters, Tasks, and Expert Criteria

Authors: Md Momen Bhuiyan, Amy X. Zhang, Connie Moon Sehat, Tanushree Mitra

Abstract: Misinformation about critical issues such as climate change and vaccine safety is oftentimes amplified on online social and search platforms. The crowdsourcing of content credibility assessment by laypeople has been proposed as one strategy to combat misinformation by attempting to replicate the assessments of experts at scale. In this work, we investigate news credibility assessments by crowds ve… ▽ More Misinformation about critical issues such as climate change and vaccine safety is oftentimes amplified on online social and search platforms. The crowdsourcing of content credibility assessment by laypeople has been proposed as one strategy to combat misinformation by attempting to replicate the assessments of experts at scale. In this work, we investigate news credibility assessments by crowds versus experts to understand when and how ratings between them differ. We gather a dataset of over 4,000 credibility assessments taken from 2 crowd groups---journalism students and Upwork workers---as well as 2 expert groups---journalists and scientists---on a varied set of 50 news articles related to climate science, a topic with widespread disconnect between public opinion and expert consensus. Examining the ratings, we find differences in performance due to the makeup of the crowd, such as rater demographics and political leaning, as well as the scope of the tasks that the crowd is assigned to rate, such as the genre of the article and partisanship of the publication. Finally, we find differences between expert assessments due to differing expert criteria that journalism versus science experts use---differences that may contribute to crowd discrepancies, but that also suggest a way to reduce the gap by designing crowd tasks tailored to specific expert criteria. From these findings, we outline future research directions to better design crowd processes that are tailored to specific crowds and types of content. △ Less

Submitted 21 August, 2020; originally announced August 2020.

arXiv:2006.05676 [pdf, other]

Position Masking for Language Models

Authors: Andy Wagner, Tiyasa Mitra, Mrinal Iyer, Godfrey Da Costa, Marc Tremblay

Abstract: Masked language modeling (MLM) pre-training models such as BERT corrupt the input by replacing some tokens with [MASK] and then train a model to reconstruct the original tokens. This is an effective technique which has led to good results on all NLP benchmarks. We propose to expand upon this idea by masking the positions of some tokens along with the masked input token ids. We follow the same stan… ▽ More Masked language modeling (MLM) pre-training models such as BERT corrupt the input by replacing some tokens with [MASK] and then train a model to reconstruct the original tokens. This is an effective technique which has led to good results on all NLP benchmarks. We propose to expand upon this idea by masking the positions of some tokens along with the masked input token ids. We follow the same standard approach as BERT masking a percentage of the tokens positions and then predicting their original values using an additional fully connected classifier stage. This approach has shown good performance gains (.3\% improvement) for the SQUAD additional improvement in convergence times. For the Graphcore IPU the convergence of BERT Base with position masking requires only 50\% of the tokens from the original BERT paper. △ Less

Submitted 2 June, 2020; originally announced June 2020.

arXiv:2004.07678 [pdf, other]

doi 10.1109/ACCESS.2020.3047337

Online Social Deception and Its Countermeasures for Trustworthy Cyberspace: A Survey

Authors: Zhen Guo, Jin-Hee Cho, Ing-Ray Chen, Srijan Sengupta, Michin Hong, Tanushree Mitra

Abstract: We are living in an era when online communication over social network services (SNSs) have become an indispensable part of people's everyday lives. As a consequence, online social deception (OSD) in SNSs has emerged as a serious threat in cyberspace, particularly for users vulnerable to such cyberattacks. Cyber attackers have exploited the sophisticated features of SNSs to carry out harmful OSD ac… ▽ More We are living in an era when online communication over social network services (SNSs) have become an indispensable part of people's everyday lives. As a consequence, online social deception (OSD) in SNSs has emerged as a serious threat in cyberspace, particularly for users vulnerable to such cyberattacks. Cyber attackers have exploited the sophisticated features of SNSs to carry out harmful OSD activities, such as financial fraud, privacy threat, or sexual/labor exploitation. Therefore, it is critical to understand OSD and develop effective countermeasures against OSD for building a trustworthy SNSs. In this paper, we conducted an extensive survey, covering (i) the multidisciplinary concepts of social deception; (ii) types of OSD attacks and their unique characteristics compared to other social network attacks and cybercrimes; (iii) comprehensive defense mechanisms embracing prevention, detection, and response (or mitigation) against OSD attacks along with their pros and cons; (iv) datasets/metrics used for validation and verification; and (v) legal and ethical concerns related to OSD research. Based on this survey, we provide insights into the effectiveness of countermeasures and the lessons from existing literature. We conclude this survey paper with an in-depth discussions on the limitations of the state-of-the-art and recommend future research directions in this area. △ Less

Submitted 16 April, 2020; originally announced April 2020.

Comments: 35 pages, 8 figures, submitted to ACM Computing Surveys

arXiv:2003.01841 [pdf, other]

IsoRAN: Isolation and Scaling for 5G RANvia User-Level Data Plane Virtualization

Authors: Nishant Budhdev, Mun Choon Chan, Tulika Mitra

Abstract: 5G presents a unique set of challenges for cellular network architecture. The architecture needs to be versatile in order to handle a variety of use cases. While network slicing has been proposed as a way to provide such versatility, it is also important to ensure that slices do not adversely interfere with each other. In other words, isolation among network slices is needed. Additionally, the lar… ▽ More 5G presents a unique set of challenges for cellular network architecture. The architecture needs to be versatile in order to handle a variety of use cases. While network slicing has been proposed as a way to provide such versatility, it is also important to ensure that slices do not adversely interfere with each other. In other words, isolation among network slices is needed. Additionally, the large number of use cases also implies a large number of users, making it imperative that 5G architectures scale efficiently. In this paper we propose IsoRAN, which provides isolation and scaling along with the flexibility needed for 5G architecture. In IsoRAN, users are processed by daemon threads in the Cloud Radio Access Network (CRAN) architecture. Our design allows users from different use cases to be executed, in a distributed manner, on the most efficient hardware to ensure that the Service Level Agreements (SLAs) are met while minimising power consumption. Our experiments show that IsoRAN handles users with different SLA while providing isolation to reduce interference. This increased isolation reduces the drop rate for different users from 42% to nearly 0% in some cases. Finally, we run large scale simulations on real traces to show the benefits for power consumption and cost reduction scale while increasing the number of base stations. △ Less

Submitted 3 March, 2020; originally announced March 2020.

Comments: 9 pages

arXiv:1909.09457 [pdf, other]

Simultaneous Progressing Switching Protocols for Timing Predictable Real-Time Network-on-Chips

Authors: Niklas Ueter, Georg von der Brueggen, Jian-Jia Chen, Tulika Mitra, Vanchinathan Venkataramani

Abstract: Many-core systems require inter-core communication, and network-on-chips (NoCs) have been demonstrated to provide good scalability. However, not only the distributed structure but also the link switching on the NoCs have imposed a great challenge in the design and analysis for real-time systems. With scalability and flexibility in mind, the existing link switching protocols usually consider each s… ▽ More Many-core systems require inter-core communication, and network-on-chips (NoCs) have been demonstrated to provide good scalability. However, not only the distributed structure but also the link switching on the NoCs have imposed a great challenge in the design and analysis for real-time systems. With scalability and flexibility in mind, the existing link switching protocols usually consider each single link to be scheduled independently, e.g., the worm-hole switching protocol. The flexibility of such link-based arbitrations allows each packet to be distributed over multiple routers but also increases the number of possible link states (the number of flits in a buffer) that have to be considered in the worst-case timing analysis. For achieving timing predictability, we propose less flexible switching protocols, called \emph{\Simultaneous Progressing Switching Protocols} (SP$^2$), in which the links used by a flow \emph{either} all simultaneously transmit one flit (if it exists) of this flow \emph{or} none of them transmits any flit of this flow. Such an \emph{all-or-nothing} property of the SP$^2$ relates the scheduling behavior on the network to the uniprocessor self-suspension scheduling problem. We provide rigorous proofs which confirm the equivalence of these two problems. Moreover, our approaches are not limited to any specific underlying routing protocols, which are usually constructed for deadlock avoidance instead of timing predictability. We demonstrate the analytical dominance of the fixed-priority $SP^2$ over some of the existing sufficient schedulability analysis for fixed-priority wormhole switched network-on-chips. △ Less

Submitted 21 October, 2019; v1 submitted 19 September, 2019; originally announced September 2019.

arXiv:1909.00647 [pdf, other]

KLEESPECTRE: Detecting Information Leakage through Speculative Cache Attacks via Symbolic Execution

Authors: Guanhua Wang, Sudipta Chattopadhyay, Arnab Kumar Biswas, Tulika Mitra, Abhik Roychoudhury

Abstract: Spectre attacks disclosed in early 2018 expose data leakage scenarios via cache side channels. Specifically, speculatively executed paths due to branch mis-prediction may bring secret data into the cache which are then exposed via cache side channels even after the speculative execution is squashed. Symbolic execution is a well-known test generation method to cover program paths at the level of th… ▽ More Spectre attacks disclosed in early 2018 expose data leakage scenarios via cache side channels. Specifically, speculatively executed paths due to branch mis-prediction may bring secret data into the cache which are then exposed via cache side channels even after the speculative execution is squashed. Symbolic execution is a well-known test generation method to cover program paths at the level of the application software. In this paper, we extend symbolic execution with modelingof cache and speculative execution. Our tool KLEESPECTRE, built on top of the KLEE symbolic execution engine, can thus provide a testing engine to check for the data leakage through cache side-channel as shown via Spectre attacks. Our symbolic cache model can verify whether the sensitive data leakage due to speculative execution can be observed by an attacker at a given program point. Our experiments show that KLEESPECTREcan effectively detect data leakage along speculatively executed paths and our cache model can further make the leakage detection much more precise. △ Less

Submitted 2 September, 2019; originally announced September 2019.

Journal ref: ACM Transactions on Software Engineering and Methodology, 2020

arXiv:1908.11450 [pdf, other]

doi 10.1109/MDAT.2020.2968258

Neural Network Inference on Mobile SoCs

Authors: Siqi Wang, Anuj Pathania, Tulika Mitra

Abstract: The ever-increasing demand from mobile Machine Learning (ML) applications calls for evermore powerful on-chip computing resources. Mobile devices are empowered with heterogeneous multi-processor Systems-on-Chips (SoCs) to process ML workloads such as Convolutional Neural Network (CNN) inference. Mobile SoCs house several different types of ML capable components on-die, such as CPU, GPU, and accele… ▽ More The ever-increasing demand from mobile Machine Learning (ML) applications calls for evermore powerful on-chip computing resources. Mobile devices are empowered with heterogeneous multi-processor Systems-on-Chips (SoCs) to process ML workloads such as Convolutional Neural Network (CNN) inference. Mobile SoCs house several different types of ML capable components on-die, such as CPU, GPU, and accelerators. These different components are capable of independently performing inference but with very different power-performance characteristics. In this article, we provide a quantitative evaluation of the inference capabilities of the different components on mobile SoCs. We also present insights behind their respective power-performance behavior. Finally, we explore the performance limit of the mobile SoCs by synergistically engaging all the components concurrently. We observe that a mobile SoC provides up to 2x improvement with parallel inference when all its components are engaged, as opposed to engaging only one component. △ Less

Submitted 22 January, 2020; v1 submitted 24 August, 2019; originally announced August 2019.

Comments: Accepted to IEEE Design & Test

Journal ref: in IEEE Design & Test, vol. 37, no. 5, pp. 50-57, Oct. 2020

arXiv:1903.05898 [pdf, other]

doi 10.1109/TCAD.2019.2944584

High-Throughput CNN Inference on Embedded ARM big.LITTLE Multi-Core Processors

Authors: Siqi Wang, Gayathri Ananthanarayanan, Yifan Zeng, Neeraj Goel, Anuj Pathania, Tulika Mitra

Abstract: IoT Edge intelligence requires Convolutional Neural Network (CNN) inference to take place in the edge devices itself. ARM big.LITTLE architecture is at the heart of prevalent commercial edge devices. It comprises of single-ISA heterogeneous cores grouped into multiple homogeneous clusters that enable power and performance trade-offs. All cores are expected to be simultaneously employed in inferenc… ▽ More IoT Edge intelligence requires Convolutional Neural Network (CNN) inference to take place in the edge devices itself. ARM big.LITTLE architecture is at the heart of prevalent commercial edge devices. It comprises of single-ISA heterogeneous cores grouped into multiple homogeneous clusters that enable power and performance trade-offs. All cores are expected to be simultaneously employed in inference to attain maximal throughput. However, high communication overhead involved in parallelization of computations from convolution kernels across clusters is detrimental to throughput. We present an alternative framework called Pipe-it that employs pipelined design to split convolutional layers across clusters while limiting parallelization of their respective kernels to the assigned cluster. We develop a performance-prediction model that utilizes only the convolutional layer descriptors to predict the execution time of each layer individually on all permitted core configurations (type and count). Pipe-it then exploits the predictions to create a balanced pipeline using an efficient design space exploration algorithm. Pipe-it on average results in a 39% higher throughput than the highest antecedent throughput. △ Less

Submitted 22 January, 2020; v1 submitted 14 March, 2019; originally announced March 2019.

Comments: Accepted to IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

Journal ref: in IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 39, no. 10, pp. 2254-2267, Oct. 2020

arXiv:1807.05843 [pdf, other]

oo7: Low-overhead Defense against Spectre Attacks via Program Analysis

Authors: Guanhua Wang, Sudipta Chattopadhyay, Ivan Gotovchits, Tulika Mitra, Abhik Roychoudhury

Abstract: The Spectre vulnerability in modern processors has been widely reported. The key insight in this vulnerability is that speculative execution in processors can be misused to access the secrets. Subsequently, even though the speculatively executed instructions are squashed, the secret may linger in micro-architectural states such as cache, and can potentially be accessed by an attacker via side chan… ▽ More The Spectre vulnerability in modern processors has been widely reported. The key insight in this vulnerability is that speculative execution in processors can be misused to access the secrets. Subsequently, even though the speculatively executed instructions are squashed, the secret may linger in micro-architectural states such as cache, and can potentially be accessed by an attacker via side channels. In this paper, we propose oo7, a static analysis approach that can mitigate Spectre attacks by detecting potentially vulnerable code snippets in program binaries and protecting them against the attack by patching them. Our key contribution is to balance the concerns of effectiveness, analysis time and run-time overheads. We employ control flow extraction, taint analysis, and address analysis to detect tainted conditional branches and speculative memory accesses. oo7 can detect all fifteen purpose-built Spectre-vulnerable code patterns, whereas Microsoft compiler with Spectre mitigation option can only detect two of them. We also report the results of a large-scale study on applying oo7 to over 500 program binaries (average binary size 261 KB) from different real-world projects. We protect programs against Spectre attack by selectively inserting fences only at vulnerable conditional branches to prevent speculative execution. Our approach is experimentally observed to incur around 5.9% performance overheads on SPECint benchmarks. △ Less

Submitted 12 November, 2019; v1 submitted 16 July, 2018; originally announced July 2018.

Comments: To appear in IEEE Transactions on Software Engineering, 2020

arXiv:1804.00706 [pdf, other]

doi 10.1145/3301278

Synergy: A HW/SW Framework for High Throughput CNNs on Embedded Heterogeneous SoC

Authors: Guanwen Zhong, Akshat Dubey, Tan Cheng, Tulika Mitra

Abstract: Convolutional Neural Networks (CNN) have been widely deployed in diverse application domains. There has been significant progress in accelerating both their training and inference using high-performance GPUs, FPGAs, and custom ASICs for datacenter-scale environments. The recent proliferation of mobile and IoT devices have necessitated real-time, energy-efficient deep neural network inference on em… ▽ More Convolutional Neural Networks (CNN) have been widely deployed in diverse application domains. There has been significant progress in accelerating both their training and inference using high-performance GPUs, FPGAs, and custom ASICs for datacenter-scale environments. The recent proliferation of mobile and IoT devices have necessitated real-time, energy-efficient deep neural network inference on embedded-class, resource-constrained platforms. In this context, we present {\em Synergy}, an automated, hardware-software co-designed, pipelined, high-throughput CNN inference framework on embedded heterogeneous system-on-chip (SoC) architectures (Xilinx Zynq). {\em Synergy} leverages, through multi-threading, all the available on-chip resources, which includes the dual-core ARM processor along with the FPGA and the NEON SIMD engines as accelerators. Moreover, {\em Synergy} provides a unified abstraction of the heterogeneous accelerators (FPGA and NEON) and can adapt to different network configurations at runtime without changing the underlying hardware accelerator architecture by balancing workload across accelerators through work-stealing. {\em Synergy} achieves 7.3X speedup, averaged across seven CNN models, over a well-optimized software-only solution. {\em Synergy} demonstrates substantially better throughput and energy-efficiency compared to the contemporary CNN implementations on the same SoC architecture. △ Less

Submitted 28 March, 2018; originally announced April 2018.

Comments: 34 pages, submitted to ACM Transactions on Embedded Computing Systems (TECS)

ACM Class: C.1.3

Journal ref: TECS, 18 (2019) 13-39

arXiv:1612.08440 [pdf, other]

Credibility and Dynamics of Collective Attention

Authors: Tanushree Mitra, Graham Wright, Eric Gilbert

Abstract: Today, social media provide the means by which billions of people experience news and events happening around the world. However, the absence of traditional journalistic gatekeeping allows information to flow unencumbered through these platforms, often raising questions of veracity and credibility of the reported information. Here we ask: How do the dynamics of collective attention directed toward… ▽ More Today, social media provide the means by which billions of people experience news and events happening around the world. However, the absence of traditional journalistic gatekeeping allows information to flow unencumbered through these platforms, often raising questions of veracity and credibility of the reported information. Here we ask: How do the dynamics of collective attention directed toward an event reported on social media vary with its perceived credibility? By examining the first large-scale, systematically tracked credibility database of public Twitter messages (47M messages corresponding to 1,138 real-world events over a span of three months), we established a relationship between the temporal dynamics of events reported on social media and their associated level of credibility judgments. Representing collective attention by the aggregate temporal signatures of an event reportage, we found that the amount of continued attention focused on an event provides information about its associated levels of perceived credibility. Events exhibiting sustained, intermittent bursts of attention were found to be associated with lower levels of perceived credibility. In other words, as more people showed interest during moments of transient collective attention, the associated uncertainty surrounding these events also increased. △ Less

Submitted 26 December, 2016; originally announced December 2016.

Showing 1–49 of 49 results for author: Mitra, T