Search | arXiv e-print repository

How Do Analysts Understand and Verify AI-Assisted Data Analyses?

Authors: Ken Gu, Ruoxi Shang, Tim Althoff, Chenglong Wang, Steven M. Drucker

Abstract: Data analysis is challenging as it requires synthesizing domain knowledge, statistical expertise, and programming skills. Assistants powered by large language models (LLMs), such as ChatGPT, can assist analysts by translating natural language instructions into code. However, AI-assistant responses and analysis code can be misaligned with the analyst's intent or be seemingly correct but lead to inc… ▽ More Data analysis is challenging as it requires synthesizing domain knowledge, statistical expertise, and programming skills. Assistants powered by large language models (LLMs), such as ChatGPT, can assist analysts by translating natural language instructions into code. However, AI-assistant responses and analysis code can be misaligned with the analyst's intent or be seemingly correct but lead to incorrect conclusions. Therefore, validating AI assistance is crucial and challenging. Here, we explore how analysts understand and verify the correctness of AI-generated analyses. To observe analysts in diverse verification approaches, we develop a design probe equipped with natural language explanations, code, visualizations, and interactive data tables with common data operations. Through a qualitative user study (n=22) using this probe, we uncover common behaviors within verification workflows and how analysts' programming, analysis, and tool backgrounds reflect these behaviors. Additionally, we provide recommendations for analysts and highlight opportunities for designers to improve future AI-assistant experiences. △ Less

Submitted 4 March, 2024; v1 submitted 19 September, 2023; originally announced September 2023.

Comments: Accepted to CHI 2024

arXiv:2301.11178 [pdf, other]

On the Design of AI-powered Code Assistants for Notebooks

Authors: Andrew M. McNutt, Chenglong Wang, Robert A. DeLine, Steven M. Drucker

Abstract: AI-powered code assistants, such as Copilot, are quickly becoming a ubiquitous component of contemporary coding contexts. Among these environments, computational notebooks, such as Jupyter, are of particular interest as they provide rich interface affordances that interleave code and output in a manner that allows for both exploratory and presentational work. Despite their popularity, little is kn… ▽ More AI-powered code assistants, such as Copilot, are quickly becoming a ubiquitous component of contemporary coding contexts. Among these environments, computational notebooks, such as Jupyter, are of particular interest as they provide rich interface affordances that interleave code and output in a manner that allows for both exploratory and presentational work. Despite their popularity, little is known about the appropriate design of code assistants in notebooks. We investigate the potential of code assistants in computational notebooks by creating a design space (reified from a survey of extant tools) and through an interview-design study (with 15 practicing data scientists). Through this work, we identify challenges and opportunities for future systems in this space, such as the value of disambiguation for tasks like data visualization, the potential of tightly scoped domain-specific tools (like linters), and the importance of polite assistants. △ Less

Submitted 26 January, 2023; originally announced January 2023.

Comments: To be published in Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems (CHI '23), April 23--28, 2023, Hamburg, Germany 16 pages with 7 Figures, 1 Table, 2 Page Appendix (consisting of 4 figures)

arXiv:2110.00680 [pdf, other]

doi 10.1145/3411764.3445400

Collecting and Characterizing Natural Language Utterances for Specifying Data Visualizations

Authors: Arjun Srinivasan, Nikhila Nyapathy, Bongshin Lee, Steven M. Drucker, John Stasko

Abstract: Natural language interfaces (NLIs) for data visualization are becoming increasingly popular both in academic research and in commercial software. Yet, there is a lack of empirical understanding of how people specify visualizations through natural language. To bridge this gap, we conducted an online study with 102 participants. We showed participants a series of ten visualizations for a given datas… ▽ More Natural language interfaces (NLIs) for data visualization are becoming increasingly popular both in academic research and in commercial software. Yet, there is a lack of empirical understanding of how people specify visualizations through natural language. To bridge this gap, we conducted an online study with 102 participants. We showed participants a series of ten visualizations for a given dataset and asked them to provide utterances they would pose to generate the displayed charts. The curated list of utterances generated from the study is provided below. This corpus of utterances can be used to evaluate existing NLIs for data visualization as well as for creating new systems and models to generate visualizations from natural language utterances. △ Less

Submitted 1 October, 2021; originally announced October 2021.

Comments: Paper appeared at the 2021 ACM Conference on Conference on Human Factors in Computing Systems (CHI 2021), 10 pages (5 figures, 3 tables)

arXiv:2001.06423 [pdf, other]

InChorus: Designing Consistent Multimodal Interactions for Data Visualization on Tablet Devices

Authors: Arjun Srinivasan, Bongshin Lee, Nathalie Henry Riche, Steven M. Drucker, Ken Hinckley

Abstract: While tablet devices are a promising platform for data visualization, supporting consistent interactions across different types of visualizations on tablets remains an open challenge. In this paper, we present multimodal interactions that function consistently across different visualizations, supporting common operations during visual data analysis. By considering standard interface elements (e.g.… ▽ More While tablet devices are a promising platform for data visualization, supporting consistent interactions across different types of visualizations on tablets remains an open challenge. In this paper, we present multimodal interactions that function consistently across different visualizations, supporting common operations during visual data analysis. By considering standard interface elements (e.g., axes, marks) and grounding our design in a set of core concepts including operations, parameters, targets, and instruments, we systematically develop interactions applicable to different visualization types. To exemplify how the proposed interactions collectively facilitate data exploration, we employ them in a tablet-based system, InChorus that supports pen, touch, and speech input. Based on a study with 12 participants performing replication and fact-checking tasks with InChorus, we discuss how participants adapted to using multimodal input and highlight considerations for future multimodal visualization systems. △ Less

Submitted 17 January, 2020; originally announced January 2020.

Comments: To appear in ACM CHI 2020 Conference on Human Factors in Computing Systems; 13 pages (10 content + 3 references); 4 Figures, 1 Table

Showing 1–4 of 4 results for author: Drucker, S M