Search | arXiv e-print repository

Beyond Preferences in AI Alignment

Authors: Tan Zhi-Xuan, Micah Carroll, Matija Franklin, Hal Ashton

Abstract: The dominant practice of AI alignment assumes (1) that preferences are an adequate representation of human values, (2) that human rationality can be understood in terms of maximizing the satisfaction of preferences, and (3) that AI systems should be aligned with the preferences of one or more humans to ensure that they behave safely and in accordance with our values. Whether implicitly followed or… ▽ More The dominant practice of AI alignment assumes (1) that preferences are an adequate representation of human values, (2) that human rationality can be understood in terms of maximizing the satisfaction of preferences, and (3) that AI systems should be aligned with the preferences of one or more humans to ensure that they behave safely and in accordance with our values. Whether implicitly followed or explicitly endorsed, these commitments constitute what we term a preferentist approach to AI alignment. In this paper, we characterize and challenge the preferentist approach, describing conceptual and technical alternatives that are ripe for further research. We first survey the limits of rational choice theory as a descriptive model, explaining how preferences fail to capture the thick semantic content of human values, and how utility representations neglect the possible incommensurability of those values. We then critique the normativity of expected utility theory (EUT) for humans and AI, drawing upon arguments showing how rational agents need not comply with EUT, while highlighting how EUT is silent on which preferences are normatively acceptable. Finally, we argue that these limitations motivate a reframing of the targets of AI alignment: Instead of alignment with the preferences of a human user, developer, or humanity-writ-large, AI systems should be aligned with normative standards appropriate to their social roles, such as the role of a general-purpose assistant. Furthermore, these standards should be negotiated and agreed upon by all relevant stakeholders. On this alternative conception of alignment, a multiplicity of AI systems will be able to serve diverse ends, aligned with normative standards that promote mutual benefit and limit harm despite our plural and divergent values. △ Less

Submitted 29 August, 2024; originally announced August 2024.

Comments: 26 pages (excl. references), 5 figures

arXiv:2408.04195 [pdf, other]

Design and Implementation of Smart Infrastructures and Connected Vehicles in A Mini-city Platform

Authors: Daniel Vargas, Ethan Haque, Matthew Carroll, Daniel Perez, Tyler Roman, Phong Nguyen, Golnaz Habibi

Abstract: This paper presents a 1/10th scale mini-city platform used as a testing bed for evaluating autonomous and connected vehicles. Using the mini-city platform, we can evaluate different driving scenarios including human-driven and autonomous driving. We provide a unique, visual feature-rich environment for evaluating computer vision methods. The conducted experiments utilize onboard sensors mounted on… ▽ More This paper presents a 1/10th scale mini-city platform used as a testing bed for evaluating autonomous and connected vehicles. Using the mini-city platform, we can evaluate different driving scenarios including human-driven and autonomous driving. We provide a unique, visual feature-rich environment for evaluating computer vision methods. The conducted experiments utilize onboard sensors mounted on a robotic platform we built, allowing them to navigate in a controlled real-world urban environment. The designed city is occupied by cars, stop signs, a variety of residential and business buildings, and complex intersections mimicking an urban area. Furthermore, We have designed an intelligent infrastructure at one of the intersections in the city which helps safer and more efficient navigation in the presence of multiple cars and pedestrians. We have used the mini-city platform for the analysis of three different applications: city mapping, depth estimation in challenging occluded environments, and smart infrastructure for connected vehicles. Our smart infrastructure is among the first to develop and evaluate Vehicle-to-Infrastructure (V2I) communication at intersections. The intersection-related result shows how inaccuracy in perception, including mapping and localization, can affect safety. The proposed mini-city platform can be considered as a baseline environment for developing research and education in intelligent transportation systems. △ Less

Submitted 7 August, 2024; originally announced August 2024.

Comments: 8 pages, 9 figures, Presented at 2024 IEEE ITSC Conference, 23 Citations

MSC Class: 68F00 (Primary); 68F11 (Secondary)

arXiv:2407.14925 [pdf, other]

When Qualitative Research Meets Large Language Model: Exploring the Potential of QualiGPT as a Tool for Qualitative Coding

Authors: He Zhang, Chuhao Wu, Jingyi Xie, Fiona Rubino, Sydney Graver, ChanMin Kim, John M. Carroll, Jie Cai

Abstract: Qualitative research, renowned for its in-depth exploration of complex phenomena, often involves time-intensive analysis, particularly during the coding stage. Existing software for qualitative evaluation frequently lacks automatic coding capabilities, user-friendliness, and cost-effectiveness. The advent of Large Language Models (LLMs) like GPT-3 and its successors marks a transformative era for… ▽ More Qualitative research, renowned for its in-depth exploration of complex phenomena, often involves time-intensive analysis, particularly during the coding stage. Existing software for qualitative evaluation frequently lacks automatic coding capabilities, user-friendliness, and cost-effectiveness. The advent of Large Language Models (LLMs) like GPT-3 and its successors marks a transformative era for enhancing qualitative analysis. This paper introduces QualiGPT, a tool developed to address the challenges associated with using ChatGPT for qualitative analysis. Through a comparative analysis of traditional manual coding and QualiGPT's performance on both simulated and real datasets, incorporating both inductive and deductive coding approaches, we demonstrate that QualiGPT significantly improves the qualitative analysis process. Our findings show that QualiGPT enhances efficiency, transparency, and accessibility in qualitative coding. The tool's performance was evaluated using inter-rater reliability (IRR) measures, with results indicating substantial agreement between human coders and QualiGPT in various coding scenarios. In addition, we also discuss the implications of integrating AI into qualitative research workflows and outline future directions for enhancing human-AI collaboration in this field. △ Less

Submitted 20 July, 2024; originally announced July 2024.

Comments: arXiv admin note: substantial text overlap with arXiv:2310.07061

arXiv:2407.12723 [pdf, ps, other]

The Future of Learning: Large Language Models through the Lens of Students

Authors: He Zhang, Jingyi Xie, Chuhao Wu, Jie Cai, ChanMin Kim, John M. Carroll

Abstract: As Large-Scale Language Models (LLMs) continue to evolve, they demonstrate significant enhancements in performance and an expansion of functionalities, impacting various domains, including education. In this study, we conducted interviews with 14 students to explore their everyday interactions with ChatGPT. Our preliminary findings reveal that students grapple with the dilemma of utilizing ChatGPT… ▽ More As Large-Scale Language Models (LLMs) continue to evolve, they demonstrate significant enhancements in performance and an expansion of functionalities, impacting various domains, including education. In this study, we conducted interviews with 14 students to explore their everyday interactions with ChatGPT. Our preliminary findings reveal that students grapple with the dilemma of utilizing ChatGPT's efficiency for learning and information seeking, while simultaneously experiencing a crisis of trust and ethical concerns regarding the outcomes and broader impacts of ChatGPT. The students perceive ChatGPT as being more "human-like" compared to traditional AI. This dilemma, characterized by mixed emotions, inconsistent behaviors, and an overall positive attitude towards ChatGPT, underscores its potential for beneficial applications in education and learning. However, we argue that despite its human-like qualities, the advanced capabilities of such intelligence might lead to adverse consequences. Therefore, it's imperative to approach its application cautiously and strive to mitigate potential harms in future developments. △ Less

Submitted 17 July, 2024; originally announced July 2024.

arXiv:2407.08882 [pdf, ps, other]

Emerging Practices for Large Multimodal Model (LMM) Assistance for People with Visual Impairments: Implications for Design

Authors: Jingyi Xie, Rui Yu, He Zhang, Sooyeon Lee, Syed Masum Billah, John M. Carroll

Abstract: People with visual impairments perceive their environment non-visually and often use AI-powered assistive tools to obtain textual descriptions of visual information. Recent large vision-language model-based AI-powered tools like Be My AI are more capable of understanding users' inquiries in natural language and describing the scene in audible text; however, the extent to which these tools are usef… ▽ More People with visual impairments perceive their environment non-visually and often use AI-powered assistive tools to obtain textual descriptions of visual information. Recent large vision-language model-based AI-powered tools like Be My AI are more capable of understanding users' inquiries in natural language and describing the scene in audible text; however, the extent to which these tools are useful to visually impaired users is currently understudied. This paper aims to fill this gap. Our study with 14 visually impaired users reveals that they are adapting these tools organically -- not only can these tools facilitate complex interactions in household, spatial, and social contexts, but they also act as an extension of users' cognition, as if the cognition were distributed in the visual information. We also found that although the tools are currently not goal-oriented, users accommodate this limitation and embrace the tools' capabilities for broader use. These findings enable us to envision design implications for creating more goal-oriented, real-time processing, and reliable AI-powered assistive technology. △ Less

Submitted 11 July, 2024; originally announced July 2024.

arXiv:2405.17713 [pdf, other]

AI Alignment with Changing and Influenceable Reward Functions

Authors: Micah Carroll, Davis Foote, Anand Siththaranjan, Stuart Russell, Anca Dragan

Abstract: Existing AI alignment approaches assume that preferences are static, which is unrealistic: our preferences change, and may even be influenced by our interactions with AI systems themselves. To clarify the consequences of incorrectly assuming static preferences, we introduce Dynamic Reward Markov Decision Processes (DR-MDPs), which explicitly model preference changes and the AI's influence on them.… ▽ More Existing AI alignment approaches assume that preferences are static, which is unrealistic: our preferences change, and may even be influenced by our interactions with AI systems themselves. To clarify the consequences of incorrectly assuming static preferences, we introduce Dynamic Reward Markov Decision Processes (DR-MDPs), which explicitly model preference changes and the AI's influence on them. We show that despite its convenience, the static-preference assumption may undermine the soundness of existing alignment techniques, leading them to implicitly reward AI systems for influencing user preferences in ways users may not truly want. We then explore potential solutions. First, we offer a unifying perspective on how an agent's optimization horizon may partially help reduce undesirable AI influence. Then, we formalize different notions of AI alignment that account for preference change from the outset. Comparing the strengths and limitations of 8 such notions of alignment, we find that they all either err towards causing undesirable AI influence, or are overly risk-averse, suggesting that a straightforward solution to the problems of changing preferences may not exist. As there is no avoiding grappling with changing preferences in real-world settings, this makes it all the more important to handle these issues with care, balancing risks and capabilities. We hope our work can provide conceptual clarity and constitute a first step towards AI alignment practices which explicitly account for (and contend with) the changing and influenceable nature of human preferences. △ Less

Submitted 27 May, 2024; originally announced May 2024.

Comments: Accepted to ICML 2024

arXiv:2404.14305 [pdf, other]

"I Upload...All Types of Different Things to Say, the World of Blindness Is More Than What They Think It Is": A Study of Blind TikTokers' Identity Work from a Flourishing Perspective

Authors: Yao Lyu, Jie Cai, Bryan Dosono, Davis Yadav, John M. Carroll

Abstract: Identity work in Human-Computer Interaction (HCI) has focused on the marginalized group to explore designs to support their asset (what they have). However, little has been explored specifically on the identity work of people with disabilities, specifically, visual impairments. In this study, we interviewed 45 BlindTokers (blind users on TikTok) from various backgrounds to understand their identit… ▽ More Identity work in Human-Computer Interaction (HCI) has focused on the marginalized group to explore designs to support their asset (what they have). However, little has been explored specifically on the identity work of people with disabilities, specifically, visual impairments. In this study, we interviewed 45 BlindTokers (blind users on TikTok) from various backgrounds to understand their identity work from a positive design perspective. We found that BlindTokers leverage the affordance of the platform to create positive content, share their identities, and build the community with the desire to flourish. We proposed flourishing labor to present the work conducted by BlindTokers for their community's flourishing with implications to support the flourishing labor. This work contributes to understanding blind users' experience in short video platforms and highlights that flourishing is not just an activity for any single Blind user but also a job that needs all stakeholders, including all user groups and the TikTok platform, serious and committed contribution. △ Less

Submitted 22 April, 2024; originally announced April 2024.

Comments: ACM CSCW

arXiv:2401.15222 [pdf, other]

Transfer Learning for the Prediction of Entity Modifiers in Clinical Text: Application to Opioid Use Disorder Case Detection

Authors: Abdullateef I. Almudaifer, Whitney Covington, JaMor Hairston, Zachary Deitch, Ankit Anand, Caleb M. Carroll, Estera Crisan, William Bradford, Lauren Walter, Eaton Ellen, Sue S. Feldman, John D. Osborne

Abstract: Background: The semantics of entities extracted from a clinical text can be dramatically altered by modifiers, including entity negation, uncertainty, conditionality, severity, and subject. Existing models for determining modifiers of clinical entities involve regular expression or features weights that are trained independently for each modifier. Methods: We develop and evaluate a multi-task tr… ▽ More Background: The semantics of entities extracted from a clinical text can be dramatically altered by modifiers, including entity negation, uncertainty, conditionality, severity, and subject. Existing models for determining modifiers of clinical entities involve regular expression or features weights that are trained independently for each modifier. Methods: We develop and evaluate a multi-task transformer architecture design where modifiers are learned and predicted jointly using the publicly available SemEval 2015 Task 14 corpus and a new Opioid Use Disorder (OUD) data set that contains modifiers shared with SemEval as well as novel modifiers specific for OUD. We evaluate the effectiveness of our multi-task learning approach versus previously published systems and assess the feasibility of transfer learning for clinical entity modifiers when only a portion of clinical modifiers are shared. Results: Our approach achieved state-of-the-art results on the ShARe corpus from SemEval 2015 Task 14, showing an increase of 1.1% on weighted accuracy, 1.7% on unweighted accuracy, and 10% on micro F1 scores. Conclusions: We show that learned weights from our shared model can be effectively transferred to a new partially matched data set, validating the use of transfer learning for clinical text modifiers △ Less

Submitted 5 February, 2024; v1 submitted 26 January, 2024; originally announced January 2024.

Comments: 18 pages, 2 figures, 6 tables. To be submitted to the Journal of Biomedical Semantics

arXiv:2401.12521 [pdf, ps, other]

doi 10.1007/978-3-031-57860-1_6

Exploring Virtual Reality through Ihde's Instrumental Realism

Authors: He Zhang, John M. Carroll

Abstract: Based on Ihde's theory, this paper explores the relationship between virtual reality (VR) as an instrument and phenomenology. It reviews the "technological revolution" spurred by the development of VR technology and discusses how VR has been used to study subjective experience, explore perception and embodiment, enhance empathy and perspective, and investigate altered states of consciousness. The… ▽ More Based on Ihde's theory, this paper explores the relationship between virtual reality (VR) as an instrument and phenomenology. It reviews the "technological revolution" spurred by the development of VR technology and discusses how VR has been used to study subjective experience, explore perception and embodiment, enhance empathy and perspective, and investigate altered states of consciousness. The paper emphasizes the role of VR as an instrumental technology, particularly its ability to expand human perception and cognition. Reflecting on this in conjunction with the work of Husserl and Ihde, among others, it revisits the potential of VR to provide new avenues for scientific inquiry and experience and to transform our understanding of the world through VR. △ Less

Submitted 23 January, 2024; originally announced January 2024.

Comments: Accepted to iConference 2024 as a short paper

arXiv:2401.12133 [pdf, other]

VRMN-bD: A Multi-modal Natural Behavior Dataset of Immersive Human Fear Responses in VR Stand-up Interactive Games

Authors: He Zhang, Xinyang Li, Yuanxi Sun, Xinyi Fu, Christine Qiu, John M. Carroll

Abstract: Understanding and recognizing emotions are important and challenging issues in the metaverse era. Understanding, identifying, and predicting fear, which is one of the fundamental human emotions, in virtual reality (VR) environments plays an essential role in immersive game development, scene development, and next-generation virtual human-computer interaction applications. In this article, we used… ▽ More Understanding and recognizing emotions are important and challenging issues in the metaverse era. Understanding, identifying, and predicting fear, which is one of the fundamental human emotions, in virtual reality (VR) environments plays an essential role in immersive game development, scene development, and next-generation virtual human-computer interaction applications. In this article, we used VR horror games as a medium to analyze fear emotions by collecting multi-modal data (posture, audio, and physiological signals) from 23 players. We used an LSTM-based model to predict fear with accuracies of 65.31% and 90.47% under 6-level classification (no fear and five different levels of fear) and 2-level classification (no fear and fear), respectively. We constructed a multi-modal natural behavior dataset of immersive human fear responses (VRMN-bD) and compared it with existing relevant advanced datasets. The results show that our dataset has fewer limitations in terms of collection method, data scale and audience scope. We are unique and advanced in targeting multi-modal datasets of fear and behavior in VR stand-up interactive environments. Moreover, we discussed the implications of this work for communities and applications. The dataset and pre-trained model are available at https://github.com/KindOPSTAR/VRMN-bD. △ Less

Submitted 22 January, 2024; originally announced January 2024.

Comments: Accepted to IEEE VR 2024

arXiv:2401.11663 [pdf, other]

doi 10.1145/3613904.3642148

"I Got Flagged for Supposed Bullying, Even Though It Was in Response to Someone Harassing Me About My Disability.": A Study of Blind TikTokers' Content Moderation Experiences

Authors: Yao Lyu, Jie Cai, Anisa Callis, Kelley Cotter, John M. Carroll

Abstract: The Human-Computer Interaction (HCI) community has consistently focused on the experiences of users moderated by social media platforms. Recently, scholars have noticed that moderation practices could perpetuate biases, resulting in the marginalization of user groups undergoing moderation. However, most studies have primarily addressed marginalization related to issues such as racism or sexism, wi… ▽ More The Human-Computer Interaction (HCI) community has consistently focused on the experiences of users moderated by social media platforms. Recently, scholars have noticed that moderation practices could perpetuate biases, resulting in the marginalization of user groups undergoing moderation. However, most studies have primarily addressed marginalization related to issues such as racism or sexism, with little attention given to the experiences of people with disabilities. In this paper, we present a study on the moderation experiences of blind users on TikTok, also known as "BlindToker," to address this gap. We conducted semi-structured interviews with 20 BlindTokers and used thematic analysis to analyze the data. Two main themes emerged: BlindTokers' situated content moderation experiences and their reactions to content moderation. We reported on the lack of accessibility on TikTok's platform, contributing to the moderation and marginalization of BlindTokers. Additionally, we discovered instances of harassment from trolls that prompted BlindTokers to respond with harsh language, triggering further moderation. We discussed these findings in the context of the literature on moderation, marginalization, and transformative justice, seeking solutions to address such issues. △ Less

Submitted 21 January, 2024; originally announced January 2024.

Comments: 24 paged, 1 Figure, accepted by CHI'24

arXiv:2401.11317 [pdf, other]

doi 10.1145/3613904.3642787

Third-Party Developers and Tool Development For Community Management on Live Streaming Platform Twitch

Authors: Jie Cai, Ya-Fang Lin, He Zhang, John M. Carroll

Abstract: Community management is critical for stakeholders to collaboratively build and sustain communities with socio-technical support. However, most of the existing research has mainly focused on the community members and the platform, with little attention given to the developers who act as intermediaries between the platform and community members and develop tools to support community management. This… ▽ More Community management is critical for stakeholders to collaboratively build and sustain communities with socio-technical support. However, most of the existing research has mainly focused on the community members and the platform, with little attention given to the developers who act as intermediaries between the platform and community members and develop tools to support community management. This study focuses on third-party developers (TPDs) for the live streaming platform Twitch and explores their tool development practices. Using a mixed method with in-depth qualitative analysis, we found that TPDs maintain complex relationships with different stakeholders (streamers, viewers, platform, professional developers), and the multi-layered policy restricts their agency regarding idea innovation and tool development. We argue that HCI research should shift its focus from tool users to tool developers with regard to community management. We propose designs to support closer collaboration between TPDS and the platform and professional developers and streamline TPDs' development process with unified toolkits and policy documentation. △ Less

Submitted 17 March, 2024; v1 submitted 20 January, 2024; originally announced January 2024.

Comments: Accepted by ACM CHI 2024

arXiv:2312.16697 [pdf, other]

Multi-channel Sensor Network Construction, Data Fusion and Challenges for Smart Home

Authors: He Zhang, Robin Ananda, Xinyi Fu, Zhe Sun, Xiaoyu Wang, Keqi Chen, John M. Carroll

Abstract: Both sensor networks and data fusion are essential foundations for developing the smart home Internet of Things (IoT) and related fields. We proposed a multi-channel sensor network construction method involving hardware, acquisition, and synchronization in the smart home environment and a smart home data fusion method (SHDFM) for multi-modal data (position, gait, voice, pose, facial expression, te… ▽ More Both sensor networks and data fusion are essential foundations for developing the smart home Internet of Things (IoT) and related fields. We proposed a multi-channel sensor network construction method involving hardware, acquisition, and synchronization in the smart home environment and a smart home data fusion method (SHDFM) for multi-modal data (position, gait, voice, pose, facial expression, temperature, and humidity) generated in the smart home environment to address the configuration of a multi-channel sensor network, improve the quality and efficiency of various human activities and environmental data collection, and reduce the difficulty of multi-modal data fusion in the smart home. SHDFM contains 5 levels, with inputs and outputs as criteria to provide recommendations for multi-modal data fusion strategies in the smart home. We built a real experimental environment using the proposed method in this paper. To validate our method, we created a real experimental environment - a physical setup in a home-like scenario where the multi-channel sensor network and data fusion techniques were deployed and evaluated. The acceptance and testing results show that the proposed construction and data fusion methods can be applied to the examples with high robustness, replicability, and scalability. Besides, we discuss how smart homes with multi-channel sensor networks can support digital twins. △ Less

Submitted 27 December, 2023; originally announced December 2023.

Comments: 8 pages, accepted by CHCHI2023

arXiv:2312.12338 [pdf, other]

Smart Connected Farms and Networked Farmers to Tackle Climate Challenges Impacting Agricultural Production

Authors: Behzad J. Balabaygloo, Barituka Bekee, Samuel W. Blair, Suzanne Fey, Fateme Fotouhi, Ashish Gupta, Kevin Menke, Anusha Vangala, Jorge C. M. Palomares, Aaron Prestholt, Vishesh K. Tanwar, Xu Tao, Matthew E. Carroll, Sajal Das, Gil Depaula, Peter Kyveryga, Soumik Sarkar, Michelle Segovia, Simone Sylvestri, Corinne Valdivia, Asheesh K. Singh

Abstract: To meet the grand challenges of agricultural production including climate change impacts on crop production, a tight integration of social science, technology and agriculture experts including farmers are needed. There are rapid advances in information and communication technology, precision agriculture and data analytics, which are creating a fertile field for the creation of smart connected farm… ▽ More To meet the grand challenges of agricultural production including climate change impacts on crop production, a tight integration of social science, technology and agriculture experts including farmers are needed. There are rapid advances in information and communication technology, precision agriculture and data analytics, which are creating a fertile field for the creation of smart connected farms (SCF) and networked farmers. A network and coordinated farmer network provides unique advantages to farmers to enhance farm production and profitability, while tackling adverse climate events. The aim of this article is to provide a comprehensive overview of the state of the art in SCF including the advances in engineering, computer sciences, data sciences, social sciences and economics including data privacy, sharing and technology adoption. △ Less

Submitted 19 December, 2023; originally announced December 2023.

arXiv:2310.07154 [pdf, other]

doi 10.1145/3637297

"Because Some Sighted People, They Don't Know What the Heck You're Talking About:" A Study of Blind TikTokers' Infrastructuring Work to Build Independence

Authors: Yao Lyu, John M. Carroll

Abstract: There has been extensive research on the experiences of individuals with visual impairments on text- and image-based social media platforms, such as Facebook and Twitter. However, little is known about the experiences of visually impaired users on short-video platforms like TikTok. To bridge this gap, we conducted an interview study with 30 BlindTokers (the nickname of blind TikTokers). Our study… ▽ More There has been extensive research on the experiences of individuals with visual impairments on text- and image-based social media platforms, such as Facebook and Twitter. However, little is known about the experiences of visually impaired users on short-video platforms like TikTok. To bridge this gap, we conducted an interview study with 30 BlindTokers (the nickname of blind TikTokers). Our study aimed to explore the various activities of BlindTokers on TikTok, including everyday entertainment, professional development, and community engagement. The widespread usage of TikTok among participants demonstrated that they considered TikTok and its associated experiences as the infrastructure for their activities. Additionally, participants reported experiencing breakdowns in this infrastructure due to accessibility issues. They had to carry out infrastructuring work to resolve the breakdowns. Blind users' various practices on TikTok also foregrounded their perceptions of independence. We then discussed blind users' nuanced understanding of the TikTok-mediated independence; we also critically examined BlindTokers' infrastructuring work for such independence. △ Less

Submitted 11 December, 2023; v1 submitted 10 October, 2023; originally announced October 2023.

Comments: Accepted at CSCW'24, 29 pages, 2 figures, and 2 tables

arXiv:2310.07061 [pdf, other]

QualiGPT: GPT as an easy-to-use tool for qualitative coding

Authors: He Zhang, Chuhao Wu, Jingyi Xie, ChanMin Kim, John M. Carroll

Abstract: Qualitative research delves deeply into individual complex perspectives on technology and various phenomena. However, a meticulous analysis of qualitative data often requires a significant amount of time, especially during the crucial coding stage. Although there is software specifically designed for qualitative evaluation, many of these platforms fall short in terms of automatic coding, intuitive… ▽ More Qualitative research delves deeply into individual complex perspectives on technology and various phenomena. However, a meticulous analysis of qualitative data often requires a significant amount of time, especially during the crucial coding stage. Although there is software specifically designed for qualitative evaluation, many of these platforms fall short in terms of automatic coding, intuitive usability, and cost-effectiveness. With the rise of Large Language Models (LLMs) such as GPT-3 and its successors, we are at the forefront of a transformative era for enhancing qualitative analysis. In this paper, we introduce QualiGPT, a specialized tool designed after considering challenges associated with ChatGPT and qualitative analysis. It harnesses the capabilities of the Generative Pretrained Transformer (GPT) and its API for thematic analysis of qualitative data. By comparing traditional manual coding with QualiGPT's analysis on both simulated and actual datasets, we verify that QualiGPT not only refines the qualitative analysis process but also elevates its transparency, credibility, and accessibility. Notably, compared to existing analytical platforms, QualiGPT stands out with its intuitive design, significantly reducing the learning curve and operational barriers for users. △ Less

Submitted 10 October, 2023; originally announced October 2023.

Comments: 25 pages, 7 figures, 1 table, under review

arXiv:2309.10771 [pdf, other]

Redefining Qualitative Analysis in the AI Era: Utilizing ChatGPT for Efficient Thematic Analysis

Authors: He Zhang, Chuhao Wu, Jingyi Xie, Yao Lyu, Jie Cai, John M. Carroll

Abstract: AI tools, particularly large-scale language model (LLM) based applications such as ChatGPT, have the potential to simplify qualitative research. Through semi-structured interviews with seventeen participants, we identified challenges and concerns in integrating ChatGPT into the qualitative analysis process. Collaborating with thirteen qualitative researchers, we developed a framework for designing… ▽ More AI tools, particularly large-scale language model (LLM) based applications such as ChatGPT, have the potential to simplify qualitative research. Through semi-structured interviews with seventeen participants, we identified challenges and concerns in integrating ChatGPT into the qualitative analysis process. Collaborating with thirteen qualitative researchers, we developed a framework for designing prompts to enhance the effectiveness of ChatGPT in thematic analysis. Our findings indicate that improving transparency, providing guidance on prompts, and strengthening users' understanding of LLMs' capabilities significantly enhance the users' ability to interact with ChatGPT. We also discovered and revealed the reasons behind researchers' shift in attitude towards ChatGPT from negative to positive. This research not only highlights the importance of well-designed prompts in LLM applications but also offers reflections for qualitative researchers on the perception of AI's role. Finally, we emphasize the potential ethical risks and the impact of constructing AI ethical expectations by researchers, particularly those who are novices, on future research and AI development. △ Less

Submitted 27 May, 2024; v1 submitted 19 September, 2023; originally announced September 2023.

arXiv:2308.14014 [pdf, other]

Reconnecting An International Travel Network: The Personal Infrastructuring Work of International Travelers in A Multi-facet Crisis

Authors: Yao Lyu, He Zhang, John M. Carroll

Abstract: In times of crisis, international travel becomes tenuous and anxiety provoking. The crisis informatics and Human-Computer Interaction (HCI) community has paid increasing attention to the use of Information and Communication Technologies (ICTs) in various crisis settings. However, little is known about the travelers' actual experiences in whole trips in crises. In this paper, we bridge the gap by p… ▽ More In times of crisis, international travel becomes tenuous and anxiety provoking. The crisis informatics and Human-Computer Interaction (HCI) community has paid increasing attention to the use of Information and Communication Technologies (ICTs) in various crisis settings. However, little is known about the travelers' actual experiences in whole trips in crises. In this paper, we bridge the gap by presenting a study on Chinese travelers' encounters in their international journeys to the US during a multifacet crisis and their use of ICTs to overcome difficulties in the journeys. We interviewed 22 Chinese travelers who had successfully come to the US during the crisis. The findings showed how travelers improvised to reconnect the broken international travel infrastructure. We also discuss the findings with the literature on infrastructure, and crisis informatics, and provide design implications for travel authorities and agencies. △ Less

Submitted 27 August, 2023; originally announced August 2023.

arXiv:2307.15217 [pdf, other]

Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback

Authors: Stephen Casper, Xander Davies, Claudia Shi, Thomas Krendl Gilbert, Jérémy Scheurer, Javier Rando, Rachel Freedman, Tomasz Korbak, David Lindner, Pedro Freire, Tony Wang, Samuel Marks, Charbel-Raphaël Segerie, Micah Carroll, Andi Peng, Phillip Christoffersen, Mehul Damani, Stewart Slocum, Usman Anwar, Anand Siththaranjan, Max Nadeau, Eric J. Michaud, Jacob Pfau, Dmitrii Krasheninnikov, Xin Chen , et al. (7 additional authors not shown)

Abstract: Reinforcement learning from human feedback (RLHF) is a technique for training AI systems to align with human goals. RLHF has emerged as the central method used to finetune state-of-the-art large language models (LLMs). Despite this popularity, there has been relatively little public work systematizing its flaws. In this paper, we (1) survey open problems and fundamental limitations of RLHF and rel… ▽ More Reinforcement learning from human feedback (RLHF) is a technique for training AI systems to align with human goals. RLHF has emerged as the central method used to finetune state-of-the-art large language models (LLMs). Despite this popularity, there has been relatively little public work systematizing its flaws. In this paper, we (1) survey open problems and fundamental limitations of RLHF and related methods; (2) overview techniques to understand, improve, and complement RLHF in practice; and (3) propose auditing and disclosure standards to improve societal oversight of RLHF systems. Our work emphasizes the limitations of RLHF and highlights the importance of a multi-faceted approach to the development of safer AI systems. △ Less

Submitted 11 September, 2023; v1 submitted 27 July, 2023; originally announced July 2023.

arXiv:2306.09309 [pdf, other]

Who Needs to Know? Minimal Knowledge for Optimal Coordination

Authors: Niklas Lauffer, Ameesh Shah, Micah Carroll, Michael Dennis, Stuart Russell

Abstract: To optimally coordinate with others in cooperative games, it is often crucial to have information about one's collaborators: successful driving requires understanding which side of the road to drive on. However, not every feature of collaborators is strategically relevant: the fine-grained acceleration of drivers may be ignored while maintaining optimal coordination. We show that there is a well-d… ▽ More To optimally coordinate with others in cooperative games, it is often crucial to have information about one's collaborators: successful driving requires understanding which side of the road to drive on. However, not every feature of collaborators is strategically relevant: the fine-grained acceleration of drivers may be ignored while maintaining optimal coordination. We show that there is a well-defined dichotomy between strategically relevant and irrelevant information. Moreover, we show that, in dynamic games, this dichotomy has a compact representation that can be efficiently computed via a Bellman backup operator. We apply this algorithm to analyze the strategically relevant information for tasks in both a standard and a partially observable version of the Overcooked environment. Theoretical and empirical results show that our algorithms are significantly more efficient than baselines. Videos are available at https://minknowledge.github.io. △ Less

Submitted 13 July, 2023; v1 submitted 15 June, 2023; originally announced June 2023.

Comments: To be published at ICML 2023

ACM Class: I.2.6; I.2.11

arXiv:2305.16941 [pdf, other]

Engagement, User Satisfaction, and the Amplification of Divisive Content on Social Media

Authors: Smitha Milli, Micah Carroll, Yike Wang, Sashrika Pandey, Sebastian Zhao, Anca D. Dragan

Abstract: In a pre-registered randomized experiment, we found that, relative to a reverse-chronological baseline, Twitter's engagement-based ranking algorithm amplifies emotionally charged, out-group hostile content that users say makes them feel worse about their political out-group. Furthermore, we find that users do not prefer the political tweets selected by the algorithm, suggesting that the engagement… ▽ More In a pre-registered randomized experiment, we found that, relative to a reverse-chronological baseline, Twitter's engagement-based ranking algorithm amplifies emotionally charged, out-group hostile content that users say makes them feel worse about their political out-group. Furthermore, we find that users do not prefer the political tweets selected by the algorithm, suggesting that the engagement-based algorithm underperforms in satisfying users' stated preferences. Finally, we explore the implications of an alternative approach that ranks content based on users' stated preferences and find a reduction in angry, partisan, and out-group hostile content but also a potential reinforcement of echo chambers. The evidence underscores the necessity for a more nuanced approach to content ranking that balances engagement, users' stated preferences, and sociopolitical outcomes. △ Less

Submitted 22 December, 2023; v1 submitted 26 May, 2023; originally announced May 2023.

arXiv:2305.09933 [pdf, other]

Impact of ROS 2 Node Composition in Robotic Systems

Authors: Steve Macenski, Alberto Soragna, Michael Carroll, Zhenpeng Ge

Abstract: The Robot Operating System 2 (ROS 2) is the second generation of ROS representing a step forward in the robotic framework. Several new types of nodes and executor models are integral to control where, how, and when information is processed in the computational graph. This paper explores and benchmarks one of these new node types -- the Component node -- which allows nodes to be composed manually o… ▽ More The Robot Operating System 2 (ROS 2) is the second generation of ROS representing a step forward in the robotic framework. Several new types of nodes and executor models are integral to control where, how, and when information is processed in the computational graph. This paper explores and benchmarks one of these new node types -- the Component node -- which allows nodes to be composed manually or dynamically into processes while retaining separation of concerns in a codebase for distributed development. Composition is shown to achieve a high degree of performance optimization, particularly valuable for resource-constrained systems and sensor processing pipelines, enabling distributed tasks that would not be otherwise possible in ROS 2. In this work, we briefly introduce the significance and design of node composition, then our contribution of benchmarking is provided to analyze its impact on robotic systems. Its compelling influence on performance is shown through several experiments on the latest Long Term Support (LTS) ROS 2 distribution, Humble Hawksbill. △ Less

Submitted 16 May, 2023; originally announced May 2023.

Comments: IEEE Robotics and Automation Letters, 2023

arXiv:2303.09387 [pdf, other]

Characterizing Manipulation from AI Systems

Authors: Micah Carroll, Alan Chan, Henry Ashton, David Krueger

Abstract: Manipulation is a common concern in many domains, such as social media, advertising, and chatbots. As AI systems mediate more of our interactions with the world, it is important to understand the degree to which AI systems might manipulate humans without the intent of the system designers. Our work clarifies challenges in defining and measuring manipulation in the context of AI systems. Firstly, w… ▽ More Manipulation is a common concern in many domains, such as social media, advertising, and chatbots. As AI systems mediate more of our interactions with the world, it is important to understand the degree to which AI systems might manipulate humans without the intent of the system designers. Our work clarifies challenges in defining and measuring manipulation in the context of AI systems. Firstly, we build upon prior literature on manipulation from other fields and characterize the space of possible notions of manipulation, which we find to depend upon the concepts of incentives, intent, harm, and covertness. We review proposals on how to operationalize each factor. Second, we propose a definition of manipulation based on our characterization: a system is manipulative if it acts as if it were pursuing an incentive to change a human (or another agent) intentionally and covertly. Third, we discuss the connections between manipulation and related concepts, such as deception and coercion. Finally, we contextualize our operationalization of manipulation in some applications. Our overall assessment is that while some progress has been made in defining and measuring manipulation from AI systems, many gaps remain. In the absence of a consensus definition and reliable tools for measurement, we cannot rule out the possibility that AI systems learn to manipulate humans without the intent of the system designers. We argue that such manipulation poses a significant threat to human autonomy, suggesting that precautionary actions to mitigate it are warranted. △ Less

Submitted 30 October, 2023; v1 submitted 16 March, 2023; originally announced March 2023.

Comments: Presented at EAAMO 2023; The first two authors contributed equally; author order was decided with a coin flip

arXiv:2302.10329 [pdf, other]

doi 10.1145/3593013.3594033

Harms from Increasingly Agentic Algorithmic Systems

Authors: Alan Chan, Rebecca Salganik, Alva Markelius, Chris Pang, Nitarshan Rajkumar, Dmitrii Krasheninnikov, Lauro Langosco, Zhonghao He, Yawen Duan, Micah Carroll, Michelle Lin, Alex Mayhew, Katherine Collins, Maryam Molamohammadi, John Burden, Wanru Zhao, Shalaleh Rismani, Konstantinos Voudouris, Umang Bhatt, Adrian Weller, David Krueger, Tegan Maharaj

Abstract: Research in Fairness, Accountability, Transparency, and Ethics (FATE) has established many sources and forms of algorithmic harm, in domains as diverse as health care, finance, policing, and recommendations. Much work remains to be done to mitigate the serious harms of these systems, particularly those disproportionately affecting marginalized communities. Despite these ongoing harms, new systems… ▽ More Research in Fairness, Accountability, Transparency, and Ethics (FATE) has established many sources and forms of algorithmic harm, in domains as diverse as health care, finance, policing, and recommendations. Much work remains to be done to mitigate the serious harms of these systems, particularly those disproportionately affecting marginalized communities. Despite these ongoing harms, new systems are being developed and deployed which threaten the perpetuation of the same harms and the creation of novel ones. In response, the FATE community has emphasized the importance of anticipating harms. Our work focuses on the anticipation of harms from increasingly agentic systems. Rather than providing a definition of agency as a binary property, we identify 4 key characteristics which, particularly in combination, tend to increase the agency of a given algorithmic system: underspecification, directness of impact, goal-directedness, and long-term planning. We also discuss important harms which arise from increasing agency -- notably, these include systemic and/or long-range impacts, often on marginalized stakeholders. We emphasize that recognizing agency of algorithmic systems does not absolve or shift the human responsibility for algorithmic harms. Rather, we use the term agency to highlight the increasingly evident fact that ML systems are not fully under human control. Our work explores increasingly agentic algorithmic systems in three parts. First, we explain the notion of an increase in agency for algorithmic systems in the context of diverse perspectives on agency across disciplines. Second, we argue for the need to anticipate harms from increasingly agentic systems. Third, we discuss important harms from increasingly agentic systems and ways forward for addressing them. We conclude by reflecting on implications of our work for anticipating algorithmic harms from emerging systems. △ Less

Submitted 11 May, 2023; v1 submitted 20 February, 2023; originally announced February 2023.

Comments: Accepted at FAccT 2023

arXiv:2302.07343 [pdf, other]

Agile and Versatile Robot Locomotion via Kernel-based Residual Learning

Authors: Milo Carroll, Zhaocheng Liu, Mohammadreza Kasaei, Zhibin Li

Abstract: This work developed a kernel-based residual learning framework for quadrupedal robotic locomotion. Initially, a kernel neural network is trained with data collected from an MPC controller. Alongside a frozen kernel network, a residual controller network is trained via reinforcement learning to acquire generalized locomotion skills and resilience against external perturbations. With this proposed f… ▽ More This work developed a kernel-based residual learning framework for quadrupedal robotic locomotion. Initially, a kernel neural network is trained with data collected from an MPC controller. Alongside a frozen kernel network, a residual controller network is trained via reinforcement learning to acquire generalized locomotion skills and resilience against external perturbations. With this proposed framework, a robust quadrupedal locomotion controller is learned with high sample efficiency and controllability, providing omnidirectional locomotion at continuous velocities. Its versatility and robustness are validated on unseen terrains that the expert MPC controller fails to traverse. Furthermore, the learned kernel can produce a range of functional locomotion behaviors and can generalize to unseen gaits. △ Less

Submitted 14 February, 2023; originally announced February 2023.

arXiv:2212.00169 [pdf, other]

Time-Efficient Reward Learning via Visually Assisted Cluster Ranking

Authors: David Zhang, Micah Carroll, Andreea Bobu, Anca Dragan

Abstract: One of the most successful paradigms for reward learning uses human feedback in the form of comparisons. Although these methods hold promise, human comparison labeling is expensive and time consuming, constituting a major bottleneck to their broader applicability. Our insight is that we can greatly improve how effectively human time is used in these approaches by batching comparisons together, rat… ▽ More One of the most successful paradigms for reward learning uses human feedback in the form of comparisons. Although these methods hold promise, human comparison labeling is expensive and time consuming, constituting a major bottleneck to their broader applicability. Our insight is that we can greatly improve how effectively human time is used in these approaches by batching comparisons together, rather than having the human label each comparison individually. To do so, we leverage data dimensionality-reduction and visualization techniques to provide the human with a interactive GUI displaying the state space, in which the user can label subportions of the state space. Across some simple Mujoco tasks, we show that this high-level approach holds promise and is able to greatly increase the performance of the resulting agents, provided the same amount of human labeling time. △ Less

Submitted 30 November, 2022; originally announced December 2022.

Comments: Presented at the NeurIPS 2022 Human in the Loop Learning (HiLL) Workshop

arXiv:2211.10869 [pdf, other]

UniMASK: Unified Inference in Sequential Decision Problems

Authors: Micah Carroll, Orr Paradise, Jessy Lin, Raluca Georgescu, Mingfei Sun, David Bignell, Stephanie Milani, Katja Hofmann, Matthew Hausknecht, Anca Dragan, Sam Devlin

Abstract: Randomly masking and predicting word tokens has been a successful approach in pre-training language models for a variety of downstream tasks. In this work, we observe that the same idea also applies naturally to sequential decision-making, where many well-studied tasks like behavior cloning, offline reinforcement learning, inverse dynamics, and waypoint conditioning correspond to different sequenc… ▽ More Randomly masking and predicting word tokens has been a successful approach in pre-training language models for a variety of downstream tasks. In this work, we observe that the same idea also applies naturally to sequential decision-making, where many well-studied tasks like behavior cloning, offline reinforcement learning, inverse dynamics, and waypoint conditioning correspond to different sequence maskings over a sequence of states, actions, and returns. We introduce the UniMASK framework, which provides a unified way to specify models which can be trained on many different sequential decision-making tasks. We show that a single UniMASK model is often capable of carrying out many tasks with performance similar to or better than single-task models. Additionally, after fine-tuning, our UniMASK models consistently outperform comparable single-task models. Our code is publicly available at https://github.com/micahcarroll/uniMASK. △ Less

Submitted 19 November, 2022; originally announced November 2022.

Comments: NeurIPS 2022 (Oral). A prior version was published at an ICML Workshop, available at arXiv:2204.13326

arXiv:2211.01602 [pdf, other]

Optimal Behavior Prior: Data-Efficient Human Models for Improved Human-AI Collaboration

Authors: Mesut Yang, Micah Carroll, Anca Dragan

Abstract: AI agents designed to collaborate with people benefit from models that enable them to anticipate human behavior. However, realistic models tend to require vast amounts of human data, which is often hard to collect. A good prior or initialization could make for more data-efficient training, but what makes for a good prior on human behavior? Our work leverages a very simple assumption: people genera… ▽ More AI agents designed to collaborate with people benefit from models that enable them to anticipate human behavior. However, realistic models tend to require vast amounts of human data, which is often hard to collect. A good prior or initialization could make for more data-efficient training, but what makes for a good prior on human behavior? Our work leverages a very simple assumption: people generally act closer to optimal than to random chance. We show that using optimal behavior as a prior for human models makes these models vastly more data-efficient and able to generalize to new environments. Our intuition is that such a prior enables the training to focus one's precious real-world data on capturing the subtle nuances of human suboptimality, instead of on the basics of how to do the task in the first place. We also show that using these improved human models often leads to better human-AI collaboration performance compared to using models based on real human data alone. △ Less

Submitted 19 November, 2022; v1 submitted 3 November, 2022; originally announced November 2022.

Comments: Presented at the NeurIPS 2022 Human in the Loop Learning (HiLL) Workshop

arXiv:2210.16381 [pdf, other]

Not Another Day Zero: Design Hackathons for Community-Based Water Quality Monitoring

Authors: Srishti Gupta, Chun-Hua Tsai, John M. Carroll

Abstract: This study looks at water quality monitoring and management as a new form of community engagement. Through a series of a unique research method called `design hackathons', we engaged with a hyperlocal community of citizens who are actively involved in monitoring and management of their local watershed. These design hackathons sought to understand the motivation, practices, collaboration and experi… ▽ More This study looks at water quality monitoring and management as a new form of community engagement. Through a series of a unique research method called `design hackathons', we engaged with a hyperlocal community of citizens who are actively involved in monitoring and management of their local watershed. These design hackathons sought to understand the motivation, practices, collaboration and experiences of these citizens. Qualitative analysis of data revealed the nature of the complex stakeholder network, workflow practices, initiatives to engage with a larger community, current state of technological infrastructure being used, and innovative design scenarios proposed by the hackathon participants. Based on this comprehensive analysis, we conceptualize water quality monitoring and management as community-based monitoring and management, and water data as community data. Such a conceptualization sheds light on how these practices can help in preempting water crisis by empowering citizens through increased awareness, active participation and informal learning of water data and resources. △ Less

Submitted 28 October, 2022; originally announced October 2022.

Comments: 21 pages, 3 figures, 3 tables

arXiv:2210.01647 [pdf, other]

Codeless App Development: Evaluating A Cloud-Native Domain-Specific Functions Approach

Authors: Chuhao Wu, Jose Miguel Perez-Alvarez, Adrian Mos, John M. Carroll

Abstract: Mobile applications play an important role in the economy today and there is an increasing trend for app enablement on multiple platforms. However, creating, distributing, and maintaining an application remain expert tasks. Even for software developers, the process can be error-prone and resource-consuming, especially when targeting different platforms simultaneously. Researchers have proposed sev… ▽ More Mobile applications play an important role in the economy today and there is an increasing trend for app enablement on multiple platforms. However, creating, distributing, and maintaining an application remain expert tasks. Even for software developers, the process can be error-prone and resource-consuming, especially when targeting different platforms simultaneously. Researchers have proposed several frameworks to facilitate cross-platform app development, but little attention has been paid to non-technical users. In this paper, we described the Flow framework, which takes the advantage of domain-specific languages to enable no-code specification for app modeling. The cloud-native coordination mechanism further supports non-technical users to execute, monitor, and maintain apps for any target platforms. User evaluations were conducted to assess the usability and user experience with the system. The results indicated that users can develop apps in Flow with ease, but the prototype could be optimized to reduce learning time and workload. △ Less

Submitted 4 October, 2022; originally announced October 2022.

arXiv:2206.07760 [pdf, other]

doi 10.3390/e24081116

Multiscale methods for signal selection in single-cell data

Authors: Renee S. Hoekzema, Lewis Marsh, Otto Sumray, Thomas M. Carroll, Xin Lu, Helen M. Byrne, Heather A. Harrington

Abstract: Analysis of single-cell transcriptomics often relies on clustering cells and then performing differential gene expression (DGE) to identify genes that vary between these clusters. These discrete analyses successfully determine cell types and markers; however, continuous variation within and between cell types may not be detected. We propose three topologically motivated mathematical methods for un… ▽ More Analysis of single-cell transcriptomics often relies on clustering cells and then performing differential gene expression (DGE) to identify genes that vary between these clusters. These discrete analyses successfully determine cell types and markers; however, continuous variation within and between cell types may not be detected. We propose three topologically motivated mathematical methods for unsupervised feature selection that consider discrete and continuous transcriptional patterns on an equal footing across multiple scales simultaneously. Eigenscores ($\text{eig}_i$) rank signals or genes based on their correspondence to low-frequency intrinsic patterning in the data using the spectral decomposition of the Laplacian graph. The multiscale Laplacian score (MLS) is an unsupervised method for locating relevant scales in data and selecting the genes that are coherently expressed at these respective scales. The persistent Rayleigh quotient (PRQ) takes data equipped with a filtration, allowing the separation of genes with different roles in a bifurcation process (e.g., pseudo-time). We demonstrate the utility of these techniques by applying them to published single-cell transcriptomics data sets. The methods validate previously identified genes and detect additional biologically meaningful genes with coherent expression patterns. By studying the interaction between gene signals and the geometry of the underlying space, the three methods give multidimensional rankings of the genes and visualisation of relationships between them. △ Less

Submitted 6 October, 2022; v1 submitted 15 June, 2022; originally announced June 2022.

Comments: 32 pages, 15 figures, 1 table. Revised and published in Entropy, special issue Applications of Topological Data Analysis in the Life Sciences

Journal ref: Entropy 2022, 24(8), 1116

arXiv:2204.13326 [pdf, other]

Towards Flexible Inference in Sequential Decision Problems via Bidirectional Transformers

Authors: Micah Carroll, Jessy Lin, Orr Paradise, Raluca Georgescu, Mingfei Sun, David Bignell, Stephanie Milani, Katja Hofmann, Matthew Hausknecht, Anca Dragan, Sam Devlin

Abstract: Randomly masking and predicting word tokens has been a successful approach in pre-training language models for a variety of downstream tasks. In this work, we observe that the same idea also applies naturally to sequential decision making, where many well-studied tasks like behavior cloning, offline RL, inverse dynamics, and waypoint conditioning correspond to different sequence maskings over a se… ▽ More Randomly masking and predicting word tokens has been a successful approach in pre-training language models for a variety of downstream tasks. In this work, we observe that the same idea also applies naturally to sequential decision making, where many well-studied tasks like behavior cloning, offline RL, inverse dynamics, and waypoint conditioning correspond to different sequence maskings over a sequence of states, actions, and returns. We introduce the FlexiBiT framework, which provides a unified way to specify models which can be trained on many different sequential decision making tasks. We show that a single FlexiBiT model is simultaneously capable of carrying out many tasks with performance similar to or better than specialized models. Additionally, we show that performance can be further improved by fine-tuning our general model on specific tasks of interest. △ Less

Submitted 9 December, 2022; v1 submitted 28 April, 2022; originally announced April 2022.

Comments: Superseded by arXiv:2211.10869

arXiv:2204.11966 [pdf, other]

Estimating and Penalizing Induced Preference Shifts in Recommender Systems

Authors: Micah Carroll, Anca Dragan, Stuart Russell, Dylan Hadfield-Menell

Abstract: The content that a recommender system (RS) shows to users influences them. Therefore, when choosing a recommender to deploy, one is implicitly also choosing to induce specific internal states in users. Even more, systems trained via long-horizon optimization will have direct incentives to manipulate users: in this work, we focus on the incentive to shift user preferences so they are easier to sati… ▽ More The content that a recommender system (RS) shows to users influences them. Therefore, when choosing a recommender to deploy, one is implicitly also choosing to induce specific internal states in users. Even more, systems trained via long-horizon optimization will have direct incentives to manipulate users: in this work, we focus on the incentive to shift user preferences so they are easier to satisfy. We argue that - before deployment - system designers should: estimate the shifts a recommender would induce; evaluate whether such shifts would be undesirable; and perhaps even actively optimize to avoid problematic shifts. These steps involve two challenging ingredients: estimation requires anticipating how hypothetical algorithms would influence user preferences if deployed - we do this by using historical user interaction data to train a predictive user model which implicitly contains their preference dynamics; evaluation and optimization additionally require metrics to assess whether such influences are manipulative or otherwise unwanted - we use the notion of "safe shifts", that define a trust region within which behavior is safe: for instance, the natural way in which users would shift without interference from the system could be deemed "safe". In simulated experiments, we show that our learned preference dynamics model is effective in estimating user preferences and how they would respond to new recommenders. Additionally, we show that recommenders that optimize for staying in the trust region can avoid manipulative behaviors while still generating engagement. △ Less

Submitted 14 July, 2022; v1 submitted 25 April, 2022; originally announced April 2022.

Comments: Accepted to ICML 2022 (Spotlight)

Journal ref: Proceedings of the 39th International Conference on Machine Learning, PMLR 162:2686-2708, 2022

arXiv:2202.01365 [pdf, other]

Feasibility of Interactive 3D Map for Remote Sighted Assistance

Authors: Jingyi Xie, Rui Yu, Sooyeon Lee, Yao Lyu, Syed Masum Billah, John M. Carroll

Abstract: Remote sighted assistance (RSA) has emerged as a conversational assistive technology, where remote sighted workers, i.e., agents, provide real-time assistance to users with vision impairments via video-chat-like communication. Researchers found that agents' lack of environmental knowledge, the difficulty of orienting users in their surroundings, and the inability to estimate distances from users'… ▽ More Remote sighted assistance (RSA) has emerged as a conversational assistive technology, where remote sighted workers, i.e., agents, provide real-time assistance to users with vision impairments via video-chat-like communication. Researchers found that agents' lack of environmental knowledge, the difficulty of orienting users in their surroundings, and the inability to estimate distances from users' camera feeds are key challenges to sighted agents. To address these challenges, researchers have suggested assisting agents with computer vision technologies, especially 3D reconstruction. This paper presents a high-fidelity prototype of such an RSA, where agents use interactive 3D maps with localization capability. We conducted a walkthrough study with thirteen agents and one user with simulated vision impairment using this prototype. The study revealed that, compared to baseline RSA, the agents were significantly faster in providing navigational assistance to users, and their mental workload was significantly reduced -- all indicate the feasibility and prospect of 3D maps in RSA. △ Less

Submitted 2 February, 2022; originally announced February 2022.

arXiv:2105.10754 [pdf]

doi 10.1109/GEM.2019.8811554

Effects of VR Gaming and Game Genre on Player Experience

Authors: Michael Carroll, Ethan Osborne, Caglar Yildirim

Abstract: With the increasing availability of modern virtual reality (VR) headsets, the use and applications of VR technology for gaming purposes have become more pervasive than ever. Despite the growing popularity of VR gaming, user studies into how it might affect the player experience (PX) during the gameplay are scarce. Accordingly, the current study investigated the effects of VR gaming and game genre… ▽ More With the increasing availability of modern virtual reality (VR) headsets, the use and applications of VR technology for gaming purposes have become more pervasive than ever. Despite the growing popularity of VR gaming, user studies into how it might affect the player experience (PX) during the gameplay are scarce. Accordingly, the current study investigated the effects of VR gaming and game genre on PX. We compared PX metrics for two game genres, strategy (more interactive) and racing (less interactive), across two gaming platforms, VR and traditional desktop gaming. Participants were randomly assigned to one of the gaming platforms, played both a strategy and racing game on their corresponding platform, and provided PX ratings. Results revealed that, regardless of the game genre, participants in the VR gaming condition experienced a greater level of sense of presence than did those in the desktop gaming condition. That said, results showed that the two gaming platforms did not significantly differ from one another in PX ratings. As for the effect of game genre, participants provided greater PX ratings for the strategy game than for the racing game, regardless of whether the game was played on a VR headset or desktop computer. Collectively, these results indicate that although VR gaming affords a greater sense of presence in the game environment, this increase in presence does not seem to translate into a more satisfactory PX when playing either a strategy or racing game. △ Less

Submitted 22 May, 2021; originally announced May 2021.

Comments: 2019 IEEE Games, Entertainment, Media Conference (GEM)

arXiv:2103.13921 [pdf, ps, other]

The Resh Programming Language for Multirobot Orchestration

Authors: Martin Carroll, Kedar S. Namjoshi, Itai Segall

Abstract: This paper describes Resh, a new, statically typed, interpreted programming language and associated runtime for orchestrating multirobot systems. The main features of Resh are: (1) It offloads much of the tedious work of programming such systems away from the programmer and into the language runtime; (2) It is based on a small set of temporal and locational operators; and (3) It is not restricted… ▽ More This paper describes Resh, a new, statically typed, interpreted programming language and associated runtime for orchestrating multirobot systems. The main features of Resh are: (1) It offloads much of the tedious work of programming such systems away from the programmer and into the language runtime; (2) It is based on a small set of temporal and locational operators; and (3) It is not restricted to specific robot types or tasks. The Resh runtime consists of three engines that collaborate to run a Resh program using the available robots in their current environment. This paper describes both Resh and its runtime and gives examples of its use. △ Less

Submitted 25 March, 2021; originally announced March 2021.

Comments: Accepted for publication at ICRA'21

arXiv:2101.05507 [pdf, other]

Evaluating the Robustness of Collaborative Agents

Authors: Paul Knott, Micah Carroll, Sam Devlin, Kamil Ciosek, Katja Hofmann, A. D. Dragan, Rohin Shah

Abstract: In order for agents trained by deep reinforcement learning to work alongside humans in realistic settings, we will need to ensure that the agents are \emph{robust}. Since the real world is very diverse, and human behavior often changes in response to agent deployment, the agent will likely encounter novel situations that have never been seen during training. This results in an evaluation challenge… ▽ More In order for agents trained by deep reinforcement learning to work alongside humans in realistic settings, we will need to ensure that the agents are \emph{robust}. Since the real world is very diverse, and human behavior often changes in response to agent deployment, the agent will likely encounter novel situations that have never been seen during training. This results in an evaluation challenge: if we cannot rely on the average training or validation reward as a metric, then how can we effectively evaluate robustness? We take inspiration from the practice of \emph{unit testing} in software engineering. Specifically, we suggest that when designing AI agents that collaborate with humans, designers should search for potential edge cases in \emph{possible partner behavior} and \emph{possible states encountered}, and write tests which check that the behavior of the agent in these edge cases is reasonable. We apply this methodology to build a suite of unit tests for the Overcooked-AI environment, and use this test suite to evaluate three proposals for improving robustness. We find that the test suite provides significant insight into the effects of these proposals that were generally not revealed by looking solely at the average validation reward. △ Less

Submitted 14 January, 2021; originally announced January 2021.

arXiv:2011.07118 [pdf, ps, other]

Deep Multi-view Image Fusion for Soybean Yield Estimation in Breeding Applications Deep Multi-view Image Fusion for Soybean Yield Estimation in Breeding Applications

Authors: Luis G Riera, Matthew E. Carroll, Zhisheng Zhang, Johnathon M. Shook, Sambuddha Ghosal, Tianshuang Gao, Arti Singh, Sourabh Bhattacharya, Baskar Ganapathysubramanian, Asheesh K. Singh, Soumik Sarkar

Abstract: Reliable seed yield estimation is an indispensable step in plant breeding programs geared towards cultivar development in major row crops. The objective of this study is to develop a machine learning (ML) approach adept at soybean [\textit{Glycine max} L. (Merr.)] pod counting to enable genotype seed yield rank prediction from in-field video data collected by a ground robot. To meet this goal, we… ▽ More Reliable seed yield estimation is an indispensable step in plant breeding programs geared towards cultivar development in major row crops. The objective of this study is to develop a machine learning (ML) approach adept at soybean [\textit{Glycine max} L. (Merr.)] pod counting to enable genotype seed yield rank prediction from in-field video data collected by a ground robot. To meet this goal, we developed a multi-view image-based yield estimation framework utilizing deep learning architectures. Plant images captured from different angles were fused to estimate the yield and subsequently to rank soybean genotypes for application in breeding decisions. We used data from controlled imaging environment in field, as well as from plant breeding test plots in field to demonstrate the efficacy of our framework via comparing performance with manual pod counting and yield estimation. Our results demonstrate the promise of ML models in making breeding decisions with significant reduction of time and human effort, and opening new breeding methods avenues to develop cultivars. △ Less

Submitted 13 November, 2020; originally announced November 2020.

Comments: 18 pages, 8 figures, and 3 Tables

arXiv:2009.11247 [pdf, other]

Novel Computational Linguistic Measures, Dialogue System and the Development of SOPHIE: Standardized Online Patient for Healthcare Interaction Education

Authors: Mohammad Rafayet Ali, Taylan Sen, Benjamin Kane, Shagun Bose, Thomas M Carroll, Ronald Epstein, Lenhart Schubert, Ehsan Hoque

Abstract: In this paper, we describe the iterative participatory design of SOPHIE, an online virtual patient for feedback-based practice of sensitive patient-physician conversations, and discuss an initial qualitative evaluation of the system by professional end users. The design of SOPHIE was motivated from a computational linguistic analysis of the transcripts of 383 patient-physician conversations from a… ▽ More In this paper, we describe the iterative participatory design of SOPHIE, an online virtual patient for feedback-based practice of sensitive patient-physician conversations, and discuss an initial qualitative evaluation of the system by professional end users. The design of SOPHIE was motivated from a computational linguistic analysis of the transcripts of 383 patient-physician conversations from an essential office visit of late stage cancer patients with their oncologists. We developed methods for the automatic detection of two behavioral paradigms, lecturing and positive language usage patterns (sentiment trajectory of conversation), that are shown to be significantly associated with patient prognosis understanding. These automated metrics associated with effective communication were incorporated into SOPHIE, and a pilot user study identified that SOPHIE was favorably reviewed by a user group of practicing physicians. △ Less

Submitted 23 September, 2020; originally announced September 2020.

arXiv:2009.09086 [pdf, other]

Focused Clinical Query Understanding and Retrieval of Medical Snippets powered through a Healthcare Knowledge Graph

Authors: Maulik R. Kamdar, Michael Carroll, Will Dowling, Linda Wogulis, Cailey Fitzgerald, Matt Corkum, Danielle Walsh, David Conrad, Craig E. Stanley, Jr., Steve Ross, Dru Henke, Mevan Samarasinghe

Abstract: Clinicians face several significant barriers to search and synthesize accurate, succinct, updated, and trustworthy medical information from several literature sources during the practice of medicine and patient care. In this talk, we will be presenting our research behind the development of a Focused Clinical Search Service, powered by a Healthcare Knowledge Graph, to interpret the query intent be… ▽ More Clinicians face several significant barriers to search and synthesize accurate, succinct, updated, and trustworthy medical information from several literature sources during the practice of medicine and patient care. In this talk, we will be presenting our research behind the development of a Focused Clinical Search Service, powered by a Healthcare Knowledge Graph, to interpret the query intent behind clinical search queries and retrieve relevant medical snippets from a diverse corpus of medical literature. △ Less

Submitted 17 September, 2020; originally announced September 2020.

Comments: Under Review as a Podium Talk at the AMIA Informatics Summit 2021

arXiv:1910.05789 [pdf, other]

On the Utility of Learning about Humans for Human-AI Coordination

Authors: Micah Carroll, Rohin Shah, Mark K. Ho, Thomas L. Griffiths, Sanjit A. Seshia, Pieter Abbeel, Anca Dragan

Abstract: While we would like agents that can coordinate with humans, current algorithms such as self-play and population-based training create agents that can coordinate with themselves. Agents that assume their partner to be optimal or similar to them can converge to coordination protocols that fail to understand and be understood by humans. To demonstrate this, we introduce a simple environment that requ… ▽ More While we would like agents that can coordinate with humans, current algorithms such as self-play and population-based training create agents that can coordinate with themselves. Agents that assume their partner to be optimal or similar to them can converge to coordination protocols that fail to understand and be understood by humans. To demonstrate this, we introduce a simple environment that requires challenging coordination, based on the popular game Overcooked, and learn a simple model that mimics human play. We evaluate the performance of agents trained via self-play and population-based training. These agents perform very well when paired with themselves, but when paired with our human model, they are significantly worse than agents designed to play with the human model. An experiment with a planning algorithm yields the same conclusion, though only when the human-aware planner is given the exact human model that it is playing with. A user study with real humans shows this pattern as well, though less strongly. Qualitatively, we find that the gains come from having the agent adapt to the human's gameplay. Given this result, we suggest several approaches for designing agents that learn about humans in order to better coordinate with them. Code is available at https://github.com/HumanCompatibleAI/overcooked_ai. △ Less

Submitted 8 January, 2020; v1 submitted 13 October, 2019; originally announced October 2019.

Comments: Published at NeurIPS 2019 (http://papers.nips.cc/paper/8760-on-the-utility-of-learning-about-humans-for-human-ai-coordination)

arXiv:1902.05630 [pdf, other]

Using Key Player Analysis as a Method for Examining the Role of Community Animators in Technology Adoption

Authors: Jomara Sandbulte, Jessica Kropczynski, John M. Carroll

Abstract: This paper examines the role of community animators in technology adoption. Community animators are individuals that actively build social networks and broker ties between nodes in those networks. The present study observes technology adoption patterns through data collected from a mobile application at a local arts festival. A social network was constructed through photo-sharing and interaction w… ▽ More This paper examines the role of community animators in technology adoption. Community animators are individuals that actively build social networks and broker ties between nodes in those networks. The present study observes technology adoption patterns through data collected from a mobile application at a local arts festival. A social network was constructed through photo-sharing and interaction within the app. Given this data, we propose the use of key player analysis to identify community animators. In addition, we use a graph invariant (i.e., fragmentation in the network) to describe the role and impact of key players on the full network of interactions. Our results contribute to literature on technology adoption in usability studies by proposing a method to quantify and identify the theoretical concept of community animators. We further analyze the types of community animators to be found in early adoption of technology: the early adopters themselves, and the initiating developers. △ Less

Submitted 14 February, 2019; originally announced February 2019.

Comments: 10 pages, 10 figures, 4 tables

arXiv:1902.02842 [pdf, other]

Community Animation: Exploring a design space that leverages geosocial networking to increase community engagement

Authors: Jomara Sandbulte, Jessica Kropczynski, John M. Carroll

Abstract: This paper explores a design study of a smartphone enabled meet-up app meant to inspire engagement in community innovation. Community hubs such as co-working spaces, incubators, and maker spaces attract community members with diverse interests. This paper presents these spaces as a design opportunity for an application that helps host community-centered meet-ups in smart and connected communities.… ▽ More This paper explores a design study of a smartphone enabled meet-up app meant to inspire engagement in community innovation. Community hubs such as co-working spaces, incubators, and maker spaces attract community members with diverse interests. This paper presents these spaces as a design opportunity for an application that helps host community-centered meet-ups in smart and connected communities. Our design study explores three scenarios of use, inspired by previous literature, for organizing meet-ups and compares them by surveying potential users. Based on the results of our survey, we propose several design implications and implement them in the Community Animator geosocial networking application, which identifies nearby individuals that are willing to chat or perform community-centered activities. We present the results of both our survey and our prototype, discuss our design goals, and provide design implications for civic-minded, geosocial networking applications. Our contribution in this work is the development process, proposed design of a mobile application to support community-centered meet-ups, and insights for future work. △ Less

Submitted 7 February, 2019; originally announced February 2019.

Comments: 10 pages, 3 figures

arXiv:1812.00148 [pdf, ps, other]

Conversations for Vision: Remote Sighted Assistants Helping People with Visual Impairments

Authors: Sooyeon Lee, Madison Reddie, Krish Gurdasani, Xiying Wang, Jordan Beck, Mary Beth Rosson, John M. Carroll

Abstract: People with visual impairment (PVI) must interact with a world they cannot see. Remote sighted assistance has emerged as a conversational/social support system. We interviewed participants who either provide or receive assistance via a conversational/social prosthetic called Aira (https://aira.io/). We identified four types of support provided: scene description, performance, social interaction, a… ▽ More People with visual impairment (PVI) must interact with a world they cannot see. Remote sighted assistance has emerged as a conversational/social support system. We interviewed participants who either provide or receive assistance via a conversational/social prosthetic called Aira (https://aira.io/). We identified four types of support provided: scene description, performance, social interaction, and navigation. We found that conversational style is context-dependent. Sighted assistants make intentional efforts to elicit PVI's personal knowledge and leverage it in the guidance they provide. PVI used non-verbal behaviors (e.g. hand gestures) as a parallel communication channel to provide feedback or guidance to sighted assistants. We also discuss implications for design. △ Less

Submitted 1 December, 2018; originally announced December 2018.

Comments: 19 pages

arXiv:1609.09767 [pdf, other]

Internet Scale Research Studies using SDL-RX

Authors: James Kizer, Arnaud Sahaguet, Neil Lakin, Michael Carroll, JP Pollak, Deborah Estrin

Abstract: Medical research is one area where collecting data is usually hard and expensive. With the launch of ResearchKit, Apple and Sage Bionetworks made large-scale personal data collection increasingly popular via simple text-based survey apps running on mobile phones. But such surveys can be a barrier in terms of usability and richness of the data being collected. In this paper, we present SDL-R X , a… ▽ More Medical research is one area where collecting data is usually hard and expensive. With the launch of ResearchKit, Apple and Sage Bionetworks made large-scale personal data collection increasingly popular via simple text-based survey apps running on mobile phones. But such surveys can be a barrier in terms of usability and richness of the data being collected. In this paper, we present SDL-R X , a powerful software library designed for ResearchKit that enables study-specific, personalized, and rich visual surveys, for both iOS and Android platforms. △ Less

Submitted 30 September, 2016; originally announced September 2016.

Comments: Presented at the Data For Good Exchange 2016

arXiv:1304.1513 [pdf]

Hierarchical Evidence Accumulation in the Pseiki System and Experiments in Model-Driven Mobile Robot Navigation

Authors: A. C. Kak, K. M. Andress, C. Lopez-Abadia, M. S. Carroll, J. R. Lewis

Abstract: In this paper, we will review the process of evidence accumulation in the PSEIKI system for expectation-driven interpretation of images of 3-D scenes. Expectations are presented to PSEIKI as a geometrical hierarchy of abstractions. PSEIKI's job is then to construct abstraction hierarchies in the perceived image taking cues from the abstraction hierarchies in the expectations. The Dempster-Shafe… ▽ More In this paper, we will review the process of evidence accumulation in the PSEIKI system for expectation-driven interpretation of images of 3-D scenes. Expectations are presented to PSEIKI as a geometrical hierarchy of abstractions. PSEIKI's job is then to construct abstraction hierarchies in the perceived image taking cues from the abstraction hierarchies in the expectations. The Dempster-Shafer formalism is used for associating belief values with the different possible labels for the constructed abstractions in the perceived image. This system has been used successfully for autonomous navigation of a mobile robot in indoor environments. △ Less

Submitted 27 March, 2013; originally announced April 2013.

Comments: Appears in Proceedings of the Fifth Conference on Uncertainty in Artificial Intelligence (UAI1989)

Report number: UAI-P-1989-PG-194-207

arXiv:cs/0111007 [pdf, ps, other]

Explaining Scenarios for Information Personalization

Authors: Naren Ramakrishnan, Mary Beth Rosson, John M. Carroll

Abstract: Personalization customizes information access. The PIPE ("Personalization is Partial Evaluation") modeling methodology represents interaction with an information space as a program. The program is then specialized to a user's known interests or information seeking activity by the technique of partial evaluation. In this paper, we elaborate PIPE by considering requirements analysis in the persona… ▽ More Personalization customizes information access. The PIPE ("Personalization is Partial Evaluation") modeling methodology represents interaction with an information space as a program. The program is then specialized to a user's known interests or information seeking activity by the technique of partial evaluation. In this paper, we elaborate PIPE by considering requirements analysis in the personalization lifecycle. We investigate the use of scenarios as a means of identifying and analyzing personalization requirements. As our first result, we show how designing a PIPE representation can be cast as a search within a space of PIPE models, organized along a partial order. This allows us to view the design of a personalization system, itself, as specialized interpretation of an information space. We then exploit the underlying equivalence of explanation-based generalization (EBG) and partial evaluation to realize high-level goals and needs identified in scenarios; in particular, we specialize (personalize) an information space based on the explanation of a user scenario in that information space, just as EBG specializes a theory based on the explanation of an example in that theory. In this approach, personalization becomes the transformation of information spaces to support the explanation of usage scenarios. An example application is described. △ Less

Submitted 5 November, 2001; originally announced November 2001.

ACM Class: H.3.5; H.4.2; H.5.4; I.2.6; K.8

Showing 1–47 of 47 results for author: Carroll, M