Search | arXiv e-print repository

InstructExcel: A Benchmark for Natural Language Instruction in Excel

Authors: Justin Payan, Swaroop Mishra, Mukul Singh, Carina Negreanu, Christian Poelitz, Chitta Baral, Subhro Roy, Rasika Chakravarthy, Benjamin Van Durme, Elnaz Nouri

Abstract: With the evolution of Large Language Models (LLMs) we can solve increasingly more complex NLP tasks across various domains, including spreadsheets. This work investigates whether LLMs can generate code (Excel OfficeScripts, a TypeScript API for executing many tasks in Excel) that solves Excel specific tasks provided via natural language user instructions. To do so we introduce a new large-scale be… ▽ More With the evolution of Large Language Models (LLMs) we can solve increasingly more complex NLP tasks across various domains, including spreadsheets. This work investigates whether LLMs can generate code (Excel OfficeScripts, a TypeScript API for executing many tasks in Excel) that solves Excel specific tasks provided via natural language user instructions. To do so we introduce a new large-scale benchmark, InstructExcel, created by leveraging the 'Automate' feature in Excel to automatically generate OfficeScripts from users' actions. Our benchmark includes over 10k samples covering 170+ Excel operations across 2,000 publicly available Excel spreadsheets. Experiments across various zero-shot and few-shot settings show that InstructExcel is a hard benchmark for state of the art models like GPT-4. We observe that (1) using GPT-4 over GPT-3.5, (2) providing more in-context examples, and (3) dynamic prompting can help improve performance on this benchmark. △ Less

Submitted 22 October, 2023; originally announced October 2023.

Comments: Findings of EMNLP 2023, 18 pages

arXiv:2310.01297 [pdf, other]

Co-audit: tools to help humans double-check AI-generated content

Authors: Andrew D. Gordon, Carina Negreanu, José Cambronero, Rasika Chakravarthy, Ian Drosos, Hao Fang, Bhaskar Mitra, Hannah Richardson, Advait Sarkar, Stephanie Simmons, Jack Williams, Ben Zorn

Abstract: Users are increasingly being warned to check AI-generated content for correctness. Still, as LLMs (and other generative models) generate more complex output, such as summaries, tables, or code, it becomes harder for the user to audit or evaluate the output for quality or correctness. Hence, we are seeing the emergence of tool-assisted experiences to help the user double-check a piece of AI-generat… ▽ More Users are increasingly being warned to check AI-generated content for correctness. Still, as LLMs (and other generative models) generate more complex output, such as summaries, tables, or code, it becomes harder for the user to audit or evaluate the output for quality or correctness. Hence, we are seeing the emergence of tool-assisted experiences to help the user double-check a piece of AI-generated content. We refer to these as co-audit tools. Co-audit tools complement prompt engineering techniques: one helps the user construct the input prompt, while the other helps them check the output response. As a specific example, this paper describes recent research on co-audit tools for spreadsheet computations powered by generative models. We explain why co-audit experiences are essential for any application of generative AI where quality is important and errors are consequential (as is common in spreadsheet computations). We propose a preliminary list of principles for co-audit, and outline research challenges. △ Less

Submitted 2 October, 2023; originally announced October 2023.

arXiv:2110.01659 [pdf, other]

Cross-Modal Virtual Sensing for Combustion Instability Monitoring

Authors: Tryambak Gangopadhyay, Vikram Ramanan, Satyanarayanan R Chakravarthy, Soumik Sarkar

Abstract: In many cyber-physical systems, imaging can be an important but expensive or 'difficult to deploy' sensing modality. One such example is detecting combustion instability using flame images, where deep learning frameworks have demonstrated state-of-the-art performance. The proposed frameworks are also shown to be quite trustworthy such that domain experts can have sufficient confidence to use these… ▽ More In many cyber-physical systems, imaging can be an important but expensive or 'difficult to deploy' sensing modality. One such example is detecting combustion instability using flame images, where deep learning frameworks have demonstrated state-of-the-art performance. The proposed frameworks are also shown to be quite trustworthy such that domain experts can have sufficient confidence to use these models in real systems to prevent unwanted incidents. However, flame imaging is not a common sensing modality in engine combustors today. Therefore, the current roadblock exists on the hardware side regarding the acquisition and processing of high-volume flame images. On the other hand, the acoustic pressure time series is a more feasible modality for data collection in real combustors. To utilize acoustic time series as a sensing modality, we propose a novel cross-modal encoder-decoder architecture that can reconstruct cross-modal visual features from acoustic pressure time series in combustion systems. With the "distillation" of cross-modal features, the results demonstrate that the detection accuracy can be enhanced using the virtual visual sensing modality. By providing the benefit of cross-modal reconstruction, our framework can prove to be useful in different domains well beyond the power generation and transportation industries. △ Less

Submitted 6 October, 2021; v1 submitted 4 October, 2021; originally announced October 2021.

arXiv:2101.01877 [pdf, other]

3D Convolutional Selective Autoencoder For Instability Detection in Combustion Systems

Authors: Tryambak Gangopadhyay, Vikram Ramanan, Adedotun Akintayo, Paige K Boor, Soumalya Sarkar, Satyanarayanan R Chakravarthy, Soumik Sarkar

Abstract: While analytical solutions of critical (phase) transitions in physical systems are abundant for simple nonlinear systems, such analysis remains intractable for real-life dynamical systems. A key example of such a physical system is thermoacoustic instability in combustion, where prediction or early detection of an onset of instability is a hard technical challenge, which needs to be addressed to b… ▽ More While analytical solutions of critical (phase) transitions in physical systems are abundant for simple nonlinear systems, such analysis remains intractable for real-life dynamical systems. A key example of such a physical system is thermoacoustic instability in combustion, where prediction or early detection of an onset of instability is a hard technical challenge, which needs to be addressed to build safer and more energy-efficient gas turbine engines powering aerospace and energy industries. The instabilities arising in combustion chambers of engines are mathematically too complex to model. To address this issue in a data-driven manner instead, we propose a novel deep learning architecture called 3D convolutional selective autoencoder (3D-CSAE) to detect the evolution of self-excited oscillations using spatiotemporal data, i.e., hi-speed videos taken from a swirl-stabilized combustor (laboratory surrogate of gas turbine engine combustor). 3D-CSAE consists of filters to learn, in a hierarchical fashion, the complex visual and dynamic features related to combustion instability. We train the 3D-CSAE on frames of videos obtained from a limited set of operating conditions. We select the 3D-CSAE hyper-parameters that are effective for characterizing hierarchical and multiscale instability structure evolution by utilizing the dynamic information available in the video. The proposed model clearly shows performance improvement in detecting the precursors of instability. The machine learning-driven results are verified with physics-based off-line measures. Advanced active control mechanisms can directly leverage the proposed online detection capability of 3D-CSAE to mitigate the adverse effects of combustion instabilities on the engine operating under various stringent requirements and conditions. △ Less

Submitted 6 January, 2021; originally announced January 2021.

Showing 1–4 of 4 results for author: Chakravarthy, R