Instructions for ACL 2023 Proceedings

First Author
Affiliation / Address line 1
Affiliation / Address line 2
Affiliation / Address line 3
email@domain
\AndSecond Author
Affiliation / Address line 1
Affiliation / Address line 2
Affiliation / Address line 3
email@domain
Abstract

This document is a supplement to the general instructions for *ACL authors. It contains instructions for using the style file for ACL 2023. The document itself conforms to its own specifications, and is, therefore, an example of what your manuscript should look like. These instructions should be used both for papers submitted for review and for final versions of accepted papers.

1 Introduction

The recent success in Natural Language Processing (NLP) comes with a variety of Large Language Models (LLMs) such as GPT-3 (175B) brown2020language, PaLM (540B) chowdhery2023palm, and GPT-4 (1.7T) achiam2023gpt. These LLMs have demonstrated superior performance on various downstream tasks. However, alongside the performance, there is a rising concern about their occasionally undesired behaviors, like hallucinations in their responses ji-etal-2023-towards and misalignment with human expectations (vafa2024large). These phenomena coincide with the long-standing issue of training deep learning models, which were known to be vulnerable to spurious correlations with artifacts, shortcuts, and biases prevalent in real-world training data geirhos2020shortcut; hermann2020shapes. Hence, there is a growing demand for AI transparency, particularly in high-stakes applications. This demand underscores the need for research to understand model decisions and enhance their robustness.

Counterfactual generation has emerged as an effective means to probe and understand the reasoning behind the predictions of a model by highlighting which part of the input influences the outcomes (wachter2017counterfactual; miller2019explanation). It makes minimal modifications to an original instance to create counterfactual examples (CFEs) with different predicted classes. CFEs can be used to detect model fairness issues within minority groups (NIPS2017_a486cd07; NIPS2017_1271a702) and enhance the robustness and generalizability of the model by augmenting the training dataset (sen-etal-2021-counterfactually; wang2021robustness; qiu2024paircfr).

In NLP domain, early studies  jung2022counterfactual; robeer2021generating were inspired by traditional CFE generators for tabular data. However, due to the vast and discrete perturbation space of each word, directly applying these techniques in NLP domain becomes less effective and inefficient. Additionally, textual CFEs should adhere to lexicon and grammar rules, and follow the language context and logic sudhakar2019transforming; wu-etal-2021-polyjuice; ross-etal-2021-explaining. Subsequent research has begun to utilize the controlled text generation model conditioned on a sentence and a label robeer2021generating; madaan2021generate, or to replace influential words with alternatives for the target prediction ross-etal-2022-tailor; zhu-etal-2023-explain. More recently, the rise of LLMs enable users to craft sophisticated prompts to obtain desired CFEs chen2023disco; sachdeva-etal-2024-catfood.

As research on textual CFE generation rapidly grows, existing surveys that primarily focused on tabular data verma2020counterfactual; stepin2021survey; 10.1145/3527848; guidotti2022counterfactual fall short in providing clear guidelines for researchers and developers interested in this area. There is a pressing need for a systematic review specifically tailored to textual CFE generation.

The challenge of reviewing this area arises from the following factors. Firstly, the generation methods are inherently tied to the task definitions; different applications such as sentiment analysis and natural language inference require tailored generation strategies. Secondly, the formulation of the generation problem varies depending on the modification strategies and language models chosen. Finally, a broad and deep understanding that spans multiple disciplines including generative modeling, causality, AI explanation, and beyond NLP is essential to fully comprehend and evaluate various algorithms, which adds to the complexity and challenge of conducting a comprehensive review.

In this survey, we review past research on natural language counterfactual generation and categorize these methods into four groups, as shown in Figure 1: (1) Manual generation, where a human annotator is asked to edit a limited number of words for a given text to change its label (kaushik2019learning), (2) Gradient-based optimization involves fine-tuning a controlled text generation model using gradient descent, given the input sentence encoding and a desired target (robeer2021generating; yan2024counterfactual), (3) Identify and then generate, a two-stage approach that pinpoints and then substitutes words to alter the labels (malmi-etal-2020-unsupervised; gilo2024general; martens2014explaining), and (4) LLMs as counterfactual generators, which directly create the counterfactuals via prompting LLMs bhattacharjee2024llmguided; gat2023faithful; sachdeva-etal-2024-catfood. We also summarize the qualitative and quantitative metrics used to evaluate the quality of the generated counterfactuals. Finally, we discuss the remaining challenges in this field and outline promising research directions, particularly in the era of LLMs.

The rest of this paper is organized as follows: Section LABEL:sec:definition_of_cfe introduces the definition of CFEs and practical considerations during generation. Section LABEL:sec:methods_category presents our novel taxonomy and describes each group. Section LABEL:sec:evaluation_metric summarizes the metrics used to evaluate generation quality. Section 2 discusses ongoing challenges and promising research directions. Finally, Section 3 concludes the paper.

\includegraphics

[scale=0.13]fig/category_green_blue.pdf

Figure 1: Overview of the proposed taxonomy for natural language counterfactual generation. Refer to Figure LABEL:fig:full_tree in the Appendix for the complete taxonomy.

2 Challenges and Future Directions

Fair evaluation. Counterfactuals are inherently speculative, making it difficult to compare CFEs from different methods due to the absence of ground truth. This challenge arises from two main aspects: (1) Existing metrics evaluate CFEs from various, often non-comparable perspectives. For example, prioritizing higher proximity (minimal changes to the original text) typically results in lower diversity. Optimizing one metric often compromises another, making it difficult to dominate across all metrics and conclusively identify the best method. (2) Many methods use filtering techniques to discard undesired results. Direct comparisons between filtered and unfiltered CFEs may introduce bias in the evaluation process. For instance, methods employing GPT-2 to filter out grammatically incorrect or nonsensical sentences radford2019language; ross-etal-2022-tailor often outperform those that do not use such filters on fluency score.

Model privacy and security. Model privacy and security are crucial considerations in the development and deployment of machine learning systems. CFEs, which reveal sensitive changes near the decision boundary, can be exploited by adversaries to extract high-fidelity surrogate models aivodji2020model; 10.1145/3531146.3533188, posing risks to model integrity. Future research should focus on strategies to mitigate model extraction risks while maintaining the utility of CFEs.

Counterfactual multiplicity. Multiple counterfactuals can exist with similar evaluation scores. For example, replacing ‘terrible’ with ‘good’ or ‘excellent’ results in similar edit distances and flip rates. While current research often centres on generating CFEs, having diverse CFEs is crucial for understanding models from various perspectives wachter2017counterfactual, enhancing fairness detection with higher test coverage 10.1145/3351095.3372850, and training robust models joshi-he-2022-investigation; qiu2024paircfr. Future work should focus on selecting diverse CFEs or incorporating diversity into the objective, possibly with a cardinality constraint.

LLM-assisted CFEs. To better leverage LLMs’ understanding and reasoning capabilities, an in-depth task analysis of the counterfactual problem is the premise, which could help design clear and constructive prompts. This is particularly important when we design prompts for different NLP tasks. Prompts framed in a “What-if” scenario may outperform those framed as optimization problems. On the other hand, LLMs are not without flaws and face several challenges, including bias, fairness issues, hallucinations, and difficulty in retaining long-term context. We should consider integrating debiasing techniques and fairness constraints, developing advanced memory architectures and integrating external knowledge to mitigate these issues.

3 Conclusion

In this survey, we systematically review recent advancements, including the latest LLM-assisted generation approaches. Based on algorithmic differences, we propose a novel taxonomy that categorizes these methods into four groups, providing an in-depth comparison, discussion, and summary for each group. Additionally, we summarize the commonly used metrics to evaluate the quality of counterfactuals. Lastly, we discuss research challenges and aim to inspire future directions. With the widespread use of LLMs, model explanation, fairness, and robust training have received increasing attention. We believe this survey can serve as an easy-to-follow guideline to motivate future advances that harness these problems.

4 Introduction

These instructions are for authors submitting papers to ACL 2023 using . They are not self-contained. All authors must follow the general instructions for *ACL proceedings,111http://acl-org.github.io/ACLPUB/formatting.html as well as guidelines set forth in the ACL 2023 call for papers.222https://2023.aclweb.org/calls/main_conference/ This document contains additional instructions for the style files. The templates include the source of this document (acl2023.tex), the style file used to format it (acl2023.sty), an ACL bibliography style (acl_natbib.bst), an example bibliography (custom.bib), and the bibliography for the ACL Anthology (anthology.bib).

5 Engines

To produce a PDF file, pdf is strongly recommended (over original plus dvips+ps2pdf or dvipdf). Xe also produces PDF files, and is especially suitable for text in non-Latin scripts.

Command Output
{\"a} ä
{\^e} ê
{\‘i} ì
{\.I} İ
{\o} ø
{\’u} ú
{\aa} å
Command Output
{\c c} ç
{\u g} ğ
{\l} ł
{\~n} ñ
{\H o} ő
{\v r} ř
{\ss} ß
Table 1: Example commands for accented characters, to be used in, e.g., Bib entries.

6 Preamble

Output natbib command Old ACL-style command
(ct1965) \citep \cite
ct1965 \citealp no equivalent
ct1965 \citet \newcite
(ct1965) \citeyearpar \shortcite
ct1965’s (ct1965) \citeposs no equivalent
(FFT; ct1965) \citep[FFT;][] no equivalent
Table 2: Citation commands supported by the style file. The style is based on the natbib package and supports all natbib citation commands. It also supports commands defined in previous ACL style files for compatibility.

The first line of the file must be

\documentclass[11pt]{article}

To load the style file in the review version:

\usepackage[review]{ACL2023}

For the final version, omit the review option:

\usepackage{ACL2023}

To use Times Roman, put the following in the preamble:

\usepackage{times}

(Alternatives like txfonts or newtx are also acceptable.) Please see the source of this document for comments on other packages that may be useful. Set the title and author using \title and \author. Within the author list, format multiple authors using \and and \And and \AND; please see the source for examples. By default, the box containing the title and author names is set to the minimum of 5 cm. If you need more space, include the following in the preamble:

\setlength\titlebox{<dim>}

where <dim> is replaced with a length. Do not set this length smaller than 5 cm.

7 Document Body

7.1 Footnotes

Footnotes are inserted with the \footnote command.333This is a footnote.

7.2 Tables and figures

See Table 1 for an example of a table and its caption. Do not override the default caption sizes.

7.3 Hyperlinks

Users of older versions of may encounter the following error during compilation:

\pdfendlink ended up in different nesting level than \pdfstartlink.

This happens when pdf is used and a citation splits across a page boundary. The best way to fix this is to upgrade to 2018-12-01 or later.

7.4 Citations

Table 2 shows the syntax supported by the style files. We encourage you to use the natbib styles. You can use the command \citet (cite in text) to get “author (year)” citations, like this citation to a paper by Gusfield:97. You can use the command \citep (cite in parentheses) to get “(author, year)” citations (Gusfield:97). You can use the command \citealp (alternative cite without parentheses) to get “author, year” citations, which is useful for using citations within parentheses (e.g. Gusfield:97).

7.5 References

The and Bib style files provided roughly follow the American Psychological Association format. If your own bib file is named custom.bib, then placing the following before any appendices in your file will generate the references section for you:

\bibliographystyle{acl_natbib}
\bibliography{custom}

You can obtain the complete ACL Anthology as a Bib file from https://aclweb.org/anthology/anthology.bib.gz. To include both the Anthology and your own .bib file, use the following instead of the above.

\bibliographystyle{acl_natbib}
\bibliography{anthology,custom}

Please see Section 8 for information on preparing Bib files.

7.6 Appendices

Use \appendix before any appendix section to switch the section numbering over to letters. See Appendix A for an example.

8 Bib Files

Unicode cannot be used in Bib entries, and some ways of typing special characters can disrupt Bib’s alphabetization. The recommended way of typing special characters is shown in Table 1.

Please ensure that Bib records contain DOIs or URLs when possible, and for all the ACL materials that you reference. Use the doi field for DOIs and the url field for URLs. If a Bib entry has a URL or DOI field, the paper title in the references section will appear as a hyperlink to the paper, using the hyperref package.

Limitations

ACL 2023 requires all submissions to have a section titled “Limitations”, for discussing the limitations of the paper as a complement to the discussion of strengths in the main text. This section should occur after the conclusion, but before the references. It will not count towards the page limit. The discussion of limitations is mandatory. Papers without a limitation section will be desk-rejected without review.

While we are open to different types of limitations, just mentioning that a set of results have been shown for English only probably does not reflect what we expect. Mentioning that the method works mostly for languages with limited morphology, like English, is a much better alternative. In addition, limitations such as low scalability to long text, the requirement of large GPU resources, or other things that inspire crucial further investigation are welcome.

Ethics Statement

Scientific work published at ACL 2023 must comply with the ACL Ethics Policy.444https://www.aclweb.org/portal/content/acl-code-ethics We encourage all authors to include an explicit ethics statement on the broader impact of the work, or other ethical considerations after the conclusion but before the references. The ethics statement will not count toward the page limit (8 pages for long, 4 pages for short papers).

Acknowledgements

This document has been adapted by Jordan Boyd-Graber, Naoaki Okazaki, Anna Rogers from the style files used for earlier ACL, EMNLP and NAACL proceedings, including those for EACL 2023 by Isabelle Augenstein and Andreas Vlachos, EMNLP 2022 by Yue Zhang, Ryan Cotterell and Lea Frermann, ACL 2020 by Steven Bethard, Ryan Cotterell and Rui Yan, ACL 2019 by Douwe Kiela and Ivan Vulić, NAACL 2019 by Stephanie Lukin and Alla Roskovskaya, ACL 2018 by Shay Cohen, Kevin Gimpel, and Wei Lu, NAACL 2018 by Margaret Mitchell and Stephanie Lukin, Bib suggestions for (NA)ACL 2017/2018 from Jason Eisner, ACL 2017 by Dan Gildea and Min-Yen Kan, NAACL 2017 by Margaret Mitchell, ACL 2012 by Maggie Li and Michael White, ACL 2010 by Jing-Shin Chang and Philipp Koehn, ACL 2008 by Johanna D. Moore, Simone Teufel, James Allan, and Sadaoki Furui, ACL 2005 by Hwee Tou Ng and Kemal Oflazer, ACL 2002 by Eugene Charniak and Dekang Lin, and earlier ACL and EACL formats written by several people, including John Chen, Henry S. Thompson and Donald Walker. Additional elements were taken from the formatting instructions of the International Joint Conference on Artificial Intelligence and the Conference on Computer Vision and Pattern Recognition.

Appendix A Example Appendix

This is a section in the appendix.