Zum Hauptinhalt springen

Showing 1–16 of 16 results for author: Forde, J Z

.
  1. arXiv:2405.14782  [pdf, other

    cs.CL

    Lessons from the Trenches on Reproducible Evaluation of Language Models

    Authors: Stella Biderman, Hailey Schoelkopf, Lintang Sutawika, Leo Gao, Jonathan Tow, Baber Abbasi, Alham Fikri Aji, Pawan Sasanka Ammanamanchi, Sidney Black, Jordan Clive, Anthony DiPofi, Julen Etxaniz, Benjamin Fattori, Jessica Zosa Forde, Charles Foster, Jeffrey Hsu, Mimansa Jaiswal, Wilson Y. Lee, Haonan Li, Charles Lovering, Niklas Muennighoff, Ellie Pavlick, Jason Phang, Aviya Skowron, Samson Tan , et al. (5 additional authors not shown)

    Abstract: Effective evaluation of language models remains an open challenge in NLP. Researchers and engineers face methodological issues such as the sensitivity of models to evaluation setup, difficulty of proper comparisons across methods, and the lack of reproducibility and transparency. In this paper we draw on three years of experience in evaluating large language models to provide guidance and lessons… ▽ More

    Submitted 29 May, 2024; v1 submitted 23 May, 2024; originally announced May 2024.

  2. arXiv:2306.16900  [pdf, other

    cs.CL

    Surveying (Dis)Parities and Concerns of Compute Hungry NLP Research

    Authors: Ji-Ung Lee, Haritz Puerto, Betty van Aken, Yuki Arase, Jessica Zosa Forde, Leon Derczynski, Andreas Rücklé, Iryna Gurevych, Roy Schwartz, Emma Strubell, Jesse Dodge

    Abstract: Many recent improvements in NLP stem from the development and use of large pre-trained language models (PLMs) with billions of parameters. Large model sizes makes computational cost one of the main limiting factors for training and evaluating such models; and has raised severe concerns about the sustainability, reproducibility, and inclusiveness for researching PLMs. These concerns are often based… ▽ More

    Submitted 9 November, 2023; v1 submitted 29 June, 2023; originally announced June 2023.

  3. arXiv:2303.13592  [pdf, other

    cs.CL cs.AI

    Prompting Multilingual Large Language Models to Generate Code-Mixed Texts: The Case of South East Asian Languages

    Authors: Zheng-Xin Yong, Ruochen Zhang, Jessica Zosa Forde, Skyler Wang, Arjun Subramonian, Holy Lovenia, Samuel Cahyawijaya, Genta Indra Winata, Lintang Sutawika, Jan Christian Blaise Cruz, Yin Lin Tan, Long Phan, Rowena Garcia, Thamar Solorio, Alham Fikri Aji

    Abstract: While code-mixing is a common linguistic practice in many parts of the world, collecting high-quality and low-cost code-mixed data remains a challenge for natural language processing (NLP) research. The recent proliferation of Large Language Models (LLMs) compels one to ask: how capable are these systems in generating code-mixed data? In this paper, we explore prompting multilingual LLMs in a zero… ▽ More

    Submitted 12 September, 2023; v1 submitted 23 March, 2023; originally announced March 2023.

    Comments: Updating Authors

  4. arXiv:2211.14673  [pdf, other

    cs.AI

    Evaluation Beyond Task Performance: Analyzing Concepts in AlphaZero in Hex

    Authors: Charles Lovering, Jessica Zosa Forde, George Konidaris, Ellie Pavlick, Michael L. Littman

    Abstract: AlphaZero, an approach to reinforcement learning that couples neural networks and Monte Carlo tree search (MCTS), has produced state-of-the-art strategies for traditional board games like chess, Go, shogi, and Hex. While researchers and game commentators have suggested that AlphaZero uses concepts that humans consider important, it is unclear how these concepts are captured in the network. We inve… ▽ More

    Submitted 26 November, 2022; originally announced November 2022.

    Comments: 10 pages, Neural Information Processing Systems 2022

  5. arXiv:2211.12424  [pdf, other

    cs.DL cs.LG

    One Venue, Two Conferences: The Separation of Chinese and American Citation Networks

    Authors: Bingchen Zhao, Yuling Gu, Jessica Zosa Forde, Naomi Saphra

    Abstract: At NeurIPS, American and Chinese institutions cite papers from each other's regions substantially less than they cite endogamously. We build a citation graph to quantify this divide, compare it to European connectivity, and discuss the causes and consequences of the separation.

    Submitted 16 November, 2022; originally announced November 2022.

    Comments: Workshop on Cultures of AI and AI for Culture @ NeurIPS 2022

  6. arXiv:2211.05100  [pdf, other

    cs.CL

    BLOOM: A 176B-Parameter Open-Access Multilingual Language Model

    Authors: BigScience Workshop, :, Teven Le Scao, Angela Fan, Christopher Akiki, Ellie Pavlick, Suzana Ilić, Daniel Hesslow, Roman Castagné, Alexandra Sasha Luccioni, François Yvon, Matthias Gallé, Jonathan Tow, Alexander M. Rush, Stella Biderman, Albert Webson, Pawan Sasanka Ammanamanchi, Thomas Wang, Benoît Sagot, Niklas Muennighoff, Albert Villanova del Moral, Olatunji Ruwase, Rachel Bawden, Stas Bekman, Angelina McMillan-Major , et al. (369 additional authors not shown)

    Abstract: Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access… ▽ More

    Submitted 27 June, 2023; v1 submitted 9 November, 2022; originally announced November 2022.

  7. arXiv:2211.00683  [pdf, other

    cs.LG cs.AI

    Reduce, Reuse, Recycle: Improving Training Efficiency with Distillation

    Authors: Cody Blakeney, Jessica Zosa Forde, Jonathan Frankle, Ziliang Zong, Matthew L. Leavitt

    Abstract: Methods for improving the efficiency of deep network training (i.e. the resources required to achieve a given level of model quality) are of immediate benefit to deep learning practitioners. Distillation is typically used to compress models or improve model quality, but it's unclear if distillation actually improves training efficiency. Can the quality improvements of distillation be converted int… ▽ More

    Submitted 1 November, 2022; originally announced November 2022.

  8. arXiv:2209.00099  [pdf, other

    cs.CL

    Efficient Methods for Natural Language Processing: A Survey

    Authors: Marcos Treviso, Ji-Ung Lee, Tianchu Ji, Betty van Aken, Qingqing Cao, Manuel R. Ciosici, Michael Hassid, Kenneth Heafield, Sara Hooker, Colin Raffel, Pedro H. Martins, André F. T. Martins, Jessica Zosa Forde, Peter Milder, Edwin Simpson, Noam Slonim, Jesse Dodge, Emma Strubell, Niranjan Balasubramanian, Leon Derczynski, Iryna Gurevych, Roy Schwartz

    Abstract: Recent work in natural language processing (NLP) has yielded appealing results from scaling model parameters and training data; however, using only scale to improve performance means that resource consumption also grows. Such resources include data, time, storage, or energy, all of which are naturally limited and unevenly distributed. This motivates research into efficient methods that require few… ▽ More

    Submitted 24 March, 2023; v1 submitted 31 August, 2022; originally announced September 2022.

    Comments: Accepted at TACL, pre publication version

  9. arXiv:2204.08377  [pdf, ps, other

    cs.AI cs.LG

    Strengthening Subcommunities: Towards Sustainable Growth in AI Research

    Authors: Andi Peng, Jessica Zosa Forde, Yonadav Shavit, Jonathan Frankle

    Abstract: AI's rapid growth has been felt acutely by scholarly venues, leading to growing pains within the peer review process. These challenges largely center on the inability of specific subareas to identify and evaluate work that is appropriate according to criteria relevant to each subcommunity as determined by stakeholders of that subarea. We set forth a proposal that re-focuses efforts within these su… ▽ More

    Submitted 18 April, 2022; originally announced April 2022.

    Comments: ICLR 2022 ML Evaluation Standards Workshop

  10. A Tool for Organizing Key Characteristics of Virtual, Augmented, and Mixed Reality for Human-Robot Interaction Systems: Synthesizing VAM-HRI Trends and Takeaways

    Authors: Thomas R. Groechel, Michael E. Walker, Christine T. Chang, Eric Rosen, Jessica Zosa Forde

    Abstract: Frameworks have begun to emerge to categorize Virtual, Augmented, and Mixed Reality (VAM) technologies that provide immersive, intuitive interfaces to facilitate Human-Robot Interaction. These frameworks, however, fail to capture key characteristics of the growing subfield of VAM-HRI and can be difficult to consistently apply due to continuous scales. This work builds upon these prior frameworks t… ▽ More

    Submitted 10 February, 2022; v1 submitted 7 August, 2021; originally announced August 2021.

    Comments: Accepted to Robotics and Automation Magazine Special Issue on Extended Reality in Robotics

  11. arXiv:2104.00606  [pdf, other

    cs.LG cs.AI cs.CY

    Model Selection's Disparate Impact in Real-World Deep Learning Applications

    Authors: Jessica Zosa Forde, A. Feder Cooper, Kweku Kwegyir-Aggrey, Chris De Sa, Michael Littman

    Abstract: Algorithmic fairness has emphasized the role of biased data in automated decision outcomes. Recently, there has been a shift in attention to sources of bias that implicate fairness in other stages in the ML pipeline. We contend that one source of such bias, human preferences in model selection, remains under-explored in terms of its role in disparate impact across demographic groups. Using a deep… ▽ More

    Submitted 7 September, 2021; v1 submitted 1 April, 2021; originally announced April 2021.

    Comments: Science and Engineering of Deep Learning Workshop, ICLR 2021

  12. arXiv:2102.03034  [pdf, other

    cs.LG cs.LO

    Hyperparameter Optimization Is Deceiving Us, and How to Stop It

    Authors: A. Feder Cooper, Yucheng Lu, Jessica Zosa Forde, Christopher De Sa

    Abstract: Recent empirical work shows that inconsistent results based on choice of hyperparameter optimization (HPO) configuration are a widespread problem in ML research. When comparing two algorithms J and K searching one subspace can yield the conclusion that J outperforms K, whereas searching another can entail the opposite. In short, the way we choose hyperparameters can deceive us. We provide a theore… ▽ More

    Submitted 25 October, 2021; v1 submitted 5 February, 2021; originally announced February 2021.

    Comments: To appear, NeurIPS 2021

    Journal ref: Advances in Neural Information Processing Systems 34 pre-proceedings (NeurIPS 2021)

  13. arXiv:2007.04091  [pdf, other

    cs.LG stat.ML

    Bespoke vs. Prêt-à-Porter Lottery Tickets: Exploiting Mask Similarity for Trainable Sub-Network Finding

    Authors: Michela Paganini, Jessica Zosa Forde

    Abstract: The observation of sparse trainable sub-networks within over-parametrized networks - also known as Lottery Tickets (LTs) - has prompted inquiries around their trainability, scaling, uniqueness, and generalization properties. Across 28 combinations of image classification tasks and architectures, we discover differences in the connectivity structure of LTs found through different iterative pruning… ▽ More

    Submitted 6 July, 2020; originally announced July 2020.

    Comments: arXiv admin note: text overlap with arXiv:2001.05050

  14. arXiv:2006.07484  [pdf, other

    cs.SE cs.LG

    dagger: A Python Framework for Reproducible Machine Learning Experiment Orchestration

    Authors: Michela Paganini, Jessica Zosa Forde

    Abstract: Many research directions in machine learning, particularly in deep learning, involve complex, multi-stage experiments, commonly involving state-mutating operations acting on models along multiple paths of execution. Although machine learning frameworks provide clean interfaces for defining model architectures and unbranched flows, burden is often placed on the researcher to track experimental prov… ▽ More

    Submitted 12 June, 2020; originally announced June 2020.

    Comments: 4 pages, 3 code listings, 1 figure

  15. arXiv:1912.03606  [pdf, other

    cs.CV cs.LG eess.IV stat.ML

    Individual predictions matter: Assessing the effect of data ordering in training fine-tuned CNNs for medical imaging

    Authors: John R. Zech, Jessica Zosa Forde, Michael L. Littman

    Abstract: We reproduced the results of CheXNet with fixed hyperparameters and 50 different random seeds to identify 14 finding in chest radiographs (x-rays). Because CheXNet fine-tunes a pre-trained DenseNet, the random seed affects the ordering of the batches of training data but not the initialized model weights. We found substantial variability in predictions for the same radiograph across model runs (me… ▽ More

    Submitted 7 December, 2019; originally announced December 2019.

    Comments: J.Z. and J.F. contributed equally to this work

  16. arXiv:1904.10922  [pdf, ps, other

    cs.LG stat.ML

    The Scientific Method in the Science of Machine Learning

    Authors: Jessica Zosa Forde, Michela Paganini

    Abstract: In the quest to align deep learning with the sciences to address calls for rigor, safety, and interpretability in machine learning systems, this contribution identifies key missing pieces: the stages of hypothesis formulation and testing, as well as statistical and systematic uncertainty estimation -- core tenets of the scientific method. This position paper discusses the ways in which contemporar… ▽ More

    Submitted 24 April, 2019; originally announced April 2019.

    Comments: 4 pages + 1 appendix. Presented at the ICLR 2019 Debugging Machine Learning Models workshop