Zum Hauptinhalt springen

Showing 1–2 of 2 results for author: Gillman, N

Searching in archive cs. Search in all archives.
.
  1. arXiv:2402.07087  [pdf, other

    cs.LG cs.AI cs.CV stat.ML

    Self-Correcting Self-Consuming Loops for Generative Model Training

    Authors: Nate Gillman, Michael Freeman, Daksh Aggarwal, Chia-Hong Hsu, Calvin Luo, Yonglong Tian, Chen Sun

    Abstract: As synthetic data becomes higher quality and proliferates on the internet, machine learning models are increasingly trained on a mix of human- and machine-generated data. Despite the successful stories of using synthetic data for representation learning, using synthetic data for generative model training creates "self-consuming loops" which may lead to training instability or even collapse, unless… ▽ More

    Submitted 10 June, 2024; v1 submitted 10 February, 2024; originally announced February 2024.

    Comments: Camera ready version (ICML 2024). Code at https://nategillman.com/sc-sc.html

  2. IsoScore: Measuring the Uniformity of Embedding Space Utilization

    Authors: William Rudman, Nate Gillman, Taylor Rayne, Carsten Eickhoff

    Abstract: The recent success of distributed word representations has led to an increased interest in analyzing the properties of their spatial distribution. Several studies have suggested that contextualized word embedding models do not isotropically project tokens into vector space. However, current methods designed to measure isotropy, such as average random cosine similarity and the partition score, have… ▽ More

    Submitted 18 April, 2022; v1 submitted 16 August, 2021; originally announced August 2021.

    Comments: ACL 2022 camera ready version