Zum Hauptinhalt springen

Showing 1–13 of 13 results for author: Stephenson, C

Searching in archive cs. Search in all archives.
.
  1. arXiv:2310.16825  [pdf, other

    cs.CV cs.CY

    CommonCanvas: An Open Diffusion Model Trained with Creative-Commons Images

    Authors: Aaron Gokaslan, A. Feder Cooper, Jasmine Collins, Landan Seguin, Austin Jacobson, Mihir Patel, Jonathan Frankle, Cory Stephenson, Volodymyr Kuleshov

    Abstract: We assemble a dataset of Creative-Commons-licensed (CC) images, which we use to train a set of open diffusion models that are qualitatively competitive with Stable Diffusion 2 (SD2). This task presents two challenges: (1) high-resolution CC images lack the captions necessary to train text-to-image generative models; (2) CC images are relatively scarce. In turn, to address these challenges, we use… ▽ More

    Submitted 25 October, 2023; originally announced October 2023.

  2. arXiv:2206.00832  [pdf, other

    cs.LG

    Fast Benchmarking of Accuracy vs. Training Time with Cyclic Learning Rates

    Authors: Jacob Portes, Davis Blalock, Cory Stephenson, Jonathan Frankle

    Abstract: Benchmarking the tradeoff between neural network accuracy and training time is computationally expensive. Here we show how a multiplicative cyclic learning rate schedule can be used to construct a tradeoff curve in a single training run. We generate cyclic tradeoff curves for combinations of training methods such as Blurpool, Channels Last, Label Smoothing and MixUp, and highlight how these cyclic… ▽ More

    Submitted 10 November, 2022; v1 submitted 1 June, 2022; originally announced June 2022.

    Comments: 9 pages, 5 figures, "Has it Trained Yet?" Workshop at NeurIPS 2022

  3. arXiv:2108.12006  [pdf, other

    cs.LG

    When and how epochwise double descent happens

    Authors: Cory Stephenson, Tyler Lee

    Abstract: Deep neural networks are known to exhibit a `double descent' behavior as the number of parameters increases. Recently, it has also been shown that an `epochwise double descent' effect exists in which the generalization error initially drops, then rises, and finally drops again with increasing training time. This presents a practical problem in that the amount of time required for training is long,… ▽ More

    Submitted 26 August, 2021; originally announced August 2021.

    Comments: 15 Pages (main 11 pages, supplemental 4 pages), 5 figures

  4. arXiv:2105.14602  [pdf, other

    cs.LG cond-mat.dis-nn stat.ML

    On the geometry of generalization and memorization in deep neural networks

    Authors: Cory Stephenson, Suchismita Padhy, Abhinav Ganesh, Yue Hui, Hanlin Tang, SueYeon Chung

    Abstract: Understanding how large neural networks avoid memorizing training data is key to explaining their high generalization performance. To examine the structure of when and where memorization occurs in a deep network, we use a recently developed replica-based mean field theoretic geometric analysis method. We find that all layers preferentially learn from examples which share features, and link this be… ▽ More

    Submitted 30 May, 2021; originally announced May 2021.

    Comments: ICLR 2021

  5. arXiv:2008.07030  [pdf, other

    eess.IV cs.CV cs.LG

    Training CNN Classifiers for Semantic Segmentation using Partially Annotated Images: with Application on Human Thigh and Calf MRI

    Authors: Chun Kit Wong, Stephanie Marchesseau, Maria Kalimeri, Tiang Siew Yap, Serena S. H. Teo, Lingaraj Krishna, Alfredo Franco-Obregón, Stacey K. H. Tay, Chin Meng Khoo, Philip T. H. Lee, Melvin K. S. Leow, John J. Totman, Mary C. Stephenson

    Abstract: Objective: Medical image datasets with pixel-level labels tend to have a limited number of organ or tissue label classes annotated, even when the images have wide anatomical coverage. With supervised learning, multiple classifiers are usually needed given these partially annotated datasets. In this work, we propose a set of strategies to train one single classifier in segmenting all label classes… ▽ More

    Submitted 16 August, 2020; originally announced August 2020.

    Comments: Submitted to IEEE Transactions on Medical Imaging (Special Issue on Annotation-Efficient Deep Learning for Medical Imaging)

  6. arXiv:2006.01095  [pdf, other

    cs.CL cs.NE

    Emergence of Separable Manifolds in Deep Language Representations

    Authors: Jonathan Mamou, Hang Le, Miguel Del Rio, Cory Stephenson, Hanlin Tang, Yoon Kim, SueYeon Chung

    Abstract: Deep neural networks (DNNs) have shown much empirical success in solving perceptual tasks across various cognitive modalities. While they are only loosely inspired by the biological brain, recent studies report considerable similarities between representations extracted from task-optimized DNNs and neural populations in the brain. DNNs have subsequently become a popular model class to infer comput… ▽ More

    Submitted 8 July, 2020; v1 submitted 1 June, 2020; originally announced June 2020.

    Comments: 9 pages. 10 figures. Accepted to ICML 2020. Included supplemental materials

  7. arXiv:2003.01787  [pdf, other

    cs.LG cond-mat.dis-nn cs.CL cs.SD eess.AS

    Untangling in Invariant Speech Recognition

    Authors: Cory Stephenson, Jenelle Feather, Suchismita Padhy, Oguz Elibol, Hanlin Tang, Josh McDermott, SueYeon Chung

    Abstract: Encouraged by the success of deep neural networks on a variety of visual tasks, much theoretical and experimental work has been aimed at understanding and interpreting how vision networks operate. Meanwhile, deep neural networks have also achieved impressive performance in audio processing applications, both as sub-components of larger systems and as complete end-to-end systems by themselves. Desp… ▽ More

    Submitted 3 March, 2020; originally announced March 2020.

    Comments: Advances in Neural Information Processing Systems. 2019

  8. arXiv:1910.00067  [pdf, other

    stat.ML cs.LG eess.AS

    Semi-supervised voice conversion with amortized variational inference

    Authors: Cory Stephenson, Gokce Keskin, Anil Thomas, Oguz H. Elibol

    Abstract: In this work we introduce a semi-supervised approach to the voice conversion problem, in which speech from a source speaker is converted into speech of a target speaker. The proposed method makes use of both parallel and non-parallel utterances from the source and target simultaneously during training. This approach can be used to extend existing parallel data voice conversion systems such that th… ▽ More

    Submitted 30 September, 2019; originally announced October 2019.

    Comments: Accepted for publication at Interspeech 2019

    Journal ref: Proc. Interspeech 2019 (2019): 729-733

  9. arXiv:1905.03864  [pdf, other

    eess.AS cs.LG cs.SD

    Adversarially Trained Autoencoders for Parallel-Data-Free Voice Conversion

    Authors: Orhan Ocal, Oguz H. Elibol, Gokce Keskin, Cory Stephenson, Anil Thomas, Kannan Ramchandran

    Abstract: We present a method for converting the voices between a set of speakers. Our method is based on training multiple autoencoder paths, where there is a single speaker-independent encoder and multiple speaker-dependent decoders. The autoencoders are trained with an addition of an adversarial loss which is provided by an auxiliary classifier in order to guide the output of the encoder to be speaker in… ▽ More

    Submitted 9 May, 2019; originally announced May 2019.

  10. arXiv:1905.02525  [pdf, other

    eess.AS cs.CL cs.SD

    Many-to-Many Voice Conversion with Out-of-Dataset Speaker Support

    Authors: Gokce Keskin, Tyler Lee, Cory Stephenson, Oguz H. Elibol

    Abstract: We present a Cycle-GAN based many-to-many voice conversion method that can convert between speakers that are not in the training set. This property is enabled through speaker embeddings generated by a neural network that is jointly trained with the Cycle-GAN. In contrast to prior work in this domain, our method enables conversion between an out-of-dataset speaker and a target speaker in either dir… ▽ More

    Submitted 30 April, 2019; originally announced May 2019.

    Comments: Submitted to Interspeech 2019

  11. arXiv:1804.10669  [pdf, other

    cs.SD cs.AI eess.AS

    Deep Speech Denoising with Vector Space Projections

    Authors: Jeff Hetherly, Paul Gamble, Maria Barrios, Cory Stephenson, Karl Ni

    Abstract: We propose an algorithm to denoise speakers from a single microphone in the presence of non-stationary and dynamic noise. Our approach is inspired by the recent success of neural network models separating speakers from other speakers and singers from instrumental accompaniment. Unlike prior art, we leverage embedding spaces produced with source-contrastive estimation, a technique derived from nega… ▽ More

    Submitted 27 April, 2018; originally announced April 2018.

    Comments: arXiv admin note: text overlap with arXiv:1705.04662

  12. arXiv:1804.05053  [pdf, other

    cs.SD eess.AS

    Voices Obscured in Complex Environmental Settings (VOICES) corpus

    Authors: Colleen Richey, Maria A. Barrios, Zeb Armstrong, Chris Bartels, Horacio Franco, Martin Graciarena, Aaron Lawson, Mahesh Kumar Nandwana, Allen Stauffer, Julien van Hout, Paul Gamble, Jeff Hetherly, Cory Stephenson, Karl Ni

    Abstract: This paper introduces the Voices Obscured In Complex Environmental Settings (VOICES) corpus, a freely available dataset under Creative Commons BY 4.0. This dataset will promote speech and signal processing research of speech recorded by far-field microphones in noisy room conditions. Publicly available speech corpora are mostly composed of isolated speech at close-range microphony. A typical appro… ▽ More

    Submitted 15 May, 2018; v1 submitted 13 April, 2018; originally announced April 2018.

    Comments: Submitted to Interspeech 2018

  13. arXiv:1705.04662  [pdf, other

    cs.SD cs.AI cs.LG stat.ML

    Monaural Audio Speaker Separation with Source Contrastive Estimation

    Authors: Cory Stephenson, Patrick Callier, Abhinav Ganesh, Karl Ni

    Abstract: We propose an algorithm to separate simultaneously speaking persons from each other, the "cocktail party problem", using a single microphone. Our approach involves a deep recurrent neural networks regression to a vector space that is descriptive of independent speakers. Such a vector space can embed empirically determined speaker characteristics and is optimized by distinguishing between speaker m… ▽ More

    Submitted 12 May, 2017; originally announced May 2017.