Zum Hauptinhalt springen

Showing 1–8 of 8 results for author: Kenyon-Dean, K

Searching in archive cs. Search in all archives.
.
  1. arXiv:2404.10242  [pdf, other

    cs.CV cs.AI cs.LG

    Masked Autoencoders for Microscopy are Scalable Learners of Cellular Biology

    Authors: Oren Kraus, Kian Kenyon-Dean, Saber Saberian, Maryam Fallah, Peter McLean, Jess Leung, Vasudev Sharma, Ayla Khan, Jia Balakrishnan, Safiye Celik, Dominique Beaini, Maciej Sypetkowski, Chi Vicky Cheng, Kristen Morse, Maureen Makes, Ben Mabey, Berton Earnshaw

    Abstract: Featurizing microscopy images for use in biological research remains a significant challenge, especially for large-scale experiments spanning millions of images. This work explores the scaling properties of weakly supervised classifiers and self-supervised masked autoencoders (MAEs) when training with increasingly larger model backbones and microscopy datasets. Our results show that ViT-based MAEs… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

    Comments: CVPR 2024 Highlight. arXiv admin note: text overlap with arXiv:2309.16064

  2. arXiv:2309.16064  [pdf, other

    cs.CV cs.AI cs.LG

    Masked Autoencoders are Scalable Learners of Cellular Morphology

    Authors: Oren Kraus, Kian Kenyon-Dean, Saber Saberian, Maryam Fallah, Peter McLean, Jess Leung, Vasudev Sharma, Ayla Khan, Jia Balakrishnan, Safiye Celik, Maciej Sypetkowski, Chi Vicky Cheng, Kristen Morse, Maureen Makes, Ben Mabey, Berton Earnshaw

    Abstract: Inferring biological relationships from cellular phenotypes in high-content microscopy screens provides significant opportunity and challenge in biological research. Prior results have shown that deep vision models can capture biological signal better than hand-crafted features. This work explores how self-supervised deep learning approaches scale when training larger models on larger microscopy d… ▽ More

    Submitted 27 November, 2023; v1 submitted 27 September, 2023; originally announced September 2023.

    Comments: Spotlight at NeurIPS 2023 Generative AI and Biology (GenBio) Workshop

  3. arXiv:2011.07013  [pdf, other

    cs.CL cs.AI

    Deconstructing word embedding algorithms

    Authors: Kian Kenyon-Dean, Edward Newell, Jackie Chi Kit Cheung

    Abstract: Word embeddings are reliable feature representations of words used to obtain high quality results for various NLP applications. Uncontextualized word embeddings are used in many NLP tasks today, especially in resource-limited settings where high memory capacity and GPUs are not available. Given the historical success of word embeddings in NLP, we propose a retrospective on some of the most well-kn… ▽ More

    Submitted 12 November, 2020; originally announced November 2020.

    Comments: EMNLP 2020, 6 pages. arXiv admin note: substantial text overlap with arXiv:1911.13280

    MSC Class: 68T50

  4. arXiv:2011.02944  [pdf, other

    cs.CL

    Learning Efficient Task-Specific Meta-Embeddings with Word Prisms

    Authors: Jingyi He, KC Tsiolis, Kian Kenyon-Dean, Jackie Chi Kit Cheung

    Abstract: Word embeddings are trained to predict word cooccurrence statistics, which leads them to possess different lexical properties (syntactic, semantic, etc.) depending on the notion of context defined at training time. These properties manifest when querying the embedding space for the most similar vectors, and when used at the input layer of deep neural networks trained to solve downstream NLP proble… ▽ More

    Submitted 5 November, 2020; originally announced November 2020.

  5. arXiv:1911.13280  [pdf, other

    cs.CL cs.LG

    Deconstructing and reconstructing word embedding algorithms

    Authors: Edward Newell, Kian Kenyon-Dean, Jackie Chi Kit Cheung

    Abstract: Uncontextualized word embeddings are reliable feature representations of words used to obtain high quality results for various NLP applications. Given the historical success of word embeddings in NLP, we propose a retrospective on some of the most well-known word embedding algorithms. In this work, we deconstruct Word2vec, GloVe, and others, into a common form, unveiling some of the necessary and… ▽ More

    Submitted 29 November, 2019; originally announced November 2019.

    Comments: 15 pages

  6. arXiv:1911.02639  [pdf, other

    cs.CL cs.LG

    Word Embedding Algorithms as Generalized Low Rank Models and their Canonical Form

    Authors: Kian Kenyon-Dean

    Abstract: Word embedding algorithms produce very reliable feature representations of words that are used by neural network models across a constantly growing multitude of NLP tasks. As such, it is imperative for NLP practitioners to understand how their word representations are produced, and why they are so impactful. The present work presents the Simple Embedder framework, generalizing the state-of-the-a… ▽ More

    Submitted 6 November, 2019; originally announced November 2019.

    Comments: 82 pages; McGill University master's thesis, 2019

    MSC Class: 68Txx

  7. arXiv:1812.07627  [pdf, other

    cs.LG cs.AI stat.ML

    Clustering-Oriented Representation Learning with Attractive-Repulsive Loss

    Authors: Kian Kenyon-Dean, Andre Cianflone, Lucas Page-Caccia, Guillaume Rabusseau, Jackie Chi Kit Cheung, Doina Precup

    Abstract: The standard loss function used to train neural network classifiers, categorical cross-entropy (CCE), seeks to maximize accuracy on the training data; building useful representations is not a necessary byproduct of this objective. In this work, we propose clustering-oriented representation learning (COREL) as an alternative to CCE in the context of a generalized attractive-repulsive loss framework… ▽ More

    Submitted 18 December, 2018; originally announced December 2018.

    Comments: AAAI 2019 Workshop on Network Interpretability for Deep Learning (9 pages)

    MSC Class: 62H30

  8. arXiv:1805.10985  [pdf, other

    cs.CL

    Resolving Event Coreference with Supervised Representation Learning and Clustering-Oriented Regularization

    Authors: Kian Kenyon-Dean, Jackie Chi Kit Cheung, Doina Precup

    Abstract: We present an approach to event coreference resolution by developing a general framework for clustering that uses supervised representation learning. We propose a neural network architecture with novel Clustering-Oriented Regularization (CORE) terms in the objective function. These terms encourage the model to create embeddings of event mentions that are amenable to clustering. We then use agglome… ▽ More

    Submitted 28 May, 2018; originally announced May 2018.

    Comments: 10 pages, 2 figures; to be published in the Proceedings of the Seventh Joint Conference on Lexical and Computational Semantics (*SEM 2018), June 2018, New Orleans, LA