Zum Hauptinhalt springen

Showing 1–5 of 5 results for author: Song, Y S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.11435  [pdf, other

    q-bio.GN cs.LG stat.ML

    Genomic Language Models: Opportunities and Challenges

    Authors: Gonzalo Benegas, Chengzhong Ye, Carlos Albors, Jianan Canal Li, Yun S. Song

    Abstract: Large language models (LLMs) are having transformative impacts across a wide range of scientific fields, particularly in the biomedical sciences. Just as the goal of Natural Language Processing is to understand sequences of words, a major objective in biology is to understand biological sequences. Genomic Language Models (gLMs), which are LLMs trained on DNA sequences, have the potential to signif… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

    Comments: Review article; 25 pages, 3 figures, 1 table

    MSC Class: 92-08; 92B20; 68T50; 68T07

  2. arXiv:2105.10590  [pdf, other

    stat.ML cs.LG q-bio.BM q-bio.QM

    Parallelizing Contextual Bandits

    Authors: Jeffrey Chan, Aldo Pacchiano, Nilesh Tripuraneni, Yun S. Song, Peter Bartlett, Michael I. Jordan

    Abstract: Standard approaches to decision-making under uncertainty focus on sequential exploration of the space of decisions. However, \textit{simultaneously} proposing a batch of decisions, which leverages available resources for parallel experimentation, has the potential to rapidly accelerate exploration. We present a family of (parallel) contextual bandit algorithms applicable to problems with bounded e… ▽ More

    Submitted 5 February, 2023; v1 submitted 21 May, 2021; originally announced May 2021.

  3. arXiv:1906.08230  [pdf, other

    cs.LG q-bio.BM stat.ML

    Evaluating Protein Transfer Learning with TAPE

    Authors: Roshan Rao, Nicholas Bhattacharya, Neil Thomas, Yan Duan, Xi Chen, John Canny, Pieter Abbeel, Yun S. Song

    Abstract: Protein modeling is an increasingly popular area of machine learning research. Semi-supervised learning has emerged as an important paradigm in protein modeling due to the high cost of acquiring supervised protein labels, but the current literature is fragmented when it comes to datasets and standardized evaluation techniques. To facilitate progress in this field, we introduce the Tasks Assessing… ▽ More

    Submitted 19 June, 2019; originally announced June 2019.

    Comments: 20 pages, 4 figures

  4. arXiv:1802.06153  [pdf, other

    cs.LG q-bio.PE stat.ML

    A Likelihood-Free Inference Framework for Population Genetic Data using Exchangeable Neural Networks

    Authors: Jeffrey Chan, Valerio Perrone, Jeffrey P. Spence, Paul A. Jenkins, Sara Mathieson, Yun S. Song

    Abstract: An explosion of high-throughput DNA sequencing in the past decade has led to a surge of interest in population-scale inference with whole-genome data. Recent work in population genetics has centered on designing inference methods for relatively simple model classes, and few scalable general-purpose inference techniques exist for more realistic, complex models. To achieve this, two inferential chal… ▽ More

    Submitted 5 November, 2018; v1 submitted 16 February, 2018; originally announced February 2018.

    Comments: 9 pages, 8 figures

  5. arXiv:1612.03839  [pdf, other

    cs.LG stat.ML

    Tensor Decompositions via Two-Mode Higher-Order SVD (HOSVD)

    Authors: Miaoyan Wang, Yun S. Song

    Abstract: Tensor decompositions have rich applications in statistics and machine learning, and developing efficient, accurate algorithms for the problem has received much attention recently. Here, we present a new method built on Kruskal's uniqueness theorem to decompose symmetric, nearly orthogonally decomposable tensors. Unlike the classical higher-order singular value decomposition which unfolds a tensor… ▽ More

    Submitted 18 April, 2017; v1 submitted 12 December, 2016; originally announced December 2016.

    Comments: 33 pages, 5 figures

    Journal ref: Proceedings of the 20th International Conference on Artificial Intelligence and Statistics (AISTATS), PMLR, Vol. 54 (2017) 614-622