Zum Hauptinhalt springen

Showing 1–7 of 7 results for author: Knight, R

Searching in archive q-bio. Search in all archives.
.
  1. arXiv:2107.05397  [pdf

    q-bio.GN cs.PF q-bio.QM

    Enabling microbiome research on personal devices

    Authors: Igor Sfiligoi, Daniel McDonald, Rob Knight

    Abstract: Microbiome studies have recently transitioned from experimental designs with a few hundred samples to designs spanning tens of thousands of samples. Modern studies such as the Earth Microbiome Project (EMP) afford the statistics crucial for untangling the many factors that influence microbial community composition. Analyzing those data used to require access to a compute cluster, making it both ex… ▽ More

    Submitted 8 July, 2021; originally announced July 2021.

    Comments: 2 pages, 4 figures, to be published in proceedings of eScience 2021

    Journal ref: 2021 IEEE 17th International Conference on eScience (eScience), 2021, pp. 229-230

  2. arXiv:2104.14005  [pdf

    q-bio.GN q-bio.PE

    Unlocking capacities of viral genomics for the COVID-19 pandemic response

    Authors: Sergey Knyazev, Karishma Chhugani, Varuni Sarwal, Ram Ayyala, Harman Singh, Smruthi Karthikeyan, Dhrithi Deshpande, Zoia Comarova, Angela Lu, Yuri Porozov, Aiping Wu, Malak Abedalthagafi, Shivashankar Nagaraj, Adam Smith, Pavel Skums, Jason Ladner, Tommy Tsan-Yuk Lam, Nicholas Wu, Alex Zelikovsky, Rob Knight, Keith Crandall, Serghei Mangul

    Abstract: More than any other infectious disease epidemic, the COVID-19 pandemic has been characterized by the generation of large volumes of viral genomic data at an incredible pace due to recent advances in high-throughput sequencing technologies, the rapid global spread of SARS-CoV-2, and its persistent threat to public health. However, distinguishing the most epidemiologically relevant information encod… ▽ More

    Submitted 4 June, 2021; v1 submitted 28 April, 2021; originally announced April 2021.

  3. Accelerating key bioinformatics tasks 100-fold by improving memory access

    Authors: Igor Sfiligoi, Daniel McDonald, Rob Knight

    Abstract: Most experimental sciences now rely on computing, and biological sciences are no exception. As datasets get bigger, so do the computing costs, making proper optimization of the codes used by scientists increasingly important. Many of the codes developed in recent years are based on the Python-based NumPy, due to its ease of use and good performance characteristics. The composable nature of NumPy,… ▽ More

    Submitted 19 April, 2021; originally announced April 2021.

    Comments: 6 pages, 3 tables, 7 algorithms, To be published in Proceedings of PEARC21

  4. arXiv:2012.00001  [pdf, other

    q-bio.QM cs.LG

    Utilizing stability criteria in choosing feature selection methods yields reproducible results in microbiome data

    Authors: Lingjing Jiang, Niina Haiminen, Anna-Paola Carrieri, Shi Huang, Yoshiki Vazquez-Baeza, Laxmi Parida, Ho-Cheol Kim, Austin D. Swafford, Rob Knight, Loki Natarajan

    Abstract: Feature selection is indispensable in microbiome data analysis, but it can be particularly challenging as microbiome data sets are high-dimensional, underdetermined, sparse and compositional. Great efforts have recently been made on developing new methods for feature selection that handle the above data characteristics, but almost all methods were evaluated based on performance of model prediction… ▽ More

    Submitted 30 November, 2020; originally announced December 2020.

    Report number: https://doi.org/10.1111/biom.13481

  5. Porting and optimizing UniFrac for GPUs

    Authors: Igor Sfiligoi, Daniel McDonald, Rob Knight

    Abstract: UniFrac is a commonly used metric in microbiome research for comparing microbiome profiles to one another ("beta diversity"). The recently implemented Striped UniFrac added the capability to split the problem into many independent subproblems and exhibits near linear scaling. In this paper we describe steps undertaken in porting and optimizing Striped Unifrac to GPUs. We reduced the run time of co… ▽ More

    Submitted 12 May, 2020; originally announced May 2020.

    Comments: 4 pages, 3 figures, 4 tables

  6. Assemblathon 2: evaluating de novo methods of genome assembly in three vertebrate species

    Authors: Keith R. Bradnam, Joseph N. Fass, Anton Alexandrov, Paul Baranay, Michael Bechner, İnanç Birol, Sébastien Boisvert, Jarrod A. Chapman, Guillaume Chapuis, Rayan Chikhi, Hamidreza Chitsaz, Wen-Chi Chou, Jacques Corbeil, Cristian Del Fabbro, T. Roderick Docking, Richard Durbin, Dent Earl, Scott Emrich, Pavel Fedotov, Nuno A. Fonseca, Ganeshkumar Ganapathy, Richard A. Gibbs, Sante Gnerre, Élénie Godzaridis, Steve Goldstein , et al. (66 additional authors not shown)

    Abstract: Background - The process of generating raw genome sequence data continues to become cheaper, faster, and more accurate. However, assembly of such data into high-quality, finished genome sequences remains challenging. Many genome assembly tools are available, but they differ greatly in terms of their performance (speed, scalability, hardware requirements, acceptance of newer read technologies) and… ▽ More

    Submitted 27 June, 2013; v1 submitted 23 January, 2013; originally announced January 2013.

    Comments: Additional files available at http://korflab.ucdavis.edu/Datasets/Assemblathon/Assemblathon2/Additional_files/ Major changes 1. Accessions for the 3 read data sets have now been included 2. New file: spreadsheet containing details of all Study, Sample, Run, & Experiment identifiers 3. Made miscellaneous changes to address reviewers comments. DOIs added to GigaDB datasets

    Journal ref: GigaScience 2:10 (2013)

  7. arXiv:0704.3221  [pdf, other

    math.PR math.CO math.ST q-bio.GN q-bio.QM

    Multiple pattern matching: A Markov chain approach

    Authors: Manuel Lladser, M. D. Betterton, Rob Knight

    Abstract: RNA motifs typically consist of short, modular patterns that include base pairs formed within and between modules. Estimating the abundance of these patterns is of fundamental importance for assessing the statistical significance of matches in genomewide searches, and for predicting whether a given function has evolved many times in different species or arose from a single common ancestor. In th… ▽ More

    Submitted 24 April, 2007; originally announced April 2007.

    Comments: Final version to appear in the Journal of Mathematical Biology

    MSC Class: 46N60; 05A15; 05A16