Keeping it light: (re)analyzing community-wide datasets without major infrastructure

Gigascience. 2019 Feb 1;8(2):giy159. doi: 10.1093/gigascience/giy159.

Abstract

DNA sequencing technology has revolutionized the field of biology, shifting biology from a data-limited to data-rich state. Central to the interpretation of sequencing data are the computational tools and approaches that convert raw data into biologically meaningful information. Both the tools and the generation of data are actively evolving, yet the practice of re-analysis of previously generated data with new tools is not commonplace. Re-analysis of existing data provides an affordable means of generating new information and will likely become more routine within biology, yet necessitates a new set of considerations for best practices and resource development. Here, we discuss several practices that we believe to be broadly applicable when re-analyzing data, especially when done by small research groups.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Computational Biology / methods*
  • High-Throughput Nucleotide Sequencing / methods*
  • Reproducibility of Results
  • Sequence Analysis, DNA / methods*