Storing sparse and repeated data in multivariate Markovian models of tuberculosis spread

R A Bielefeld; S M Debanne; D Y Rowland

doi:10.1006/cbmr.1996.0009

Storing sparse and repeated data in multivariate Markovian models of tuberculosis spread

Comput Biomed Res. 1996 Apr;29(2):85-92. doi: 10.1006/cbmr.1996.0009.

Authors

R A Bielefeld¹, S M Debanne, D Y Rowland

Affiliation

¹ Department of Epidemiology and Biostatistics, School of Medicine, Case Western Reserve University, Cleveland, Ohio 44106, USA.

PMID: 8785913
DOI: 10.1006/cbmr.1996.0009

Abstract

Through the use of appropriate sparse storage techniques, we were able to reduce memory usage in a multivariate Markovian model for the spread of tuberculosis in the United States through the year 2010. A straightforward software implementation of the model would have required approximately 2.5 x 10(9) bytes of storage for the population of each year being modeled and approximately 1.3 x 10(14) bytes of storage for each year-to-year set of transition probabilities. We were able to reduce memory usage in the model by 96% for cross-sectional population data and over 99.9% for transition probability data. Data structure initialization time for population data was increased by a factor of 16.48 and lookup time for population data was increased by a factor of 11.3 over times required for an array implementation. For transition data the initialization and lookup times were increased by negligible factors. This work was done under contract from the Centers for Disease Control and the Association of Teachers of Preventive Medicine.

Publication types

Research Support, Non-U.S. Gov't
Research Support, U.S. Gov't, P.H.S.

MeSH terms

Centers for Disease Control and Prevention, U.S.
Cross-Sectional Studies
Humans
Information Storage and Retrieval*
Markov Chains*
Models, Statistical
Multivariate Analysis
Population Surveillance
Preventive Medicine
Probability
Software
Tuberculosis, Pulmonary / epidemiology*
United States / epidemiology