Integration of large and diverse biological data sets is a daunting problem facing systems biology researchers. Exploring the complex issues of data validation, integration, and representation, we present a systematic approach for the management and analysis of large biological data sets based on data warehouses. Our system has been implemented in the Bioverse, a framework combining diverse protein information from a variety of knowledge areas such as molecular interactions, pathway localization, protein structure, and protein function.