Katta is a scalable, failure tolerant, distributed, data storage for real time access.
Katta serves large, replicated, indices as shards to serve high loads and very large data sets. These indices can be of different type. Currently implementations are available for Lucene and Hadoop mapfiles.
* Makes serving large or high load indices easy
* Serves very large Lucene or Hadoop Mapfile indices as index shards on many servers
* Replicate shards on different servers for performance and fault-tolerance
* Supports pluggable network topologies
* Master fail-over
* Fast, lightweight, easy to integrate
* Plays well with Hadoop clusters
* Apache Version 2 License
N. Ferro, and D. Harman. Multilingual Information Access Evaluation I. Text Retrieval Experiments, volume 6241 of Lecture Notes in Computer Science, Springer, Berlin / Heidelberg, (2010)
D. Hiemstra, and C. Hauff. Multilingual and Multimodal Information Access Evaluation, volume 6360 of Lecture Notes in Computer Science, page 64--69. Berlin, Springer Verlag, (2010)