Project Description

Archive4J is an archive engine for large document
collections written in Java, i.e. a set of
algorithmic tools and implementations that make it
possible to build a direct index of a document
collection. In particular, for each document some
basic data can be recovered, such as the length of
the document in words, the list of distinct terms
appearing in the document, and the number of
occurrences of each term in the document (the
count). Goals include a very high compression rate
and very fast random access. To obtain this
result, Archive4J combines techniques typical of
search engines with succinct data structures.

(This Description is auto-translated) Try to translate to Japanese Show Original Description

Resenha
Your rating
Review this project