Amberfish is a general purpose text/XML retrieval utility. It features indexing of both free text and nested fields, built-in support for XML documents, structured queries allowing generalized field/tag paths, hierarchical result sets, automatic searching across multiple databases, efficient indexing, and relatively low memory requirements.
Solr is an enterprise search platform from the Apache Lucene project. Its major features include powerful full-text search, hit highlighting, faceted search, dynamic clustering, database integration, and rich document (e.g. Word and PDF) handling. Solr is highly scalable, providing distributed search and index replication, and it powers the search and navigation features of many of the world's largest internet sites. Solr is written in Java and runs as a standalone full-text search server within a servlet container such as Tomcat. Solr uses the Lucene Java search library at its core for full-text indexing and search, and has REST-like HTTP/XML and JSON APIs that make it easy to use from virtually any programming language. Solr's powerful external configuration allows it to be tailored to almost any type of application without Java coding, and it has an extensive plugin architecture when more advanced customization is required.
Associations Indexing Service (AIS) was originally done as an extension of human memory for tagging (storing under personal keywords and associations) resources, URIs, bookmarks, and memos (for fast access to the information in future) by using the same keywords or queries, similar to popular search engines. It can be seen as a local search engine, used as an automatic indexer of big file hierarchies (e.g. personal archives or files repositories). It is based on Lucene, so the application will remain very fast with any size index.
AudioScout is a distributed audio content indexing system. It can index a large collection of audio content for the purpose of later recognition of unknown signals. Robust to noise, different encodings, and other types of distortion, it can be used for a variety of applications including duplicate detection of files, identifying music, as well as more sophisticated uses involving the enforcement of copyrights and ensuring lawful use of content.
Basenji is an indexing and search tool designed for easy and fast indexing of media collections. Once indexed, removable media such as CDs and USB sticks can be browsed and searched for specific files very quickly, without actually being connected to the computer. Besides file hierarchies and audio track listings, Basenji also presents extracted metadata (image dimensions, mp3 tags etc.) and content previews of indexed media in a clean and straightforward user interface.