120 projects tagged "Indexing"

No download Website Updated 13 May 2014 Emdros

Screenshot
Pop 362.32
Vit 137.96

Emdros is a corpus query system for storing and searching linguistically annotated text. It is very generic, supporting almost any kind of annotation from almost any linguistic theory. All linguistic levels of analysis are supported, including phonology, morphology, the lexical level, syntax, and discourse. The core libraries act as a middleware layer between a client and an underlying SQL database. MySQL, PostgreSQL, and SQLite (2 and 3) are supported.

Download Website Updated 06 May 2014 Recoll

Screenshot
Pop 379.03
Vit 78.18

Recoll is a personal full text desktop search tool based on Xapian. It provides an easy to use, feature-rich, easy administration interface with a Qt-based GUI. Text, HTML, PDF, PostScript, MS Word, OpenOffice, Wordperfect, KWord, Abiword, maildir, and mailbox mail folder formats are supported, along with their compressed versions and quite a few others. Powerful query facilities are provided. Multiple character sets are supported, and internal processing and storage uses Unicode UTF-8. Stemming is performed at query time and the stemming language can be switched after indexing.

Download Website Updated 10 Apr 2014 OpenGrok

Screenshot
Pop 449.82
Vit 27.19

OpenGrok is a fast and usable source code search and cross reference engine. It helps you search, cross-reference, and navigate your source tree. It can understand various program file formats and version control histories like Mercurial, Bazaar, Git, ClearCase, Perforce, SCCS, RCS, CVS, or Subversion. In other words, it lets you grok (profoundly understand) the source.

Download Website Updated 06 Apr 2014 OpenSearchServer

Screenshot
Pop 558.82
Vit 40.51

OpenSearchServer is a powerful, enterprise-class, search engine program. Using its Web user interface, crawlers (Web, file, database, etc.), and REST/RESTFul API, you can integrate advanced full-text search capabilities into your application.

Download Website Updated 04 Apr 2014 Terrier

Screenshot
Pop 230.41
Vit 23.57

Terrier is software for the rapid development of Web, intranet, and desktop search engines. More generally, it is a modular platform for building large-scale information retrieval applications, providing indexing and probabilistic retrieval functionalities. It comes with a desktop search application.

No download Website Updated 03 Mar 2014 DocFetcher

Screenshot
Pop 346.54
Vit 26.41

DocFetcher is a desktop search application: It allows you to search the contents of documents on your computer. You can think of it as Google for your local files.

Download Website Updated 06 Jan 2014 HTMLDOC

Screenshot
Pop 724.34
Vit 33.93

HTMLDOC converts HTML files and Web pages into indexed HTML, PostScript, and PDF files suitable for online viewing and printing. It can be used as a standalone GUI application, in a batch document processing environment, as a Web-based report generation application, or in embedded environments to support printing of HTML content. It runs on all Unix platforms as well as Mac OS X and Windows 2000 and higher.

Download Website Updated 05 Jan 2014 QuickFind

Screenshot
Pop 35.83
Vit 2.34

QuickFind is a cross-platform Java application for searching files in your computer. Its sole purpose is to save user time by searching the desired file almost instantly. It is designed to support all of the major computer platforms. The user can schedule caching or manually cache at any time on selected directories. Once the caching is done, all you have to do is input the name of the file you want to find.

Download Website Updated 23 Dec 2013 GNU libextractor

Screenshot
Pop 493.76
Vit 44.57

libextractor is a library used to extract meta-data from files of arbitrary type. It is designed to use helper-libraries to perform the actual extraction, and to be trivially extendable by linking against external extractors for additional file types. The goal is to provide developers of file-sharing networks, file managers, and WWW-indexing bots with a universal library to obtain meta-data about files. It includes a shell-command and bindings for Java (JNI) and Python.

No download Website Updated 02 Dec 2013 Yioop!

Screenshot
Pop 160.72
Vit 10.86

Yioop! is a PHP search engine. Yioop! can be configured as either a general purpose search engine for the whole Web or it can be configured to provide search results for a set of URLs or domains. Yioop can crawl pages or can directly index archives such as ARC and WARC. It supports indexing several file formats such as HTML, Atom, PDF, DOC, PPT, RTF, RSS, XML, SVG, PNG, JPG, BMP, GIF, and sitemaps. The Yioop! crawler can be deployed on one or many machines. It supports having one or more to crawl scheduler processes, as well as multiple fetchers and mirrors. Crawling respects robots.txt including Crawl-delay. Yioop! crawls are stored in a Web archive format that is easy to move around. Crawling can be done on one machine and the results deployed elsewhere. Yioop! supports mixing of crawls. Yioop! comes with a search front end that can be localized as desired using a GUI. This GUI supports RTL languages. Management of crawls can also be done using this GUI. Yioop! can be configured in a straightforward manner to make use of file caching or memcache if available.

Screenshot

Project Spotlight

Lilblue Linux

An XFCE4 desktop system built on uClibc.

Screenshot

Project Spotlight

Devel Live CD

A Live CD to compile programs.