Projects / Xidel


Xidel is a command line tool to download Web pages and extract data from them. It can download files over HTTP/S connections, follow redirections, links, or extracted values, and process local files. The data can be extracted using XPath 2.0, XQuery 1.0, and JSONiq expressions, CSS 3 selectors, and custom, pattern-matching templates that are like an annotated version of the processed page. The extracted values can then be exported as plain text/XML/HTML/JSON, or assigned to variables to be used in other extract expressions or be exported to the shell. There is also an online CGI service for testing.

Operating Systems

Recent releases

  •  24 Mar 2014 22:31

    Release Notes: This release improved JSONiq support (updated to JSONiq 1.0.3) and the JSON extensions (more compatible with XQuery, assignable), uses arbitrary precision arithmetic for all numeric operations if necessary, and adds a trivial subset of XPath/XQuery 3 (!, ||, and switch), new functions for resolving URI or HTML hrefs, some new multipage commands similar to XSLT, and more.

    •  26 Mar 2013 01:36

      Release Notes: This release added JSONiq support with functions/objects/arrays/literals, improved the command syntax by allowing grouped command line options to apply them only to certain pages/links, changed the input/output formats, added support for exporting variables to the shell, fixed several HTML parsing/serialization bugs, changed the syntax of extended strings (from "..$var;.." to x"..{$var}.."), added more HTTP options (different methods, ports, authorization), and made various other minor changes.

      •  06 Nov 2012 18:58

        Release Notes: The XPath interpreter has been extended to become a complete XQuery 1 interpreter, thereby some bugs and design mistakes were found and fixed. Two additional functions were added: a "form" function that encodes HTML forms and can easily follow post requests (e.g. -f form(//form[1], "username=...&password=..."), and a "match" function to run the pattern-matching templates from within XPath 2 expressions. The Windows CLI interface was improved (e.g. support for single quotes), and the two online services were merged into one.


        Project Spotlight


        A Fluent OpenStack client API for Java.


        Project Spotlight

        TurnKey TWiki Appliance

        A TWiki appliance that is easy to use and lightweight.