Projects / Herold


Herold converts HTML files to DocBook files. It tries to detect the structure of the HTML code by analyzing the header elements. Herold is able to suppress table elements and to serialize the contents. Furthermore, you can exclude certain elements via XPath expressions.


Recent releases

  •  27 Mar 2013 09:44

    Release Notes: New section detection via CSS class names. If your HTML contains headings that are not using the h1-6 tags, but via CSS formatted p, div, or similar tags, the new section detection can help you to create the proper sectioning structure in DocBook. List detection via CSS class names. Sometimes HTML contains "lists", which truly are special formatted paragraphs. The new list detection can help you to reconstruct the proper lists in DocBook.

    •  29 Nov 2012 10:21

      Release Notes: The lang attribute lang of pre elements is now preserved. Command line arguments were cleaned up New and improved profiles were provided. Creation of invalid DocBook XML when transforming an element with a nested img elements was fixed. Processing of meta elements was added, and minor fixes were made.

      •  06 Nov 2012 19:09

        Release Notes: This release fixes the installation program.

        •  04 Nov 2012 14:47

          Release Notes: This release fixes usage of invalid values for the align attribute and fixes wrong normalization of literal environments.

          •  25 Apr 2012 09:58

            Release Notes: This release is licensed under GPLv3, reorganizes commandline arguments, introduces profile files, ends support for groovy scripts, and fixes computation of colspans.


            Project Spotlight


            A Fluent OpenStack client API for Java.


            Project Spotlight

            TurnKey TWiki Appliance

            A TWiki appliance that is easy to use and lightweight.