Jericho HTML Parser for Windows 3.2

Martin Jericho in Development \ Components and Libraries

Jericho HTML Parser is a java library allowing analysis and manipulation of parts of an HTML document, including server-side tags, while reproducing verbatim any unrecognised or invalid HTML.


   
 

Jericho HTML Parser is a java library allowing analysis and manipulation of parts of an HTML document, including server-side tags, while reproducing verbatim any unrecognised or invalid HTML. It also provides high-level HTML form manipulation functions.

It is an open source library released under both the Eclipse Public License (EPL) and GNU Lesser General Public License (LGPL). You are therefore free to use it in commercial applications subject to the terms detailed in either one of these licence documents.

The javadocs provide comprehensive documentation of the entire API, as well as being a very useful reference on aspects of HTML and XML in general.

<b>Features:</b>

The library distinguishes itself from other HTML parsers with the following major features:

* The presence of badly formatted HTML does not interfere with the parsing of the rest of the document, which makes the library ideal for use with "real-world" HTML that chokes other parsers.
* ASP, JSP, PSP, PHP and Mason server tags are explicitly recognised by the parser. This means that normal HTML is still parsed properly even if there are server tags inside them, which is common for example when dynamically setting element attributes.
* A new stream based parsing option using the StreamedSource class, which allows memory efficient processing of large files using an event iterator. This is essentially a StAX alternative with the ability to process HTML and non-validating XML, as well as several other features not available in other streaming parsers.
* In its standard form it is neither an event nor tree based parser, but rather uses a combination of simple text search, efficient tag recognition and a tag position cache. The text of the whole source document is first loaded into memory, and then only the relevant segments searched for the relevant characters of each search operation.
* Compared to a tree based parser such as DOM, the memory and resource requirements can be far better if only small sections of the document need to be parsed or modified. Incorrect or badly formatted HTML can easily be ignored, unlike tree based parsers which must identify every node in the document from top to bottom.
* Compared to an event based parser such as SAX, the interface is on a much higher level and more intuitive, and a tree representation of the document element hierarchy is easily created if required.
* The begin and end positions in the source document of all parsed segments are accessible, allowing modification of only selected segments of the document without having to reconstruct the entire document from a tree.
* The row and column number of each position in the source document are easily accessible.
* Provides a simple but comprehensive interface for the analysis and manipulation of HTML form controls, including the extraction and population of initial values, and conversion to read-only or data display modes. Analysis of the form controls also allows data received from the form to be stored and presented in an appropriate manner.
* Custom tag types can be easily defined and registered for recognition by the parser.
* Built-in functionality to extract all text from HTML markup, suitable for feeding into a text search engine such as Apache Lucene.
* Built-in functionality to render HTML markup with simple text formatting.
* Built-in functionality to format HTML source code that indents elements according to their depth in the document element hierarchy. (Click here for an online demonstration)
* Built-in functionality to compact HTML source code by removing all unnecessary white space.


Jericho HTML Parser for Windows 3.2 Components and Libraries software developed by Martin Jericho. The license of this components and libraries software is freeware, the price is free, you can free download and get a fully functional freeware version of Jericho HTML Parser for Windows. Do not use illegal warez version, crack, serial numbers, registration codes, pirate key for this components and libraries freeware Jericho HTML Parser for Windows. Always use genuine version that is released by original publisher Martin Jericho.


Similar Software

Jericho HTML Parser 3.2 jericho.htmlparser.net  Misc. Dev. Tools

Jericho HTML Parser is a java library allowing analysis and manipulation of parts of an HTML document, including server-side tags, while reproducing verbatim any unrecognised or invalid HTML.

Elerium HTML .NET Parser 1.7 Elerium Software  Components and Libraries

Elerium HTML .NET Parser is a .NET component for parsing and manipulating HTML/XML documents and Cascading Style Sheets (CSS). The HTML Parser can be used in WinForms and ASP.NET (C#, VB.NET) applications. Component is fully independent and requires only .NET Framework. Elerium HTML .NET Parser...

Html Agility Pack 1.4.6 Darth Obiwan  Misc. Dev. Tools

This is an agile HTML parser that builds a read/write DOM and supports plain XPATH or XSLT (you actually don't HAVE to understand XPATH nor XSLT to use it, don't worry...). It is a .NET code library that allows you to parse "out of the web" HTML files. The parser is very tolerant with...

El-Kabong - HTML Parser 0.3.2 ekhtml.sourceforge.net  Misc. Dev. Tools

El-Kabong is a high-speed, forgiving, sax-style HTML parser. Its aim is to provide consumers with a very fast, clean, lightweight library which parses HTMLquickly, while forgiving syntactically incorrect tags.

Java Mozilla Html Parser 0.3.0 mozillaparser.sourceforge.net  Misc. Dev. Tools

MozillaParser is a Java Html parser based on mozilla's html parser. it acts as a bridge from java classes to Mozilla's classes and outputs a java Document object from a raw ( and dirty) HTML input

php4-html-dom: Fast HTML Parser for PHP 0.10.0 php4-html-dom.sourceforge.net  Misc. Dev. Tools

Light weight, fault tolerant, high speed single pass HTML parser. Builds HTML DOM similar to accessing the browsers DOM with javascript. Compatible with PHP4 and higher. Send in your feature requests.

BS Parser 1.00.00.02 G-92 Developers Group  HTML Tools

Semi-professional HTML-parser In the standard version, It allows for the results from the processing to be edited and saved in RTF (Rich Text Format), which makes it very comfortable for daily use. Basic functions BS-Parser is a typical multifunctional, module, multi stream parser,...

Chilkat .NET HTML-to-XML 9.2.1 Chilkat Software  Misc. Dev. Tools

HTML-to-XML is a .NET component that can help you transform a HTML file into a well-formed XML for parsing. If effect, it is designed to be an HTML parser / scraper. Once HTML is converted to XHTML (i.e. well-formed XML), the plethora of existing XML parsing components and libraries can be...

Michael HTML parser 2.0 Michael Kochiashvili  Misc. Dev. Tools

This unit allows you to parse HTML code, extract HTML elements and NAME=Value pairs. Unit contains THtmlParser class and 3 functions

Xls2Html 3.0.0.134 Broennimann Informatik  File & Disk Management

Special Excel to HTML converter for Windows: Preserve your HTML layout. You define with your own HTML template how the resulting HTML page will look like. Minimize your HTML file size. Some programs produce tons of superfluous HTML code. Publish the converted HTML file to your FTP server. Email...


Popular Software of Development - Components and Libraries

aiCharts for Android 1.0.0 ArtfulBits Inc.  Components and Libraries

ArtfulBits Android chart (aiCharts) is a professional solution with a comprehensive feature set. aiCharts is a complete framework that allows developers to enhance applications with slick interactive charts in mere hours (with available technical support, samples and tutorials). aiCharts provides...

Java Bridge to Exchange 1.0 Moyosoft  Components and Libraries

The Java Bridge to Exchange product is an effective solution to access data stored in the Microsoft Exchange server from Java. Exchange items like e-mails, contacts and appointments can be accessed, created, imported or exported with the library. The library can be used to integrate Exchange with...

PrecisionID Code128 Barcode Fonts 4.0 PrecisionID  Components and Libraries

PrecisionID Code 128 includes TrueType, Binary PostScript, and ASCII PostScript Fonts so you can easily print barcodes in Excel, Word, Crystal Reports, and Access! PrecisionID Font Formatting Components(TM) simplify barcode generation with a Crystal Reports UFL, Microsoft VBA module for Excel,...

Luxand FaceCrop Face Detection SDK 1.0 Luxand Development  Components and Libraries

Face detection software provides web developers the perfect solution to greatly optimize and automate the process of creating professional-looking, passport-like photos from original images of any type. Regardless of the quality, size, aspect ratio of the original image, the software can produce...

Luxand FaceSDK 7.0 Luxand Development  Components and Libraries

Add facial recognition and biometric identification features to your applications. FaceSDK is a multi-platform library enabling Microsoft Visual C++, C#, Objective C, VB, Java and Delphi developers implement fast and precise face recognition and identification in their applications. Working in...

wolfSSL 4.0.0 wolfSSL  Components and Libraries

The wolfSSL embedded SSL/TLS library is a lightweight SSL library written in ANSI standard C and targeted for embedded and RTOS environments - primarily because of its small size, speed, and feature set. It is commonly used in standard operating environments as well because of its royalty free...

DTMF IVR SYSTEM IN VB.NET 9.7.0 DTMF IVR SYSTEM IN VB.NET  Components and Libraries

Ozeki VoIP SIP .NET SDK allows to develop a DTMF navigated IVR system written in VB.NET. In this sample program an IVR tree is built up from .xml file and SIP communication is supported by Ozeki SIP SDK. The extended codec support (G.729 and further codecs) assures excellent sound quality. The...

FlashViewer Engine 1.0 FeatherySoft  Components and Libraries

FlashViewer Engine is a set of components for Delphi, C++ Builder and Lazarus which adds extra features to Adobe Flash Player (ActiveX or Netscape plugin) such as loading from any sources, grab real 32-RGBA frames, real transparency playing. This great solution for playing Flash movies is an...

USB Monitoring Control 2.12.00.2387 HHD Software  Components and Libraries

USB Monitoring Control ActiveX Component (USBMC) - USB Devices Data Sniffing Control Com Library. Monitor USB Device From Your Application. USB Devices Connection Monitoring Component. The library lets you enumerate all installed usb devices, attach a monitor object to receive transferred data...

Dew Lab Studio for Delphi 2019.1 Dew Research  Components and Libraries

Dew Lab Studio includes MtxVec math library and additional signal analysis (DSP Master) and statistical analysis (Stats Master) add-on packages. Features include: - fast object oriented numerical/matrix library - Intel AVX 1/2, AVX-512, SSE4, Open CL and 64bit support enabled. - optional...