WHtmParser

What is a WHtmlParser?

WHtmlParser was creating to parse HTML code as fast as possible. Parser does not have limitations imposed by parsers according to a WC3 specification. Example you can put any tag in any place and nothing will be added automatically.

If you want to parse and modify a HTML code in an object-oriented way I recommend using a WQuery library which wraps a parser into an object, which shares user-friendly methods similar to those in jQuery.

Not only WQuery but also WHtmlParser are a part of the Wojdav Bootstrap Mvc library and subject to the same license conditions.

See the parser limitations before using it.

Parser limitations

A HTML code must be correct, otherwise, will be thrown an exception. A parser will not add or close automatically the missing tags. We can create p tag in a html tag, the parser will not add automatically a head and body tag.
Values of the attributes must be given in a quotation mark (id="Main") or in case of tags of a void type with no value (checked).
A parser does not parse attributes with a value given in a single quotation mark or with no limitation marks. Both cases end with an exception: id='1' id=1
A parser parses a doctype declaration as one character string, without parsing particular attributes.

All limitations will be gradually removed as the library will be developed.

If you want to parse considerable pages of the unknown origin I recommend a AngleSharp parser.

Usage

We create a parser object which after parsing a HTML code returns a list of nodes and on which we can induce subsequent methods. In this case, we add a class, an id attribute and another element to the element.

                
                    var htmlParser = new HtmlParser();
                    IList<INode> list = htmlParser.Parse("<div></div>");
                    var element = list[0] as IElement;
                    element.ClassList.Add("myClass");
                    element.Attributes.SetNamedItem("id", "myID");
                    element.AppendChild(new Element("p").AppendChild(new TextNode("Welcome")));
                    Console.WriteLine(element.ToHtml());
                
            
The use of a parser

When we use a ToHtml method on the element, a parsed and modified code will be returned as a HTML code.

 
                
                    <div class="myClass" id="myID">
                        <p>Welcome</p>
                    </div>
                
            
Result of a ToHtml method