After using HTMLAgilityPack for a long time I finally switched to Jsoup. Jsoup is better in parsing HTML and best thing is that it supports Jquery like selectors to select elements with ease. Many a times I have noticed HTMLAgilityPack fails to extract correct data. For example while extracting meta keywords and description information for a website built using ASP.net it usually fails because of some extra spaces added by Master pages.
How to bring Java written Jsoup to .Net world?
Solution is very simple. Jsoup can very easily be complied to .Net library using IKVM.Net. Although you will be required to use IKVM.NET VM to run Jsoup library but since last one and half years I haven’t seen any issues because of this. It was able to achieve almost every task where I was utilizing HTMLAgilityPack library. There is a little learning curve while switching to Jsoup but it is more beneficial. One more point to add that Jsoup comes with a good list of examples and selectors to make you life easy.
Because of licensing and distribution issues I am not putting .Net compiled Jsoup here. If you are having problems converting just drop me a word.
C# Snippet using Jsoup
// using Jsoup
var name= doc.select("#post").first().select("H2").first().text();