2010년 7월 7일 수요일

Screen scraping in C# using HtmlAgilityPack.

In my project, i have used HtmlAgilityPack library to capture the page and parse it.

1. First you create a asp.net project and add HtmlAgilityPack reference in your project.

2. Create a 'HtmlWeb' object

3. Load the html page using 'HtmlDocument'

4. now you get html page in 'HtmlDocument' object.

5. You can display this 'HtmlDocument' object in asp 'Literal' control.

the code this below


HtmlWeb hwObject = new HtmlWeb();
HtmlDocument htmldocObject = hwObject.Load("http://www.c-sharpcorner.com");


lLatestPrice.Text = htmldocObject.DocumentNode.InnerHtml;


You can find all link from this page using a loop like this

foreach (HtmlNode link in htmldocObject.DocumentNode.SelectNodes("//a[@href]"))
{
string s = link.InnerText;

}

thats it.....

댓글 없음:

댓글 쓰기