Is there any Xhtml class in C# which is not for asp.net??? I need it for WinForms
I need that class so I can read attribute values from some sites which is using Xhtml
19 replies to this topic
#1
Posted 17 July 2011 - 03:38 PM
|
|
|
#2
Posted 17 July 2011 - 08:44 PM
The XML classes should be able to deal with XHTML.
#3
Posted 22 July 2011 - 06:25 AM
problem is not all XHtml pages are compliant.
try downloading the HtmlAgilityPack, its amazing. It will handle xhtml and even just plain html, in a DOM that is very similar to the Xml Dom. It even allows you to fire XPath queries against it.
I could provide an example when I get back to my workstation. But anyways, its available on codeplex.
Html Agility Pack
try downloading the HtmlAgilityPack, its amazing. It will handle xhtml and even just plain html, in a DOM that is very similar to the Xml Dom. It even allows you to fire XPath queries against it.
I could provide an example when I get back to my workstation. But anyways, its available on codeplex.
Html Agility Pack
#4
Posted 22 July 2011 - 09:06 AM
can you give some review how to use it (to sam_coder)...i want to read some value in html code from my program and show it inside datagridview control...i was never working with html inside c# so it any help would be great...
there is some value from hrportfolio.com | FONDOVI | Hrvatski otvoreni investicijski fondovi detaljno - prinosi fondova, grafi ki prikaz, usporedba fondova, kupnja udjela u fondu, pristup fondu | like KD Victoria...only values from 2 columns "Vrijednost" and "Promj.%"
can someone give me an example
there is some value from hrportfolio.com | FONDOVI | Hrvatski otvoreni investicijski fondovi detaljno - prinosi fondova, grafi ki prikaz, usporedba fondova, kupnja udjela u fondu, pristup fondu | like KD Victoria...only values from 2 columns "Vrijednost" and "Promj.%"
can someone give me an example
#5
Posted 22 July 2011 - 09:09 AM
I'll see what I can do tonchi.
What you're trying to do, is typically called 'scraping'.
It has some dangers associated with it. for instance, scraping content without permission might be against the end user policies, it's always better to ask permission.
It's also possible that the page could change, rendering your scraping code incompatable. XPath greatly midigates this risk mind you, generally keeping code changes to a minimum.
Anyways, that said, I'll see if I can whip something up simple.
What you're trying to do, is typically called 'scraping'.
It has some dangers associated with it. for instance, scraping content without permission might be against the end user policies, it's always better to ask permission.
It's also possible that the page could change, rendering your scraping code incompatable. XPath greatly midigates this risk mind you, generally keeping code changes to a minimum.
Anyways, that said, I'll see if I can whip something up simple.
#6
Posted 23 July 2011 - 06:31 AM
Hey Tonchi,
yea, you will likely have to re-reference the HtmlAgilityPack, not sure if I packaged it or not.
I can't read your language, so I just called a couple of columns, data, data 2, etc
But you get the gist.
ScreenScraping.zip 179.01K
6 downloads
So, anyways, I haven't tried to make it look pretty or anything. I'm kinda busy. =)
What gets complicated about these documents, and this one in particular is all the embedded tables. tables and tables and tables.
So what I do, is I look for a spot in the document, and sync on that. Just download the page, and ignore everything up to that point. That allows me to throw much cleaner XPath queries against it.
the Html Agility Pack is really forgiving, so you can throw any malformed markup at all, I've never seen it not make sense of it. =)
enjoy!
yea, you will likely have to re-reference the HtmlAgilityPack, not sure if I packaged it or not.
I can't read your language, so I just called a couple of columns, data, data 2, etc
But you get the gist.
ScreenScraping.zip 179.01K
6 downloadsSo, anyways, I haven't tried to make it look pretty or anything. I'm kinda busy. =)
What gets complicated about these documents, and this one in particular is all the embedded tables. tables and tables and tables.
So what I do, is I look for a spot in the document, and sync on that. Just download the page, and ignore everything up to that point. That allows me to throw much cleaner XPath queries against it.
the Html Agility Pack is really forgiving, so you can throw any malformed markup at all, I've never seen it not make sense of it. =)
enjoy!
#7
Posted 23 July 2011 - 06:32 AM
oh, and the solution XML file, thats just the source from the page you wanted me to hit, You can ignore that, I just had it there to keep my head straight
#8
Posted 23 July 2011 - 06:52 AM
tnx a lot...that was my first problem...
but i have a 2 more questions...first question is how did you fill the names of columns inside table (Name, Data, Data2, Data3)
and second question is how can i save datas from that table (as i can see it's not datagridview control)...after i save those data i want to load it from last save and when i click "upload" button that a new data change gets in new row, so previous data is still on that table
but i have a 2 more questions...first question is how did you fill the names of columns inside table (Name, Data, Data2, Data3)
and second question is how can i save datas from that table (as i can see it's not datagridview control)...after i save those data i want to load it from last save and when i click "upload" button that a new data change gets in new row, so previous data is still on that table
#9
Posted 23 July 2011 - 07:12 AM
ok, well I used a list view control, and list view item has a constructor that takes a string array
in the Update Grid, you can see that I'm just appending each column of the table into the array.
A data grid view would work very similarly. Just make a data table with the appropriate columns, and then use an object array (make sure all columns are string type)
so then you end up with
The data grid view can be bound to that data table, using the DataSource property on the grid.
That's it, that would allow this to be shown
in the Update Grid, you can see that I'm just appending each column of the table into the array.
A data grid view would work very similarly. Just make a data table with the appropriate columns, and then use an object array (make sure all columns are string type)
so then you end up with
dt.rows.add(new object[] {
node.SelectSingleNode("td[@class='colFond']/a").InnerText, //scrape the appropriate fields
node.SelectSingleNode("td[@class='colDatum']").InnerText,
node.SelectSingleNode("td[@class='colVrijednost']").InnerText,
node.SelectSingleNode("td[@class='colValuta']").InnerText
}
The data grid view can be bound to that data table, using the DataSource property on the grid.
That's it, that would allow this to be shown
#10
Posted 23 July 2011 - 09:48 AM
is there any property to allow me to copy those informations so i can paste it in word???
#11
Posted 23 July 2011 - 10:37 AM
is this the way to copy text from listbox:
private void listBox1_MouseClick(object sender, MouseEventArgs e)
{
if (e.Button == MouseButtons.Right)
{
Clipboard.SetText(listBox1.Items[listBox1.SelectedIndex].ToString());
}
}
and can i paste it to word???
#12
Posted 23 July 2011 - 01:07 PM
and what's with the scrapeResult??? where did you defined it??? i copied every single code from your project into mine and this is error from VS:
from those lines:
and what's with those error (i can't fix it)
it's from here:
Quote
Error 2 The name 'scrapeResults' does not exist in the current context c:\documents and settings\antonio\my documents\visual studio 2010\Projects\WindowsFormsApplication2\WindowsFormsApplication2\Form1.cs 66 13 WindowsFormsApplication2
from those lines:
... doc.LoadHtml(content); //Load the content into the structure scrapeResults.Items.Clear(); //clear the table foreach (HtmlNode node in doc.DocumentNode.SelectNodes( "/div/div/table[@id='tabelaTec1']/tbody/tr")) ...and
...
{
//add a new listview item for each item in the table
scrapeResults.Items.Add(new ListViewItem(new string[] {
node.SelectSingleNode("td[@class='colFond']/a").InnerText, //scrape the appropriate fields
node.SelectSingleNode("td[@class='colDatum']").InnerText;
...
and what's with those error (i can't fix it)
Quote
Error 1 The type or namespace name 'Form1' could not be found (are you missing a using directive or an assembly reference?) c:\documents and settings\antonio\my documents\visual studio 2010\Projects\WindowsFormsApplication2\WindowsFormsApplication2\Program.cs 18 33 WindowsFormsApplication2
it's from here:
static void Main()
{
Application.EnableVisualStyles();
Application.SetCompatibleTextRenderingDefault(false);
Application.Run(new Form1());
}
}
1 user(s) are reading this topic
0 members, 1 guests, 0 anonymous users


Sign In
Create Account


Back to top









