| Author |
Scraping HTML or CSV to database |
frangonve
Member
Posts: 110
Location: Madrid (Spain) until I can live at the beach
Joined: 18.06.08 |
| Posted on 19-12-2009 12:54 |
|
|
Hello I'm trying to "Limnor" a program to:
A get some data from a html page
B import it in a database form
C after some processing showing the results in a window
I'm stuck yet in the first step.
Please find enclosed the snapshot of the html page:

and the html source code here:
http://docs.google.com/Doc?docid=0ASY4vrkiyweqZGhkM3ZuNmNfM2Y1cnM0Ymhq&hl=en
I need the program to:
1 Fill text data in some fields.
2 Click Apply button.
3 Copy the data from the resulting table in the database table to be able to process it.
Can you give me a hint?
Cheers
Francisco
Edited by frangonve on 23-03-2010 00:54 |
|
| Author |
RE: HTML table to database |
frangonve
Member
Posts: 110
Location: Madrid (Spain) until I can live at the beach
Joined: 18.06.08 |
| Posted on 19-03-2010 00:06 |
|
|
The web page can show the data in CSV style:

Source code can be found here:
https://docs.google.com/Doc?docid=0ASY4vrkiyweqZGhkM3ZuNmNfNWZtcjRyMmd2&hl=en
Can this CSV data be imported by LimnorStudio to be processed?
Cheers
Francisco |
|
| Author |
RE: moving data between environments. |
vince
Member
Posts: 100
Location:
Joined: 30.01.07 |
| Posted on 21-03-2010 15:09 |
|
|
Frangrove,
This looks difficult in some respects...
You have tried to move computing information between platforms
before (environments...different systems)
This does become complicated. The internet uses a system called
CGI (Common gateway interface) to deal with information that is taken from web pages.....processed...and blasted back to the web page.
You are wanting to do a similar thing but CGI can become complex indeed.
Try to discover WHY this information needs treating in this manner?
Is there a more readily available solution where-by a staff member can simply input data?....(employee)
Can the original DATA not be collected in the RIGHT MANNER at the very outset?....
---------------------------------------------------------------------------
You can use the existing network of the world wide web. Using several computers you can display the table and just have ONE simple SQL query per computer.
The query would be different....on each machine...and you would gather the information that way. You can use email as the protocol
and search the HTML for validated entries that interest you using a browser front end. (Find in Browser menu)
Pass that results file(s) via email and gather together a NEW data base in Limnor Studio. Place the required actions and processing behind that NEW database and execute it.
Deliverables would be the database results which can then be sent to update the original system (the web page, via RSS or manual employee)
yes
thankyou
Vincent |
|
| Author |
RE: Moving data formats between computing environments. |
vince
Member
Posts: 100
Location:
Joined: 30.01.07 |
| Posted on 21-03-2010 15:49 |
|
|
The data already exists.
WHY is it not possible to find the supply of the original information?
Ask for that information in it's original state!
Is the "HTML form information" available in it's original state?
Can you recieve help with supplying you with the raw data
information of that form?
You can open a web account. (freewebhosting) and open a new
SQL database. Create it.
There may be a facility to link a FORM from a web page to an
existing database you created.
Otherwise you can use other tools for doing online web databases
These tools are free. (email me)
vwhyman@yahoo.co.uk
thankyou
Vincent.
p.s. what about cut and paste? |
|
| Author |
RE: Good opinions on "HTML form to database" |
vince
Member
Posts: 100
Location:
Joined: 30.01.07 |
| Posted on 21-03-2010 19:11 |
|
|
Hello F,
There is web pages with answers.
Near the bottom.
http://www.experts-exchange.com/Programming/Misc/Q_21613559.html
The idea is to try to see if the web site you are interested in has an existing RSS feed. Then you can use that information instead of the
HTML web page.
OR there is an idea to "copy and paste" and export as CSV file etc.
thankyou
Vincent. |
|
| Author |
RE: HTML table to database |
SOE
Member
Posts: 60
Location:
Joined: 24.06.05 |
| Posted on 22-03-2010 05:29 |
|
|
I can see some situations where you don't have access to the original data, but just in html format, and would still want to use it.
1. Like, suppose you wanted to make a quiz program based on the data from a website. Your program is going to manipulate the data in a different way than it is presented on the website, however you want to use the website's data as source information.
2. Suppose the data is being updated constantly, so you don't want old data, but new and present data. You don't control the data, so don't control what is updated.
If the data coming from the site is dynamic and your program is reflecting that in some manner, then just copying and pasting it into a CSV or spreadsheet program, might not be wanted.
The RSS feed is an "if" type thing. The data may not be presented as a feed. By the way, there are scripts floating out there to convert data to feed. Examples sql2rss (payware)... and free ones like http://kellychronicles.spaces.live.com/blog/cns!A0D71E1614E8DBF8!1361.entry-
You might ask the author, if his data is not sent out as a feed, to do so by suggesting a script to him.
Without the author's cooperation or just trying to use data from any old website, you would probably need to use javascript or try ADOdb. |
|
| Author |
RE: HTML table to database |
frangonve
Member
Posts: 110
Location: Madrid (Spain) until I can live at the beach
Joined: 18.06.08 |
| Posted on 23-03-2010 00:23 |
|
|
Hello,
I've been reading a bit and here is what I've found so far:
Perhaps all this is too obvious for experienced coders, but well... you know Limnor attracts people like me: old time power users with "some" programming abilities who get lost in the new jungle of scripting languages, OOPS parlance,... ;)
I think the term is scraping:
http://en.wikipedia.org/wiki/Web_scraping
and if I'm not wrong .NET has the capabilities to do it easily:
http://authors.aspalliance.com/stevesmith/articles/netscrape.asp
it's really just incredibly easy with .NET.
In the above linked article Steven Smith mentions two objects: WebRequest and WebResponse.
Everything you need to do screen scraping in .NET is in the System.Net namespace. In particular, you will want to become familiar with the WebRequest and WebResponse objects, which perform the task of sending a request over HTTP and returning the response, respectively.
note that you need to include a reference to the System.IO assembly to support the StreamReader class.
And he freely offers (he would appreciate donation/credit, so credit is given) a C# and VB example that I've pasted below as it was given by the author.
Perhaps these objects can be used in Limnor Studio to import the data requested from Oanda webpage... Hints and comments are welcome.
Cheers
Francisco
<%@ Import Namespace="System.Net" %>
<%@ Import Namespace="System.IO" %>
<script language="C#" runat="server">
void Page_Load(Object Src, EventArgs E) {
myPage.Text = readHtmlPage("http://aspadvice.com/blogs/ssmith/");
}
private String readHtmlPage(string url)
{
String result;
WebResponse objResponse;
WebRequest objRequest = System.Net.HttpWebRequest.Create(url);
objResponse = objRequest.GetResponse();
using (StreamReader sr =
new StreamReader(objResponse.GetResponseStream()) )
{
result = sr.ReadToEnd();
// Close and clean up the StreamReader
sr.Close();
}
return result;
}
</script>
<html>
<body>
<b>This content is being populated from a separate HTTP request to
<a href="http://aspadvice.com/blogs/ssmith/">http://aspadvice.com/blogs/ssmith/</a>:</b><hr/>
<asp:literal id="myPage" runat="server"/>
</body>
</html>
/stevesmith/articles/examples/vb/scrape.aspx<%@ Import Namespace="System.Net" %>
<%@ Import Namespace="System.IO" %>
<script language="VB" runat="server">
Sub Page_Load(Src As Object, E As EventArgs)
myPage.Text = readHtmlPage("http://aspadvice.com/blogs/ssmith/")
End Sub
Function readHtmlPage(url As String) As String
Dim objResponse As WebResponse
Dim objRequest As WebRequest
Dim result As String
objRequest = System.Net.HttpWebRequest.Create(url)
objResponse = objRequest.GetResponse()
Dim sr As New StreamReader(objResponse.GetResponseStream())
result = sr.ReadToEnd()
'clean up StreamReader
sr.Close()
return result
End Function
</script>
<html>
<body>
<b>This content is being populated from a separate HTTP request to
<a href="http://aspadvice.com/blogs/ssmith/">http://aspadvice.com/blogs/ssmith/</a>:</b><hr/>
<asp:literal id="myPage" runat="server"/>
</body>
</html>
Edited by frangonve on 14-08-2010 08:17 |
|
| Author |
RE: scraping HTML |
vince
Member
Posts: 100
Location:
Joined: 30.01.07 |
| Posted on 24-03-2010 14:50 |
|
|
F,
That is great.
You can create some re-usable tool....as a condensed outcome of your work.
That way, you can keep that custom work in the tool box.
When you need to do similar computing you can call-up that tool.
thanks
Vince. |
|
| Author |
RE: Scraping HTML or CSV to database |
SOE
Member
Posts: 60
Location:
Joined: 24.06.05 |
| Posted on 25-03-2010 11:20 |
|
|
|
OK, Frangonve and Vince, there is a part missing. So what happened? Found an alternate solution? |
|
| Author |
RE: Scraping HTML or CSV to database |
vince
Member
Posts: 100
Location:
Joined: 30.01.07 |
| Posted on 26-03-2010 08:30 |
|
|
Hello,
Not sure what is the solution. Soe.
The programming look like complicated to me.
Many people are capable of programming. The only
common sense is to keep DATA amounts and the initial
situation comparable to what is a realistic outcome.
Complicated coding will perhaps only really work alone.
The emphasis on coding is not shared...
Going to be too complicated...
my opinion , ofcourse.
Thankyou
V. |
|
| Author |
RE: Scraping HTML or CSV to database |
frangonve
Member
Posts: 110
Location: Madrid (Spain) until I can live at the beach
Joined: 18.06.08 |
| Posted on 26-03-2010 11:32 |
|
|
Hello,
Now you can read again my post about .NET scraping capabilities...
It is 4 posts above.
There was an offending character in my post, that made the whole post invisible...
Cheers
Francisco |
|
| Author |
RE: Scraping HTML or CSV to database |
vince
Member
Posts: 100
Location:
Joined: 30.01.07 |
| Posted on 27-03-2010 17:08 |
|
|
F Soe,
It looks kind very complicated to me....... Framgove.
However , in some respects, your post is helpful to me because
it gives me a clear indication of how ""EXTREMELY LOW LEVEL"" my programming skills are.
Infact, i do not have any "language programming skills" what-so-ever....
Baring this in mind, it is possibly not much help....(posts)
Thanks for the link from wikipedia. (perhaps that was the offending character?....)
Please feel free to read my post(s)....also.
you must hold onto your work!....if you are the author. The tools can be re-used ....right!
thankyou
Vince.
|
|
| Author |
RE: Scraping HTML or CSV to database |
yw
Administrator
Posts: 670
Location: ***
Joined: 23.06.05 |
| Posted on 28-03-2010 17:46 |
|
|
Webbrowser control has a property All which is a collection of all HTML elements. We may extract data by locating the right HTML elements. A simple sample project can be downloaded from http://www.limnor.com/index.html?Doc=studio_shareApps.html.
The zip file http://www.limnor.com/Studio/ShowHtmlElements.zip contains the project files and a HTML file copied from the URL framgove gave earlier.
WebBrowser.Document.All is a Collection of HtmlElement. Like an array, you may create an ExecuteForEachItem action. When editng this action, a new Method Editor appears. A "value" icon is among other component icons. The "value" is an HtmlElement fetched from the Collection. You may add actions to use the "value". In the sample, it uses the properties of the "value" to form a display text to be added to the listbox. The text is formed as
value.TagName + " Name:" + value.Name + "Id:" + value.Id + " " + value.InnerText
When the ExecuteForEachItem action is executed, it executes all the actions you added to the Method Editor as many times as the number of the elements in the Collection. Each time the "value" is a different element from the Collection.
In the sample, only one action, ListBox1.Add, is added to the Method Editor. Suppose the Collection has 100 HtmlElements then this ListBox1.Add will be executed 100 times, each time with a different HtmlElement is used.
To make sure the sample is working, please download the latest version of Limnor Studio from http://www.limnor.com/Studio/LimnorStudioSetup.msi
Edited by yw on 28-03-2010 17:49 |
|
| Author |
RE: Scraping HTML or CSV to database |
yw
Administrator
Posts: 670
Location: ***
Joined: 23.06.05 |
| Posted on 07-04-2010 01:30 |
|
|
Another sample project, Save HTML Table to Database, is uploaded at http://www.limnor.com/index.html?Doc=studio_shareApps.html.
It shows tables from any HTML pages. For the sample HTML page frangonve gave, it can save the table into a database.
New version is needed to use the sample. http://www.limnor.com/Studio/LimnorStudioSetup.msi.
Edited by yw on 07-04-2010 01:34 |
|
| Author |
RE: Scraping HTML or CSV to database |
frangonve
Member
Posts: 110
Location: Madrid (Spain) until I can live at the beach
Joined: 18.06.08 |
| Posted on 09-04-2010 15:27 |
|
|
|
Thanks a lot for your help... |
|
| Author |
RE: Scraping HTML or CSV to database |
frangonve
Member
Posts: 110
Location: Madrid (Spain) until I can live at the beach
Joined: 18.06.08 |
| Posted on 07-08-2010 15:27 |
|
|
Hello,
Now that I have a few days off, I'm trying to download GetHtmlTableDataPrj.zip
But I get an error message.
Cheers
Francisco
|
|
| Author |
RE: Scraping HTML or CSV to database |
yw
Administrator
Posts: 670
Location: ***
Joined: 23.06.05 |
| Posted on 07-08-2010 23:31 |
|
|
|
Sorry, the URL is not right. it is at http://www.limnor.com/Studio/GetHtmlTableData.zip |
|
| Author |
RE: Scraping HTML or CSV to database |
frangonve
Member
Posts: 110
Location: Madrid (Spain) until I can live at the beach
Joined: 18.06.08 |
| Posted on 08-08-2010 01:49 |
|
|
|
Thanks a lot... I could download it without problems. |
|
| Author |
RE: Scraping HTML or CSV to database |
frangonve
Member
Posts: 110
Location: Madrid (Spain) until I can live at the beach
Joined: 18.06.08 |
| Posted on 08-08-2010 01:54 |
|
|
After opening the project I think this is a too complex project for me at the moment. I'll try something easier to learn about
LimnorStudio <=> WebPage interaction that I'll describe in a new thread.
Cheers
|
|