Web scraping is a process where an application extracts data or content from a web page. You can do this from your Excel worksheet using VBA. I have shared an article here before where I have explained how to extract data from HTML elements using a Macro. In this post, I’ll show you how to fill a web form from your Excel worksheet using VBA.
You can extract contents (data) from any website or a webpage from your Excel worksheet, like extract stock quotes from a web page etc.
The DataDome web scraping prevention solution stops bot-driven attacks in real time, and protects you from all other bot threats in real time. Web scraping a web page involves fetching it and extracting from it. Fetching is the downloading of a page (which a browser does when a user views a page). Therefore, web crawling is a main component of web scraping, to fetch pages for later processing. Once fetched, then extraction can take place. Web Scraping Using Python What is Web Scraping? Web Scraping is a technique to extract a large amount of data from several websites. The term 'scraping' refers to obtaining the information from another source (webpages) and saving it into a local file. For example: Suppose you are working on a project called 'Phone comparing website,' where you require the price of mobile phones, ratings,. ParseHub is a free web scraping tool. Turn any site into a spreadsheet or API. As easy as clicking on the data you want to extract.
Contents in a web page are embed inside HTML elements. That is, the content or the data is written either between HTML tags like <p>, <div> etc. or data is typed in textboxes. Each HTML element on a web page has two important attributes, such as, the id and/or name. The id, in particular makes an HTML element unique. These id’s are often used to extract data from the elements.
Not just extracting, we can pass data to a web page dynamically from Excel, like filling a form.
Here’s a sample form, a contact form that I have designed especially for my Macro to work. The form has few textboxes (or input boxes) and a button, which when clicked will save the form data in a text (or .txt) file.
The macro that I am sharing will fill the form and will automatically click the Save Button (on the sample web page).
Note: I am assuming, you have Internet Explorer (any version like 9, 10, 11 etc.) installed in your computer. There is an IE installed on Windows 10 too.
We need Microsoft’s Internet Explorer to open the web page. Therefore, we’ll first add an object reference. From the top menu of your VBA editor, click Tools -> References…. In the References window, find and select Microsoft HTML Object Library and click OK.
Here’s the macro.
How the Macro executes
In the beginning of the code, I have defined a Const, where I have assigned the URL of the web page.
Const sSiteName = 'https://www.encodedna.com/css-tutorials/form/contact-form.htm'
Next, I have created two objects. Object oIE for Internet Explorer and object oHDoc for HTMLDocument property.
I have not added any reference of Internet Explorer, rather I have just created an object oIE using CreateObject() method and now I can open IE browser from my worksheet.
Set oIE = CreateObject('InternetExplorer.Application')
The browser is kept visible, so I can see the output (the filling of the form).
The HTMLDocument property object will give me access to the HTML elements and its attributes on the web page. Now, see this line here … .getElementById('txtName').Value = 'Arun Banik'
The .getElementById() method (its an HTML DOM element and JavaScript developers are familiar with it), is commonly used to get an HTML element by its id. The method takes a parameter in the form of an id. Look at this again.
.getElementById('txtName').Value = 'Arun Banik'
The id is txtName. It’s the id of first textbox or input box on the web page, which we are filling. The method is followed by the property Value. That’s how I am assigning values to each textbox.
Note: If the elements have name, then you can use this …
oHDoc.getElementsByName
In-addition, there is a dropdown list (with Country names) on the web page (if you have noticed) and using the same method .getElementById(), I’ll assign a value to it.
Finally, the program will click the button on the web page to save the data.
.getElementById('bt').Click
How to get the id of the Elements
Now, at this stage, you might be wondering, how I got the ids from that web page. It is simple. Just follow these steps.
1)Click this link to open the page, preferably on Chrome browser.
2) Set focus on the textbox using the mouse and right click the mouse. Now, choose the option Inspect (if you are using Chrome Browser).
3) It will open DevTools or the Developer Tools window, highlighting the <input> element. You will see id=“txtName”. I am using this id in my Macro.
4) Follow the 1st three steps to get the id’s of other elements.
Online Data Scraper
You can get the ids and names of elements from any website by following the above steps.
Well, that’s it. Let me know if you have any queries regarding Web Scraping.
Thanks for reading. ☺
← PreviousNext →