We already looked at
getElementsByClassName as VBA methods for hooking onto web page elements.
But when elements (HTML tags) on a page don’t have an ID or class, another common approach for accessing them is using
creates an array (group) of all
div elements on a page.
displays number of
p (paragraph) elements on a page in the console.
displays inner HTML of 1st
p element on a page.
displays text content of 5th
p element on a page.
For Each ele In objIE.document.getElementsByTagName("input")
If ele.Value = "Sign In" then ele.Click: Exit For
input element that has a value equal to ‘Sign In’
For Each ele In objIE.document.getElementsByTagName("a")
If InStr(ele.href, "wp-admin") > 0 then objIE.navigate ele.href: Exit For
gets the link element containing ‘wp-admin’ and navigates to it.
Let’s practice. Let’s say every morning we need to create a report which requires going to a website, copying a bunch of data, and pasting it into Excel, and we’re wanting to automate the task…
I’ll just use a list of last names as example data. Doing this manually would look like…
Paste into Excel…
Not too hard, but many of our ‘Excel + web data’ tasks are more involved. And if they are daily, or otherwise repititive, you could save hours per year by automating them. The Sub below performs our task within seconds … and the code is eternally reusable. Assuming this is a 5-minute task, daily, building an automation super macro would save 30 hours a year!
Paste the code below into a VBA code module (how to open a new VBA module), or download below.
Sub GrabLastNames() 'dimension (set aside memory for) our variables Dim objIE As InternetExplorer Dim ele As Object Dim y As Integer 'start a new browser instance Set objIE = New InternetExplorer 'make browser visible objIE.Visible = True 'navigate to page with needed data objIE.navigate "http://names.mongabay.com/most_common_surnames.htm" 'wait for page to load Do While objIE.Busy = True Or objIE.readyState <> 4: DoEvents: Loop 'we will output data to excel, starting on row 1 y = 1 'look at all the 'tr' elements in the 'table' with id 'myTable', 'and evaluate each, one at a time, using 'ele' variable For Each ele In objIE.document.getElementById("myTable").getElementsByTagName("tr") 'show the text content of 'tr' element being looked at Debug.Print ele.textContent 'each 'tr' (table row) element contains 4 children ('td') elements 'put text of 1st 'td' in col A Sheets("Sheet1").Range("A" & y).Value = ele.Children(0).textContent 'put text of 2nd 'td' in col B Sheets("Sheet1").Range("B" & y).Value = ele.Children(1).textContent 'put text of 3rd 'td' in col C Sheets("Sheet1").Range("C" & y).Value = ele.Children(2).textContent 'put text of 4th 'td' in col D Sheets("Sheet1").Range("D" & y).Value = ele.Children(3).textContent 'increment row counter by 1 y = y + 1 'repeat until last ele has been evaluated Next 'save the Excel workbook ActiveWorkbook.Save End Sub
getElementsByTagName (especially when
getElementsByClassName are not an option) you can scrape data from any web page at the push of a button.
What tasks can you automate?
To download this Excel VBA project: getelementsbytagname.xlsm
Recommended next: reddit Q&A – How to extract data from HTML into Excel