How to auto-open the VBA Editor when Excel starts

To have the VBA Editor automatically open when Excel starts, add the code below to the ThisWorkbook module:

Private Sub Workbook_Open()
 
    Application.VBE.MainWindow.Visible = True
    Application.VBE.Windows("Immediate").Visible = True
    ThisWorkbook.VBProject.VBComponents("Module1").Activate
 
End Sub

Double-click ThisWorkbook and select Workbook in the left drop-down. The Open declaration should initiate and populate the first and last lines of a Workbook_Open procedure for you.

Then simply add the code above.

Add On-Workbook-Open Procedure in Excel
Add On-Workbook-Open Procedure in Excel

This is exceptionally handy when you’re working on a project for several days, or when you want the end user to be able to view the editor (for example, to watch the action happen in the Immediate Window while the bot is running).

Facebooktwittergoogle_plusredditlinkedintumblrFacebooktwittergoogle_plusredditlinkedintumblr

Common VBA Methods & Properties used in web automation

In the previous post we looked at the Intellisense contextual helper menu, which lists the myriad Methods and Properties available to the InternetExplorer control object, as well as HTML objects and other objects in VBA.

Did you feel overwhelmed by all those unknown properties and methods? I sure do! But the good news is we don’t have to learn all those. Out of those thousands of choices, we only use a small handful in most of our web-scraping and VBA web-automation tasks.

Often-used VBA HTML object Properties

Properties – HTML example 1

Consider the following snippet of HTML (from this site’s footer):

<div class="site-info">
    <span class="site-title"><a rel="home" href="http://automatetheweb.net/">Automate the Web</a></span>
    <a href="https://wordpress.org/">Proudly powered by WordPress</a>
</div>

We can have VBA snag the div element like this (ele is just a variable name I’m making up):
Set ele = objIE.document.getElementsByClassName("site-info")(0)

Read it like this: "for all elements on the web page that have a class of ‘site-info’ get me the zero-ith (first) one, and attach it to the object variable ele".

 

Remember, once VBA runs that line of code it finds a div element with the class name site-info… then it tucks that div into the variable ele for later use…

… so then we can look at some of that div’s properties in the debug console like this…

Debug.Print ele.ClassName would return:

site-info

 

Debug.Print ele.TagName

div

 

Debug.Print ele.Id

 

(nothing … this element has no id=”” attached to it)

 

Debug.Print ele.href

ERROR: Object does not support this method or property

(silly rabbit, div’s don’t have href’s!)

 

Debug.Print ele.innerText

Automate the Web
Proudly powered by WordPress

(this can also be written as Debug.Print ele.textContent)

 

Debug.Print ele.innerHTML

<span class="site-title"><a rel="home" href="http://automatetheweb.net/">Automate the Web</a></span>
<a href="https://wordpress.org/">Proudly powered by WordPress</a>

Debug.Print ele.outerHTML

<div class="site-info">
    <span class="site-title"><a rel="home" href="http://automatetheweb.net/">Automate the Web</a></span>
    <a href="https://wordpress.org/">Proudly powered by WordPress</a>
</div>

 

Properties – HTML example 2

Now let’s look at some Properties of a different element. In the upper-right of this page there’s a blue email SUBSCRIBE button. If we right-click it, then click Inspect, we see this HTML:

<input tabindex="923" type="submit" class="button btn btn-primary" name="submit" 
value="subscribe" id="newsletters-2-button">

So let’s tell VBA to scrape that input element using the following line of code (iEle is just a made up variable name) …
Set iEle = objIE.document.getElementId("newsletters-2-button")

Read like this: "for the lone element on the page that has an id of ‘newsletters-2-button’, attach it to the object variable iEle".

 

When VBA runs that line of code it discovers that particular id belongs to an input element, and it holds this element in memory in the variable iEle.

So executing Debug.Print iEle.Id would produce:

newsletters-2-button

 

Debug.Print iEle.Type

submit

 

Debug.Print iEle.Name

submit

 

Debug.Print iEle.Value

subscribe

.Value is an example of a property that we can set as well as get, meaning it’s not a read-only property. We could assign it a value (write to it) as well, using code like the following, which enters a string of text into the input box:
objIE.document.getElementId("newsletters-2-button").Value = "hi@automatetheweb.net"

 

Properties – HTML example 3

<div class="header-image">
    <a href="http://automatetheweb.net/" rel="home">
        <img src="http://automatetheweb.net/wp-content/uploads/login-logo.jpg" width="350" height="40" alt="Automate the Web">
    </a>
</div>

This time let’s just access some elements directly — we won’t use a variable…

Debug.Print objIE.document.getElementsByTagName("a")(7).href
(assumes it’s the 8th <a> tag on the page we’re after… remember 1 is 0 in variable land, 2 is 1, 3 is 2, and so on)

http://automatetheweb.net/
 

Debug.Print objIE.document.getElementsByTagName("img")(3).src
(assumes it’s the 4th <img> tag on the page we’re after)

http://automatetheweb.net/wp-content/uploads/login-logo.jpg

 

Often-used VBA HTML object Methods

In addition to those amazing and exciting object Properties (think attributes, or nouns), VBA objects can have Methods (think actions, or verbs), too:

We use Methods like these to interact withn HTML objects:

objIE.document.getElementsByTagName("button")(0).click
clicks the first button element on the page.

objIE.document.getElementsByTagName("form")(0).submit
submits the first form on the page.

And we use Methods with the InternetExplorer object like these:

Assume we have
Dim objIE As InternetExplorer
Set objIE = New InternetExplorer

Then…

objIE.navigate "http://google.com"
navigates the browswer

objIE.stop
manually stops browser navigation.

objIE.refresh
performs a browser refresh.

objIE.quit
closes the IE browser.

But the real web-scraping fun happens when using the following powerhouse Methods with InternetExplorer to hunt down (sometimes-illusive) web page elements. These VBA/IE Methods are powerful web-scraping allies that don’t just walk the DOM, they stomp all over it! …

getElementById   video with example use
getElementsByName
getElementsByTagName   video with example use
getElementsByClassName   video with example use

And to work alongside those go-getters, we employ these sophisticated, relationship-bound DOM-walkers:

Children
firstElementChild
nextElementChild
lastElementChild
nextElementSibling
lastElementSibling
previousElementSibling
parentElement

These family-oriented methods rely on the predictable, cohesive and hierarchical structure of the DOM to pinpoint and hook onto any element(s) on a web page.

 

Chaining multiple Properties and Methods to walk the DOM

So what is the DOM? Think of the Document Object Model as a complete and perfect snapshot of the entire loaded web page, stored in VBA as a tree-like structure. A page’s DOM is completely known and understood by VBA once a page has been successfully loaded. Study the image below, replacing the word jQuery with VBA

VBA and the Document Object Model of a web page
VBA and the Document Object Model of a web page
image source: this awesome site
 

In order to scrape web page data, we just need to direct VBA how and where within the DOM to access it. This is sometimes easier said then done, and questions with "how do I get to this data…" are the most common ones I hear. But accessing and interacting with a specific element or elements on a web page with VBA is always possible, and becomes increasingly easy to do with practice and learning to read the DOM (see video links above).

The most common technique when using VBA to find and interact with web page elements is using a method I call chaining (well, I probably read it somewhere) where we employ a combination of one or more of the afformentioned getters with one or more of the family-oriented methods.

Chaining VBA Properties and Methods – example 1

Considering the image above, let’s figure out how to get the text within the li element with the class name "fish". It’s important to note there are many ways to get there, just as there are different ways to climb a tree and still get to the top. Some of these ways are more efficient than others.

Here’s the most direct way that I see: objIE.document.getElementsByClassName("fish")(0).innerText.

And that route would work even if all those bottom-row li‘s had the class "fish", since (0) represents the first one.

But what if that li had no class? Then we could use: objIE.document.getElementsByClassName("menu_list")(0).Children(0).innerText

Here you can see we’ve used dots (.) to chain together the innerText property of the Children method of the getElementsByClassName method of the document method of the InternetExplorer object.

Chaining VBA Properties and Methods – example 2

Let’s consider another example, using a previous code sample, of chaining/connecting together multiple InternetExplorer and HTML methods and properties.

<div id="hdimg" class="header-image">
    <a href="http://automatetheweb.net/" rel="home">
        <img src="http://automatetheweb.net/wp-content/uploads/login-logo.jpg" width="350" height="40" alt="Automate the Web">
        <div><span style="font-size: 90%;">a div of text I added, just for giggles</span></div>
    </a>
</div>

Let’s assume
Dim objIE As InternetExplorer
Dim ele as Object
Set objIE = New InternetExplorer

Now let’s come up with a variety of ways to scrape out the line of text with the word "giggles" and print it in the debug console. Each of the following VBA examples prints "a div of text I added, just for giggles".

  • Debug.Print objIE.document.getElementById("hdimg").innerText
  • Debug.Print objIE.document.getElementsByClassName("header-image")(0).innerText
  • Debug.Print objIE.document.getElementsByClassName("header-image")(0).getElementsByTagName("span")(0).innerText
  • Debug.Print objIE.document.getElementsByClassName("header-image")(0).getElementsByTagName("img")(0).nextElementSibling.innerText
  • Debug.Print objIE.document.getElementsByClassName("header-image")(0).getElementsByTagName("img")(0).nextElementSibling.Children(0).innerText
  • Debug.Print objIE.document.getElementsByClassName("header-image")(0).getElementsByTagName("img")(0).nextElementSibling.getElementsByTagName("span").innerText
  • For each ele in objIE.document.getElementsByTagName("span")
       If instr(ele.style, "ize: 9") > 0 then
          Debug.print ele.innerText: Exit For
       End If
    Next

While all of these VBA code examples accomplish the same thing, clearly the first is the most obvious and practical, while the remainder are merely illustrative and increasingly obscure (even a little ridiculous).

Conclusion

The InternetExplorer and HTML Objects reference libraries include hundreds of Properties and Methods, but we really only use a few of them in most of our VBA web-automation projects.

Common Properies include:
innerText, innerHTML, outerHTML, value, href, src, name, id, className, tagName, and a small handful of others.

Common Methods include:
click, submit, navigate, stop, refresh, quit, getElementById, getElementsByName, getElementsByTagName, getElementsByClassName, Children, firstElementChild, nextElementChild, lastElementChild, nextElementSibling, lastElementSibling, previousElementSibling, parentElement, and a small handful of others.

Chaining together InternetExplorer and HTML Properties and Methods lets us walk anywhere along the DOM, interacting with a web page and its elements, allowing us to scrape data from and automate the web.

 

Facebooktwittergoogle_plusredditlinkedintumblrFacebooktwittergoogle_plusredditlinkedintumblr

VBA Intellisense and Early-binding vs Late-binding

Generally speaking, you want to use early binding, rather than late binding.

Early binding an object variable tells VBA specifically what we’ll be doing with it.
Late binding is generically declaring the variable an Object and VBA doesn’t know how we intend to use it.

 
VBA Intellisense helper menu
VBA Intellisense helper menu

Example of Early binding:
   Dim ie As InternetExplorer
   Set ie = New InternetExplorer

Advantages:
1) you get to use Intellisense
2) code runs faster
3) full access to object model
4) better error-checking

 

Example of Late binding:
   Dim ie As Object
   Set ie = CreateObject("InternetExplorer.Application")

Advantages:
1) can use variables for multiple purposes
2) better backwards compatability with previous VBA versions

 

I almost always perfer early-binding, having a clear advantage over late-binding. One exception, however, is that I’ll often declare a reuseable element object generically (Dim ele as Object) so that I can use it for multiple purposes. Like the following example where I use ele to capture an img element and then a div element:

Dim ele as Object unspecified Object variable, not bound to any particular use
Set ele = ie.document.getElementsByTagName("img")(3)
Debug.Print ele.src I just printed the source of an img!
For each ele in ie.document.getElementsByClassName("ytube")
    Debug.print ele.textContent I just printed the text of a div!
Next
 

But aside from mere convenience, there’s really no good reason to do it this way. Hence my advice to pretty much always use early binding.

 
Facebooktwittergoogle_plusredditlinkedintumblrFacebooktwittergoogle_plusredditlinkedintumblr

Excel – Enable macros or ActiveX controls by default

If you get a "Security Warning – Macros have been disabled" message in Excel, like the following…

Excel - 'Security Warning macros have been disabled'
Excel – ‘Security Warning macros have been disabled’

… just click Enable Content and Excel will remember your choice, and the next time you open this file you won’t get the warning.

However, if you rename the file, or if perhaps you have VBA save copies of the file with modified file names, you’ll see the warning again for each new file-name. This could be problematic if you later need VBA to dynamically open these files for editing (the code would get stopped in its tracks!).

In this case you would just need to set Excel to trust all macros for all files, like so…

File > Options > Trust Center > Trust Center Settings > Macro Settings > click Enable all macros

Excel Trust Center - Enable macros by default
Excel Trust Center – Enable macros by default

You would only tick Trust access to the VBA project object model if you’re creating sophisticated VBA code specifically designed to modify the code within your project’s Subroutines and Functions.

Occasionally you’ll come across (or write code for) an Excel workbook where ActiveX objects are used — usually form controls. Similar to above, if you need a global solution, check Enable all controls without restrictions and without prompting.

Excel Trust Center - Enable ActiveX controls by default
Excel Trust Center – Enable ActiveX controls by default
Facebooktwittergoogle_plusredditlinkedintumblrFacebooktwittergoogle_plusredditlinkedintumblr