Excel + VBA + IE = web automation

Learn to write Excel VBA code to automate your web browser.

Excel = Microsoft Excel
VBA = Visual Basic for Applications, the macro programming language within Excel
IE = Internet Explorer

Excel VBA Internet Explorer interaction

Most browser automation tools are complex, programmed by software engineers. But because XL and IE –both Microsoft products– play so well together, normal folks like you and I can build some pretty cool things too.

Web browsing & interaction, sending & replying to emails, social media interaction, blogging/posting/content editing, product searching/comparing/price-checking/buying, all kinds of stuff … you can build a robot to do it for you, and do it faster, and around the clock.

TL;DR »

(Web) bots aren’t just for internet marketers — anyone with data-intensive tasks and workflow can benefit from learning how to create what might be called “super macros” in Excel’s VBA. In these pages I’ll share some tools and methods I’ve used to automate common Windows and web tasks, with extra emphasis on website interaction.

I’m not a programmer by trade. Just a short while back, I knew nothing about it. But with a few VBA beginner courses, help from quick-answer sites like stackoverflow.com, and a willingness to learn, anyone can code. Now VBA is a fun hobby and I’ve helped lots of people automate daily tasks and turn small projects into big ones with bots over the past 6 years.

VBA (Visual Basic for Applications) exists within Word, Access, Outlook, Publisher, and other Microsoft products, too. VBA is almost identical to Visual Basic, a common programming language for creating .exe Windows programs. But we love Excel for its myriad business uses… parsing, sorting, presenting, and storing data… so I find myself programming within Excel/VBA by default.

One thing I’ve learned is there are many ways to say the same thing in code. Like any language, you sometimes trade efficiency for clarity. If you’re advanced and see a better method for anything on this site (more efficient while remaining clear for beginners), feel free to share in the comments.

I’ve used other automation tools besides VBA. Ubot and iMacros are both excellent, and powerful programs (their own programming languages, really). In some respects they’re easier, and for 99% of web automation tasksg, you really can’t go wrong with either. But I got to where I only used VBA because my programming was getting into Windows API’s and command line calls (Visual Basic is tightly integrated with Windows), plus I often found myself using Excel alongside these programs anyway. I discovered there’s almost nothing VBA can’t do with automating Windows and Internet Explorer (even making IE appear as a different browser), and it seemed to me investing time learning Microsoft’s Visual Basic programming language just made more sense.

Let’s start by building a basic web bot to open a search page, submit a search request, evaluate and scrape the results page, and return the results to Excel.

  1. Excel Basics – but first, if you’re new to Excel you should start here.
  2. VBA Basics – it would also help to know some VBA fundamentals (eg. what’s a variable, what’s a subroutine, for-next loops, the VBA Editor, etc) — these courses are good.
  3. HTML and the DOM – and for creating web-bots, you should understand basic HTML structure – how a webpage is laid out – the Document Object Model.

Ok, you’re grounded in the basics. Now let’s jump into building a simple bot to search for a keyword and return the results…

A beginner web automation project: keyword search bot

Let’s create a web bot that goes to my new favorite search engine, enters a keyword, submits the query, then scans the results and prints them to an Excel sheet, highlighting any links from yellowpages.com.

Begin by opening Excel and start a new, blank workbook.

Enter a keyword – let’s say auto parts – in cell A2, and a location – let’s say Houston TX – in cell C1.

Excel workbook with Developer tab selected
Excel workbook with Developer tab selected

Click the Developer tab (not showing?), and click Visual Basic.

Click FileOptionsCustomize RibbonDeveloperOK

Then click ToolsReferences, and add the reference libraries Microsoft HTML Object Library and Microsoft Internet Controls

VBA insert reference library - 1
VBA – default reference libraries
VBA insert reference library - 2
VBA – add these web-related reference libraries

Next, in the Visual Basic Editor (VBE), click ViewProject Explorer.

Then, right-click VBAProject, and Insert › a new Module

VBA insert new module
VBA – inserting a new module
 

Now copy & paste the code below into Module1

'start a new subroutine called SearchBot
Sub SearchBot()
 
    'dimension (declare or set aside memory for) our variables
    Dim objIE As InternetExplorer 'special object variable representing the IE browser
    Dim aEle As HTMLLinkElement 'special object variable for an <a> (link) element
    Dim y As Integer 'integer variable we'll use as a counter
    Dim result as String 'string variable that will hold our result link
 
    'initiating a new instance of Internet Explorer and asigning it to objIE
    Set objIE = New InternetExplorer
 
    'make IE browser visible (False would allow IE to run in the background)
    objIE.Visible = True 
 
    'navigate IE to this web page (a pretty neat search engine really) 
    objIE.navigate "https://duckduckgo.com" 
 
    'wait here a few seconds while the browser is busy
    Do While objIE.Busy = True Or objIE.readyState <> 4: DoEvents: Loop 
 
    'in the search box put cell "A2" value, the word "in" and cell "C1" value 
    objIE.document.getElementById("search_form_input_homepage").Value = _ 
      Sheets("Sheet1").Range("A2").Value & " in " & Sheets("Sheet1").Range("C1").Value
 
    'click the 'go' button 
    objIE.document.getElementById("search_button_homepage").Click
 
    'wait again for the browser 
    Do While objIE.Busy = True Or objIE.readyState <> 4: DoEvents: Loop 
 
    'the first search result will go in row 2 
    y = 2 
 
    'for each <a> element in the collection of objects with class of 'result__a'... 
    For Each aEle In objIE.document.getElementsByClassName("result__a") 
 
        '...get the href link and print it to the sheet in col C, row y 
        result = aEle
        Sheets("Sheet1").Range("C" & y).Value = result
 
        '...get the text within the element and print it to the sheet in col D
        Sheets("Sheet1").Range("D" & y).Value = aEle.innerText
        Debug.Print aEle.innerText
 
        'is it a yellowpages link?
        If InStr(result, "yellowpages.com") > 0 Or InStr(result, "yp.com") > 0 Then
            'make the result red
            Sheets("Sheet1").Range("C" & y).Interior.ColorIndex = 3
            'place a 1 to the left
            Sheets("Sheet1").Range("B" & y).Value = 1
        End If
 
        'increment our row counter, so the next result goes below 
        y = y + 1 
 
    'repeat times the # of ele's we have in the collection 
    Next
 
    'add up the yellowpages listings
    Sheets("Sheet1").Range("B1").Value = _
      Application.WorksheetFunction.Sum(Sheets("Sheet1").Range("B2:B100"))
 
    'close the browser
    objIE.Quit
 
'exit our SearchBot subroutine
End Sub
 

Your module should look like the image below. To run this subroutine you’d press the green play button or just press F5. But first, let’s have a closer look. Click the first line, Sub SearchBot(). Then press F8 to begin stepping through the code line-by-line (also called debugging).

VBA code module in VB Editor
VBA code module in VB Editor

Each time you press F8 the line of code highlighted in yellow will execute, and then the next line down of code will turn yellow and wait for your key press. Say you change the code in a line and want to re-run it… you’d move the yellow line back up by dragging the yellow arrow with your mouse, or by clicking the line you want and then pressing CTRL+F9

Notice that after executing the line objIE.Visible = True, Internet Explorer opens? Now position all 3 applications (Excel, VBE, and IE) on the screen so that each is visible, and continue stepping through the code using F8.

What’s happening here…

Let’s look at the code line by line. Watch the video too if you want.

'start a new subroutine called SearchBot

The first thing you’ll notice are lots of comments I added to the code. VBA ignores lines beginning with a single quote ('), highlighting them green, as comments. Adding your own personal comments to your code is an especially good idea for beginners, or for more complex code that requires a memory-jogger when you go back to it.

Sub SearchBot()

This tells VBA we are starting a new subroutine, or Procedure. A module can contain subroutines (SubEnd Sub) and functions (FunctionEnd Function). In the video, we start the subroutine by putting the cursor within the code and pressing F5 (run) or F8 (execute line-by-line). But usually a Sub or Function is called by another Sub or Function, or by a launch button. The name of a Sub or Function can be almost anything you want. CamelCase naming convention, with first letters capitalized, is typical for naming subroutines and functions.

Dim objIE as InternetExplorer

Think of variables (like objIE) as floating references or containers that refer to objects or numbers we want to manipulate. But before we get to use them, we have to declare, or Dim, them, which sets aside some memory for whatever kind of use we have in mind for them. To manipulate Internet Explorer as an object in VBA, we need to Dim the browser — either generically as an Object variable (an example of late binding objects), or more specifically as an InternetExplorer special object variable like we did here (an example of early binding objects).

VBA Intellisense
VBA Intellisense code helper
Right now don’t worry about early vs. late binding… there are pros and cons for each method… but one advantage of early binding is VBA knows exactly what you intend for the variable, and therefore can show you its Intellisense (helper hints) popup menu when typing code after a specified object variable, which is great for helping beginners snoop around and see different things the object could possibly do. Why call it objIE? No special reason, aside from making it easy to remember what it’s for. Names of variables can be almost anything you want, and camelCase with first letter lowercase is the usual naming convention.

Dim aEle As HTMLLinkElement

Here we reserve aEle as a different type of object variable. Specifically, we early bind it for use as a webpage link (the <a> tags on a webpage). The other way would be to declare it more generically as Dim aEle as Object then later put it to use in one or more specific roles as an object — either way works. Once again, I could choose to call it anything, even mySuperHrefThingy, but I like the simplicity of aEle because it’s short and makes me think <a> element.

Dim y as Integer

Let’s use the variable y to refer to the row number on the Excel sheet where we’ll have the bot put our search results. Integer type variables in VBA are round numbers between -32,768 and 32,768. If we needed a variable to hold a value greater than 32,768 we’d Dim y as Long instead. Or if we needed it to have decimals we’d use Dim y as Double.

Dim result as String

For each aEle we find, we’ll copy its href value over to a string (alphanumeric text) variable called result, then print the string to the Excel sheet. Ok, I admit we don’t really need this –we could just print aEle’s value to the sheet directly– but I added it to introduce you to string variables.

Set objIE = New InternetExplorer

Setting an object instantiates or activates a new instance of its object type. So we’re saying let’s start a new instance of the IE browser.

objIE.Visible = True

Awesome – your bot just opened your IE browser! Properties (characteristics, such as making IE Visible) and Methods (actions) are invoked after typing a dot (.) after an object, like you see here. In your VBE, delete this line and start retyping it. You’ll notice after typing objIE. that the Intellisense helper pops up, listing all the properties and methods available for the InternetExplorer object. Go ahead and scroll through the list.

objIE.navigate "https://duckduckgo.com"

The .navigate event tells IE to navigate to whatever webpage is within the quotes. Yay automation time!

Do While objIE.Busy = True Or objIE.readyState <> 4: DoEvents: Loop

But the code wants to run all at once, in a millisecond. By the time IE actually loads the page of search results, the entire subroutine is long over. We have to slow things down and wait for the browser while its navigating. We could use something like Application.Wait(Now + TimeValue("00:00:05")) to wait 5 seconds, but if the page loads faster we’re wasting time waiting for nothing, and if the page takes longer we’re in the same pickle as before. Do While objIE.Busy = True Or objIE.readyState <> 4: DoEvents: Loop is an elegant Do...Loop solution consisting of 3 VBA statements we’ve tidied up on one line, using colons (:) to connect them. It’s saying: while IE is busy navigating, just sit here and run in circles. objIE.Busy = True and objIE.readyState <> 4 (<> means not equal) are essentially the same thing –one could probably be ommitted– but just to make sure the page is really done loading we make both of these conditions requirements for getting out of the loop, by using the logical operator Or. The DoEvents part is technically not needed but we add it in the loop to free up processing resources and allow other things to happen within Excel during this time, asynchronous to the running loop. For the over-achievers out there, here are some other ways we could have written this line of code to do the same thing.

Do While objIE.Busy <> False Or objIE.readyState <> 4: DoEvents: Loop
Do Until objIE.Busy = False And objIE.readyState = 4: DoEvents: Loop
Do: DoEvents: Loop While objIE.Busy = True Or objIE.readyState <> 4
Do: DoEvents: Loop Until objIE.Busy = False And objIE.readyState = 4:
While objIE.Busy Or objIE.readyState <> 4: DoEvents: Wend

objIE.document.getElementById("search_form_input_homepage").Value = _
  Sheets("Sheet1").Range("A2").Value & " in " & Sheets("Sheet1").Range("C1").Value

This VBA statement says: Find the search box element on the webpage by its ID, which is search_form_input_homepage, and make its text value what we have in cell A2, plus the string " in ", plus what we have in cell C1. The underscore (_) just lets us put longer pieces of code on multiple lines for better readability. How did I know what the search input box’s ID was, or that it even had an ID? Firebug of course! If you don’t have Firefox get it, and then install the Firebug add-on. Then right-click the search box, click Inspect Element with Firebug, and behold…

Firefox Firebug Inspector
Firefox Firebug Inspector

Firebug Console - highlighted code
Firebug Console – highlighted code

…the line of code is highlighted in the Firebug console (I added the red box around the ID). The image above reminds us a webpage is organized in a nested, tree-like structure, starting with .document as the root object and branching out from there into smaller objects. This is called the Document Object Model. .getElementById("__") is the go-to tool for grabbing one of those objects from the webpage. With it you can scrape data from a page, as well as writing to a webpage as we’re doing here. Here’s a sneak peak at some other ways we could have found the search box and entered data into it (shown in the order I would have tried them, if an ID was not available).

Locate element by its name
objIE.document.all.Item("q").Value = ...

Firebug - element name highlighted
Firebug – element name highlighted
Locate element by its class name
For Each ele In objIE.document.getElementsByClassName("js-search-input")
ele.Value = ...
Exit For
Next

Or
objIE.document.getElementsByClassName("js-search-input")(0).Value = ...

Firebug - element class name highlighted
Firebug – element class name highlighted
Locate an element by its tag name
For Each ele In objIE.document.getElementsByTagName("input")
ele.Value = ...
Exit For
Next

Or
objIE.document.getElementsByTagName("input")(0).Value = ...

Firebug - element tag name highlighted
Firebug – element tag name highlighted

objIE.document.getElementById("search_button_homepage").Click

I used Firebug to find the ID of the search button (magnifying glass) on the page, then used .getElementById again to grab it, and the .Click method to click it. Alternatively, we could have submitted the search box form like this instead objIE.document.getElementById("search_form_homepage").Submit. Actually, there’s a simpler method still that would allow us to skip the previous 3 steps altogether. Can you see it? Hint: take a look at the url we ended up with… https://duckduckgo.com/?q=auto+parts+in+Houston+TX. Got it? If you still don’t see it, and you don’t think you’ll be able to sleep tonight, email me 🙂

Do While objIE.Busy = True Or objIE.readyState <> 4: DoEvents: Loop

We need to wait for the browser for a second (an eternity in CPU land) while the search is running.

y = 2

Now we have a page full of search results. We’ll put them in column C, but we need to start in row 2 since C1 contains our heading Houston TX.

For Each aEle In objIE.document.getElementsByClassName("result__a")

This is the opening statement in our action-packed For...Next loop, where the procedure seeks out each aEle (link) object that’s found in the collection of all objects on the page that have a class of result__a (again I used Firebug to locate this particular class name to target). For each match found, we temporarily assign it to the aEle variable and do the following with it…

result = aEle
Sheets("Sheet1").Range("C" & y).Value = result

We take the found <a> element (aEle) and assign its value to a string variable, result. Ok, we really didn’t need to do this… we could just work with aEle directly without passing it around… I just wanted to demonstrate what a string variable is! Then we print the string to the sheet in column C, row y.

Sheets("Sheet1").Range("D" & y).Value = aEle.innerText

Let’s say we also grab the anchor text for each link and plop that into the cell to the right. The .innerText property (or .textContent) gets us that text. Side-note: I’ve built many bots with the code If InStr(objIE.document.body.innerText, "some text") > 0 then ... where I needed to know if the web page had some text in it somewhere.

Debug.Print aEle.innerText

We’ll also print this text in the debugging console, the Immediate Window, as well (note: you may have to press CTRL+G to show the Immediate Window, or open it from the View menu).

If InStr(result, "yellowpages.com") > 0 Or InStr(result, "yp.com") > 0 Then
Sheets("Sheet1").Range("C" & y).Interior.ColorIndex = 3
Sheets("Sheet1").Range("B" & y).Value = 1
End If

Let’s say we want to single out Yellow Pages links. InStr looks for one string inside another, using the format InStr(haystack, needle) and returns the numerical starting point of the first occurence of needle, or 0 if not found. If ‘yp’ or ‘yellowpages’ is found in a link we make the interior of the cell red with the .ColorIndex VBA property, and place a ‘1’ in the cell to the left. FYI, short and simple If...Then statements, like this one: If x = y Then Debug.print "Yo, x is the same as y, bro!", can be written on one line with the closing End If ommitted. If...Then statements are the backbone of any great A.I. … Watson, Wall-E, and now SearchBot()!

y = y + 1

Now we increment y by 1, so each new result found will be printed on the next row down.

Next

Continue looping until we run out of aEle‘s to look at.

Sheets("Sheet1").Range("B1").Value = _
  Application.WorksheetFunction.Sum(Sheets("Sheet1").Range("B2:B100"))

Since we put a 1 in row B each time we found a Yellowpages link, we can now just add them up, making cell B1 display the total. Alternatively, we could have used another For...Next loop to look at each cell in column B one at a time, adding up all the 1’s. But using Excel’s powerful built-in WorksheetFunction‘s whenever possible is better.

objIE.Quit
End Sub

So we opened IE, navigated to DuckDuckGo, performed a search, parsed the results on the web page, printed them to Excel, and even did some further analysis on the retrieved data. All that’s left is to close (.Quit) the browser object and end (End Sub) the procedure.

Finally, let’s save the file. We need to save the file with the extension .xlsm, since this is most definitely a macro-enabled workbook.

Excel - Save-as .xlsm
Excel – Save-as .xlsm

Congratulations! You built your first bot, and in the process learned a good deal about the most common VBA commands used to interact with and automate the web.

To download this Excel VBA project: xl-searchbot.xlsm (keyword search & scrape bot)

Recommended next: Web scraping with getElementsByTagName()

Facebooktwittergoogle_plusredditlinkedintumblrFacebooktwittergoogle_plusredditlinkedintumblr