Regular expressions can seem like a foreign language, but once you know how they work you won’t know how you ever used Google Analytics without them . This post will help you understand how each of the regular expressions function and includes examples of how they can be used for your account.
What is does: Turns the character following the backslash into plain text.
How it works:
Say you want to create a goal for the url /thankyou?id=123. In Regex, “?” has another meaning, which we’ll get to in a little bit, but we need it to be plain text since it is part of the url string. To do that, we place a backslash before the ? to tell Analytics to treat it as plain text.
/thankyou?id=123
What it does: Creates an “or” statement.
a|b will match a or b
How it works:
Let’s say you want to find all visits from branded terms for PPC Hero. You can create a custom filter setting a regular expression for all brand keywords.
ppc hero|ppchero
All terms containing ppchero or ppc hero will be returned.
What it does: Tells Analytics that the previous item is optional.
ab?c will match ac or abc
How it works:
This expression comes in handy when you are filtering for keywords that are commonly misspelled. I want to find all visits to our site that contain the term “heroes,” which is often misspelled as “heros.”
heroe?s
This will catch keywords that contain either “heros” or “heroes.”
What it does: Tells other regex characters how to function. Works the same way as in math.
2 + 3 x 5 = 17 (2+3) x 5 = 25
How it works:
You’ll most often see parentheses working in conjunction with pipe bars. I want to all the searches for Google Display Network. I know people also refer to it as the Google Content Network and I want to include both searches in my results. Without the parentheses, Analytics would return anything containing “Google Content” or “Display Network.”
Google (Display|Content) Network
By including parentheses, this Regex will return anything containing “google content network” or “google display network.”
What they do: Create a list of items to match to. The regular expression will only match ONE item in this list.
p[aiu] will match pan, pin, pun but NOT pain
How it’s used:
I’m interested in how many click to the 2nd, 3rd, and 4th pages when they come to our blog. The url for each page x on the blog is /page/x. To find pages 2, 3 and 4 I would set my expression as follows:
/page/[234]
Before we see what the results look like in Analytics, I want to introduce you to the next regex character which helps in creating lists.
What it does: Works with brackets to extend lists.
- [a-z] matches all lower case letters in the alphabet
- [A-Z] matches all upper case letters in the alphabet
- [a-zA-Z0-9] matches lower and upper case letters and digits
How it works:
Lets use the same example above. Using a dash in my regular expression, I can quickly include more page numbers for it to match to without having to type them all out.
/page/[2–9]
This will return any page which url ends in /page/2 through /page/9.
Looking at these results, you might be wondering about two things. What happens when you want to view pages higher than 9 and how do you keep regex from including the category pages. Those questions will be answered as we continue to get to know the rest of the Regex characters.
What they do:Braces tell Analytics to repeat the last piece of information a certain number of times.
Braces can be used with one or two numbers.
- {x,y} – repeat the last item at least x times and no more than y times
- {z} – repeat the last item exactly z times
How it works:
I can use the braces, combined with brackets and dashes, to include page numbers higher than 9 in the example above. I’ll also need to change the starting number from a 2 to a 0, or else the regex will ignore any pages containing the number one.
/page/[0-9]{1,2}
This will pull all urls that end in page/1 through page/99.
What it does: A dot matches ANY one character. Characters include letters, numbers and symbols. A dot even matches a whitespace.
a.c will match “abc”, “adc”, “a$c”, “a c” ,etc. It won’t match “ac” because there is no character between a and c.
How it works:
Truth be told, I really don’t use just the dot in analytics much. Even so, it’s still important to know how it functions so you set up your regular expressions correctly.
If I want to see all keywords for which someone included “.com” and I don’t use to remove the regex function of the dot, will find anything that has any character before “com.” Look at the difference in results below when .com is used with and without the .
What it does: A plus sign matches one or more of the previous items, and only the previous items.
a+bc will match abc, aabc, aaabc but not bc.
You can also use lists with plus signs to match more than just one previous item.
[abc]+ wil match a, ab, abc, acb, c, b, bbbbbbb, etc.
How it works:
Going back to the page number example, we can use + instead of { } to match to pages above 9.
What it does: A star matches zero or more of the previous items. Similar to plus signs except they allow you to match ZERO or more of the previous items (plus signs require at least one match).
a*bc will match abc, aabc, aaabc AND bc.
How it works:
Let’s take the example above. This would only match to page urls that have some number after them. If I use a star, it will match all urls that end in page/ with or without a number after it.
What it does: These two regular expressions put together mean “get everything.”
How it works:
If I want to compare visits for the 2nd page of every piece of content on my site I can set up my regular expression as .*/page/2/.* to catch every 2nd page url on my site.
What it does: When you use a caret in your RegEx you force the Expression to match only strings that start exactly the way your RegEx does.
^abc will match ab, a, abc but not bc
How it works:
Let’s revisit my earlier task of wanting to see all the pages in my main blog feed past page 1. Remember, I was getting category pages and not just main pages. Placing the carat at the beginning of my string can solve this problem.
^/page/[1-9]*
What it does: Indicates the end of the string. It tell Analytics not to match any target string that has any characters beyond where I have placed the dollar sign in my Regular Expression.
abc$ will match abc, bc but not abcd
How it works:
Finally we have all of the characters necessary to create an expression that only shows the different pages of the main blog.
Understanding how to use these regular expressions will help you quickly find the information you are looking for in your Analytics account. If you have any questions or comments please post below!