Whether you have a strong background in working with data or barely know how to use an Excel spreadsheet, it’s important to understand how powerful it can be to use numbers to prove your assumptions.
But that sounds daunting, doesn’t it? Fortunately, we are here to provide you with lessons and templates to help us get more comfortable with data, and understand how we can use data to provide a compelling story as to what took place with our marketing efforts!
Another word for data is facts, and when collected in an appropriate and statistically significant manner, these facts are undeniable to all but the hungriest of HiPPOS (the Highest Paid Persons Opinion within an organization).
The key is to take these raw facts and shape them into something compelling. This involves turning data into information, and then turning that information into knowledge.
Since knowledge is power, connecting dots and forming a data driven conclusion is a way to speak from a position authority.
But how do we get there?
Getting started with data analysis
For this week’s lesson, we will walk through how we can start to take raw data and turn it into something meaningful. To make things easy on us, we will limit ourselves to only using one tool and one source of data to get started.
All that we will need is an Excel spreadsheet that contains the data that we would like to analyze. While you may not have access to vast amounts of data in your current position, there are several free data sets that we can use to begin out lesson. The largest free data resource is the United States government data.gov website. There you can search through over 90,000 data sets on just about any topic you might find of interest.
Another source that I have been meaning to play with in recent months is the results data from the 2014 Moz Industry Survey. The good folks at Moz published a survey of digital marketing industry professionals in late 2013 and had nearly 4,000 people take the survey. After providing some analysis of the data they collected on their own website, they took things a step further and made their raw data open-source for anyone to download and analyze (as long as you give attribution back to Moz).
It is rare that such a relevant data set would be made available to us for download, so I thought it would be fun to use this data set for our lesson in data before they change their minds!
Downloading the raw data set
The Moz survey data is available for download at the bottom of the Moz Industry Survey website. If you scroll down to the bottom of the post, you should notice a button that looks like this:
If you click on this button, you will be prompted to download a CSV file that contains the raw results of the survey with 3,769 rows and something like 270 columns of answers to questions. That is a lot of data for us to play with!
As soon as the file downloads, open it up in Excel, and here is what you should see in the Raw results.csv tab:
Note: I am using the Mac version of Microsoft Excel for these screenshots. Windows versions will look slightly different, but have the same functionality.
While there is a lot of data here, it’s not exactly meaningful at the moment, is it? This is one of the troubles of working with raw data – it is so raw that you’re not sure what to do with it at first!
Cleaning up and making sense of data
One of the first things that we will want to do with a raw data set is to clean up the data to make it easier to understand. I like to take the data and apply formatting to the table to make it easier to read in Excel. If you are using a current version of Excel, you can use the tables section to choose a color format for your tables. On a Mac, it looks like this:
Apply this format to your data set and it will start to look a little cleaner for you to review:
I also find it easier to read the data when rows are alternating in color.
This data is just showing the first 6 columns of data out of 270, so in this particular case it will take some time to make sense of the data we are reviewing. In future lessons we will walk through how to make sense of all of the data columns in the survey, but for now we will be just talking about applying basic formatting.
Finding patterns in data through pivot tables
Even glancing at the raw data in the screenshot above, we can start to make some conclusions about the data we are seeing. We know that there are both males and females in our data set. We know that they have a broad range of ages and come from more than just the United States. Looking through all 270 columns helps us determine that we know a lot about these people who have filled out the Moz survey.
But if we just glance or “eyeball” this raw data, it’s really tough to quantify whether there are more males than females. You would need to count through nearly 4,000 rows of data and tally up male vs. female.
That is ridiculously inefficient, isn’t it?
There is a better way, and it comes in the form of something called a Pivot Table. Simply put, a pivot table is like having a database query engine (like SQL) right in your Excel spreadsheet. Simply select the columns of data you would like to analyze and you can turn it into a pivot table.
After clicking this button, I then selected which range of data I would like to use for analysis. Since we converted our raw data to a table in the step above, I can simply choose the first two column names to get started.
This will create a grid with nothing in it that looks like this:
Not exactly useful yet, but we’ll get there soon!
Using the pivot table builder (this will surely look different on a PC), you can start to select the table columns that are in your spreadsheet:
Let’s add Gender to the Row Labels and the Values boxes at the bottom:
The table will now be filled in with a count of how many people responded Female or Male to the survey. We also found that some people chose not to say their gender.
It worked! We just saved ourselves hours of work and tons of pain to get an extremely accurate answer using pivot tables! We may be able to do the same thing for the rest of our 270 columns. But before we do that, let’s start to form a strategic approach to analyzing this data, because creating 270 pivot tables is not the most efficient use of our time.
Preparing for strategic data analysis
If we tried to do analysis for each of the 270 columns of data from the Moz survey, we would probably never find time to publish our results. This is because we are creating so many factors that we could potentially analyze that it becomes overwhelming. Instead, I recommend grouping the columns into the specific topic areas that they cover so that we can understand the data we are dealing with and form our own questions for analysis.
In looking at the Moz survey data, here is how I would initially group items together for ease of understanding. We have data that provides facts about:
- Demographics (age, gender, location)
- Work Experience (years on the job, education, salary levels)
- Current job functions (what someone does and how often)
- Skill Acquisition (how we learned to do our job)
- Clients (who we work for)
- Our Companies (how many employees, budget for tools, etc.)
- Industry Demand (where is demand increasing/decreasing)
- Importance of Metrics (SEO, Social, etc.)
- Tools of the Trade (for SEO, Social, etc.)
That is a lot of attributes about marketers that this survey has provided us! Using these buckets of topics, we can start to piece together a list of questions that can be answered by this data.
Here are some questions that I think we can answer using the Moz data:
How is skill acquisition influenced by demographics?
This is an interesting question for me. We have all read articles about the differences in learning styles between generations (Baby Boomers, Generation X, Millenials, etc.), as well as the differences between males and females when it comes to learning. I am wondering if there is a noticeable difference between ages and genders.
Do we use more or less tools as we get deeper into our career?
I have always thought that obsession with tools of the trade is something that happens early in our careers. This draws from both personal experience as well as the thought that people further in their career think less tactically than those just getting started. Tools are often used to speed up tactical excellence.
Are employees spending their time in current job functions that are increasing as industry demand increases?
Several questions in the survey pertain to whether demand for certain services are increasing, decreasing or staying the same. I have often observed that there are rarely enough talented people to fill in-demand roles, so I would be curious to know whether those taking the survey spent more time in these in demand roles or if they spent time in areas that were experiencing neutral or decreasing demand.
Is there a correlation between the size of our company and the size of our clients?
Big clients will only work with big companies, right? Not necessarily. While I was working at a boutique marketing agency for years, we were able to consistently work with companies that had over 100x more employees than we did. While I am not sure if we were the exception or the rule, we can see if the data will allow us to draw conclusions.