Tuesday, July 23, 2013

Categorizing Keywords

For those of you SEO's that manage very large sites and map your keyword categories to sections of your website - you know how difficult it is to categorize your terms and track their performance. Well, I have to say that after searching, asking, and digging around for a tool that does exactly what I am talking about, I finally came up with a solution. It's a bit of a workaround in Excel - but it's the best I can do until someone comes up with a tool that categorizes keywords for SEO.

Know Your Keywords and Categories

Before you get started categorizing the terms that come to your site, you should know what keywords you are targeting, and the combinations of terms as well. I'm going to use a flower shop's website as an example for this particular blog post. Categorizing is something you can do with any website. At the very least, you can categorize terms into "Broad" and "Branded", to get you started.

Most keyword tools can help you establish what categories to target. Google's Keyword Tool or WordTracker are just a couple of the many tools available on the web.

Another way to figure out terms that fit in categories is by grabbing search data (referring terms in Google Analytics) on your site for the past few months or year. I personally spent some time going through and categorizing keywords in Excel by using the filters and then having the sheet show all words including "anniversary" for terms around "anniversary flowers". It takes a lot of work and time, but in the long run you will have a more accurate account of the terms you will need to do the Lookup against.

Setting Up Your Template

Download the Template

Now that you have all the terms possible in all of your categories it's time to start setting up your template. You are going to want to Download the template I have set up in Excel. You can start from a fresh Excel document if you want, but the template has directions (in case you lose this blog post somehow) and the Lookup formula is in there.

Once you have downloaded the template it's time to get it set up to work for your keywords.

In the following steps - I am going to walk you through setting up the template and then categorizing the terms. If you don't have terms that you can use already, I have a zip file you can download and walk through the example with me to get familiar with how this works.

Copy and paste your first set of categorized terms and paste them into the first Tab marked "Broad". Since every site usually has a "Broad" category of terms, I figure that's probably the best to get started with. In the case of this example "flower shop", "online flower shop", and "best flower shop" terms are the ones that fit under the Broad category.

If you have the .zip folder downloaded, open up the "Terms" Excel doc and you will see the words already categorized for you. There are "Broad", "Branded", "Birthday", "Anniversary", and "Wedding". Click the Drop Down next to "Category" and click "select all (to deselect all) and then click "Broad". You will see all of the terms sort by just that "Broad" category.

Next select all of the terms in the "Keyword" list - copy and paste them into the "Broad" Tab.
We will then need to sort the terms in alphabetical order so that the Lookup string can go through them in order. If you don't then the Lookup won't work.

Highlight the Column with your keywords
Click "data" > "sort"
Select "My data has headers"
Select under "sort by" the column you keywords are under (should be column A)
Click OK

Double click the Tab and rename it with the one word name of your category.
Highlight all of your keywords in the column (just the cells that have words, not any blank cells).
Type the name of the category (stick to one word naming) into the upper left field. You have now named your table.

Do this for "Branded" and the other categories as well. You are going to have to create a new tab in the template to fit all the categories.

If you have not downloaded the .zip file and are working off of your own terms, creating new tabs and naming them is probably going to be something you will need to do. But don't worry, the template will still work.

Now that you have all of your keywords in your Template's Tabs with names and sorted it's time to set up your Lookup string.

Setting up Your Lookup

The way the Lookup works in this case is we are going to ask Excel to look at one Keyword (one cell) and match it up to one of the terms in the Tabs we have set up. If it matches one of those terms then we tell Excel to place the word into that Cell. If it doesn't, then we just leave that cell blank.

The string looks like this:
  • B2 is the cell of the keyword we want to look for.
  • the first "Broad" is the Table name we want to look for that keyword in.
  • Broad!A$2:Broad!A$9999998 is the Tab and range that the Table exists in.
  • FALSE is telling the Lookup to do an exact match. TRUE would look through to see if letters from that Keyword exist in the Cells we are looking in, so in this case it won't work.
  • We leave the ,"", as a blank - but you can put "not categorized" or "misc" to show that it isn't in a category. Though for our purposes here, we keep it blank.
  • ,"Broad" is telling Excel to put the word "Broad" in the cell if the keyword matches one of those in the Broad Table or Tab.

See - it's that easy...

What you are going to do next is replace the word "Broad" or "Cat1" with the name of your table, Tab, and category. This is why we name the Table, the Tab, and the Category the same so that our life is much easier when setting this string up.

Now your template is ready for you to paste some keywords with data and grab some numbers.

Gathering Your Data

Open up your Google Analytics account - if you don't have Google Analytics, pretty much any tracking tool that has a list of referring terms with some sort of data is fine. You can expand and contract the columns to the right of the terms as you wish. The template you will download will have the columns set up just for the purpose of exporting referring terms with visits and such from Google Analytics though.

Log into your Google Analytics account.
Click "Traffic Sources" > "Sources" > "Search" > "Organic"
Select the date range you would like to report on.
Scroll to the bottom of the report and show 5,000 rows.
Scroll back to the top and click "Export" the select "CSV".
After the file has downloaded, open the excel file.
Highlight JUST the cells that include the keywords and your data (ignore the first few at the top with date and information, and the bottom that summarize the data and below).
Copy those cells, and paste into your "Master" Tab.

Note: If you have multiple dates you would like to track, you can export the different date ranges, and then add which keywords go with what date in the Master Tab. This will allow you to see trends of categories.

I added an Excel doc called "Analytics Organic Search Traffic" with some terms and fake data that you can play with. There are three tabs that I added dates for each day's data. Start with just the one day and play with that to get familiar with percentages. From there you can play with all three dates and work on your trends to see what categories are trending up and down.

Completing Your Lookup

Now that you have copied and pasted the keywords into the "Master" Tab it's time to get all of those terms categorized.

Select the top row with your categories and your "All Categories" cell
Copy just those cells in the top row
Highlight the next row (same cells just below) hold down the "shift" key
Scroll down to the last keyword record
Holding down the shift key select the last cell under the "All categories" - this highlights all of those cells for those categories to Lookup the keywords.
Hit "CTRL+V" on your keyboard (this quickly pastes the Lookup formulas for each line)
Be patient, as it may take a while for your Lookup to complete (depending on how many keywords, and records you have)
The "Master" Tab should look something like this:

Playing With Your Data

The most efficient way to gather information from your data is to copy the entire "Master" Tab and paste as values into a new Excel sheet.  This way you won't have to wait for the Lookup to complete each time you sort, pivot, etc.

Click the top left "Arrow" in the "Master" Tab
Right Click and select "Copy"
Open a new Excel Doc
Right Click and select

From here you can create pivot tables then sort them into pie charts, graphs, and all sorts of fun reports to see how your keywords are performing.

I personally like to start with a quick pie chat to see what category of terms brings int he most traffic. At times we will have a drop or rise in traffic, and it's good to understand which category of terms are fluctuating. By copying and pasting terms by dates (weeks, months, or even a set of a few days) will help me see which categories are fluctuating on a timeline trend. Knowing which categories bring int he most traffic, I can then make decisions on which parts of the website we need to focus our efforts on to increase traffic.

See how much fun categorizing your terms can be?
Now that I have a template I work off of, when traffic goes up I can quickly categorize the terms and let our executives know if our recent efforts have worked.

Thursday, June 20, 2013

How long does it take for Google to recognize 301s?

Or Better Yet - 

It's been over a year and Google still doesn't have the new URLs in the Index

Just over a year ago, I started working on this website that had over 900k top level domain files. We changed the structure of the URLs to a more organized hierarchy. The pages content changed slightly, but most importantly instead of all of the site's pages residing directly under the main domain (Let's use a computer broad to longtail term structure for example - like domain.com/computer.html and domain.com/laptop-computer.html and domain.com/500gb-laptop-computer.html) we changed them to a more representative hierarchy directory to file structure (example - domain.com/computer/ to domain.com/computer/laptop-computer/ then domain.com/computer/laptop-computer/500gb.html).

Why the URL Hierarchy?

The quick and simple explanation as to why we did this is that while URLs are fairly dynamic these days,  the bots like to see and understand how a website is organized on a server. Remember your old school folder and file structure back when sites were in html were built?  The URLs you have today should represent that organize file structure as much as possible. I cover this in my SEO Workshop  (slide 23)- but I also found a pretty good article that explains the hierarchy relatively simply and quickly.

The process in setting the 301

The Since the entire 27+ million pages on the site were mostly files located directly under the main domain, it was difficult to understand what pages fit under what category so that we could organize them. I went to our keyword analysis and bucketed each focus term out and then organized the correlating URLs to fit within that bucket. Once that was done, I worked with the Developers to pull the naming from the database (dynamically) into the directory and file structure that fit the buckets. Some of the keywords I knew I eventually wanted to build out with supporting pages, so those got directory levels instead of page levels for future optimization (and limit more 301 redirecting later on).

I mention a bit about breaking the site up into sections for analytics purposes in my previous post "SEO Issues - is it Penguin? Is it Panda? or is it me?" under "Figuring out what was hit by Penguin". The "video" to the left is a quick (and very raw) animation to help explain exactly what we did. Now that the site was organized it not only helps the bots understand the structure, but helps us understand what sections bring in what SEO traffic in Google Analytics.

How Long Does it Take Google to See New URL via 301 Redirect?

This whole undertaking was completed over a course of 2-3 months starting in June 2012 (last year) and finished up with the last of the redesigns and URL changes in August with one last directory change (no redesigned pages) in January of this year (2013). The most important ones are still are showing 550,000 pages in Google's Index (11 months later):
As I Google to see if others have a solution for speeding up the indexing of these old URLs, or if even if anyone has had the same problem I found a lot of questions in various forums (both reliable and unreliable) but no real articles, blog posts, or anything from reputable SEO's. The most common answer in the forums is to just "wait". It's, of course, what I tell others when they ask me "Be patient, Google will eventually hit those pages again and recognize that they have changed then correct the index then." But after nearly a year and so many pages, this is getting ridiculous.

I spoke with my friend (and SEO mentor) Bruce Clay who came back with the suggestion to add an .xml sitemap and submit it to Google with the old URLs we want removed.

It was kinda making sense that because those old URLs are no longer linked to, and there are so many, that Google wasn't crawling them as much anymore. They are just sitting there in the index - and not getting "updated"

Unfortunately getting a sitemap added is not an easy feat. I would have to define the strategy, present it to the powers that be with data to backup the success metrics in order to get the project prioritized. With so many other initiatives needed for SEO, all of which were more important and affect the business in a positive way, it was in my best interest to keep pushing those and not deal with the sitemap.

My work around, though, was about as black hat as I would get (Matt Cutts if you are reading this, I apologize and throw myself at your mercy, but it had to be done). One weekend over a month ago, I grabbed one of my many impulsive purchased domains and quickly set up hosting and an old school html site that consisted of one page. I then exported all of the links on the Google "site:" search through a Firefox plugin called SEOquake that exports the results into a csv file. It's not the prettiest, and there was a lot of work still needed to get to just the URLs, but it was the best solution I could find (note: if any SEO reading this knows of an easier way to do this - please add to the comments for prosperity). I then parsed out the parameters in the URLs in a separate document and used those as the anchor text for each URL. Finally, using excel I then concatenated the URLs and parameters (that were now anchor text) into an html href string.
Then copying and pasting the "string" column into the html code, the page looked like:
The page wasn't the prettiest, and it had thousands of links (the above is just an example) so it was bad all around, but the point was to get those links crawled by Google.

Of course every SEO knows that you can't just build a website and expect it to immediately get crawled - right? 

So I set it up in Google Webmaster Tools and submitted the page to the the index:
I even got more fancy to ensure Google would see the page and crawl all of those old URLs and +1'd it on Google. 

Did it work?

I checked the URLs this evening to see how many Google is seeing and the number has dropped from 550,000 to now only 175.

I took the domain off of the server, and now have it parked elsewhere (back where it belongs) and removed the webmaster tools account. All traces of it ever existing are now gone, and the small moment of my attempt to get those URLs removed has passed.

Thank For the Advice Jenn - Now I'm Going to Try This!

If you have come across this post and you need to do something similar - I'm going to put the same disclaimer they do when a very dangerous stunt is performed in commercials. 
Do not attempt this at home - this stunt was performed by a trained professional on a closed course.

So, don't go adding a bunch of links to a random domain thinking that your attempt just weeks ago to 301 pages isn't working. The links on the external domain were too many for the domain and page, and were extremely spammy. In addition, all those links pointing to pages that were redirecting and were supposed to pass value to the new URLs, now had many spammy links pointing to them from a very spammy domain. If left up too long, or not done correctly, it could actually cause more damage than ever helping.

If you have any questions, or feel you need to try this same strategy, please don't hesitate to contact me. I'm here to help, and want to ensure that your website has considered all possible options before attempting any such trickery.

Some Helpful Links on the Very Subject:

Friday, January 18, 2013

SEO Issues - is it Penguin? Is it Panda? or is it me?

The following story is one that has been several months in the making. It's one that I have lived through one too many times as an SEO, and it is one that I am sure other SEO's have faced. I fought with the thought of writing this for fear that someone from the company might read it and get angry that the story is told. But, it's something I think that not only people out there could learn from, but speaks to so many others in this industry to show them that they are not alone.

It's long, it's a bit technical (I tried to keep it simple), and it has some personal frustrations laid out in words. My only hope is that you get value out of reading this as much as living it has made me a better person (or well, a better SEO).

It Begins

I started working on this website's SEO in May 2012 at which time I was told the site's traffic was declining due to Panda updates. In February of 2012 the traffic from SEO was the best they had ever seen, but soon after that there was a steady decline.
Traffic from February 2012 - May 2012
Before digging into any possible SEO issues, I first checked the Google Trends to ensure that the decline isn't searcher related. Often times a drop in traffic could just mean that users aren't searching for the terms the website is ranking for as they were in the past.

Top Key Terms in Google Trends
Looking at the same time frame as the traffic data, I noticed an increase in searches for the top 3 terms the website ranked for, and there appeared to be a decline around the same time from March to April that the traffic was reflecting. But there was a drop in the website's traffic in April from the 23rd to the 24th and then significantly on the 25th. The website I was working on had two SEO's already working on it: an agency and a consultant. Both had already done a numerous amount of research and some work to get the website on track. Both were stressing that the drop in traffic was due to the Panda updates by Google. I looked at SEOmoz's Google Algorithm Change History and found an update to Google's Panda on April 19th and an update to Penguin on April 24th. Given that the traffic significantly dropped on the 24th my best guess is that it was possibly Penguin related, but still needed further exploration.

Figuring Out What Was Hit by Penguin.

The site is/was broken up into sections by keyword focus. At one point, I could tell that someone really had a good head on their shoulders for SEO, but the strategy that was used was outdated. Perhaps the site was originally optimized several years before, and it just needs some cleanup now to bring it up to 2012's optimization standards. So, understanding Penguin and identifying which part of the site was driving the bulk of the organic traffic was going to be my next step in solving this mystery. Once I understood why, and where, then I could start to establish a what to do to solve the problem.

I broke the site traffic report by sections as best I could in Google Analytics. There was a bit of a struggle as all of the pages of the site resided on the main domain. Without a hierarchy in place, breaking out the sections had to be accomplished with a custom report and a head matching for landing pages. I hadn't had to do this before, so the agency that was working with the site already helped build the first report, and I began building out the other reports from there.
Click to View Larger
Section 1 over 72% of traffic

Just focusing on April and May I created a Dashboard in Google Analytics focusing on organic Traffic and identifying the sections of the site. Looking at the different sections - Section 1 was the bulk of the traffic with over 72% and Section 2 coming in second with just over 15%. Subs of Section 3 and other one-off pages make up the difference.

Both Section 1 and Section 2 dropped off after the April 24th date, so clearly they were the bulk of what was pulling the overall traffic numbers down. Since Section 1 was the majority of the traffic, I presented to the executive responsible for the site that we address any issues with that page first.

Actual screenshot of Section 1 presented
I took all of the research from the agency and consultant and we quickly reworked the pages to represent a hierarchy in the URL structure, and cleaned up any issues from the outdated optimization that was done.

Soon after Section 1 was addressed, we did the same with Section 2, and then worked on Section 3 (and sub pages, rolling them up into a solid section) and then added a few pages to grab any new opportunity.

Not Quite As  Easy as it Looks

The projects were launched in increments - first URL hierarchy fix to Section 1 and then the page redesign. Next was a full launch of URL fixes and page redesign to Section 2, and then lastly Section 3 and the new Section 4.
Section 1 - Section 2- Section 3 Launch Dates and Organic Traffic
Soon after Section 1 was launched traffic started declining rapidly. I was asked several times why traffic was getting worse, and I started digging some more. Every time I looked at the Impressions of the new URLs from Section 1 they weren't getting any traction, but the previous URLs were still.  I began looking at the history of the website, trying to find out why it was doing so well at one point, but was not doing well at that time. One of the things I noticed was that there was a lack of priority linking to these pages, but at some point there were links to some of them individually from the homepage. Google matches a hierarchy of pages to a directory structure that links are presented on a site. This site had every page on the first level, and linking to those pages from the homepage, which was telling Google that every page was the most important page. It worked at one time, but as Google has been rolling out their 2012 updates these pages were getting hit, and those links on the homepage weren't there anymore. Before the launch of Section 2, I had them put links to the main directory for each section on the homepage. The links would tell the search engines that these are important pages of the website, but not be so obnoxious with a dozen or more links on the homepage to discourage users (avoiding the appearance of spamminess).

But - even after adding the links to the homepage, the traffic to those pages was still declining. Pressure was put on me to figure out what was wrong. In addition, accusations were flying that I single-handedly ruined the SEO for the site, I spent every waking hour looking at reports, and trying to figure out what was going on. I consulted friends in the industry, and read every article I could find to figure out what Panda or Penguin updates were affecting these pages.

Then it hit me - just as the links to these sections would help them get recognized as important pages, so were the other pages that were being linked to from the homepage. In fact a set of them linked to the website's search results with queries attached to them mimicking pages, but showing search results. On those search results pages, there were over 200 links with multiple (we're talking hundreds - possibly thousands) combinations of parameters. The bots were coming to the homepage, going to the links to the search results pages, and then getting stuck in this vortex of links and combinations of parameter generating URLs - not allowing any crawl time for the pages that once were getting rankings. This also explains why the new URLs weren't showing very many impressions in the Webmaster Tools Data - those pages just weren't getting crawled.

There was a project underway that would solve the many links on the search pages, and there was also talk of using ajax to show the results. When this project would launch, the bots would go to the URL from the homepage, but would then essential not go much further. With this project a few months out, I made the case to add the search page to robots.txt to allow the bots to then recognize the Sections as important pages. After several weeks of attempting to convince the powers that be, the URL was eventually added to the robots.txt file.

Immediately after the search page was added to the robots.txt Google Webmaster tools presented me with a warning:
Warning in Webmaster Tools
In most cases, a warning from Google should never be taken lightly, but in this case it was exactly what I wanted. In fact it proved to me that my theory was correct, and that the site was hopefully headed down the right path.

Panic, Questioning, and a Third Party

As with every up in the SEO world, there must be a down. Soon after the search result page was added to the robots.txt the organic traffic to the site dropped, and continued to drop. Throughout those grueling three months there were several Google Panda and Penguin updates. I documented each and every one of them in Google Analytics, and continued to answer questions, gathering data, and dealing with being under close scrutiny that the work I was doing was complete BS.
Organic Traffic from September 2012 - November 2012
I sat in numerous meetings, some of which I walked out crying (I'm not afraid to admit it), being questioned about the road I had taken and why we weren't seeing results. There were people within the company recommending that they roll the pages back to where they were before, and even changing the URLs. I fought hard that they don't touch a thing. I sent an article posted on Search Engine Land by Barry Schwartz citing Google's patent that "tricks" search spammers.

The patent states:

When a spammer tries to positively influence a document’s rank through rank-modifying spamming, the spammer may be perplexed by the rank assigned by a rank transition function consistent with the principles of the invention, such as the ones described above. For example, the initial response to the spammer’s changes may cause the document’s rank to be negatively influenced rather than positively influenced. Unexpected results are bound to elicit a response from a spammer, particularly if their client is upset with the results. In response to negative results, the spammer may remove the changes and, thereby render the long-term impact on the document’s rank zero. Alternatively or additionally, it may take an unknown (possibly variable) amount of time to see positive (or expected) results in response to the spammer’s changes. In response to delayed results, the spammer may perform additional changes in an attempt to positively (or more positively) influence the document’s rank. In either event, these further spammer-initiated changes may assist in identifying signs of rank-modifying spamming.
 But the article and my please fell on deaf ears...

It had gotten so heated and there was fear that nothing was being done while traffic was significantly declining that the company brought in yet another SEO consultant to look at the site objectively.

Just as the consultant was starting his audit, and the traffic hit the lowest I ever thought it could possibly go, the next day traffic went up. The last week in November (roughly 3 months after we blocked the search result page) I saw an increase in traffic in Google Analytics to Section 1:
Section 1 Organic Traffic
I quickly pulled up my report to check the Section's impressions from the Webmaster Tools data, and there was a significant increase as well:
Section 1 Impressions from Webmaster Tools Data
On December 3, 2012 I logged into Webmaster Tools and saw that the warning had gone away:
It was the "halleluiah" moment that every SEO dreams of, and very few get. All the work I had done, the fighting for what I believed in, it all finally paid off.

To this day traffic continues to increase - we can now focus on some of the cleanup still left to do, and then onto projects that will attract new opportunity.
Organic Traffic from November 2012 - January 17, 2013 (day before this post is written)
Quick Note: 
I forgot to mention a post I wrote months ago while going through all of this - SEO - Panda and the Penguins. It helps to give a bit of perspective of some of the linking stuff I didn't get into in this post.