Group, count and order top words or phrases in Excel 2007

This is going to be one of those posts that's probably more for me than anyone else but recently it drove me a little loopy trying to sort out a basic task. I was asked to take one of excel error reports one of our websites generates and get a count for the top 20 recurring error messages. Sounds simple enough you might say, well thanks to Excel unless you know where to dig it can be a little frustrating to do!

Here are the steps I went through to do this work (note: I'm using Office 2007 for this):

  1. I downloaded my excel sheet from my site and opened it up in Excel. One of the columns contained the error messages so I copied that and put it into a new worksheet.  
  2. At this stage I have all my error messages but I want to group them and get a total for the number of occurrences of each error message. First I go to 'Sort & Filter' and order the error messages column alphabetically. Then click on the Data tab and select SubTotal. The default values in the subtotal dialog box are usually ok - just make sure the Use Function option is selected to 'Count' and click OK.
  3. The data is now grouped and you will have a mixture of the error messages along with the actual occurrences total. In the row number column you should notice some lines indicating the groupings. By default there should be 3 lines here. Selecting the second line will close all of the duplicate records and display just the error message and the number of times it occurs.

    "Brilliant", you think. "I'm done". Woah there jimmy, not so fast! Next up you need to order these grouped results by highest to lowest.
  4. I find the easiest way to order the number of occurrences is to highlight to two columns (the error message and the number of occurrences) and click on the 'Find & Select' option from the Home tab and select 'Go To Special'. When the dialog appears select 'Visible Cells Only'. This will highlight all of the visible cells (it excludes all the collapsed information which is what we want). Copy this and paste the results into a new worksheet.
  5. Now you can order the number of occurrences from highest to lowest. See, wasn't that easy? :) Maybe someone else out there might know of an easier way to get this end result but I found that this way works best for me. I normally forget step 4 so when I try to paste the collapsed items into a new worksheet it pastes all of the data which is not what I needed or wanted.

Like I said at the start of this post - I don't really expect any of the above to useful to most people but for me, this is going to be one of those posts I go back to and say "Ohhhh ya, that's how I did it!".

blog comments powered by Disqus

Get In Touch

Follow me online at TwitterFacebook or Flickr.

Latest Tweets