Developing knowledge base metrics (this is long!)

Cheng Wang

unread,

Oct 14, 2008, 6:01:59 PM10/14/08

to

The support.mozilla.com knowledge base is visited every week by hundreds of thousands of users. Following where these users go on the site and what articles they find helpful provides very good data as to what issues our users are facing as well as where we could improve articles to help more users.

Towards that end, each troubleshooting article has the following question to which users answer yes or no: “Did this article solve a problem you had with Firefox?”

Historical background

At first blush, it seems that counting the number of users who answer “yes” to this question for a given article will give an indication of how many people have a given problem. However, this gives (for this past week) the following statistics as the top 5:

For Internet Explorer Users 833

Installing Firefox On Windows 240

Keyboard Shortcuts 205

Clearing Private Data 167

Clearing Location Bar History 153

How To Make Firefox The Default Browser 133

Options Window 127

Cookies 93

How To Set The Home Page 69

Using Firefox 68

As you can see, it appears that the vast majority of our users just want to read about what we have to say about Internet Explorer or installing Firefox on windows, even though reading through forum posts and Hendrix posts suggests that users are really having trouble with things like the change to the location bar and bookmarks UI as well as bookmarks not saving. A quick look at the way people make it these pages explains why the stats are as skewed as they are: the For Internet Explorer Users article is linked from the Firefox help menu directly and most of the other articles in the top 10 are linked from the in-product help page. This means that traffic to these pages is much higher than to other KB article pages and hence they naturally have more votes.

Development of current approach

To compensate for vote skewing based on how many users see a page, we would have to divide the number of votes by the number of pageviews.

Also, users who voted that “no, this article didn’t solve a problem” probably had the issue in question but our support article wasn’t able to address it, so to build a metric of which are the most common problems, we should include those votes as well. We can build a votes metric that is Yes votes plus No votes.

Doing this gives this list: Essentially all pages which were mis-linked from the English language search results or weren’t properly filed. This is because these results have so few pageviews that their numbers are artificially inflated.

Installer Firefox Sur Windows

Firefox Unter Linux Installieren

Zoom De La Page

Installare Firefox Su Linux

Bloqueador De Popups

Informazioni Su Javascript

Importation

Mistet Bokmerker

Innstillinger

Suggestions De Recherche

We’re going to have to factor the number of pageviews back into the equation somehow. Too many pages have next to no pageviews and that completely skews the numbers. We seem to have come full circle.

However, Ominiture is more powerful than this. It can also tell us where people go after they search or visit our front page or inproduct front page. We can combine this data into a single statistic which I’ll (for lack of a better term) call significant pageviews. Someone searching for a problem and landing on a KB article is worth 1 point. Someone clicking on a link from the start page is worth 0.2 (because they find that page when they look for Firefox help), clicking from the inproduct page is only 0.1 point (since it’s an easier page to get to and people may find it even when they don’t have a specific problem. Since basic pageviews are still important (links from outside are included here) we have to factor those in at 0.01 points. All of these factors can be adjusted but these are numbers that serve for illustrative purposes now.

We can then build our final metric (score):

Article score = (Significant pageviews) * (Total votes) / (Total pageviews)

Ranking by article score gives the following:

Clearing Private Data 270.8

No Sound In Firefox 249.34

How To Clear Search Bar History 138.54

Clearing Location Bar History 134.8

Hiding Bookmarks In The Smart Location Bar 125.2

Bookmarks 120.85

Bookmarks Not Saved 118.73

Options Window 113.24

For Internet Explorer Users 111.82

Organizing Bookmarks 99.22

This is a lot better (although it’s also apparent that it can use a lot more tweaking since articles like For Internet Explorer Users is still overrepresented).

Distinguishing yes and no votes

Now, in the above scenarios, we’ve merged yes votes and no votes to figure out which problems users are most facing. What if we wanted to answer the question of which articles are most likely to solve the user’s issue and which ones are most likely to be a waste of users’ time? With our new number of “significant pageviews” we can look mainly at yes votes and no votes. Since a yes vote should count against a no vote and vice versa (a large number of yes votes with a corresponding large number of no votes shouldn’t count more than an article which is predominantly yes votes), we build the following metrics:

Most helpful articles: (Significant pageviews) * (Yes votes – No votes * 0.5) / (Total pageviews)
Least helpful articles: (Significant pageviews) * (No votes – Yes votes * 0.5) / (Total pageviews)

Astute observers may wonder why we don’t just ratio yes votes to no votes. The answer is that removes the significance of vote count entirely. An article with just one vote in one direction will dominate over one with hundreds in both. This way of balancing votes is a little more informative in terms of providing the desired information without skewing pages with few votes too heavily and it also has a nice symmetry with the previous method of scoring.

This is the current status of the KB metrics collection.

Possible improvements

The primary improvement that I would like to see is factoring in how people get to a certain page if they’re not coming via searching or a front page. If they’re linking from other knowledge base articles, or the forums, I’d like to score that as high as or higher than search results, whereas if they’re coming from outside the site, I’d score that low. Unfortunately while it is technically possible to distinguish between these possibilities using Omniture data, it is extremely labor intensive as it involves downloading a report for every single knowledge base article. Until there’s a simple way to get a dump of the entire Omniture data set each week, it’ll be very difficult to produce this quality of data.

Another improvement is dividing the large and unwieldy knowledge base articles up such that each one addresses a single issue. This way, we can pinpoint exactly which solutions are useful and which should be dropped. The reason the Bookmarks article is so highly ranked is possibly because people aren’t familiar with bookmarks or perhaps they have a problem with some aspect of the bookmarks UI or perhaps they have lost their bookmarks. It’s impossible to tell right now but if we could divide those up or somehow collect more detailed statistics, our data will be a lot better.

Adjusting factors. Since there is no conclusive independent ranking of knowledge base articles, it’s impossible to know how accurate this dataset is at predicting what issues users will have. It is possible the factors will need to be adjusted to make a more useful “significant pageviews” number. The numbers used above were devised based on “best guess” estimates of the relative number of people who come via each channel with the indicated issue with Firefox. Without hard facts or independent confirmation, that’s as good as it is possible to do at this time. However, if you think the numbers should be significantly different, please say so. We can try all sorts of combinations out.

David Tenser

unread,

Oct 15, 2008, 9:20:26 AM10/15/08

to

Cheng Wang skrev:

> Also, users who voted that “no, this article didn’t solve a problem”
> probably had the issue in question but our support article wasn’t able
> to address it, so to build a metric of which are the most common
> problems, we should include those votes as well. We can build a votes
> metric that is Yes votes plus No votes.

This part could be clarified a bit: Basically, there are three main
reasons why someone would vote No on the question "Did this article
solve the problem you had with Firefox?":

a) The article correctly describes the user's problem, but the suggested
solution(s) didn't solve it.

b) The article does not describe the user's problem -- the user is
reading the wrong article.

c) The user doesn't even have a problem, but somehow decided to vote anyway.

We can be fairly sure that group c is marginal here, leaving us with
gruops a and b. We're making the assumption that if a person reads all
the way down to the end of the article, that's an indication that the
article correctly describes the problem, meaning the person belongs in
group a.

However, we can't assume that everyone is in this group, so the factor
for "No" votes should be lower than 1.0, maybe 0.75. The only way to
know for sure would be to ask a follow-up question when someone votes no
asking for the reason why it didn't help. At this point I don't think
this is necessary; instead we should probably focus on providing good
next steps for the user, e.g. relevant articles or a way to the forum or
live chat.

Great summary of the hard work that has been put into this effort of
getting better metrics from SUMO, Cheng!

Axel Hecht

unread,

Oct 15, 2008, 10:05:22 AM10/15/08

to

Hi,

first of all, it's great to see someone look into metrics on sumo.

Could you detail a bit on what questions you're actually trying to
answer with those metrics? I probably just missed the memo.

I have a few technical issues, I have to admit.

- troubleshooting article

What's the definition of that? Like, the "For Internet Explorer Users"
seems to qualify and not in your analysis, in that on the one hand, it
has the Yes/No buttons, but on the other hand you're unhappy if it pops
up in the analysis. Not sure why it shouldn't.

- translated articles

There is one section, where you come up with a bunch of article names
that look like translations. For some metrics, translations and their
en-US original should probably counted as one. Unless you're trying to
evaluate translation quality for a particular page.

- statistical relevance

You're introducing some complex metric with adjustable weights, and one
of the reasons you cited was to work around the noise of non-popular
pages. There are statistical measures like variance for that. How about
cutting off the data by the variance at some point and using a simple
measure?

- weights in the complex measure

I didn't get those, at all. Like, why would a user that finds the page
he's looking for because we've set up a good navigation path be less
significant than a search result?

- "no" answers

I think that "no" answers are different answers than "yes". There are
various steps from the problem to a troubleshooting article, and if
you're looking at the wrong page, you might say "no". That is more an
answer on search results or other navigation, though, and not that much
of an answer on whether that article gives a good answer to the problem
it's about.

Yeah, sorry, loads of "no"s.

Depending on which question you're trying to answer, it might be a nice
other project to analyse what users are searching for when hitting a
troubleshoot article, or even more, if there are peaks in searches which
we don't have answers for or the like.

Axel

Cheng Wang

unread,

Oct 20, 2008, 6:46:06 PM10/20/08

to

Axel Hecht wrote:
> Hi,
>
> first of all, it's great to see someone look into metrics on sumo.
>
> Could you detail a bit on what questions you're actually trying to
> answer with those metrics? I probably just missed the memo.

The most basic question: what issues/problems with Firefox are our users facing most commonly? More specifically, we want data that we can then take to the greater mozilla community as a form of feedback as to what issues are most prevalent among USERS (not technical folk who understand bugzilla). There was no memo, it's my fault for not making the purpose of this project clear at the outset.

>
> I have a few technical issues, I have to admit.
>
> - troubleshooting article
>
> What's the definition of that? Like, the "For Internet Explorer Users"
> seems to qualify and not in your analysis, in that on the one hand, it
> has the Yes/No buttons, but on the other hand you're unhappy if it pops
> up in the analysis. Not sure why it shouldn't.

I don't think it should qualify because it doesn't fix a problem but provides information. In that sense it doesn't actually help us identify issues with Firefox. More importantly, I highly doubt that that many users are coming to support because they have a problem with migration from IE, it's more likely that they click on "Help for IE users" from the Help menu in Firefox in windows, see the poll at the bottom and just vote. Based on support questions to the forums, livechat, almost no one is looking for a feature in IE that they don't know how to get in Firefox.

>
> - translated articles
>
> There is one section, where you come up with a bunch of article names
> that look like translations. For some metrics, translations and their
> en-US original should probably counted as one. Unless you're trying to
> evaluate translation quality for a particular page.
>

I'm working off article names and none of my raw numbers track which English article each foreign language article goes to. It's by far much easier to filter on URL and drop hits on foreign language pages altogether. This however lets some article hits to slip through for various tikiwiki bugs and thus you see a lot of articles with really few pageviews. More importantly, the point I was making in that section is that articles with pageviews = 1 but more than one vote (basically articles where omniture screwed up the local pointing in the URL tracking due to a tikiwiki bug) are going to dominate if we only consider votes-per-pageview so we have to adjust for it. Basically that statistic no longer measures how common certain issues are but rather how _infrequently_ a given page is seen.

> - statistical relevance
>
> You're introducing some complex metric with adjustable weights, and one
> of the reasons you cited was to work around the noise of non-popular
> pages. There are statistical measures like variance for that. How about
> cutting off the data by the variance at some point and using a simple
> measure?
>

I'm not sure how variance accounts for this. On what metric do I calculate variance? This isn't an issue with statistical or random noise, this is a problem with user flow. For example, the article on Options and the article on For Internet users can be accessed directly from Firefox with the user not having to go via the main support site or having a specific problem in mind. While we make the assumption that users who vote on a page are going to have attempted the solution on that page, for pages where there is no explicit problem or explicit solutions, that doesn't necessarily apply. We have to divide out the huge statistical anomalies. As for statistical relevance, this is data mining and it's always always possible to over interpret data and assign it higher relevance than it may actually have. The most concrete conclusions you can draw are which articles represent problems that users are most likely to face/look for in a rough, ranked sense. Just because one art
icle is ranked 12 and one is 13 doesn't mean that by another metric it shouldn't be the other way around but the issue addressed in either of those is probably less commonly seen than for the article ranked 2 and more commonly seen than for the article ranked 50.

> - weights in the complex measure
>
> I didn't get those, at all. Like, why would a user that finds the page
> he's looking for because we've set up a good navigation path be less
> significant than a search result?
>

Because users aren't that good. Without doing a search, they're not exposed to the full list of possible articles. Users who just click on one of the ten linked articles from the front page are less likely to have picked the article that best describes their problem from the pool of hundreds. They may have picked the best article that describes their problem from a list of ten but that's not as informative. Users who go to the inproduct help page are less likely to be needing help than users who went to google or the mozilla site, searched for firefox support and got to our main support site, so inproduct users are ranked even lower.

> - "no" answers
>
> I think that "no" answers are different answers than "yes". There are
> various steps from the problem to a troubleshooting article, and if
> you're looking at the wrong page, you might say "no". That is more an
> answer on search results or other navigation, though, and not that much
> of an answer on whether that article gives a good answer to the problem
> it's about.
>
> Yeah, sorry, loads of "no"s.
>

Djst answered this question above. If the article didn't solve the problem but the page is correct or at least the user has a strong suspicion that the page will be helpful, they probably have the issue in the title or some close variant and should be counted. Our KB is not comprehensive and often we don't include solutions for specific issues if they're deemed too technical or the likelihood of another solution working is much higher. People who have no better article for their problem and vote "no" _should_ be counted. People for whom there is a better article and found the wrong one _should not_ be counted. We hope that with redirection inside the article (if you have XYZ, see this article instead) users are more likely to find and vote on the article that best describes their problem and so "no" votes should carry significant weight (perhaps not as much as a yes vote but definitely more than half)

> Depending on which question you're trying to answer, it might be a nice
> other project to analyse what users are searching for when hitting a
> troubleshoot article, or even more, if there are peaks in searches which
> we don't have answers for or the like.

I agree, looking at search terms and paths to find things was actually my first idea for collecting metrics. Unfortunately Omniture doesn't track that (it chops off all the parameters that are pushed to a page in the URL) and the Google search term tracking doesn't provide a lot of data, just a list of top ten search terms with no numbers, timeline or page targets. Looking at where users go after they search is currently the best hard statistic we've got, which is why I gave it the strongest rating.

>
> Axel

Thanks for your feedback and as I've said, there's a ton of room for improvement. If you have any suggestions for other data that we can look at and how we can incorporate them, I'd really appreciate hearing about them.

Majken Connor

unread,

Oct 21, 2008, 10:30:09 AM10/21/08

to Planning how we can best support our users

> _______________________________________________
> support-planning mailing list
> support-...@lists.mozilla.org
> https://lists.mozilla.org/listinfo/support-planning
>

This might be a bit off from the information we're trying to gather
currently, but I'd love to see stats like "users who visited x article were
most likely to click yes on y article." This is along the lines of studying
the paths users take. In theory, x should always = y but if lots of people
are ending up on one article and then following it to another we'd want to
look at why and see about getting people to the correct article first.

Cheng Wang

unread,

Oct 22, 2008, 6:55:40 AM10/22/08

to

Majken Connor wrote:

>
> This might be a bit off from the information we're trying to gather
> currently, but I'd love to see stats like "users who visited x article were
> most likely to click yes on y article." This is along the lines of studying
> the paths users take. In theory, x should always = y but if lots of people
> are ending up on one article and then following it to another we'd want to
> look at why and see about getting people to the correct article first.

This is not doable because the server that tracks votes (tikiwiki) is independent of the server that tracks userpaths (omniture). I am, however, able to get stats like "35% of Users who search for X follow links eventually to Y." However, that would involve, again, downloading page and path data for every single page on SUMO which is an incredibly daunting manual task (it's about 12 clicks and 4 minutes to do each one). There is no automatic massive data dump that I can use, unlike for SUMO. This is something I _would_ like to do at some point but it's certainly not something I can do on a weekly basis.

Axel Hecht

unread,

Oct 22, 2008, 7:36:19 AM10/22/08

to

You might be able to use gristmill/windmill to automate that.

Axel

Majken Connor

unread,

Oct 22, 2008, 2:19:35 PM10/22/08

to Planning how we can best support our users

> _______________________________________________
>

I'm not as familiar with omniture, does it only track clicks on links, (as
opposed to other types of elements)?

Also, if it tracks userpaths what can it compare out of the box? Does it
follow a user from entry to exit? Or does it just count entries and clicks
per page?

Cheng Wang

unread,

Oct 27, 2008, 11:33:14 AM10/27/08

to

It tracks users but not clicks perse, it tracks what pages they "request" from the server and in what order. It only tracks the part before the question mark, so I can't even track what thread users are seeing most often. Since voting doesn't direct the user to a new page (the address bar doesn't change) it doesn't register on omniture. Comparisons with the userpaths feature in omniture is hard, mostly because it outputs pngs of a diagram. But it does track single users from entry to exit and presents aggregate statistics. So I can for example, find all users who start at the search page and visit X page and then see what fraction of them also visited Y in that session. To do this, however, with a generalized X is very hard since there is no way to interpret the data once you download so many images.

Majken Connor

unread,

Oct 27, 2008, 12:38:22 PM10/27/08

to Planning how we can best support our users

On Mon, Oct 27, 2008 at 11:33 AM, Cheng Wang <
mozillasup...@spamgourmet.com> wrote:

> images. _______________________________________________

>
> support-planning mailing list
> support-...@lists.mozilla.org
> https://lists.mozilla.org/listinfo/support-planning
>

Thanks for explaining! Seems quite complicated.