Type 1 and Type 2 Errors
Type 1 and Type 2 Errors may be something you vaguely remember from statistics courses but few people really know what the terms mean. If you turn to Wikipedia hoping for some clarification on Type 1 and Type 2 Errors, then you are in for a disappointment. It starts off with a clear warning:
This article may be too technical for most readers to understand.
The terms may be somewhat scary, but behind them are some basic and important ideas. A Type I error (or, an error of the first kind) and a Type II error (or, an error of the second kind) are precise technical terms used in statistics to describe particular flaws in a testing process The terms are also used in a more general way by social scientists and others to refer to flaws in reasoning.
Separating Sheep And Goats
A simple game example will illustrate what we’re talking about here. Suppose you were given a pile of 60 photographs, some of sheep and some of goats. The game is to take each photograph and after looking at it for 1 second you put the photograph in the sheep pile or the goat pile. At the end you may find that in the sheep pile you may mistakenly have included some goats, which should have been rejected. That could be considered a Type 1 error. On the other hand, in your goat pile you may have added some sheep photographs. These could be considered Type 2 errors.
Whenever someone is involved in dividing a queue of people into two groups then you have the possibility that people will be assigned to the wrong group and one or other type of error is made. A triage nurse in a hospital Emergency Room tries to decide on the basis of limited information whether visitors should be fast-tracked given their condition or added to the general line-up. Mistakes will occur for a variety of reasons and some who should have been fast-tracked are left in the general line-up. Others who should not have been fast-tracked are mistakenly accepted for the fast-track.
Such decisions separating the ‘sheep’ from the ‘goats’ are more common than you might think. In every case since decisions are rarely perfect, type 1 and type 2 errors will occur. Some sheep will be left with the goats and some goats will find they’ve been given the sheep treatment. Some of these decisions can have very important consequences and one of these is Google’s assessment of the quality of websites and whether they should be included in their index or not. Since Google can bring large volumes of traffic to websites if they are included in the index, it is very important whether Google treats your website as a sheep or a goat. A knowledgeable service firm such as Ingenuity Digital can help you in ensuring this.
Google’s Type 1 and Type 2 errors – Separating Sheep and Goats
Here we are using a shorthand in defining a sheep in Google’s eyes as a website which is in conformance with their Webmaster Guidelines. Their explanation of the Guidelines is that these are Best practices to help Google find, crawl, and index your site. For simplicity we are describing your website as a goat if Google chooses not to index it.
The guidelines include three sections:
- Design and content guidelines
- Technical guidelines
- Quality guidelines
Google gives a clear indication of what may turn your website into a goat.
Even if you choose not to implement any of these suggestions, we strongly encourage you to pay very close attention to the “Quality Guidelines,” which outline some of the illicit practices that may lead to a site being removed entirely from the Google index or otherwise impacted by an algorithmic or manual spam action. If a site has been affected by a spam action, it may no longer show up in results on Google.com or on any of Google’s partner sites.
These guidelines are complex and matters are not black and white. Undoubtedly there are shades of grey. That’s just the kind of situation where a decision process to separate sheep from goats may end up with mis-assignments and errors of both types.
Webmasters may insist that their website is of high quality and should be clearly accepted as a sheep. Google is the final arbiter and makes the decision. If they say it’s a goat, then it’s a goat. However Google may well make an erroneous judgment either way. Google makes the sheep/goat assessment principally by a series of algorithms, which they are seeking to improve all the time. The fact that these algorithmic improvements are ongoing shows that Google itself feels their sheep/goat assessment is not yet 100% correct.
Google Panda Updates – Spotting Sheep Clones
In 2012, a good example of this ongoing improvement process is the series of Panda Updates since February. The first of these was described by Barry Schwartz of Search Engine Roundtable who wrote that Google’s Farmer Update Is Live: 12% Of Google’s Results Are Forever Changed. The Farmer name was used by some before the Panda name came into wide usage. These Panda updates aim to downgrade web pages that are almost direct copies of others. Even if the original version was a sheep, producing clones of this in their thousands clearly in now way is creating web pages of quality. So Google aimed to group these sheep clones with the goats. Most webmasters were entirely in support of this Google initiative. Clearly a better assessment of sheep and goats was in the making.
Google’s Over Optimization Update Named Penguin
Search Engine Roundtable now has information on the latest update called Penguin. This identifies over-optimized web pages that are stuffed with keywords as goats. Most realistic webmasters will find this update too is an improvement in separating the true sheep from the goats. Congratulations, Google: it is likely that if this update works correctly then the type 1 and Type 2 errors will be further reduced. Initial reactions on Internet marketing forums seems to show a positive reaction.
The Unnatural Link Update
A recent update that attempted to better separate out the sheep and the goats is proving to be much less satisfactory. This is the Unnatural Link Update, which is well described by Patrick Altoft of Branded3 in his article on The New Google Link Algorithm
The Unnatural Link update works in direct contrast to previous updates because it penalises websites purely based on the link strategy they have chosen to adopt and it has caused huge problems for the sites that are affected.
Since we blogged about the unnatural link notices that Google has sent to almost 1 million websites in the past few months the we’ve had loads of people getting in touch with Branded3 to ask for our advice on dealing with the issue and as a result we have analysed over 50 reputable websites who have received the message and/or a direct penalty causing a reduction in rankings.
One of the unexpected and tragic consequences of this update has been covered by Aaron Wall in an article on GoogleBowling, Negative SEO & Outing
With all of Google’s warning messages about abnormal links they have built the negative SEO industry in a big way. In some instances those who are not good enough to compete try to harm competitors.
How could Google’s attempt to improve their ability to separate out sheep and goats (and thus reduce their type 1 and type 2 errors) turn out so bad. That Wikipedia article we started off with may give a clue.
Google’s Type 3 Error
The Wikipedia article gives the following explanation of the Type 3 error.
In 1957, Allyn W. Kimball, a statistician with the Oak Ridge National Laboratory, proposed a different kind of error to stand beside “the first and second types of error in the theory of testing hypotheses”. Kimball defined this new “error of the third kind” as being “the error committed by giving the right answer to the wrong problem.”
This is really what we are dealing with here. It was Google who suggested in the first place with their PageRank theory that links to a web page could be an indicator of the web page’s authority or relevance. There was no suggestion at that time that too many low value links could have an adverse effect on web page quality and thus whether a web page was a sheep or a goat. Indeed everyone picked up the notion that more links would be better and this seemed to be reflected in the way Google did the maths around the PageRank concept.
More Links Are Better
That has certainly been the common understanding up till now. As it happens, while writing this, I received the following e-mail message (as most of us frequently do):
We have a large crew of Professionals who are experts in Quality Link Building, SMM, Search Engine Marketing and Online Business Optimization, which are time consuming duties for you and half of your in-house resource cost.
Our core focus is following industry verticals for:
- Quality Link Building
- Rate Effective Link Building – Google Page rank oriented
- Link Wheeling
- Google Panda Friendly Content Writing
We always adopt the Google friendly ETHICAL LINK building process/White hat technique; also follow the guidelines of Google and major search Engine.
We are looking forward starting a long and healthy business relationship with you. I will really appreciate you please let me know your requirement.
Everyone has been thinking that this was correct and indeed this seems to have created a new industry in India.
More Links Can No Longer Turn Goats Into Sheep
The view that links could be a measure of a web page’s authority or relevance may have been correct when first proposed. However when a dominant search engine company such as Google proclaims the value of links then this causes everyone to create links wherever they can and the whole Internet link graph becomes distorted. The Google PageRank search approach can no longer work in such an online world, even though it is Google who caused this to happen. Google tried to change the world by insisting that links to a web page are part of its quality rating but no one listened.
Google could attempt to clean up this Augean stable of links but the better approach is to acknowledge that there is now very little value in this whole mess mass of links. By now Google is trying to give the right answer to the wrong problem. By anyone’s definition that is a major Type 3 error on Google’s part.
Credit: image of sheep and goats from 24oranges.nl via Flickr