This line of thinking doesn't make much sense with respect to a company as large as Google. The fact that the Google Alerts team has identified and solved a particular problem doesn't really allow us to draw reliable conclusions about what's going on in other parts of the organization.
So the question is: Are they not taking greater action against spammers because it would hurt their bottom line? And, if so, is that evil?
In particular, leading questions like this strike me as premature.
I'm not at all a Google expert, but it seems unlikely to me that the Alert team has figured out a way to filter crap results and the Search team has not.
I'm not a Google expert either -- whatever that would mean -- but I've worked in large software organizations. My point was that the linked article naively conceptualizes Google as a unitary entity that either "knows" something or doesn't.
As to your point, a software engineering organization releases products that are the results of myriad goals and constraints (requirements, priorities, schedules, bugs, etc.). Products aren't merely reflections of what the engineers can or cannot do. So it's likely that the search team can filter crap results? Is that really all that we need to know before we decide that Google search is intentionally designed to profit from spam?
Whoa, I'm not saying that Google is intentionally profiting from spam. I don't think TFA is even saying that. I think the point is that filtering bad results doesn't seem to be a technical hurdle they can't leap, so we're eliminating one reason Google search results are frequently filled with spammy content farm sites.
Google has to maintain the status of a common carrier (somewhat).
If that filter the Alert team came up with triggers a lot of false positives, that would be hurting their business since legitimate websites could get excluded from search results (imagine the backlash).
Now I'm not defending them; I actually think they aren't doing it because they don't know how to handle the situation (profits loss versus angry users).
But I'm also pretty sure Mr. Obie Fernandez wouldn't like to be dropped off the face of the Internet just because it has ties to Jacksonville (yeah, cheap shot done on purpose: we know how filtering works and how hard to get right it is).
You also gotta remember that Google employees are the kind of people who will stand in front of Mike Arrington's car because he's talking on the phone while driving. I don't think it's very likely that people like that would remain silent about directives to give favorable treatment to spammers.
Google does seem to be getting clogged up with crap results, but I can reliably find what I'm looking for within a few searches. Unfortunately, that's more than I can say for most of the alternative search engines right now. I had major frustration yesterday trying to find Ruby/Rails topics after switching to Blekko as my default search engine (even with the slashtags and everything). I gotta say that the /seo slashtag is awesome though.
During our beta we have only a limited crawl/index, 3 billion pages. Very specific technical queries and queries with a lot of words seem to suffer most.
Would you like to be an editor for the /ruby and /rails slashtags?
Plausible deniability: We don't cover our pages with AdSense. Do we direct you to pages with AdSense? Yes, but only when our algorithm determines that the content on those pages is relevant.
Perhaps it is important for them to maintain an aura of neutrality and cold calculation. Though if we are supposing that Google is intentionally crippling it's algorithm, would it be so much to suppose that they would just clandestinely run the spam sites themselves to eliminate the middle-man? Though perhaps the cost of secretly doing it themselves is higher than the AdSense payments made to the sites.
This all assumes that Google's reputation remains untarnished by merely linking to spam rather than filling their own site with spam, but I don't think that's quite the case. Google's reputation is intricately tied up with the quality of the results it gives.
I am not suggesting that Google intentionally cripples its algorithms, just that it doesn't rush to "fix" something you or I might consider a problem.
Google's reputation is intricately tied up with the quality of the results it gives.
I think the problem here is the question of whether an adsense-adorned page with scraped content is judged "low quality" by the people doing the search.
I'm no expert in other domains, but judging by the "I CAN HAZ CODES" programmers out there, if they type a programming question into Google and they get a page with scraped content from StackOverflow, will they care?
My guess is that if the scraped content has the answer, they're happy. They aren't interested in doing more research, looking the author's SO reputation, or anything else. They get their answer, and maybe they click an AdSense link if it catches their eye.
You're right, it does depend on people's perception of quality.
I can only recall being frustrated with the scraper sites when they don't have the answer I need, because when I go back to the Google results and try another page I find that they have the exact same scraped content. Out of the first ten or twenty results there are only a few unique pieces of content.
Perhaps if Google eliminated redundancy from their results I'd never even notice whether the content was scraped or not.
I think this may be a big part of the reason why Google employees seem to coyly suggest 'by our numbers these crap results are making people happy' whenever this issue comes up. For a lot of low-literacy users, these results are fine, and as good as they've ever received from Google.
The only benefit I can see to Google are a lot of those spam sites rely on AdSense to make their money. If it makes sense for a developer to register 100 domains and game some keywords, then they must be making a little money.
Nit: AdSense is profitable, but AdWords is even more profitable. If I'm reading it correctly, AdWords brought in 67% of their 2010 ad revenue vs. 30% for AdSense:
That is incorrect. Adsense is only one way that google publishes ads. According to their 2009 earnings reports:
"Google Network Revenues - Google's partner sites generated revenues, through AdSense programs, of $2.04 billion, or 31% of total revenues"
That is hardly nearly all of googles profits. A significant portion to be sure but not all.
Yeah, it sounded more tongue-in-cheek in my head than it comes across in the comment. If this is the reason we see spam results, then the benefit must outweigh the negative effect of potentially pissing of users.
and my guess is that Google will only roll out improved search results when the cost to the company is great enough to justify the loss of income.
It is a publicly traded company, I don't think Google can just cut out millions of dollars of revenue from their bottom line because they want to not be evil - the shareholders would probably ask for people's heads on platters.
Of course only Google knows the extent of which their income-from-spam is, but I imagine it is significant otherwise they would have solved that problem already as the interest in Duck Duck GO, Blekko and Bing/Yahoo continues to rise/be-discussed-more (Don't know if the ACTUAL usage suggests that people are doing more than just talking or moving over to using different services full time)
and I have had a hard time trying to find an article that was written 2 years ago about the data centers around the globe that Google has built. The scale is unbelievable of each installation and there are something like 30 around the globe right now:
http://royal.pingdom.com/2008/04/11/map-of-all-google-data-c...
More than nefarious under-dealings, I think this situation literally snuck up on Google and by the time the publicly-traded company had algorithms to determine the extent of the shenanigans, they realized it would have a noticeable effect on their bottom line if they simply culled all those results out in one day.
They are either going to roll out changes in stages and slowly increase the quality while keeping an eye on what that does to Adsense income and really publicize each change so they rebuild trust with all of us, or they will respond heavy-handidly in a year or so with a "new algorithm change" that "online publishers are up in arms about!" again.
My guess is on the slow-and-gradual approach with a big publicity boost so we are shown they care and are working on it Matt Cutts-style :)
While I've noticed the lagging quality in their search, I still use the Big-G... it's fast for me and gives me accurate results. Then again I mostly search for tech, if I was searching for weight loss, health, sex, appliances or any other topic that is DOMINATED by ads, I would have given up and gone back to using a damn phone book a while ago.
With any luck, Bing (who have no monetary incentive not to block Adsense spam; and a bigger marketing budget and reach than DDG/Blekko) will start piling on the pressure by aggressively pruning their index.
Ultimately I'd guess Google's shareholders will be even more worried about lost search market share (particularly if it's early adopter migration) than Adsense revenue dips...
Actually it's not. I could cite hundreds of examples, but here's one: when I wrote a blog post about getting a huge traffic boost for my startup following on some Mark Cuban in PR Newswire, I titled my post "take the elevator, not the stairs", and Google AdSense served ads for Thyssen elevators. That's how smart it is.
The fact that they don't always have it doing the smart thing you'd want them to do in no way lessens the fact that Google has a lot of compute power that they are throwing at a lot of problems.
Reading blackhat SEO forums is enlightening. A lot of these guys operate large setups where they convert a few dollars of AdWords clicks at a time into a few more dollars in AdSense revenue by funneling the traffic around in various ways. Multiply by a couple hundred or a couple thousand servers, and you're talking about some serious cash.
I think I've heard that they prefer to remove things like that algorithmically, rather than writing hard filters for individual sites. Makes sense, because if you can come up with an algorithm to get rid of known crap, it will probably also get rid of some unknown crap too.
So the question is: Are they not taking greater action against spammers because it would hurt their bottom line? And, if so, is that evil?
In particular, leading questions like this strike me as premature.