Subscribe / Unsubscribe Enewsletters | Login | Register

Pencil Banner

An inside look at Google's news-ranking algorithm

Jaikumar Vijayan | Feb. 22, 2013
Patent application seeks to refine algorithm for third time since 2003

He pointed to metrics like staff size and audience diversity as examples. Even Google's use of story length is a good metric, Sreenivasan said. At first blush, it would appear that Google is emphasizing quantity over quality, he said. But the reality is that many high-quality media organizations now generate more content than they used to. So using story lengths and word counts is valid, he said.

"It reflects today's journalism reality," Sreenivasan said.

In an article from The Atlantic last September, Google News executives said the site "algorithmically" collects stories from more than 50,000 news sources and attracts more than 1 billion unique users each week.

Many in the media industry, especially in Europe, have rankled at what they view as Google's leeching away of readers and advertising dollars with its Google News site. But few have so far blocked their content from being displayed there, though Google offers a fairly straightforward way to do so.

Google itself has offered minimal insight about the algorithms it uses to discover and rank news stories. All the company will say publicly is that articles are selected and ranked based on metrics such as how often and on what sites a story appears; freshness of content; location; relevance; and diversity. The company has claimed that it constantly fine-tunes its news ranking to ensure high quality content is shown.

Last year's application appears to be the latest example of that refining process, offering a rare look at the some of the key ingredients Google considers. For instance:

-To determine the quality of a news source, Google could either look at the number of original ("non-duplicate") articles produced or count the number of original sentences produced by the source.
-To determine the importance of coverage, it can consider "story size scores" for all original articles produced by an organization over a week, a month or longer. "As an example, ...if D is an article about the crash of the Columbia Shuttle and there were 500 other distinct articles on the subject, then the story size would be 500."
-
To calculate a "breaking news score" for a news operation, Google "may measure the ability of the news source to publish a story soon after an important event has occurred...."
-
And to measure an operation's ability to produce high-quality, original work, the number of people mentioned in stories, especially if they're not widely cited elsewhere, could be used. "...This may be an indication that the news source is capable of original reporting."

Google also monitors links from search engines to individual articles. "Well-known sites, such as CNN, tend to be preferred to less popular sites, such as Unknown Town News, which users may avoid," the patent application said.

 

Previous Page  1  2 

Sign up for CIO Asia eNewsletters.