Inside Search
The official Google Search blog
Search quality highlights: 65 changes for August and September
October 4, 2012
Our latest installment of search quality highlights is here with 65 changes to report for August and September. As you may recall from our
last post
, in cases where we don’t have a descriptive name, we are using a unique ID number. August and September were both busy months as we launched new features, expanded the Knowledge Graph globally in English, and worked towards
building the search engine of the future.
Here’s the list for August:
#82862.
[project “Page Quality”] This launch helped you find more high-quality content from trusted sources.
#83197.
[project “Autocomplete”] This launch introduced changes in the way we generate query predictions for Autocomplete.
#83818.
[project “Answers”] This change improved display of the movie showtimes feature.
#83819.
[project “Answers”] We improved display of the MLB search feature.
#83820.
[project “Answers”] This change improved display of the finance search feature.
#83384.
[project “Universal Search”] We made improvements to driving directions in Turkish.
#83459.
[project “Alternative Search Methods”] We added support for answers about new stock exchanges for voice queries.
LTS.
[project “Other Ranking Components”] We improved our web ranking to determine what pages are relevant for queries containing locations.
Maru
. [project “SafeSearch”] We updated SafeSearch to improve the handling of adult video content in videos mode for queries that are not looking for adult content.
#83135.
[project “Query Understanding”] This change updated term-proximity scoring.
#83659.
[project “Answers”] We made improvements to display of the local time search feature.
#83105.
[project “Snippets”] We refreshed data used to generate sitelinks.
Imadex.
[project “Freshness”] This change updated handling of stale content and applies a more granular function based on document age.
#83613.
[project “Universal Search”] This change added the ability to show a more appropriately sized video thumbnail on mobile when the user clearly expresses intent for a video.
#83443.
[project “Knowledge Graph”] We added a lists and collections component to the
Knowledge Graph.
#83442.
[project “Snippets”] This change improved a signal we use to determine how relevant a possible result title actually is for the page.
#83012.
[project “Knowledge Graph] The Knowledge Graph displays factual information and refinements related to many types of searches. This launch extended the
Knowledge Graph
to English-speaking locales beyond the U.S.
#84063.
[project “Answers”] We added better understanding of natural language searches for the calculator feature, focused on currencies and arithmetic.
nearby.
[project “User Context”] We improved the precision and coverage of our system to help you find more relevant local web results. Now we’re better able to identify web results that are local to the user, and rank them appropriately.
essence.
[project “Autocomplete”] This change
introduced entity predictions
in autocomplete. Now Google will predict not just the string of text you might be looking for, but the actual real-world thing. Clarifying text will appear in the drop-down box to help you disambiguate your search.
#83821.
[project “Answers”] We introduced better natural language parsing for display of the conversions search feature.
#82279.
[project “Other Ranking Components”] We changed to fewer results for some queries to show the most relevant results as quickly as possible.
#82407.
[project “Other Search Features”] For pages that we do not crawl because of robots.txt, we are usually unable to generate a snippet for users to preview what's on the page. This change added a replacement snippet that explains that there's no description available because of robots.txt.
#83709.
[project “Other Ranking Components”] This change was a minor bug fix related to the way links are used in ranking.
#82546.
[project “Indexing”] We made back-end improvements to video indexing to improve the efficiency of our systems.
Palace.
[project “SafeSearch”] This change decreased the amount of adult content that will show up in Image Search mode when SafeSearch is set to strict.
#84010.
[project “Page Quality”] We refreshed data for the "Panda"
high-quality sites algorithm
.
#84083.
[project “Answers”] This change improved the display of the movie showtimes search feature.
gresshoppe.
[project “Answers”] We updated the display of the flight search feature for searches without a specified destination.
#83670.
[project “Snippets”] We made improvements to surface fewer generic phrases like "comments on" and "logo" in search result titles.
#83777.
[project “Synonyms”] This change made improvements to rely on fewer "low-confidence" synonyms when the user's original query has good results.
#83377.
[project “User Context”] We made improvements to show more relevant local results.
#83484.
[project “Refinements”] This change helped users refine their searches to find information about the right person, particularly when there are many prominent people with the same name.
#82872.
[project “SafeSearch”] In "strict" SafeSearch mode we remove results if they are not very relevant. This change previously launched in English, and this change expanded it internationally.
Knowledge Graph Carousel.
[project “Knowledge Graph”] This change expanded the
Knowledge Graph carousel
feature globally in English.
Sea.
[project “SafeSearch”] This change helped prevent adult content from appearing when SafeSearch is in "strict" mode.
#84259.
[project “Autocomplete”] This change tweaked the display of real-world entities in autocomplete to reduce repetitiveness. With this change, we don't show the entity name (displayed to the right of the dash) when it's fully contained in the query.
TSSPC.
[project “Spelling”] This change used spelling algorithms to improve the relevance of long-tail autocomplete predictions.
#83689.
[project “Page Quality”] This launch helped you find more high-quality content from trusted sources.
#84068.
[project “Answers”] We improved the display of the currency conversion search feature.
#84586.
[project “Other Ranking Components”] This change improved how we rank documents for queries with location terms.
Here’s the list for September:
Dot.
[project “Autocomplete”] We improved cursor-aware predictions in Chinese, Japanese and Korean languages. Suppose you're searching for "restaurants" and then decide you want "Italian restaurants." With cursor-aware predictions, once you put your cursor back to the beginning of the search box and start typing "I," the prediction system will make predictions for "Italian," not completions of "Irestaurants."
#84288.
[project “Autocomplete”] This change made improvements to show more fresh predictions in autocomplete for Korean.
trafficmaps.
[project “Universal Search”] With this change, we began showing a traffic map for queries like "traffic from A to B" or "traffic between A and B."
#84394.
[project “Page Quality”] This launch helped you find more high-quality content from trusted sources.
#84652.
[project “Snippets”] We currently generate titles for PDFs (and other non-html docs) when converting the documents to HTML. These auto-generated titles are usually good, but this change made them better by looking at other signals.
#83761.
[project “Freshness”] This change helped you find the latest content from a given site when two or more documents from the same domain are relevant for a given search query.
#83406.
[project “Query Understanding”] We improved our ability to show relevant Universal Search results by better understanding when a search has strong image intent, local intent, video intent, etc.
espd.
[project “Autocomplete”] This change provided entities in autocomplete that are more likely to be relevant to the user's country. See
blog post
for background.
#83391.
[project “Answers”] This change internationalized and improved the precision of the
symptoms search feature.
#82876.
[project “Autocomplete”] We updated autocomplete predictions when predicted queries share the same last word.
#83304.
[project “Knowledge Graph”] This change updated signals that determine when to show summaries of topics in the right-hand panel.
#84211.
[project “Snippets”] This launch led to better snippet titles.
#81360.
[project “Translation and Internationalization”] With this launch, we began showing local URLs to users instead of general homepages where applicable (e.g. blogspot.ch instead of blogspot.com for users in Switzerland). That’s relevant, for example, for global companies where the product pages are the same, but the links for finding the nearest store are country-dependent.
#81999.
[project “Translation and Internationalization”] We revamped code for understanding which documents are relevant for particular regions and languages automatically (if not annotated by the webmaster).
Cobra.
[project “SafeSearch”] We updated SafeSearch algorithms to better detect adult content.
#937372.
[project “Other Search Features”] The translate search tool is available through the link "Translated foreign pages" in the sidebar of the search result page. In addition, when we guess that a non-English search query would have better results from English documents, we'll show a feature at the bottom of the search results page to suggest users try the translate search tool. This change improved the relevance of when we show the suggestion.
#84460.
[project “Snippets”] This change helped to better identify important phrases on a given webpage.
#80435.
[project “Autocomplete”] This change improves autocomplete predictions based on the user's Web History (for signed-in users).
#83901.
[project “Synonyms”] This change improved the use of synonyms for search terms to more often return results that are relevant to the user's intention.
And here are a few other changes we’ve blogged about since last time:
An update to our search algorithms
Voice Search arrives in 13 new languages
Structured Data Testing Tool
Insights into what the world is searching for -- the new Google Trends
Travelers can now access Flight Search from their tablets
Posted by
Pandu Nayak
, Member of Technical Staff
Search quality highlights: 86 changes for June and July
August 10, 2012
We’re back with the latest in our series of search quality highlights. We have a couple months to make up for, so this list is a doozy with 86 changes. You might notice we’ve made one subtle tweak as compared with prior blog posts. We’re no longer using separate descriptive names and codenames (we’re just listing one or the other). Many times the descriptive names really just repeated the information in the short description. Names are sometimes useful for providing a unique identifier for a given change, so in cases where we don’t have a name, we’re adding an ID number.
Here’s the list for June:
uefa-euro1.
[project codename “Answers”] Addition of a live result showing schedule and scores of the EURO 2012 games (European championship of national soccer teams).
#82293.
[project codename “Answers”] Improved dictionary search feature by adding support for more natural language searches.
Better HTML5 resource caching for mobile.
[project codename “Mobile”] We’ve improved caching of different components of the search results page, dramatically reducing latency in a number of cases.
ng2.
[project codename “Other Ranking Components”] Better ordering of top results using a new and improved ranking function for combining several key ranking features.
Ref-16.
[project codename “Other Ranking Components”] Changes to an "official pages" algorithm to improve internationalization.
Bamse.
[project codename “Page Quality”] This launch helps you find more high-quality content from trusted sources.
Bamse-17L.
[project codename “Page Quality”] This launch helps you find more high-quality content from trusted sources.
GreenLandII.
[project codename “Page Quality”] We've incorporated new data into the Panda algorithm to better detect high-quality sites and pages.
#82353.
[project codename “Page Quality”] This change refreshes data for the Panda
high-quality sites algorithm.
SuperQ2.
[project codename “Image”] We've updated a signal for Google Images to help return more on-topic image search results.
#82743.
[project codename “Answers”] Changes to the calculator feature to improve recognition of queries containing "and," such as [4 times 3 and a half].
komodo.
[project codename “Query Understanding”] Data refresh for system used to better understand and search for long-tail queries.
#82580.
[project codename “Answers”] This is an improvement for showing the sunrise and sunset times search feature.
PitCode.
[project codename “Answers”] This launch adds live results for Nascar, MotoGP, and IndyCar. This is in addition to Formula1 results, which were already available.
timeob.
[project codename “Answers”] We've improved natural language detection for the time feature to better understand questions like, "What time is it in India?"
#81933.
[project codename “Synonyms”] This launch improves use of query synonyms in ranking. Now we're less likely to show documents where the synonym has a different meaning than the original search term.
#82496.
[project codename “Answers”] Changes made to the movie showtimes feature on mobile to improve recognition of natural language queries and overall coverage.
#82367.
[project codename “Other Ranking Components”] This launch helps you find more high-quality content from trusted sources.
#82699.
[project codename “Other Search Features”] We've made it easier to quickly compare places. Now you can hover over a local result and see information about that place on the right-hand side.
CapAndGown.
[project codename “Image”] On many webpages, the most important images are closely related to the overall subject matter of the page. This project helps you find these salient images more often.
#82769.
[project codename “Answers”] Improvements to the calculator feature on mobile to improve handling of queries that contain both words and numbers such as [4 times 3 divided by 2].
Vuvuzela.
[project codename “SafeSearch”] We've updated SafeSearch to unify the handling of adult video content in videos mode and in the main search results. Explicit video thumbnails are now filtered more consistently.
#82537.
[project codename “Answers”] We've enabled natural language detection for the currency conversion feature to better understand questions like, "What is $500 in euros?"
#82519.
[project codename “Answers”] We've enabled natural language detection for the flight status feature to better understand questions about flight arrival times and status.
#82879.
[project codename “Answers”] We've improved the triggering for the "when is" feature and understanding of queries like, "When is Mother's Day?"
wobnl0330.
[project codename “Answers”] Improvements to display of the weather search feature.
Lime.
[project codename “Freshness”] This change improves the interaction between various search components to improve search results for searches looking for fresh content.
gas station.
[project codename “Snippets”] This change removes the boilerplate text in sitelinks titles, keeping only the information useful to the user.
#81776.
[project codename “Answers”] We've improved natural language detection for the unit conversion feature to better understand questions like, "What is 5 miles in kilometers?"
#81439.
[project codename “Answers”] Improved display of the finance feature for voice search queries on mobile.
#82666.
[project codename “Page Quality”] This launch helps you find more high-quality content from trusted sources.
#82541.
[project codename “Other Ranking Components”] This is one of multiple projects that we're working on to make our system for clustering web results better and simpler.
gaupe
. [project codename “Universal Search”] Improves display of the flights search feature. Now, this result shows for queries with destinations outside the US, such as [flights from Austin to London].
#82887.
[project codename “Answers”] We've improved natural language processing for the dictionary search feature.
gallium-2.
[project codename “Synonyms”] This change improves synonyms inside concepts.
zinc-4.
[project codename “Synonyms”] This change improves efficiency by not computing synonyms in certain cases.
Manzana2.
[project codename “Snippets”] This launch improves clustering and ranking of links in the expanded sitelinks feature.
#82921.
[project codename “Alternative Search Methods”] We've improved finance results to better understand finance-seeking queries spoken on mobile.
#82936.
[project codename “Answers”] Improved display of the weather search feature, so you can ask [weather in california] or [is it hot in italy].
#82935.
[project codename “Answers”] We've improved natural language detection for the sunrise/sunset feature.
#82460.
[project codename “Snippets”] With this change we're using synonyms to better generate accurate titles for web results.
#82953.
[project codename “Answers”] This change improves detection of queries about weather.
PandaMay.
[project codename “Search Quality”] We launched a data refresh for our Panda high-quality sites algorithm.
ItsyBitsy.
[project codename “Images”] To improve the quality of image results, we filter tiny, unhelpful images at the bottom of our image results pages.
localtimeob.
[project codename “Answers”] We've improved display of the local time search feature.
#82536.
[project codename “Answers”] We've improved natural language detection to better understand queries about baseball and return the latest baseball information about MLB, such as schedules and the latest scores.
Improvements to Images Universal ranking.
[project codename “Universal Search”] We significantly improved our ability to show Images Universal on infrequently searched-for queries.
absum3.
[project codename “Snippets”] This launch helps us select better titles to display in the search results. This is a change to our algorithm that will specifically improve the titles for pages that are in non-Latin based languages.
#83051.
[project codename “Answers”] We've improved display of local business information in certain mobile use cases. In particular, we'll highlight information relevant to the search, including phone numbers, addresses, hours and more.
calc2-random.
[project codename “Answers”] This change improves our understanding of calculator-seeking queries.
#82961.
[project codename “Alternative Search Methods”] When you search for directions to or from a location on your mobile device without specifying the start point, we'll return results starting from your current position.
#82984.
[project codename “Universal Search”] This was
previously available
for users searching on google.com in English, and now it's available for all users searching in English on any domain.
#82150.
[project codename “Spelling”] Refresh of our algorithms for spelling systems in eight languages.
NoPathsForClustering.
[project codename “Other Ranking Components”] We've made our algorithm for clustering web results from the same site or same path (same URL up until the last slash) more consistent. This is one of multiple projects that we're working on to make our clustering system better and simpler.
Hamel.
[project codename “Page Quality”] This change updates a model we use to help you find high-quality pages with unique content.
#81977.
[project codename “Synonyms”] This change updates our synonyms systems to make it less likely we'll return adult content when users aren't looking for it.
Homeland.
[project codename “Autocomplete”] This is an improvement to autocomplete that will help users to get predicted queries that are more relevant to their local country.
Here’s the list for July:
#82948.
[project codename “Other Search Features”] We've improved our natural language processing to improve display of our movie showtimes feature.
yoyo.
[project codename “Snippets”] This change leads to more useful text in sitelinks.
popcorn.
[project codename “Snippets”] We've made a minor update to our algorithm that detects if a page is an "article." This change facilitates better snippets.
Golden Eagle.
[project codename “Autocomplete”] When Google Instant is turned off, we'll sometimes show a direct link to a site in the autocomplete predictions. With this change we refreshed the data for those predictions.
#82301.
[project codename “Indexing”] This change improves an aspect of our serving systems to save capacity and improve latency.
#82392.
[project codename “Indexing”] This launch improves the efficiency of the Book Search ranking algorithms, making them more consistent with Web Search.
Challenger.
[project codename “Snippets”] This is another change that will help get rid of generic boilerplate text in Web results' titles, particularly for sitelinks.
#83166.
[project codename “Universal Search”] This change is a major update to Google Maps data for the following regions: CZ, GR, HR, IE, IT, VA, SM, MO,PT, SG, LS. This new data will appear in maps universal results.
#82515.
[project codename “Translation and Internationalization”] This change improves the detection of queries that would benefit from translated results.
bergen.
[project codename “Other Ranking Components”] This is one of multiple projects that we're working on to make our system for clustering web results better and simpler.
Panda JK.
[project codename “Page Quality”] We launched Panda on google.co.jp and google.co.kr to promote more high-quality sites for users in Japan and Korea.
rrfix4.
[project codename “Freshness”] This is a bug fix to a freshness algorithm. This change turns off a freshness algorithm component in certain cases when it should not be affecting the results.
eventhuh4
. [project codename “Knowledge Graph”] We'll show a list of upcoming events in the Knowledge Graph for city-related searches such as [san francisco] and [events in san francisco].
#83483
. [project codename “Universal Search”] This change helps surface navigation directions directly in search results for more queries.
Zivango.
[project codename “Refinements”] This change leads to more diverse search refinements.
#80568.
[project codename “Snippets”] This change improves our algorithm for generating
site hierarchies
for display in search result snippets.
Labradoodle.
[project codename “SafeSearch”] We've updated SafeSearch algorithms to better detect adult content.
JnBamboo.
[project codename “Page Quality”] We’ve updated data for our Panda high-quality sites algorithm.
#83242.
[project codename “Universal Search”] This change improves news universal display by using entities from the Knowledge Graph.
#75921.
[project codename “Autocomplete”] For some time we've shown personalized predictions in Autocomplete for users who've enabled Web History on google.com in English. With this change, we're internationalizing the feature.
#83301.
[project codename “Answers”] Similar to the live results we provide for sports like baseball or European football, you can now search on Google and find rich, detailed information about the latest schedule, medal counts, events, and record-breaking moments for the world's largest sporting spectacle.
#83432.
[project codename “Autocomplete”] This change helps users find more fresh trending queries in Japanese as part of autocomplete.
And here are some changes we’ve shared elsewhere:
Flight Search for Canada
Updated Hot Searches list
Update to Search by Image
I
nteractive Weather Visualization Now on Tablet
More detailed maps in parts of Europe, Africa and Asia
Google with Handwrite for Mobile and Tablet Search
Structured Data Dashboard
Posted by
Scott Huffman
, Engineering Director
An update to our search algorithms
August 10, 2012
We aim to provide a great experience for our users and have developed over 200 signals to ensure our search algorithms deliver the best possible results. Starting next week, we will begin taking into account a new signal in our rankings: the number of
valid copyright removal notices
we receive for any given site. Sites with high numbers of removal notices may appear lower in our results. This ranking change should help users find legitimate, quality sources of content more easily—whether it’s a song previewed on
NPR’s music website
, a TV show on
Hulu
or new music streamed from
Spotify
.
Since we re-booted our copyright removals over two years ago, we’ve been given much more data by copyright owners about infringing content online. In fact, we’re now receiving and processing more copyright removal notices every day than we did in all of 2009—
more than 4.3 million URLs in the last 30 days alone
. We will now be using this data as a signal in our search rankings.
Only copyright holders know if something is authorized, and only courts can decide if a copyright has been infringed; Google cannot determine whether a particular webpage does or does not violate copyright law. So while this new signal will influence the ranking of some search results, we won’t be removing any pages from search results unless we
receive
a valid copyright removal notice from the rights owner. And we’ll continue to provide "
counter-notice
" tools so that those who believe their content has been wrongly removed can get it reinstated. We’ll also continue to be
transparent
about copyright removals.
Posted by Amit Singhal, SVP, Engineering
Search quality highlights: 39 changes for May
June 7, 2012
May is
often
a
big month
for
us in Search
, and 2012 has been no exception. This month we had exciting announcements including the Knowledge Graph, better search for users in mainland China, and an updated Search App for iPhone. We also released new sports features, deeper detection of hacked pages, and much more.
Here’s the list for May:
Deeper detection of hacked pages.
[launch codename "GPGB", project codename "Page Quality"] For some time now Google has been detecting defaced content on hacked pages and presenting a notice on search results reading, “This site may be compromised.” In the past, this algorithm has focused exclusively on homepages, but now we’ve noticed hacking incidents are growing more common on deeper pages on particular sites, so we’re expanding to these deeper pages.
Autocomplete predictions used as refinements.
[launch codename "Alaska", project codename “Refinements”] When a user types a search she’ll see a number of predictions beneath the search box. After she hits “Enter”, the results page may also include related searches or "refinements". With this change, we’re beginning to include some especially useful predictions as “Related searches” on the results page.
More predictions for Japanese users.
[project codename "Autocomplete"] Our usability testing suggests that Japanese users prefer more autocomplete predictions than users in other locales. Because of this, we’ve expanded the number or predictions shown in Japan to as many as eight (when Instant is on).
Improvements to autocomplete on Mobile.
[launch codename "Lookahead", project codename "Mobile"] We made an improvement to make predictions work faster on mobile networks through more aggressive caching.
Fewer arbitrary predictions.
[launch codename "Axis5", project codename "Autocomplete"] This launch makes it less likely you’ll see low-quality predictions in autocomplete.
Improved IME in autocomplete.
[launch codename "ime9", project codename "Translation and Internationalization"] This change improves handling of input method editors (IMEs) in autocomplete, including support for caps lock and better handling of inputs based on user language.
New segmenters for Asian languages.
[launch codename "BeautifulMind"]
Speech segmentation
is about finding the boundaries between words or parts of words. We updated the segmenters for three asian languages: Chinese, Japanese, and Korean, to better understand the meaning of text in these languages. We’ll continue to update and improve our algorithm for segmentation.
Scoring and infrastructure improvements for Google Books pages in Universal Search.
[launch codename “Utgo”, project codename “Indexing”] This launch transitions the billions of pages of scanned books to a unified serving and scoring infrastructure with web search. This is an efficiency, comprehensiveness and quality change that provides significant savings in CPU usage while improving the quality of search results.
Unified Soccer feature.
[project codename "Answers"] This change unifies the soccer search feature experience across leagues in Spain, England, Germany and Italy, providing scores and scheduling information right on the search result page.
Improvements to NBA search feature.
[project codename "Answers"] This launch makes it so we’ll more often return relevant NBA scores and information right at the top of your search results. Try searching for [
nba playoffs
] or [
heat games
].
New Golf search feature.
[project codename "Answers"] This change introduces a new search feature for the Professional Golf Association (PGA) and PGA Tour, including information about tour matches and golfers. Try searching for [
tiger woods
] or [
2012 pga schedule
].
Improvements to ranking for news results.
[project codename "News"] This change improves signals we use to rank news content in our main search results. In particular, this change helps you discover news content more quickly than before.
Better application of inorganic backlinks signals.
[launch codename "improv-fix", project codename "Page Quality"] We have algorithms in place designed to detect a variety of
link schemes
, a common spam technique. This change ensures we’re using those signals appropriately in the rest of our ranking.
Improvements to Penguin.
[launch codename "twref2", project codename "Page Quality"] This month we rolled out a couple minor tweaks to improve signals and refresh the data used by the
penguin algorithm
.
Trigger alt title when HTML title is truncated.
[launch codename "tomwaits", project codename "Snippets"] We have algorithms designed to present the best possible result titles. This change will show a more succinct title for results where the current title is so long that it gets truncated. We’ll only do this when the new, shorter title is just as accurate as the old one.
Efficiency improvements in alternative title generation.
[launch codename "TopOfTheRock", project codename "Snippets"] With this change we’ve improved the efficiency of title generation systems, leading to significant savings in cpu usage and a more focused set of titles actually shown in search results.
Better demotion of boilerplate anchors in alternate title generation.
[launch codename "otisredding", project codename "Snippets"] When presenting titles in search results, we want to avoid boilerplate copy that doesn’t describe the page accurately, such as “Go Back.” This change helps improve titles by avoiding these less useful bits of text.
Internationalizing music rich snippets.
[launch codename "the kids are disco dancing", project codename "Snippets"]
Music rich snippets
enable webmasters to mark up their pages so users can more easily discover pages in the search results where you can listen to or preview songs. The feature launched originally on google.com, but this month we enabled music rich snippets for the rest of the world.
Music rich snippets on mobile.
[project codename "Snippets"] With this change we’ve turned on music rich snippets for mobile devices, making it easier for users to find songs and albums when they’re on the go.
Improvement to SafeSearch goes international.
[launch codename "GentleWorld", project codename "SafeSearch"] This change internationalizes an algorithm designed to handle results on the borderline between adult and general content.
Simplification of term-scoring algorithms.
[launch codename "ROLL", project codename "Query Understanding"] This change simplifies some of our code at a minimal cost in quality. This is part of a larger effort to improve code readability.
Fading results to white for Google Instant.
[project codename "Google Instant"] We made a minor user experience improvement to Google Instant. With this change, we introduced a subtle fade animation when going from a page with results to a page without.
Better detection of major new events.
[project codename "Freshness"] This change helps ensure that Google can return fresh web results in realtime seconds after a major event occurs.
Smoother ranking functions for freshness.
[launch codename "flsp", project codename "Freshness"] This change replaces a number of thresholds used for identifying fresh documents with more continuous functions.
Better detection of searches looking for fresh content.
[launch codename "Pineapples", project codename "Freshness"] This change introduces a brand new classifier to help detect searches that are likely looking for fresh content.
Freshness algorithm simplifications.
[launch codename “febofu", project codename "Freshness"] This month we rolled out a simplification to our freshness algorithms, which will make it easier to understand bugs and tune signals.
Updates to +Pages in right-hand panel.
[project codename “Social Search”] We improved our signals for identifying relevant +Pages to show in the right-hand panel.
Performance optimizations in our ranking algorithm.
[launch codename "DropSmallCFeature"] This launch significantly improves the efficiency of our scoring infrastructure with minimal impact on the quality of our results.
Simpler logic for serving results from diverse domains.
[launch codename "hc1", project codename "Other Ranking Components"] We have algorithms to help return a diverse set of domains when relevant to the user query. This change simplifies the logic behind those algorithms.
Precise location option on tablet.
[project codename “Mobile”] For a while you've had the option to choose to get personalized search results relevant to your more precise location on mobile. This month we expanded that choice to tablet. You’ll see the link at the bottom of the homepage and a button above local search results.
Improvements to local search on tablet.
[project codename “Mobile”] Similar to the
changes we released
on mobile this month, we also improved local search on tablet as well. Now you can more easily expand a local result to see more details about the place. After tapping the reviews link in local results, you’ll find details such as a map, reviews, menu links, reservation links, open hours and more.
Internationalization of “recent” search feature on mobile.
[project codename "Mobile"] This month we expanded the
“recent” search feature
on mobile to new languages and regions.
Other changes we’ve blogged about since last time:
The Knowledge Graph
Better search in mainland China
Notification of DNSChanger Malware
Google+ Local
Improvements to mobile local search
Google Maps for mobile 6.7
Updated Search App for iPhone
Posted by Scott Huffman, Engineering Director
Better search in mainland China
May 31, 2012
Over the past couple years, we’ve had a lot of feedback that Google Search from mainland China can be inconsistent and unreliable. It depends on the search query and browser, but users are regularly getting error messages like “This webpage is not available” or “The connection was reset.” And when that happens, people typically cannot use Google again for a minute or more. This video shows what’s happening:
We’ve taken a long, hard look at our systems and have not found any problems. However, after digging into user reports, we’ve noticed that these interruptions are closely correlated with searches for a particular subset of queries.
So starting today we’ll notify users in mainland China when they enter a keyword that may cause connection issues. By prompting people to revise their queries, we hope to reduce these disruptions and improve our user experience from mainland China. Of course, if users want to press ahead with their original queries they can carry on.
In order to figure out which keywords are causing problems, a team of engineers in the U.S. reviewed the 350,000 most popular search queries in China. In their research, they looked at multiple signals to identify the disruptive queries, and from there they identified specific terms at the root of the issue.
We’ve observed that many of the terms triggering error messages are simple everyday
Chinese characters
, which can have different meanings in different contexts. For example a search for the single character [
江
] (Jiāng, a common surname that also means “river”) causes a problem on its own, but
江
is also part of other common searches like [丽
江
] (Lijiang, the name of a city in Yunnan Province), [锦
江
之星] (the Jinjiang Star hotel chain), and [
江
苏移动] (Jiangsu Mobile, a mobile phone service). Likewise, searching for [
周
] (Zhōu, another common surname that also means “week”) triggers an error message, so including this character in other searches—like [
周
杰伦] (Jay Chou, the Taiwanese pop star), [
周
星驰] (Stephen Chow, a popular comedian from Hong Kong), or any publication that includes the word “week”—would also be problematic.
Now, when a user types in a common term like [长
江
] (Yangtze River) from China, Google highlights the problem term [
江
] as they type, and when they press “enter” a drop-down menu appears beneath the search box:
Notices will appear matching the user’s language settings.
To learn more, users can click on the “interruption” link, which takes them to this
help center article
. They can continue with their original query (which will likely lead to an error message), or click “Edit search terms,” which will remove the highlighted characters and prompt users to try other search terms:
In order to avoid connection problems, users can refine their searches without the problem keywords. For example, instead of searching for [长
江
], they could search for [changjiang]—which also means Yangtze River, but is written using
pinyin
, the system used to transliterate Chinese characters into Latin script. This won’t cause a timeout, but will still generate search results related to the Yangtze River.
We’ve said before that we want as many people in the world as possible to have access to our services. Our hope is that these written notifications will help improve the search experience in mainland China. If you’re outside China and are curious to see what the notifications look like, you can visit
this link
to try it out.
Posted by Alan Eustace, Senior Vice President, Knowledge
Note: To read this blog post in Chinese, see
this PDF
.
Search quality highlights: 52 changes for April
May 4, 2012
Update
6 May, 950am: We accidentally had one change included twice, "No freshness boost for low-quality content." We've removed the duplicate entry and updated the number of total launches from 53+ to 52+.
- Ed.
We’ve had a
zerg rush
of 52+ launches this month in search. One of the big changes for me was our latest algorithm improvement to help you find
more high-quality sites
. But, that’s not all we’ve been up to. As you may recall, a couple months back we shared
uncut video
discussion of a spelling related change, and now that’s launched as well (see “More spell corrections for long queries”). Other highlights include changes in indexing, spelling, sitelinks, sports scores features and more. We even experimented with a couple more radical features, such as Really Advanced Search and Weather Control, but ultimately decided they were a little too
foolish
.
Here’s the
real
list for April:
Categorize paginated documents.
[launch codename "Xirtam3", project codename "CategorizePaginatedDocuments"] Sometimes, search results can be dominated by
documents from a paginated series
. This change helps surface more diverse results in such cases.
More language-relevant navigational results.
[launch codename "Raquel"] For navigational searches when the user types in a web address, such as [bol.com], we generally try to rank that web address at the top. However, this isn’t always the best answer. For example, bol.com is a Dutch page, but many users are actually searching in Portuguese and are looking for the Brazilian email service, http://www.bol.uol.com.br/. This change takes into account language to help return the most relevant navigational results.
Country identification for webpages.
[launch codename "sudoku"] Location is an important signal we use to surface content more relevant to a particular country. For a while we’ve had systems designed to detect when a website, subdomain, or directory is relevant to a set of countries. This change extends the granularity of those systems to the page level for sites that host user generated content, meaning that some pages on a particular site can be considered relevant to France, while others might be considered relevant to Spain.
Anchors bug fix.
[launch codename "Organochloride", project codename "Anchors"] This change fixed a bug related to our handling of anchors.
More domain diversity.
[launch codename "Horde", project codename "Domain Crowding"] Sometimes search returns too many results from the same domain. This change helps surface content from a more diverse set of domains.
More local sites from organizations.
[project codename "ImpOrgMap2"] This change makes it more likely you’ll find an organization website from your country (e.g. mexico.cnn.com for Mexico rather than cnn.com).
Improvements to local navigational searches.
[launch codename "onebar-l"] For searches that include location terms, e.g. [
dunston mint seattle
] or [
Vaso Azzurro Restaurant 94043
], we are more likely to rank the local navigational homepages in the top position, even in cases where the navigational page does not mention the location.
Improvements to how search terms are scored in ranking.
[launch codename "Bi02sw41"] One of the most fundamental signals used in search is whether and how your search terms appear on the pages you’re searching. This change improves the way those terms are scored.
Disable salience in snippets.
[launch codename "DSS", project codename "Snippets"] This change updates our system for generating snippets to keep it consistent with other infrastructure improvements. It also simplifies and increases consistency in the snippet generation process.
More text from the beginning of the page in snippets.
[launch codename "solar", project codename "Snippets"] This change makes it more likely we’ll show text from the beginning of a page in snippets when that text is particularly relevant.
Smoother ranking changes for fresh results.
[launch codename "sep", project codename "Freshness"] We want to help you find the freshest results, particularly for searches with important new web content, such as breaking news topics. We try to promote content that appears to be fresh. This change applies a more granular classifier, leading to more nuanced changes in ranking based on freshness.
Improvement in a freshness signal.
[launch codename "citron", project codename "Freshness"] This change is a minor improvement to one of the freshness signals which helps to better identify fresh documents.
No freshness boost for low-quality content.
[launch codename “NoRot”, project codename “Freshness”] We have modified a classifier we use to promote fresh content to exclude fresh content identified as particularly low-quality.
Tweak to trigger behavior for Instant Previews.
This change narrows the trigger area for
Instant Previews
so that you won’t see a preview until you hover and pause over the icon to the right of each search result. In the past the feature would trigger if you moused into a larger button area.
Sunrise and sunset search feature internationalization.
[project codename "sunrise-i18n"] We’ve internationalized the
sunrise and sunset
search feature to 33 new languages, so now you can more easily plan an evening jog before dusk or set your alarm clock to watch the sunrise with a friend.
Improvements to currency conversion search feature in Turkish.
[launch codename "kur", project codename "kur"] We launched improvements to the currency conversion search feature in Turkish. Try searching for [
dolar kuru
], [
euro ne kadar
], or [
avro kaç para
].
Improvements to news clustering for Serbian.
[launch codename "serbian-5"] For news results, we generally try to cluster articles about the same story into groups. This change improves clustering in Serbian by better grouping articles written in Cyrillic and Latin. We also improved our use of “stemming” -- a technique that relies on the “
stem
” or root of a word.
Better query interpretation.
This launch helps us better interpret the likely intention of your search query as suggested by your last few searches.
News universal results serving improvements.
[launch codename "inhale"] This change streamlines the serving of news results on Google by shifting to a more unified system architecture.
UI improvements for breaking news topics.
[launch codename "Smoothie", project codename "Smoothie"] We’ve improved the user interface for news results when you’re searching for a breaking news topic. You’ll often see a large image thumbnail alongside two fresh news results.
More comprehensive predictions for local queries.
[project codename "Autocomplete"] This change improves the comprehensiveness of autocomplete predictions by expanding coverage for long-tail U.S. local search queries such as addresses or small businesses.
Improvements to triggering of public data search feature.
[launch codename "Plunge_Local", project codename "DIVE"] This launch improves triggering for the
public data search feature
, broadening the range of queries that will return helpful population and unemployment data.
Adding Japanese and Korean to error page classifier.
[launch codename "maniac4jars", project codename "Soft404"] We have signals designed to detect crypto 404 pages (also known as “soft 404s”), pages that return valid text to a browser, but the text only contains error messages, such as “Page not found.” It’s rare that a user will be looking for such a page, so it’s important we be able to detect them. This change extends a particular classifier to Japanese and Korean.
More efficient generation of alternative titles.
[launch codename "HalfMarathon"] We use a variety of signals to generate titles in search results. This change makes the process more efficient, saving tremendous CPU resources without degrading quality.
More concise and/or informative titles.
[launch codename "kebmo"] We look at a number of factors when deciding what to show for the title of a search result. This change means you’ll find more informative titles and/or more concise titles with the same information.
Fewer bad spell corrections internationally.
[launch codename "Potage", project codename "Spelling"] When you search for [mango tea], we don't want to show spelling predictions like “Did you mean 'mint tea'?” We have algorithms designed to prevent these “bad spell corrections” and this change internationalizes one of those algorithms.
More spelling corrections globally and in more languages.
[launch codename "pita", project codename "Autocomplete"] Sometimes autocomplete will correct your spelling before you’ve finished typing. We’ve been offering advanced spelling corrections in English, and recently we extended the comprehensiveness of this feature to cover more than 60 languages.
More spell corrections for long queries.
[launch codename "caterpillar_new", project codename "Spelling"] We rolled out a change making it more likely that your query will get a spell correction even if it’s longer than ten terms. You can watch
uncut footage
of when we decided to launch this from our past blog post.
More comprehensive triggering of “showing results for” goes international.
[launch codename "ifprdym", project codename "Spelling"] In some cases when you’ve misspelled a search, say [pnumatic], the results you find will actually be results for the corrected query, “pneumatic.” In the past, we haven’t always provided the explicit user interface to say, “Showing results for pneumatic” and the option to “Search instead for pnumatic.” We recently started showing the explicit “Showing results for” interface more often in these cases in English, and now we’re expanding that to new languages.
“Did you mean” suppression goes international.
[launch codename "idymsup", project codename "Spelling"] Sometimes the “Did you mean?” spelling feature predicts spelling corrections that are accurate, but wouldn’t actually be helpful if clicked. For example, the results for the predicted correction of your search may be nearly identical to the results for your original search. In these cases, inviting you to refine your search isn’t helpful. This change first checks a spell prediction to see if it’s useful before presenting it to the user. This algorithm was already rolled out in English, but now we’ve expanded to new languages.
Spelling model refresh and quality improvements.
We’ve refreshed spelling models and launched quality improvements in 27 languages.
Fewer autocomplete predictions leading to low-quality results.
[launch codename "Queens5", project codename "Autocomplete"] We’ve rolled out a change designed to show fewer autocomplete predictions leading to low-quality results.
Improvements to SafeSearch for videos and images.
[project codename "SafeSearch"] We’ve made improvements to our SafeSearch signals in videos and images mode, making it less likely you’ll see adult content when you aren’t looking for it.
Improved SafeSearch models.
[launch codename "Squeezie", project codename "SafeSearch"] This change improves our classifier used to categorize pages for SafeSearch in 40+ languages.
Improvements to SafeSearch signals in Russian.
[project codename "SafeSearch"] This change makes it less likely that you’ll see adult content in Russian when you aren’t looking for it.
Increase base index size by 15%.
[project codename "Indexing"] The base search index is our main index for serving search results and every query that comes into Google is matched against this index. This change increases the number of documents served by that index by 15%. *Note: We’re constantly tuning the size of our different indexes and changes may not always appear in these blog posts.
New index tier.
[launch codename "cantina", project codename "Indexing"] We keep our index in “tiers” where different documents are indexed at different rates depending on how relevant they are likely to be to users. This month we introduced an additional indexing tier to support continued comprehensiveness in search results.
Backend improvements in serving.
[launch codename "Hedges", project codename "Benson"]
We’ve rolled out some improvements to our serving systems making them less computationally expensive and massively simplifying code.
"Sub-sitelinks" in expanded sitelinks.
[launch codename "thanksgiving"] This improvement
digs deeper
into
megasitelinks
by showing sub-sitelinks instead of the normal snippet.
Better ranking of expanded sitelinks.
[project codename "Megasitelinks"] This change improves the ranking of megasitelinks by providing a minimum score for the sitelink based on a score for the same URL used in general ranking.
Sitelinks data refresh.
[launch codename "Saralee-76"] Sitelinks (the links that appear beneath some search results and link deeper into the site) are generated in part by an offline process that analyzes site structure and other data to determine the most relevant links to show users. We’ve recently updated the data through our offline process. These updates happen frequently (on the order of weeks).
Less snippet duplication in expanded sitelinks.
[project codename "Megasitelinks"] We’ve adopted a new technique to reduce duplication in the snippets of expanded sitelinks.
Movie showtimes search feature for mobile in China, Korea and Japan.
We’ve expanded our movie showtimes feature for mobile to China, Korea and Japan.
MLB search feature.
[launch codename "BallFour", project codename "Live Results"] As the MLB season began, we rolled out a new MLB search feature. Try searching for [
sf giants score
] or [
mlb scores
].
Spanish football (La Liga) search feature.
This feature provides scores and information about teams playing in La Liga. Try searching for [
barcelona fc
] or [
la liga
].
Formula 1 racing search feature.
[launch codename "CheckeredFlag"] This month we introduced a new search feature to help you find Formula 1 leaderboards and results. Try searching [
formula 1
] or [
mark webber
].
Tweaks to NHL search feature.
We’ve improved the NHL search feature so it’s more likely to appear when relevant. Try searching for [
nhl scores
] or [
capitals score
].
Keyword stuffing classifier improvement.
[project codename "Spam"] We have classifiers designed to detect when a website is
keyword stuffing
. This change made the keyword stuffing classifier better.
More authoritative results.
We’ve tweaked a signal we use to surface more authoritative content.
Better HTML5 resource caching for mobile.
We’ve improved caching of different components of the search results page, dramatically reducing latency in a number of cases.
And here are some other changes we’ve blogged about since last time:
Updates to rich snippets
Another step to reward high-quality sites
Posted by Matt Cutts, Distinguished Engineer
Another step to reward high-quality sites
April 24, 2012
(Cross-posted on the
Webmaster Central Blog
)
Google has said before that search engine optimization, or SEO, can be
positive and constructive
—and we're
not the only ones
. Effective search engine optimization can make a site more crawlable and make individual pages more accessible and easier to find. Search engine optimization includes things as simple as keyword research to ensure that the right words are on the page, not just industry jargon that normal people will never type.
“White hat” search engine optimizers often improve the usability of a site, help create great content, or make sites faster, which is good for both users and search engines. Good search engine optimization can also mean good marketing: thinking about creative ways to make a site more compelling, which can help with search engines as well as social media. The net result of making a great site is often greater awareness of that site on the web, which can translate into more people linking to or visiting a site.
The opposite of “white hat” SEO is something called “black hat webspam” (we say “webspam” to distinguish it from email spam). In the pursuit of higher rankings or traffic, a few sites use techniques that don’t benefit users, where the intent is to look for shortcuts or loopholes that would rank pages higher than they deserve to be ranked. We see all sorts of webspam techniques every day, from
keyword stuffing
to
link schemes
that attempt to propel sites higher in rankings.
The goal of many of our ranking changes is to help searchers find sites that provide a great user experience and fulfill their information needs. We also want the “good guys” making great sites for users, not just algorithms, to see their effort rewarded. To that end we’ve launched
Panda changes
that successfully
returned higher-quality sites in search results
. And earlier this year we launched a
page layout algorithm
that reduces rankings for sites that don’t make much content available “above the fold.”
In the next few days, we’re launching an important algorithm change targeted at webspam. The change will decrease rankings for sites that we believe are violating Google’s existing
quality guidelines
. We’ve always targeted webspam in our rankings, and this algorithm represents another improvement in our efforts to reduce webspam and promote high quality content. While we can't divulge specific signals because we don't want to give people a way to game our search results and worsen the experience for users, our advice for webmasters is to focus on
creating high quality sites
that create a good user experience and employ white hat SEO methods instead of engaging in aggressive webspam tactics.
Here’s an example of a webspam tactic like keyword stuffing taken from a site that will be affected by this change:
Of course, most sites affected by this change aren’t so blatant. Here’s an example of a site with unusual linking patterns that is also affected by this change. Notice that if you try to read the text aloud you’ll discover that the outgoing links are completely unrelated to the actual content, and in fact the page text has been “spun” beyond recognition:
Sites affected by this change might not be easily recognizable as spamming without deep analysis or expertise, but the common thread is that these sites are doing much more than white hat SEO; we believe they are engaging in webspam tactics to manipulate search engine rankings.
The change will go live for all languages at the same time. For context, the initial Panda change affected about 12% of queries to a significant degree; this algorithm affects about 3.1% of queries in English to a degree that a regular user might notice. The change affects roughly 3% of queries in languages such as German, Chinese, and Arabic, but the impact is higher in more heavily-spammed languages. For example, 5% of Polish queries change to a degree that a regular user might notice.
We want people doing white hat search engine optimization (or even no search engine optimization at all) to be free to focus on creating amazing, compelling web sites. As always, we’ll keep our ears open for feedback on ways to iterate and improve our ranking algorithms toward that goal.
Posted by Matt Cutts, Distinguished Engineer
Search quality highlights: 50 changes for March
April 3, 2012
Here’s our latest installment of search quality highlights, with another 50 changes to report for March. We’re starting to get into a groove with these posts, so we’re getting more and more comprehensive as the months go by. New for this month, we’ve published
uncut video
from our search quality meeting, which gives a great flavor for how these decisions get made.
Here’s the list for March:
Autocomplete with math symbols.
[launch codename "Blackboard", project codename "Suggest"] When we process queries to return predictions in autocomplete, we generally normalize them to match more relevant predictions in our database. This change incorporates several characters that were previously normalized: “+”, “-”, “*”, “/”, “^”, “(“, “)”, and “=”. This should make it easier to search for popular equations, for example [
e = mc2
] or [
y = mx+b
].
Improvements to handling of symbols for indexing.
[launch codename "Deep Maroon"] We generally ignore punctuation symbols in queries. Based on analysis of our query stream, we’ve now started to index the following heavily used symbols: “%”, “$”, “\”, “.”, “@”, “#”, and “+”. We’ll continue to index more symbols as usage warrants.
Better scoring of news groupings.
[launch codename "avenger_2"] News results on Google are organized into groups that are about the same story. We have scoring systems to determine the ordering of these groups for a given query. This subtle change slightly improves our scoring system, leading to better ranking of news clusters.
Sitelinks data refresh.
[launch codename "Saralee-76"] Sitelinks (the links that appear beneath some search results and link deeper into the respective site) are generated in part by an offline process that analyzes site structure and other data to determine the most relevant links to show users. We’ve recently updated the data through our offline process. These updates happen frequently (on the order of weeks).
Improvements to autocomplete backends, coverage.
[launch codename "sovereign", project codename "Suggest"] We’ve consolidated systems and reduced the number of backend calls required to prepare autocomplete predictions for your query. The result is more efficient CPU usage and more comprehensive predictions.
Better handling of password changes.
Our general approach is that when you change passwords, you’ll be signed out from your account on all machines. This change ensures that changing your password more consistently signs your account out of Search, everywhere.
Better indexing of profile pages.
[launch codename "Prof-2"] This change improves the comprehensiveness of public profile pages in our index from more than two-hundred social sites.
UI refresh for News Universal.
[launch codename "Cosmos Newsy", project codename "Cosmos"] We’ve refreshed the design of News Universal results by providing more results from the top cluster, unifying the UI treatment of clusters of different sizes, adding a larger font for the top article, adding larger images (from licensed sources), and adding author information.
Improvements to results for navigational queries.
[launch codename "IceMan5"] A “navigational query” is a search where it looks like the user is looking to navigate to a particular website, such as [New York Times] or [wikipedia.org]. While these searches may seem straightforward, there are still challenges to serving the best results. For example, what if the user doesn’t actually know the right URL? What if the URL they’re searching for seems to be a parked domain (with no content)? This change improves results for this kind of search.
High-quality sites algorithm data update and freshness improvements.
[launch codename “mm”, project codename "Panda"] Like many of the changes we make, aspects of our high-quality sites algorithm depend on processing that’s done offline and pushed on a periodic cycle. In the past month, we’ve pushed updated data for “Panda,” as we mentioned in a
recent tweet
. We’ve also made improvements to keep our database fresher overall.
Live results for UEFA Champions League and KHL.
We’ve added live-updating snippets in our search results for the KHL (Russian Hockey League) and UEFA Champions League, including scores and schedules. Now you can find live results from a variety of sports leagues, including the
NFL
,
NBA
,
NHL
and others.
Tennis search feature.
[launch codename "DoubleFault"] We’ve introduced a new search feature to provide realtime tennis scores at the top of the search results page. Try [
maria sharapova
] or [
sony ericsson open
].
More relevant image search results.
[launch codename "Lice"] This change tunes signals we use related to landing page quality for images. This makes it more likely that you’ll find highly relevant images, even if those images are on pages that are lower quality.
Fresher image predictions in all languages.
[launch codename "imagine2", project codename "Suggest"] We recently rolled out a change to surface more relevant image search predictions in autocomplete in English. This improvement extends the update to all languages.
SafeSearch algorithm tuning.
[launch codenames "Fiorentini", “SuperDyn”; project codename "SafeSearch"] This month we rolled out a couple of changes to our SafeSearch algorithm. We’ve updated our classifier to make it smarter and more precise, and we’ve found new ways to make adult content less likely to appear when a user isn't looking for it
Tweaks to handling of anchor text.
[launch codename "PC"] This month we turned off a classifier related to anchor text (the visible text appearing in links). Our experimental data suggested that other methods of anchor processing had greater success, so turning off this component made our scoring cleaner and more robust.
Simplification to Images Universal codebase.
[launch codename "Galactic Center"] We’ve made some improvements to simplify our codebase for Images Universal and to better utilize improvements in our general web ranking to also provide better image results.
Better application ranking and UI on mobile.
When you search for apps on your phone, you’ll now see richer results with app icons, star ratings, prices, and download buttons arranged to fit well on smaller screens. You’ll also see more relevant ranking of mobile applications based on your device platform, for example Android or iOS.
Improvements to freshness in Video Universal.
[launch codename "graphite", project codename "Freshness"] We’ve improved the freshness of video results to better detect stale videos and return fresh content.
Fewer undesired synonyms.
[project codename "Synonyms"] When you search on Google, we often identify other search terms that might have the same meaning as what you entered in the box (synonyms) and surface results for those terms as well when it might be helpful. This month we tweaked a classifier to prevent unhelpful synonyms from being introduced as content in the results set.
Better handling of queries with both navigational and local intent.
[launch codename "ShieldsUp"] Some queries have both local intent and are very navigational (directed towards a particular website). This change improves the balance of results we show, and helps ensure you’ll find highly relevant navigational results or local results towards the top of the page as appropriate for your query.
Improvements to freshness.
[launch codename "Abacus", project codename "Freshness"] We launched an improvement to freshness late last year that was very helpful, but it cost significant machine resources. At the time we decided to roll out the change only for news-related traffic. This month we rolled it out for all queries.
Improvements to processing for detection of site quality.
[launch codename "Curlup"] We’ve made some improvements to a longstanding system we have to detect site quality. This improvement allows us to get greater confidence in our classifications.
Better interpretation and use of anchor text.
We’ve improved systems we use to interpret and use anchor text, and determine how relevant a given anchor might be for a given query and website.
Better local results and sources in Google News.
[launch codename "barefoot", project codename "news search"] We’re deprecating a signal we had to help people find content from their local country, and we’re building similar logic into other signals we use. The result is more locally relevant Google News results and higher quality sources.
Deprecating signal related to ranking in a news cluster.
[launch codename "decaffeination", project codename "news search”] We’re deprecating a signal that’s no longer improving relevance in Google News. The signal was originally developed to help people find higher quality articles on Google News. (Note: Despite the launch codename, this project has nothing to do with Caffeine, our update to indexing in 2010).
Fewer “sibling” synonyms.
[launch codename "Gemini", project codename "Synonyms"] One of the main signals we look at to identify synonyms is context. For example, if the word “cat” often appears next to the term “pet” and “furry,” and so does the word “kitten”, our algorithms may guess that “cat” and “kitten” have similar meanings. The problem is that sometimes this method will introduce “synonyms” that actually are different entities in the same category. To continue the example, dogs are also “furry pets” -- so sometimes “dog” may be incorrectly introduced as a synonym for “cat”. We’ve been working for some time to appropriately ferret out these “sibling” synonyms, and our latest system is more maintainable, updatable, debuggable, and extensible to other systems.
Better synonym accuracy and performance.
[project codename "Synonyms"] We’ve made further improvements to our synonyms system by eliminating duplicate logic. We’ve also found ways to more accurately identify appropriate synonyms in cases where there are multiple synonym candidates with different contexts.
Retrieval system tuning.
[launch codename "emonga", project codename "Optionalization"] We’ve improved systems that identify terms in a query which are not necessarily required to retrieve relevant documents. This will make results more faithful to the original query.
Less aggressive synonyms.
[launch codename "zilong", project codename "Synonyms"] We’ve heard feedback from users that sometimes our algorithms are too aggressive at incorporating search results for other terms. The underlying cause is often our synonym system, which will include results for other terms in many cases. This change makes our synonym system less aggressive in the way it incorporates results for other query terms, putting greater weight on the original user query.
Update to systems relying on geographic data.
[launch codename "Maestro, Maitre"] We have a number of signals that rely on geographic data (similar to the data we surface in Google Earth and Maps). This change updates some of the geographic data we’re using.
Improvements to name detection.
[launch codename "edge", project codename "NameDetector"] We’ve improved a system for detecting names, particularly for celebrity names.
Updates to personalization signals.
[project codename "PSearch"] This change updates signals used to personalize search results.
Improvements to Image Search relevance.
[launch codename "sib"] We’ve updated signals to better promote reasonably sized images on high-quality landing pages.
Remove deprecated signal from site relevance signals.
[launch codename "Freedom"] We’ve removed a deprecated product-focused signal from a site-understanding algorithm.
More precise detection of old pages.
[launch codename "oldn23", project codename “Freshness"] This change improves detection of stale pages in our index by relying on more relevant signals. As a result, fewer stale pages are shown to users.
Tweaks to language detection in autocomplete.
[launch codename “Dejavu”, project codename "Suggest"] In general, autocomplete relies on the display language to determine what language predictions to show. For most languages, we also try to detect the user query language by analyzing the script, and this change extends that behavior to Chinese (Simplified and Traditional), Japanese and Korean. The net effect is that when users forget to turn off their IMEs, they’ll still get English predictions if they start typing English terms.
Improvements in date detection for blog/forum pages.
[launch codename "fibyen", project codename "Dates"] This change improves the algorithm that determines dates for blog and forum pages.
More predictions in autocomplete by live rewriting of query prefixes.
[launch codename "Lombart", project codename "Suggest”] In this change we’re rewriting partial queries on the fly to retrieve more potential matching predictions for the user query. We use synonyms and other features to get the best overall match. Rewritten prefixes can include term re-orderings, term additions, term removals and more.
Expanded sitelinks on mobile.
We’ve launched our
expanded sitelinks
feature for mobile browsers, providing better organization and presentation of sitelinks in search results.
More accurate short answers.
[project codename “Porky Pig”] We’ve updated the sources behind our
short answers feature
to rely on data from
Freebase
. This improves accuracy and makes it easier to fix bugs.
Migration of video advanced search backends.
We’ve migrated some backends used in video advanced search to our main search infrastructure.
+1 button in search for more countries and domains.
This month we’ve internationalized the +1 button on the search results page to additional languages and domains. The +1 button in search makes it easy to share recommendations with the world right from your search results. As we said in
our initial blog post
, the beauty of +1’s is their relevance—you get the right recommendations (because they come from people who matter to you), at the right time (when you are actually looking for information about that topic) and in the right format (your search results).
Local result UI refresh on tablet.
We’ve updated the user interface of local results on tablets to make them more compact and easier to scan.
And here are a few other changes we’ve blogged about since last time:
Flights to worldwide destinations
Redesigned Search App for Windows 7.5 phones
SSL search around the globe
“Recent” feature on mobile
Full-page themes in iGoogle
March Madness NCAA search feature
3D graphing calculator
Posted by Johanna Wright, Director of Product Management
Video! The search quality meeting, uncut (annotated)
March 12, 2012
It took eight video cameras and 16 microphones, but we’ve done something new and special to give you another inside look at
how search works
. Today we’ve published, for the first time, a video with the uncut discussion of a proposed
algorithm change
(in this case, an upcoming change to our spell correction system). The language can be technical, so we've included annotations to provide some context for the discussion (and have a little fun!).
The footage was captured on December 1, 2011 at our weekly “Quality Launch Review” meeting. We hold the meeting on Thursdays to discuss possible algorithmic improvements and make decisions about what to launch. As usual, meeting participants gathered in Mountain View and joined on videoconference from remote offices around the globe, including our offices in Moscow, New York, Zurich, Seoul, Haifa and Tokyo. Check out the video for a flavor of the kinds of topics and data the team discusses before making many of the important changes to our system.
A few things you’ll observe:
Even relatively subtle changes get intense scrutiny by our
search evaluation
and
ranking
teams. The specific change discussed in this video improves spelling suggestions for searches with more than 10 words and it impacts only .1% of our traffic. Still, you can see the scrutiny and thoughtfulness that goes into approving this change.
Every change has a dedicated search quality analyst assigned to study the impact. This analyst is not part of the engineering team building the change, but instead offers a separate opinion on whether the change is good for users.
The search team relies heavily on the results of experimental data to make decisions. During the meeting, we rely on detailed analyst reports including the results of click evaluations and side-by-side experiments. These reports can sometimes be more than 25 pages long.
Launch reports include specific examples to illustrate broader trends in the data. Rather than manually change one example, our engineers look for algorithmic ways to improve millions of queries.
Search algorithm improvements often rely on and impact many different systems, so engineers with expertise in different areas all need to come together to make the best decision for the user, balancing all the tradeoffs involved (relevance, spam, latency, cost, language impact, etc.)
As I said in the video, this is an experiment, and we’re interested to hear
what you think
. For all the search geeks out there, we hope you enjoy it! For a video summary of our process, I can also recommend the
video
we posted last August.
Posted by Amit Singhal, Senior VP and Google Fellow
Search quality highlights: 40 changes for February
February 27, 2012
This month we have many improvements to celebrate. With 40 changes reported, that marks a new record for our
monthly series
on search quality. Most of the updates rolled out earlier this month, and a handful are actually rolling out today and tomorrow. We continue to improve many of our systems, including related searches, sitelinks, autocomplete, UI elements, indexing, synonyms, SafeSearch and more. Each individual change is subtle and important, and over time they add up to a radically improved search engine.
Here’s the list for February:
More coverage for related searches.
[launch codename “Fuzhou”] This launch brings in a new data source to help generate the “Searches related to” section, increasing coverage significantly so the feature will appear for more queries. This section contains search queries that can help you refine what you’re searching for.
Tweak to categorizer for expanded sitelinks.
[launch codename “Snippy”, project codename “Megasitelinks”] This improvement adjusts a signal we use to try and identify duplicate snippets. We were applying a categorizer that wasn’t performing well for our expanded sitelinks, so we’ve stopped applying the categorizer in those cases. The result is more relevant sitelinks.
Less duplication in expanded sitelinks.
[launch codename “thanksgiving”, project codename “Megasitelinks”] We’ve adjusted signals to reduce duplication in the snippets for
expanded sitelinks
. Now we generate relevant snippets based more on the page content and less on the query.
More consistent thumbnail sizes on results page.
We’ve adjusted the thumbnail size for most image content appearing on the results page, providing a more consistent experience across result types, and also across mobile and tablet. The new sizes apply to rich snippet results for recipes and applications, movie posters, shopping results, book results, news results and more.
More locally relevant predictions in YouTube.
[project codename “Suggest”] We’ve improved the ranking for predictions in YouTube to provide more locally relevant queries. For example, for the query [lady gaga in ] performed on the US version of YouTube, we might predict [lady gaga in times square], but for the same search performed on the Indian version of YouTube, we might predict [lady gaga in India].
More accurate detection of official pages.
[launch codename “WRE”] We’ve made an adjustment to how we detect official pages to make more accurate identifications. The result is that many pages that were previously misidentified as official will no longer be.
Refreshed per-URL country information.
[Launch codename “longdew”, project codename “country-id data refresh”] We updated the country associations for URLs to use more recent data.
Expand the size of our images index in Universal Search.
[launch codename “terra”, project codename “Images Universal”] We launched a change to expand the corpus of results for which we show images in Universal Search. This is especially helpful to give more relevant images on a larger set of searches.
Minor tuning of autocomplete policy algorithms.
[project codename “Suggest”] We have a narrow set of
policies for autocomplete
for offensive and inappropriate terms. This improvement continues to refine the algorithms we use to implement these policies.
“Site:” query update
[launch codename “Semicolon”, project codename “Dice”] This change improves the ranking for queries using the “site:” operator by increasing the diversity of results.
Improved detection for SafeSearch in Image Search.
[launch codename "Michandro", project codename “SafeSearch”] This change improves our signals for detecting adult content in Image Search, aligning the signals more closely with the signals we use for our other search results.
Interval based history tracking for indexing.
[project codename “Intervals”] This improvement changes the signals we use in document tracking algorithms.
Improvements to foreign language synonyms.
[launch codename “floating context synonyms”, project codename “Synonyms”] This change applies an improvement we previously launched for English to all other languages. The net impact is that you’ll more often find relevant pages that include synonyms for your query terms.
Disabling two old fresh query classifiers.
[launch codename “Mango”, project codename “Freshness”] As search evolves and new signals and classifiers are applied to rank search results, sometimes old algorithms get outdated. This improvement disables two old classifiers related to query freshness.
More organized search results for Google Korea.
[launch codename “smoothieking”, project codename “Sokoban4”] This significant improvement to search in Korea better organizes the search results into sections for news, blogs and homepages.
Fresher images.
[launch codename “tumeric”] We’ve adjusted our signals for surfacing fresh images. Now we can more often surface fresh images when they appear on the web.
Update to the Google bar.
[project codename “Kennedy”] We continue to iterate in our efforts to deliver a beautifully simple experience across Google products, and as part of that this month we made further adjustments to the Google bar. The biggest change is that we’ve replaced the drop-down Google menu in the
November redesign
with a consistent and expanded set of links running across the top of the page.
Adding three new languages to classifier related to error pages.
[launch codename "PNI", project codename "Soft404"] We have signals designed to detect crypto 404 pages (also known as “soft 404s”), pages that return valid text to a browser but the text only contain error messages, such as “Page not found.” It’s rare that a user will be looking for such a page, so it’s important we be able to detect them. This change extends a particular classifier to Portuguese, Dutch and Italian.
Improvements to travel-related searches.
[launch codename “nesehorn”] We’ve made improvements to triggering for a variety of flight-related search queries. These changes improve the user experience for our
Flight Search feature
with users getting more accurate flight results.
Data refresh for related searches signal.
[launch codename “Chicago”, project codename “Related Search”] One of the many signals we look at to generate the “Searches related to” section is the queries users type in succession. If users very often search for [apple] right after [banana], that’s a sign the two might be related. This update refreshes the model we use to generate these refinements, leading to more relevant queries to try.
International launch of shopping rich snippets.
[project codename “rich snippets”]
Shopping rich snippets
help you more quickly identify which sites are likely to have the most relevant product for your needs, highlighting product prices, availability, ratings and review counts. This month we expanded shopping rich snippets globally (they were previously only available in the US, Japan and Germany).
Improvements to Korean spelling.
This launch improves spelling corrections when the user performs a Korean query in the wrong keyboard mode (also known as an "IME", or input method editor). Specifically, this change helps users who mistakenly enter Hangul queries in Latin mode or vice-versa.
Improvements to freshness.
[launch codename “iotfreshweb”, project codename “Freshness”] We’ve applied new signals which help us surface fresh content in our results even more quickly than before.
Web History in 20 new countries.
With Web History, you can browse and search over your search history and webpages you've visited. You will also get personalized search results that are more relevant to you, based on what you’ve searched for and which sites you’ve visited in the past. In order to deliver more relevant and personalized search results, we’ve launched Web History in Malaysia, Pakistan, Philippines, Morocco, Belarus, Kazakhstan, Estonia, Kuwait, Iraq, Sri Lanka, Tunisia, Nigeria, Lebanon, Luxembourg, Bosnia and Herzegowina, Azerbaijan, Jamaica, Trinidad and Tobago, Republic of Moldova, and Ghana. Web History is turned on only for people who have a Google Account and previously enabled Web History.
Improved snippets for video channels.
Some search results are links to channels with many different videos, whether on mtv.com, Hulu or YouTube. We’ve had a feature for a while now that displays snippets for these results including direct links to the videos in the channel, and this improvement increases quality and expands coverage of these rich “decorated” snippets. We’ve also made some improvements to our backends used to generate the snippets.
Improvements to ranking for local search results.
[launch codename “Venice”] This improvement improves the triggering of Local Universal results by relying more on the ranking of our main search results as a signal.
Improvements to English spell correction.
[launch codename “Kamehameha”] This change improves spelling correction quality in English, especially for rare queries, by making one of our scoring functions more accurate.
Improvements to coverage of News Universal.
[launch codename “final destination”] We’ve fixed a bug that caused News Universal results not to appear in cases when our testing indicates they’d be very useful.
Consolidation of signals for spiking topics.
[launch codename “news deserving score”, project codename “Freshness”] We use a number of signals to detect when a new topic is spiking in popularity. This change consolidates some of the signals so we can rely on signals we can compute in realtime, rather than signals that need to be processed offline. This eliminates redundancy in our systems and helps to ensure we can continue to detect spiking topics as quickly as possible.
Better triggering for Turkish weather search feature.
[launch codename “hava”] We’ve tuned the signals we use to decide when to present Turkish users with the weather search feature. The result is that we’re able to provide our users with the weather forecast right on the results page with more frequency and accuracy.
Visual refresh to account settings page.
We completed a visual refresh of the
account settings page
, making the page more consistent with the rest of our constantly
evolving design
.
Panda update.
This launch refreshes data in the Panda system, making it more accurate and more sensitive to recent changes on the web.
Link evaluation.
We often use characteristics of links to help us figure out the topic of a linked page. We have changed the way in which we evaluate links; in particular, we are turning off a method of link analysis that we used for several years. We often rearchitect or turn off parts of our scoring in order to keep our system maintainable, clean and understandable.
SafeSearch update.
We have updated how we deal with adult content, making it more accurate and robust. Now, irrelevant adult content is less likely to show up for many queries.
Spam update.
In the process of investigating some potential spam, we found and fixed some weaknesses in our spam protections.
Improved local results.
We launched a new system to find results from a user’s city more reliably. Now we’re better able to detect when both queries and documents are local to the user.
And here are a few more changes we’ve already blogged about separately:
Flight Search on mobile
Improved health searches
Better related searches for images
Upcoming concert dates
Posted by Amit Singhal, Senior VP and Google Fellow
Labels
flight search
images
knowledge graph
local
mobile
quick answers
Search Blog
search quality
search stories
search tips
trends
universal search
webmasters
Archive
2016
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2015
Dec
Nov
Oct
Sep
Jul
Jun
May
Apr
Mar
Feb
Jan
2014
Dec
Nov
Oct
Aug
Jul
Jun
Apr
Mar
Feb
Jan
2013
Dec
Nov
Sep
Aug
Jul
May
Apr
Mar
Feb
Jan
2012
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2011
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Feed
Google
on
Follow @google
Follow
Give us feedback in our
Product Forums
.