Google/Bing blogs

  • Fetch as Googlebot Mobile and Claim your Sidewiki comment - added to Webmaster Tools Labs!
  • Sharing the verification love
  • Google's SEO Report Card
  • Is your site hacked? New Message Center notifications for hacking and abuse
  • How did you do on the Webmaster Quiz?
  • Request visitors' permission before installing software
  • Protect your site from spammers with reCAPTCHA
  • Introducing a new Rich Snippets format: Events
  • Google SEO resources for beginners
  • State of the Index 2009
  • Illuminating the path to SEO for Silverlight -

    Microsoft Silverlight is a transformative technology. It enables otherwise basic websites to act as full-blown applications, provides access to state-of-the-art animation and video rich media presentations, and takes full advantage of your development team's existing experience in standard programming languages, such as C#.

    However, there is one little problem. Like competing rich Internet application (RIA) technologies that present images, animations, and videos, all of that non-text-based content is extremely hard for search engines to parse and index. As a result, many website owners who are initially thrilled at the cutting edge presentation shown on their websites are later confounded when their beautiful sites suddenly fail to show up in the search engine results pages (SERPs). The problem is that the very technology used with the intention of blowing away their customers, when not thoughtfully implemented, can literally blow away their page rank as a result. In those cases, the site developer/webmaster failed to account for search engine optimization (SEO) when they implemented Silverlight.

    Silverlight applications are packaged up for deployment into files with the extension .xap. These files represent the instructions to start the application and, in most cases, contain the content of the application. Unfortunately, search engines can't easily read Silverlight XAP files. Yes, the technical parsing and content extraction capabilities used in the world of search are improving every day. But as of today, you'd be wise to cater to what the search engines already do very well: read text. This means you need to add text-based information -metadata and more - to the Silverlight objects you employ on your website.

    You might think it's not worth the effort to do such work. If you don't care about being found in search, you might have a point! But consider this: if some of your intended clients use operating systems or web browsers that are unsupported by Silverlight, what will they see when they go to your Silverlight-enabled site? Will the site still be usable? Will its presentation still make sense? Or will the page be blank? Do you even know?

    If you care about these audiences, then the same backward compatibility work you'd do to help them will also serve you in your SEO efforts. As an overall investment, if you want to use advanced content technologies to improve the user experience for your customers, you might well want to invest in making your site more accessible to all potential users, including search engine crawlers (aka bots).

    Basic SEO

    Right off the bat, there are several things you can do to help the search engine bot learn more about your Silverlight-infused pages. Because bots cannot "read" Silverlight content the way we can, the wise addition of metadata is all the more important in these pages. This information helps the bots interpret what is the theme of the page, how the content relates to other pages and sites, and provides keywords to help the search engine understand the page well enough to rank it accurately among other, relevant sites for search users.

    Much of this advice is actually basic SEO, but as Bing so commonly sees RIA-laden pages that possess none of this information, it bears repeating here. All of your pages, not just those using Silverlight, should have these elements in the source code:

    • Descriptive <title> tag. Every page should include a descriptive and unique <title> tag. That information is part of what a bot reads to assess what sort of content is contained on the page. Using a title such as <title>Silverlight application</title> is about as useless as no <title> tag at all. Get specific!
    • Descriptive name="description" <meta> tag. Another important page element that bots use to determine the contents of a page is the text within the "description" <meta> tag. This information is often used to help create the website description snippet used on a SERP. As before, don't go generic here - be specific and unique. There is often so little text-based information on a Silverlight page that every little bit of unique content will be that much more meaningful to the search engine indexer.
    • Descriptive <h1> tags. The first level heading is second only to the <title> tag for being the place to define the thematic contents of a page. As such, stick to only one iteration per page, but make it meaningful and unique.
    • Discoverable navigation. No man is an island, but a web page with no discernible navigation links to other pages might be. And any page built without any discoverable navigation to other pages must not be very important - at least, that's the way bots will see it. Be sure every page on your site is linked to at least one other page, and link out to other pages from every page so the bot doesn't get stuck in a blind alley and abandon crawling your site any further.
    • Descriptive alt text. When you add an <img> tag to your page, be sure to provide that additional meta content. Bots can't read the contents of that image, even if it is merely an image of text, so the alt text you add is critical for helping the bot better understand what it cannot see.
    • Meaningful application name. Just as there is some SEO value to creating human-friendly URLs, where the directory and file names spell out logical words rather than globally unique identifier (GUID)-based gibberish, there is value to naming your Silverlight application in a manner that helps identify its purpose or role in the page. An object in the page code identifying "SilverlightApp1" is meaningless to everyone but the originating developer (and even then, it's questionable!).

    Every one of these elements is an opportunity to develop keywords for your pages. Be sure to use keyword-rich text in every opportunity. But as always, do so wisely. Keep it readable and oriented for human readers, not stuffed for bots. Keyword stuffing will only get your site in trouble.

    Graceful degradation

    OK, so the basic SEO stuff has been knocked out, but what more can you do for a Silverlight page? As it turns out, we've only just begun to optimize.

    The key to success in ensuring that down-level users will not be abandoned when you use an advanced technology like Silverlight is to implement a graceful degradation strategy. That means if a client, for whatever reason, cannot access the advanced primary technology offered (in this case, Silverlight), they still have a means to get something out of the page by means of lesser, secondary technology, be it metadata, substitute text on the page, a static image, or whatever else you can provide, content-wise, to assist those users.

    To provide that graceful degradation experience to your users, modify your Silverlight pages to include one or more of the following solutions.

    1. Present alternate, static page content

    Instead of using the <embed> tag, use the <object> tag to instantiate your Silverlight content in your page. The <object> tag allows the page to provide secondary, down-level content to be presented in case the initial, primary content (such as a Silverlight application) cannot be presented. By using the <object> tag, you can include text descriptions and other relevant content following the instantiation of the application in the code. Write these text descriptions toward the non-Silverlight user, describing the Silverlight application's role on the page, its function, or any other pertinent information that would help down-level users understand what would have been shown if they were able to access Silverlight. Be sure to use your page's targeted keywords as you describe the Silverlight content.

    Below is an example of how you can include contextual, alternative information within your page's Silverlight <object> tag code:

    <object data="data:application/x-silverlight-2," style="display:block" type="application/x-silverlight-2" >
      <param name="minRuntimeVersion" value="3.0.40624.0" />
      <param name="source" value="ClientBin/KingCountyTrafficMap.xap" />
      <div class="down-level">
        <h1>Traffic Map for King County, Washington</h1>

        <!-- Static image from the application -->
        <img src="KingCountyAfternoonTraffic.jpg" alt="Typical King County metro weekday rush-hour traffic at 5:00pm" />

        <p>Silverlight enabled computers can use this page to see up-to-date traffic conditions on the major roads and highways in King County, Washington.</p>
        <p>It's easy to <a href=" http://www.microsoft.com/silverlight/get-started/install/"install Silverlight</a> on your computer. See what you've been missing!</p>
      </div>
    </object>

    As you can see, the alternative content included the important <h1> tag and some informative content identifying the role of the Silverlight application. And by providing a link to installing Silverlight, you might enable another user to step up and see your page in its primary view.

    2. Use multiple <div> sections

    Another strategy for creating a graceful degradation of Silverlight includes using multiple <div> sections on the page: one for the actual Silverlight content and another to be shown on computers that do not have Silverlight installed. Similar to the previous example, this technique sample demonstrates the presentation of static page content:

    <div id=" King County Traffic Map">
      <object data="data:application/x-silverlight-2," style="display:block" type="application/x-silverlight-2" >
        <param name="minRuntimeVersion" value="3.0.40624.0" />
        <param name="source" value="ClientBin/KingCountyTrafficMap.xap" />
      </object>
      <iframe style='visibility:hidden;height:0;width:0;border:0px'></iframe>
    </div>

    <div id="AlternativeContent" style="display: none;">
        <h1>Traffic Map for King County, Washington</h1>

        <!-- Static image from the application -->
        <img src="KingCountyAfternoonTraffic.jpg" alt="Typical King County metro weekday rush-hour traffic at 5:00pm" />

        <p>Silverlight enabled computers can use this page to see up-to-date traffic conditions on the major roads and highways in King County, Washington.</p>
        <p>It's easy to <a href=" http://www.microsoft.com/silverlight/get-started/install/"install Silverlight</a> on your computer. See what you've been missing!</p>
    </div>

    Note that the alternative <div> is created by default as hidden content. Contrary to the generic advice given in the recent page-level web spam article, The pernicious perfidy of page-level web spam, the use of hidden content in this case is recognized by the search engine as contextually related to the graceful degradation strategy for Silverlight. As such, its use in this case will not raise any red flags to the search engine concerning potential web spam. As usual for these types of things, interpreting user intent is key to search engine bots identifying whether or not an ambiguous page element might be malicious.

    3. Expose alternate, dynamic content

    What if you are using Silverlight for more than just a single webpage application? What if you have a site-wide, Silverlight application used in an e-commerce scenario? In that case, you'll want to expose your inventory catalog of deep link content to search rather than have it left invisible in Silverlight. For this, you'll need to take a different approach. The alternate content here must describe any and all end point(s) that you want to make available to the search engine bot.

    Instead of doing a deep dive here on this technique (this article is already getting long!), I'll instead refer you to a few useful resources of information on how to expose these end points to the non-Silverlight user and the bot. Both include good code examples and a clear explanation of how the technique is employed:

    4. Use the createObject function in JavaScript

    This is a more developer-oriented SEO strategy that you can employ with Silverlight. This technique uses JavaScript to automatically generate the markup code needed to create the <object> tag and its required parameters.

    Again, as no one wants to read a white paper posing as a blog column, I will simply point you to helpful resources for more information:

    Test the new down-level experiences

    Once you've implemented your Silverlight graceful degradation strategy, test it in non-Silverlight-enabled environments. Popular choices among SEOs include text-based, web browser environments such as Lynx browser or SEO-browser. You can also use operating systems currently incompatible with Silverlight, such as Windows 98, Linux, FreeBSD, or SolarisOS, or unsupported web browsers, such as Opera. For details on Silverlight compatibility, see the list of Compatible Operating Systems and Browsers.

    Planning graceful degradation of Silverlight for SEO is identical to planning for those clients that are not Silverlight-enabled. Once your pages present useful, alternative content to non-Silverlight clients using the suggestions above, you can rest assured that search engine bots will also be able to see the results of your effort. And until bots can read RIA-based multimedia content like humans can, that is how you do SEO with Silverlight.

    For additional information on performing SEO on Silverlight-enabled webpages, see the following:

    If you have any questions, comments, or suggestions, feel free to post them in our SEM forum. I'll be back soon!

    -- Rick DeJarnette, Bing Webmaster Center

  • Chasing the long tail with keyword research (SEM 101) -

    When a key opens a lock, it typically provides the key's holder with a clear path to where he or she wants to go. Keywords and key phrases do the same for a website. They help direct searchers to content they wish to see on the Internet. But there is a key difference: whereas a lock key will typically match up with only one lock, keywords can lead a searcher down multiple paths to many matching, relevant websites. It is a filtering process that leads the holder to the destination to which they want to go. (At least that's how it's supposed to work - see my recent article on keyword web spam for times when this is not the case.)

    Search engines are still heavily oriented toward text-based content. Even when other media types are indexed, it is typically done so using text-based descriptions. Search engine users separate the wheat from the chaff on the Internet by searching for words that are relevant to the information they seek. That is their key. Smart webmasters, anticipating those users who will employ search to find content similar to what they've published, can boost their chances of bringing searchers to their websites by using the same words in their content that searchers will in their searches. It's simply matching keys to unlocking (revealing) the content you want.

    Sometimes the keywords and key phrases searchers will use for a given field of interest are obvious, but that's not always good news for webmasters. If these keywords are obvious to you, it's likely that they are obvious to everyone, and if your site falls into one of those fields, all of your competitors' websites will be using those same keywords.

    The long tail of search

    If this is the case for you, there's no need to despair - there is hope. There's an often-overlooked truism in our industry: search has a long tail. Most webmasters only work to identify their sites with the head, so there's typically a lot of untapped value to be had in working on that long tail.

    What do I mean by head and tail? Consider the form of a tadpole. Much of its mass is in the big head, but then its form flows into a long, tapering tail. Graphs of keyword search trends often look like a tadpole with a very long tail. A few primary keywords typically dominate a sizable percentage of the search traffic, but then there are secondary and even tertiary keywords. By themselves, they are clearly not as effective as the primary keywords, as fewer users search on them. But there are people who either search directly on them or use them as a part of longer queries, and those users are just as valuable as conversion opportunities as users of primary keywords. The key distinction here is that most webmasters do not bother to actively compete for those potential customers in the long tail.

    If you are in an industry that has a few heavy-hitter, powerhouse websites as competitors, whose webmasters have worked hard to develop great content and earn authoritative backlinks, it can be as frustrating as chasing your own tail for a smaller upstart to compete with those sites using the same primary keywords. Competing in the long tail can be a great way to mop up some otherwise untapped business and begin to develop a name and reputation for your website. It's always better to compete for a high rank for a few keywords in the tail than to merely settle for a middling or worse rank for the most popular keywords in the head (settling for mediocrity is what most webmasters do, and thus why there's so often good opportunities for the taking).

    And with the time you spend successfully targeting the long tail keyword opportunities, if you make the effort to simultaneously develop quality content and work to earn authoritative inbound links for that content, your site will only increase in stature. At that point, you can start thinking about getting more competitive for those primary keywords in the head as well.

    Make it so

    So all of that sounds fine in concept. But how do you execute on such a plan? You have to know what keywords are being used in your field. You need to know what keywords you need to use on your website. You need to make your website a legitimate target for searchers who use those keywords. To get such keyword intelligence, you need a great keyword tool. One that is easy to use, draws from strong industry data sources, and offers a variety of views of that data. Frankly, I suggest you take a look at Microsoft Advertising Intelligence.

    Microsoft Advertising Intelligence is the successor to the 2009 beta tool called adCenter Excel Add-in Keyword Research Tool. As you might have inferred by its previous moniker, it installs as an add-in to Microsoft Office Excel 2007 (it won't work with any previous versions of Excel, however). You'll need an account with adCenter to gain access to the keyword data, but that's easily enough done, and there's no cost for setting up the account. Note that the tool was designed for users of search marketing (aka Pay Per Click [PPC] ads). However, the research needed to develop strong-performing keywords for PPC ads parallels that of keywords for search engine optimization (SEO), and thus the tool is easily repurposed for those efforts.

    Once installed, Microsoft Advertising Intelligence is presented as a tab on the Excel ribbon named Ad Intelligence. Click that tab, and from there, you have access to a series of helpful tools that can help you perform the following tasks:

    • Extract current keywords from an existing site
    • Create new keywords by starting with an existing list, a webpage, or by selecting a vertical
    • Expand current list of keywords by examining advertiser bidding selections and analysis of search query data
    • Analyze keyword performance by query, time, demographics, geo-location, and more
    • Identify the categories using that keyword and drill down to common queries
    • Identify key performance indicators (KPIs) for keywords and compare yours against industry averages
    • Look up typical PPC keyword pricing for particular keywords
    • Learn the click-through-rate (CTR) and the cost per click (CPC) around your chosen match-type position
    • Learn about industry KPIs and learn more about your own particular vertical, including the average CTR and CPC, and then compare your performance against your vertical's average

    I recommend that, immediately after installation, you first configure the tool to work with your adCenter account. In the Options & Help section of the ribbon, click Options, and then fill in the User name and Password fields with your adCenter credentials. Click Test Connection to confirm everything is ready to go. Once you get a message box confirming the connection was good, click OK to close the open dialog boxes.

    There are nine tool buttons on the ribbon, some containing multiple, related tools. Instead of me trying to explain all of the cool stuff that Microsoft Advertising Intelligence can do, I'll simply refer you to the tool's website for technical documentation, its active community forum, and the numerous video tutorials.

    Identify the long tail

    Once you've installed the tool, you can use it to pull a list of the current keywords used on your website today. Here's how:

    1. In Excel's Ad Intelligence tab, click the Keyword Wizards tool, select the option Extract from website, and then click Next.
    2. Type the URL you want to use, and then click Next.
    3. You can first review the keywords extracted by clicking Review, and then click Next to continue.
    4. Select the option Queries That Contain Your Keyword to see other keywords based on those extracted from your site, and then click Next.
    5. You can either change the setting Maximum suggested keywords or use the default. Click Next to continue.
    6. Click Review to see the updated list, and then click Next.
    7. To see historical data on the usage of the keywords in your list, click Monthly traffic, and then click Next.
    8. You can then modify the range of dates for historical usage performance data retrieved as well as for forward prediction usage or keep the defaults.
    9. Click Finish to get your report.

    In the resulting report, you can change the sort order of any of the columns of data to see which keywords and key phrases had the highest CTR on any particular month or in aggregate.

    If you want to be very specific in conducting your research and customizing your reports, you can skip the keyword wizard and instead use the other tools in Microsoft Advertising Intelligence to narrow down keywords for specific verticals, demographics (including age, gender, and location), and more. You'll see which words are the highest performers, and how those words have performed recently.

    This is powerful information, and you'll learn which words are being used in your field at which frequency. Check your site's keywords against those who are the movers and shakers in your field, and you may discover some under-utilized keywords in the long tail of search that may be a golden opportunity for your site.

    Once you do, implement them wisely on your site, and then monitor your site's progress over the coming weeks and months. For advice on implementing keywords wisely, check out our earlier blog articles on using keywords, including Put your keywords where the emphasis is (SEM 101) and The key to picking the right keywords (SEM 101). Whatever you do, don't follow the examples of keyword abuse documented in the blog article The pernicious perfidy of page-level web spam (SEM 101). Remember that SEO is not an overnight quick fix. Time is needed for crawling and reindexing changed content from the search engine side and then for searchers to find you. Patience, along with hard, smart work, will pay off. (And don't ignore other aspects of a thoughtful SEO plan that can improve ranking as well, such as creating great, unique content and earning authoritative, high-quality inbound links!)

    So stop chasing your own tail. Instead, invest in chasing the long tail of search by using a keyword intelligence tool like Microsoft Advertising Intelligence. That is the key for unlocking success in search.

    If you have any questions, comments, or suggestions, feel free to post them in our SEM forum. See you again soon...

    -- Rick DeJarnette, Bing Webmaster Center

  • The liability of loathsome, link-level web spam (SEM 101) -

    When I was a kid in high school, I used to go to the public library and do initial research in the Encyclopedia Britannica (yes, the bound book editions. I also remember black & white television with vacuum tubes and rotary telephones! Sheesh, I'm getting old!). I would pick up the index volume that contained the keyword I wanted to look up to identify which of the main volumes had the content I sought.

    But imagine this: when I opened up the referenced main volume to the page specified, I always found the content I wanted. I never once went to the content page referenced in the index and found a page full of advertisements, come-ons for dubious physical enhancement pharmaceuticals, or any irrelevant, unwanted garbage like that. That's how Internet search is supposed to work, too.

    Search engine as master index

    However, unlike the Encyclopedia Britannica, which maintained sole control over the information it published (thus making its index a really good bet for finding the content you want), the fast and loose world of the Internet is open to all comers, for better or for worse. The good of that trait allows for information of all types, from highly important to trivial (and in all ranges of value, from well-researched reports to skewed opinions to deceptive trash), to be found, but you must know where to look. This is where a search engine's role as master indexer comes into play. Services like Bing use their own resources to scan the Web for content and organize their findings into a useful index of the available content for users.

    But since no one entity has control over the content placed on the Web, the useful and informative website is joined by the unscrupulous huckster, who spends a huge amount of effort to deceive the search engine index in order to bring unsuspecting web searchers to their irrelevant website. This deception is the core of web spam.

    Bing and other search engine service providers work diligently to detect and eliminate web spam-tainted results from getting into our search engine results pages (SERPs). It's a tough battle, and it requires a great deal of work to keep our SERPs useful and legitimate for search customers.

    We've already discussed the basic definition of web spam and one of the two major implementations, page-level web spam, in previous blog articles. We'll wrap up this web spam series with a discussion of the other major type, link-level web spam. And finally, we'll discuss what a webmaster can do to restore their website's listing (removing penalties) with Bing once the detected web spam has been removed.

    Definition of link-level web spam

    Link-level web spam uses web link deceptions in an attempt to artificially inflate the page rank of a specific page or site. Savvy webmasters know that earning high quality, relevant inbound links from authoritative sites can have a very positive influence on the search engine's rank of the linked site - we recently published a blog post on this subject titled, Link building for smart webmasters (no dummies here). This is good search engine optimization (SEO). Some less savvy and/or more unscrupulous folks believe they can simply substitute the "high quality, relevant" part of the equation for high quantity and swap "authoritative sites" for either junk or irrelevant sites and achieve the same goal. Sadly for them, this is not the case.

    The intent of the link-level web spammer is to create huge numbers of inbound links (typically from unrelated, low quality sites) to attain illegitimate page rank for a site to fool web searchers into visiting their sites. Luckily, Bing and the other search engines can assess the quality and authority of a particular website.

    Sites employing link-level techniques also often employ page-level web spam techniques to make their sites appear to be relevant to a commonly searched keyword when they are not. The use of link-level web spam techniques will cause a search engine to examine your site more deeply, and if it's determined to be using web spam techniques, your site could be penalized.

    As we stated earlier with page-level web spam, some of these techniques can have valid uses at their core, but the intention behind their use is the distinguishing factor. When we detect deceptive intent as we crawl the Web, we identify those pages as web spam and penalize them as appropriate, ranging from neutralization (which levels the playing field for other sites offering content on the same subject) to expulsion from the index. As you can imagine, for an online-based business, these are serious consequences, so it pays to know what NOT to do when you optimize your site for search (or hire a consultant to do the same).

    Post web spam

    Definition: This is a form of user-generated content (UGC)-based outbound links posted in other web sites, such as in guest book pages, forums, blog comments, message boards, and referrer logs.

    Problem: The destination links in post web spam are usually unrelated, topic-wise, to the page containing the UGC outbound link. Often these posts include multiple links. In sites that rely on post web spam for inbound links, it is not unusual for a sizable percentage of all of their inbound links to be from post web spam.

    What we look for: Several techniques for implementing post web spam are commonly used, including:

    • Add backlinks to all UGC content. When users go onto websites that allow UGC to be created, those who use post web spam include backlink URLs to their sites, even if they don't have anything to do with the comment or, more significantly, the theme of the UGC-sponsoring site.
    • Automation. Spammers often use automated techniques to repeatedly submit the same UGC post containing short, generic text and a clickable URL to their sites in every UGC-sponsoring page possible.
    • Keyword stuffing. Post web spam text is often keyword stuffed. Check out our page-level web spam article titled The pernicious perfidy of page-level web spam for more information on this.
    • Massive repetition. Lots of non-relevant, poor quality, inbound links come from such pages as online guest books, forums, and blog comments.

    What post web spammers don't often realize is that many UGC-oriented pages automatically append the attribute rel="nofollow" to any links created in UGC content. As such, no inbound link credit is derived when search engines crawl and index these pages.

    From a webmaster point of view, however, we encourage active, regular cleaning up (or better yet, preventing) of UGC-based web spam content. If there is too much junk or web spam content on a page, it could reflect poorly on the overall quality of your page, even if you are employing rel="nofollow" to URLs. For that matter, it is very important for any site that allows UGC content to actively monitor their site's security. Hosting malware can also get a site penalized, and you don't want that! For more information on malware, see our blog article series called The merciless malignancy of malware, Part 1, Part 2, Part 3, and Part 4.

    Link farming

    Definition: A link farm is a large collection of websites that exist for the sole purpose of providing massive numbers of links to targeted websites, ostensibly to improve the appearance of their organic, online popularity.

    Problem: Link farming is often employed to promote one website using many other websites or it can be a commercial enterprise in which the link farm sells its unscrupulous (and worthless) outbound linking services to less-SEO-savvy webmasters.

    What we look for: Link farming is often implemented using the following techniques:

    • Large, sudden surge of new inbound links. When dozens or hundreds of inbound links suddenly appear for a new or a previously small website, such a big change can indicate link farm web spam activity. The relevance of the outbound linking sites will be a key factor in whether or not such a sudden change warrants further investigation.
    • Consistent similarities between outbound linking sites. If a large number of the inbound links for a site come from sites that are very similar in design, structure, and other key characteristics, this can lead to deeper scrutiny of a website for web spam.
    • Poor linking standards. A link farm will often have a large number of unrelated links on the page, or will have related links to many sites that employ other spam methods.The pages themselves are designed to maximize the number of links on them, favoring outbound links rather than original content on the page.

    When link farms are identified, those sites are penalized, which negates the value of the links they contain. In addition, the pages they link to are more likely to be heavily scrutinized for other forms of web spam.

    See our earlier blog articles for more information on what makes a good link versus a bad link.

    Link exchanges

    Definition: Unlike link farms that target a few selected sites, link exchanges are organized groups of websites who participate in providing reciprocal inbound and outbound links ostensibly to benefit all websites in the exchange.

    Problem: Web spam-oriented link exchanges typically involve unrelated web sites reciprocally exchanging links en masse for the purposes of rank inflation. As such, they offer no value to human visitors, and thus they are candidates for being considered web spam.

    While earning inbound links are a part of legitimate SEO activities, as we've stated before, Bing values quality links over quantity of links. Inbound links from sites unrelated to the theme of your site, typical with most link exchanges, will be of little to no value to you for improving your page rank.

    What we look for: Link exchanges usually include the following activities:

    • Starts out as email spam. Link exchanges often start out as spam emails sent from webmasters of unrelated sites asking other webmasters if they would like to improve their ranking by exchanging links.
    • Excessive links. Link exchanges (reciprocal links) between unrelated sites, especially when done to excess, can be indicators of web spam, and a participating website might be more heavily scrutinized for other web spam problems.

    Note that reciprocal linking is not an automatic red flag. Some websites within a particular niche will link to others when it provides a relevant value to their customers. For example, think of a bed and breakfast who links out to local wineries and a winery who links out to local bed and breakfasts - these are interrelated activities to a region that are naturally relevant for site visitors.

    But as usual, too much of a good thing can be bad. And when there is no relevance between linked sites, the value of link exchanges can quickly degrade down to the level of web spam (especially when the numbers of unrelated links is deemed excessive).

    Penalties and restitution

    Mistakes happen. An entrepreneurial do-it-yourselfer optimizes a website based on bad (spammy) advice from the Web. A Mom-and-Pop-shop website owner naively hires an unscrupulous website consultant. Heck, it's even possible that a search engine might mistakenly label an innocent site as web spam. So what do you do?

    If you made a mistake on your site and your rank has been neutralized, the solution is easy. Web spam neutralization is handled automatically with Bing. If you are using web spam techniques on your website and you want to remove the site's web spam neutralization penalty, eliminate the web spam violations and then republish your website. Once the Bing crawler, MSNBot, recrawls your site, if the web spam violations have been removed, the neutralization will be automatically resolved in the index.

    But what if your site has been purged from the Bing index? That requires some manual intervention.

    Request reconsideration for your site

    If you search for your site in the Bing index using the advanced search keyword phrase site:www.myURL.com (using your URL, of course!) and nothing turns up, your site is not in the index. If this is a sudden change and you know you've used some unscrupulous web spam techniques, you'll need help to get back into the index.

    First of all, fix all of the web spam violations on your site. Not just one or two, but all of them. Then, once you've republished a corrected version of your website, contact Bing support to request reconsideration of your website's penalty. Here's how:

    1. Go to Bing E-mail Support and fill out the form completely
    2. Select Content Inclusion Request from the drop-down list. A new drop-down will appear underneath.
    3. From the new drop-down list, select Reinclusion request.
    4. Write a clear and detailed explanation of what you have done to resolve the problem in the next text box. (You can prepare this in advance, and then copy and paste the text into the form.)
    5. Type the security code from the presented image into the text box below.
    6. Once the form is completed, click submit.

    A member of the Bing support team will quickly review your request and schedule your site to be recrawled. If the crawler determines that all of the violations have indeed been resolved, then your site is eligible to be added back into the index. But be patient - this process doesn't happen overnight (which is why it's a wise idea to avoid such web spam penalties in the first place).

    For more information on Bing penalties and restitution, see the blog article Getting out of the penalty box.

    If you have any questions, comments, or suggestions, feel free to post them in our Ranking Feedback and Discussion forum. Later...

    -- Rick DeJarnette, Bing Webmaster Center


    P.S. It was suggested to me that I list the other articles in this web spam series for those who might be interested in reading the entire set, so here goes (in order of publication):

    Enjoy!

    Rick

  • The pernicious perfidy of page-level web spam (SEM 101) -

    In the exciting world of today's Internet, where the world's information is literally at your fingertips, where you can endlessly communicate, shop, research, and be entertained, spam is a big downer. The unwanted email spam that fills our inboxes also consumes huge portions of the available bandwidth of our routers and trunk lines. But email is not the only spam game in town.

    Web spam is the bane (well, one of the banes) of the search engine and web searcher communities. Search engines want to provide search users with a great experience, helping them find what they want as quickly and as easily as possible. Search users want to use search engines to get the right information they seek as quickly as possible. And webmasters want search users to find their websites, but also to get those search user visitors to become conversions instead of bounces.

    Web spam, those unwanted garbage pages that use overtly deceptive search engine optimization (SEO) techniques and contain no valuable content, is a frustration to search engines and search users alike, and ultimately work against the best interests of conversion-seeking webmasters (severely annoying a potential customer is rarely a great sales technique!).

    In the previous article that defined web spam and discussed how it is different from junk content, we mentioned that there are two types of web spam. In this article, we're going to delve into the details of the first type: page-level web spam.

    Definition of page-level web spam

    Page-level web spam uses on-page SEO trickery (not to be confused with link-level web spam, which we'll discuss in an upcoming article). Webmasters and optimizers for these sites do this because they believe they can fool the search engines into giving their webpages a higher-than-deserved ranking based on their content relevancy, often times for subject areas that are completely unrelated to the site's actual content. This is done in an effort to deceive searchers into visiting their spammy sites for a multitude of reasons, none of which usually benefit the end user.

    The use of the following questionable SEO techniques will cause Bing to examine your site more deeply for page-level web spam. If your site is determined to be using web spam techniques, your site could be penalized as a result.

    Note that Bing recognizes that the core concepts behind many of these techniques can have valid uses. No one is saying that their use always and automatically denotes web spam. The issue of intent behind their use is the distinguishing factor for determining whether or not web spam is present and any site penalties are needed. Please understand that, from a search engine perspective, the web spam effort consistently provides very little to no value whatsoever to end users. The entire effort is directed to fraudulently affect search engine rankings. As Martha Stewart might say, that's not a good thing.

    Keyword URL and link stuffing

    Definition: This is the use of heavily repeated keywords and phrases with the goal of attaining a more favorable ranking for those words in a search engine index.

    Problem: Keywords can be repeated to excess, so much so that they render any text in which they appear unintelligible from a natural language point of view. Those excessive repetitions can also be added in places that are not seen by the end user (meaning outside of displayed page text). Some web spam pages even use repeated keywords that are unrelated to the theme of the page. If any of these conditions are detected, these techniques will draw the attention of Bing as likely web spam.

    What we look for: The purveyors of web spam use a variety of methods for keyword stuffing, including:

    • Excessive repetitions of keywords. The number of repetitions relative to the amount of content on the page is a key indicator of web spam. The practice of repetitive keyword stuffing is often relative to the amount of content in a page. For example, a very long page of text dedicated to a single topic may naturally repeat its primary theme keyword several times, but a page with less content using the same number of repetitions of the same word may be indicative of keyword stuffing.
    • Stuffing words unrelated to the page or site theme. Stuffing the page with words that are known to be heavily searched on the Web when they are irrelevant to the theme of a site can be an indicator of web spam. Relevance is an important factor for evaluating whether keywords are indicators of web spam.
    • Stuffing on-page text. Littering the text of a page with repeated keywords that render the text meaningless and unreadable to humans is a clear problem. When such content on the page is not useful to people, the content is often suspect as web spam.
    • Stuffing in less visible areas of the page. Placing repeated keywords in less visible areas of a page, such as at the bottom of the page, in links, in Alt text, and in the title tag, can be indicative of web spam.
    • Hiding stuffed keywords in the code of a page. By putting keywords in the code of a page that the search engine crawler (aka a bot) will see but configuring it so that a web browser will not show it to a human reader can be highly suspicious. Such methods as formatting text fonts the same color as the background, using extremely small fonts, and hiding stuffed keywords using tag attributes such as style="display: none" and class="hide" (both of which prevent the tagged contents from being shown to the user) will draw the attention of a search engine for closer scrutiny.

    Note that stuffing the keywords <meta> tag alone is not a reason to be judged as web spam. But <meta> tag stuffing could be an indicator that other web spam techniques may be employed and could draw a search engine to take a closer look at such a site.

    It is important that webmasters not overreact to this information. A small amount of relevant keyword repetition is considered common and is not considered web spam as long as it is used naturally within the page content language and the page provides useful, relevant content. They key message is always the same: develop your pages for human readers, not for search engine bots, for the best results. For more information on creating and using keywords wisely, see the blog articles The key to picking the right keywords and Put your keywords where the emphasis is.

    Misspelling and computer generated words

    Definition: Pages populated with many various spellings of targeted keywords, especially those unrelated to the theme of the page or the site, can indicate that the keyword lists are computer generated.

    Problem: Aggressive inclusion of large numbers of misspelled or rare word lists and phrases can be considered web spam when used to excess. The relevance of those words to the theme of the page or the site is the key distinguishing factor here.

    What we look for: The Bing team commonly sees the following techniques on web spam sites:

    • Excessive use of misspelled keywords. Huge lists containing all possible iterations of a misspelled word can be so excessive that the page will be worthy of closer inspection for web spam.
    • Large numbers of misspelled words unrelated to the theme of the site. Long lists of word spelling variations whose core definitions are unrelated to the theme of the page or the site can indicate the site is web spam.
    • Common misspellings of popular site URLs in domain names. Common misspellings of URLs and other computer-generated content are usually considered web spam sites.

    Redirecting and cloaking

    Definition: When a web client visits a website, certain traits can be used to identify the user and redirect them to a different page. These include, but are not limited to, redirects based on the referral code, the user agent (bot or human), and IP address.

    Problem: Redirecting can be a legitimate technique in some cases such as if a web client is limited in what it can display on a mobile device web browser, or when a web server uses the client's IP address to determine the language in which to present the content (aka geo-targeting). However, problems arise when sites filter their content based on whether the user agent belongs to an end user web browser versus a search engine bot. This type of filtering can run the gamut between showing the bot a keyword-stuffed page to an entirely different set of content, all of which is an attempt to deceive. When used with this intent, this is web spam.

    What the webmasters who implement these techniques don't understand is that search engines can detect this attempted deception. We do see when the content presented is user-agent based, and when the differences between the content variations is not done in the same light as that done between mobile and desktop browsers.

    What we look for: Some webmasters design their websites to use the following deceptive techniques when the detected user agent is a search engine bot:

    • Script-based redirects. The use of JavaScript or <meta> tag refreshes to automatically change which page is displayed are often suspicious in nature and will get more scrutiny from Bing. This is because some sites use JavaScript to redirect all visiting user agents to a new page, and that page may contain web spam. However, since search engine bots don't execute JavaScript natively, they won't execute the redirect and thus are supposed to index the contents of the original page (although the search engines bots can still detect this behavior).
    • Referral redirects. Some websites consider the referrer when they show a page. When the referrer is a SERP and the target website shows a different page than the one shown when the user directly navigates to the URL, this behavior is considered web spam.
    • Redirect search engine bot to a target page. Some sites detect the user agent specified and send search engine bots to alternate, text-based pages modified with other web spam techniques such as keyword stuffing (but the site provides its normal web content pages to end user web browser user agents). When redirects are filtered on search engine user agents for the purpose of deceiving them, this is a web spam version of cloaking. Bots can detect when they are redirected to special pages. So when this is encountered, it is usually indicative of web spam and will be investigated further.
    • Redirect end users to a target page. Sometimes webmasters use cloaking to work the opposite way than described immediately above. They may serve highly optimized content pages on Topic A to search engine bot user agents, but when a web browser visits the site, the page shown shows content for a completely different subject (typically an illicit one, such as a page promoting porn, casino or online gambling, illicit pharmaceuticals, and the like.). The effort here is to rank well for a commonly searched topic of interest in a search engine results page (SERP). Then supposedly when searchers find that link in their SERPs, they click the blue link in the SERP and are unwittingly redirected to the web spam page.

    The problem for webmasters practicing these techniques is that their technical deceptions are not very effective. Search engines use a number of techniques to uncover such fraudulent practices as redirect and cloaking web spam. When they are revealed, the websites of the perpetrators are penalized, sometimes severely. Well-meaning webmasters or online business owners who hire unscrupulous consultants or carelessly take black hat SEO advice from indiscriminate sources on the Web are setting themselves up for trouble. Reviewing the issues identified in this article as well as the official webmaster guidelines for Bing, Yahoo, and Google, will go a long way to keeping a website on the right track for search.

    In the next article on web spam, we'll discuss link-level web spam in detail. We'll also include some information on what to do if your site was pegged as web spam and after the problems have been resolved, how to request reinstatement into the Bing index as a normal website. Stay tuned!

    If you have any questions, comments, or suggestions, feel free to post them in our Ranking Feedback and Discussion forum. Until next time...

    -- Rick DeJarnette, Bing Webmaster Center

  • Webmaster Center FAQ updated -

    Just a quick note for today.

    While we've been busy working on the new series of web spam blog articles, we were also busy working on updating the recently compiled Webmaster Center FAQ (as described in the blog announcement New Webmaster Center FAQ available) with even more content. While some might call it new and improved, I just claim that it's bigger with more information (hmmm -- I guess it is new and improved!).

    The Webmaster Center FAQ document now contains 82 detailed questions and answers, organized into 12 categories, all navigable through a linked table of contents. As before, it's available as a downloadable PDF through the Microsoft Download Center. Check it out!

    And be sure to keep coming back to the Webmaster Center blog on a regular basis. We have a lot of new and exciting content coming your way soon. You won't want to miss it. Heck, you might even want to subscribe to our blog's RSS feed.

    Thanks for tuning in. Now back to our regularly scheduled programming.

    If you have any questions, comments, or suggestions, feel free to post them in our SEM forum. Later...

    -- Rick DeJarnette, Bing Webmaster Center





  • Get Adobe Flash playerPlugin by wpburn.com wordpress themes