<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>New Webmasters &#187; Search Engines</title>
	<atom:link href="http://newwebmasters.net/category/search-engines/feed/" rel="self" type="application/rss+xml" />
	<link>http://newwebmasters.net</link>
	<description>Build a Better Website</description>
	<lastBuildDate>Sun, 03 Jan 2010 23:12:18 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=</generator>
		<item>
		<title>How Good Are Search Engine 404 Error Pages?</title>
		<link>http://newwebmasters.net/search-engines/search-engine-404-error-page/</link>
		<comments>http://newwebmasters.net/search-engines/search-engine-404-error-page/#comments</comments>
		<pubDate>Tue, 22 Sep 2009 20:27:07 +0000</pubDate>
		<dc:creator>corbyboy</dc:creator>
				<category><![CDATA[Search Engines]]></category>
		<category><![CDATA[useability]]></category>

		<guid isPermaLink="false">http://newwebmasters.net/?p=779</guid>
		<description><![CDATA[Search engines are keen to provide us with lots of advice about building good 404 erros pages. Let's see how well they follow their own advice.]]></description>
			<content:encoded><![CDATA[<p>Google is pretty good about giving information to webmasters about how to design good pages that get listed in Google. They tell us to use descriptive title tags, how to prevent duplicate content problems and how to check if your site has been hacked.</p>
<div class="captionfull"><img src="http://newwebmasters.net/wp-content/uploads/2009/09/google-404-error.png" alt="Google 404 Error Page" title="Google 404 Error Page" width="640" height="308" />
<p>Google 404 Error Page</p>
</div>
<div class="captionfull"><img src="http://newwebmasters.net/wp-content/uploads/2009/09/yahoo-404-error.png" alt="Yahoo 404 Error Page" title="Yahoo 404 Error Page" width="640" height="308" />
<p>Yahoo 404 Error Page</p>
</div>
<div class="captionfull"><img src="http://newwebmasters.net/wp-content/uploads/2009/09/bing-404-error.png" alt="Bing 404 Error Page" title="Bing 404 Error Page" width="640" height="308" />
<p>Bing 404 Error Page</p>
</div>
<p>They also provide information on how to produce <a href="http://www.google.com/support/webmasters/bin/answer.py?hl=en&#038;answer=93641">good 404 error pages</a>. At the bottom of that article they provide a list of items to include in your 404 error page (<a href="http://newwebmasters.net/reference/http-4xx-errors/">read more about HTTP status codes</a>). These are supposed to help the user find what they are looking for and to ensure they stay on your website.</p>
<p>In this article we are going to look at the 404 error pages on Google, Yahoo and Bing to see how many of these guidelines they follow themselves. The sites are given a mark out of 10 for how well they meet each guideline.</p>
<p>We are going to access the following URLs:</p>
<ul>
<li>http://www.google.com/fake-page</li>
<li>http://www.yahoo.com/fake-page</li>
<li>http://www.bing.com/fake-page</li>
</ul>
<h2>Tell visitors clearly that the page they&#8217;re looking for can&#8217;t be found. Use language that is friendly and inviting</h2>
<h3>Google</h3>
<p><strong>The page &#8211; www.google.com/fake-page &#8211; does not exist.</strong></p>
<p>The message that the page cannot be found is pretty clear although the error text is the same size as the rest of the page.</p>
<p>Mark: <strong>4</strong>/10</p>
<h3>Yahoo</h3>
<p><strong>Sorry, the page you requested was not found.</strong></p>
<p>This text is larger than the rest of the page, is in bold and is centered. It is very clear that the page cannot be found.</p>
<p>Mark: <strong>8</strong>/10</p>
<h3>Bing.com</h3>
<p><strong>Let&#8217;s try that again</strong></p>
<p>Although there is some explanatory text below, this is a terrible header to use for an error page. It is very vague and really doesn&#8217;t help the user.</p>
<p>The text &#8220;That web page doesn&#8217;t exist. Let&#8217;s see if we can help you find what you are looking for&#8221; is much more useful. If they used this as the header they would definitely improve the page.</p>
<p>However, it is the only error page that actually used a &lt;h1&gt; tag for the page header.</p>
<p>Mark: <strong>4</strong>/10</p>
<h2>Make sure your 404 page uses the same look and feel (including navigation) as the rest of your site.</h2>
<h3>Google</h3>
<p>The page features the standard Google search box and logo at the top and the familiar links to popular Google sites such as Videos, Maps and News.</p>
<p>Mark: <strong>8</strong>/10</p>
<h3>Yahoo</h3>
<p>Features the Yahoo logo and the links back to the homepage and help sections. Doesn&#8217;t feature the standard links to Yahoo Mail that seem to feature on a lot of Yahoo pages. However, each section of Yahoo seems to have its own style and design so this would be a difficult rule for Yahoo to follow.</p>
<p>Mark: <strong>5</strong>/10</p>
<h3>Bing</h3>
<p>Has the standard site header, log and search box at the top of the page. It is even customised for your location and whether you are signed in or not.</p>
<p>Mark: <strong>10</strong>/10</p>
<h2>Consider adding links to your most popular articles or posts, as well as a link to your site&#8217;s home page.</h2>
<p>The way I read this is that Google is recommending that as well as linking to your homepage, you should direct users to some popular sections of your site that they might be looking for.</p>
<h3>Google</h3>
<p>The only link to the homepage is via the Google logo at the top. Aside from that, the only even remotely useful link is to the help center. And the help topics it links to are not even specific for the error we have encountered. Not very helpful at all.</p>
<p>Mark: <strong>2</strong>/10</p>
<h3>Yahoo</h3>
<p>Yahoo features a link to the homepage via the logo at the top and a text link in the main body. There is also a link to a list of Yahoo&#8217;s services and Yahoo Help Central. Like Google, this help center doesn&#8217;t link to a specific page that is relevant to our current problem.</p>
<p>Mark: <strong>3</strong>/10</p>
<h3>Bing</h3>
<p>Bing features their logo at the top which links to the homepage and a text link to the help pages. As with the other two, not a link to help us with our current problem.</p>
<p>Mark: <strong>3</strong>/10</p>
<h2>Make sure that your webserver returns an actual 404 HTTP status code when a missing page is requested.</h2>
<p>All the pages returned proper 404 headers. It would have been terribble news if they hadn&#8217;t.</p>
<p>Mark: All sites <strong>10</strong>/10</p>
<h2>Use the Enhance 404 widget to embed a search box on your custom 404 page and provide users with useful information to help them find the information they need.</h2>
<p>None of the search engines actually uses this widget, but what it is trying to achieve is very useful.</p>
<p>It provides a search box and tries to extract query data from the URL to put in this box. This then allows you to search the current site for what you are looking for.</p>
<p>It also tries to list some close match URLs (<a href="http://newwebmasters.net/wordpress--twitterbot/">see an example</a>).</p>
<p>Let&#8217;s see if the search engines have any of these features in their sites.</p>
<h3>Google</h3>
<p>Google has a search box, but it doesn&#8217;t just search Google, it is just Google&#8217;s standard web search.</p>
<p>There is no list of closest match URLs either. So say if I accidentally typed in this URL:</p>
<pre>http://www.google.com/analatics</pre>
<p>Google <em>should</em> be clever enough to know that I am really looking for Google analytics. But it doesn&#8217;t, and that&#8217;s very disappointing.</p>
<p>Mark: <strong>0</strong>/10</p>
<h3>Yahoo</h3>
<p>Yahoo sufferes from the same problem as Google, no list of closest match pages and the search box doesn&#8217;t just search Yahoo.</p>
<p>Mark: <strong>0</strong>/10</p>
<h3>Bing</h3>
<p>It seems this is standard practice on the big three search engines. No list of closest matches and no search box for just bing.com.</p>
<p>Mark: <strong>0</strong>/10</p>
<h2>Total Marks</h2>
<h3>Google</h3>
<p>24/40 = 60%</p>
<h3>Yahoo</h3>
<p>26/40 = 65%</p>
<h3>Bing</h3>
<p>27/40 = 67.5%</p>
<p>You can see that each search engine has similar marks, and that they are fair, but could definitely be improved.</p>
<p>It seems that search engines could do with a dose of their own medicine before telling us how to design our own 404 pages.</p>
]]></content:encoded>
			<wfw:commentRss>http://newwebmasters.net/search-engines/search-engine-404-error-page/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Why You Should Try To NOT Use The New Canonical Tag</title>
		<link>http://newwebmasters.net/search-engines/why-you-should-try-to-not-use-the-new-canonical-tag/</link>
		<comments>http://newwebmasters.net/search-engines/why-you-should-try-to-not-use-the-new-canonical-tag/#comments</comments>
		<pubDate>Tue, 17 Feb 2009 21:22:51 +0000</pubDate>
		<dc:creator>corbyboy</dc:creator>
				<category><![CDATA[Search Engines]]></category>
		<category><![CDATA[google]]></category>
		<category><![CDATA[live]]></category>
		<category><![CDATA[tools]]></category>
		<category><![CDATA[useability]]></category>
		<category><![CDATA[yahoo]]></category>

		<guid isPermaLink="false">http://newwebmasters.net/?p=548</guid>
		<description><![CDATA[Google, Yahoo and Live have all announced support for the new "canonical" tag to specify alternative content for page. We look at why you shouldn't use it on your website.]]></description>
			<content:encoded><![CDATA[<div class="captionleft"><img src="http://newwebmasters.net/wp-content/themes/tma/images/latest/google-cutoff.jpg" alt="Google Logo" title="Google Logo" width="470" height="175" />
<p>Google supports the new tag</p>
</div>
<p>Last week, <a href="http://googlewebmastercentral.blogspot.com/2009/02/specify-your-canonical.html">Google</a>, <a href="http://ysearchblog.com/2009/02/12/fighting-duplication-adding-more-arrows-to-your-quiver/">Yahoo</a> and <a href="http://blogs.msdn.com/webmaster/archive/2009/02/12/partnering-to-help-solve-duplicate-content-issues.aspx">Live</a> all announced their support for a new header tag called the &#8220;canonical&#8221; tag. This is designed to let you tell the search engines an alternative URL to index for a page.</p>
<p>It is designed to remove duplicate content issues from your web pages by letting you specify the correct version of the page. It is like a 301 redirect, but is a &#8220;strong hint&#8221; rather than an order.</p>
<p>A good example might be if you use information from the query string to sort data on a page. Your website might generate the following pages</p>
<ul>
<li class="rel_post">http://example.com/cat1.php?sort=size&#038;dir=desc</li>
<li class="rel_post">http://example.com/cat1.php?sort=price</li>
<li class="rel_post">http://example.com/cat1.php?sort=price&#038;dir=asc</li>
<li class="rel_post">http://example.com/cat1.php?sort=avail&#038;dir=desc</li>
</ul>
<p>Search engines would regard each of these as individual pages, even though they are essentially the same page. The information is just in a different order.</p>
<p>Allowing URLs like this could be detrimental to your SEO efforts as your PageRank would be split between the pages, and people could be linking to the different versions of each page.</p>
<p>You could use the tag like this to remove this problem. There is a more technical explanation of how to use the tag over at the <a href="http://www.seomoz.org/blog/canonical-url-tag-the-most-important-advancement-in-seo-practices-since-sitemaps">SEOmoz blog</a>.</p>
<pre>&lt;link rel="canonical" href="http://example.com" /&gt;</pre>
<h2>Lesson: DON&#8217;T Use This Tag</h2>
<p>The examples <a href="http://googlewebmastercentral.blogspot.com/2009/02/specify-your-canonical.html">given by Google</a> underline something very important about this tag; you should be striving to never <strong>need</strong> to use it.</p>
<p>Here are some situations that you should never need to use the tag to resolve:</p>
<h3>WWW vs Non-WWW URLs</h3>
<p>In order to prevent duplicate content, a lot of websites redirect the non-www version of a URL to the www version. For example, accessing http://example.com will redirect you to http://www.example.com.</p>
<p><strong>Lesson: use a 301 redirect</strong></p>
<h3>Fixing Sloppy Links Generated By Your CMS</h3>
<p>The links generated by the CMS or online store in Google&#8217;s examples are the result of sloppy coding.</p>
<p>While accessing <em>http://example.com/prod.php?p=fish</em> and <em>http://example.com/prod.php?p=fish&#038;cat=animals</em> will result in the same URL, these two links should NEVER be generated by a CMS. This is the result of sloppy coding. While using the canonical tag as a backup might be OK, you should never rely on it.</p>
<p><strong>Lesson: Fix your coding errors first.</strong></p>
<h3>Remove Tracking And Affiliate Links From URLs</h3>
<p>Search results are full of URLs with tracking IDs, such as those generated by Google Analytics URL Builder. Like this:</p>
<ul>
<li><em>http://newwebmasters.net/?utm_source=email&#038;utm_medium=gmail&#038;utm_term=web&#038;utm_campaign=test1</em></li>
</ul>
<p>But these links should be logged, stripped from the URL and the user redirected, preferably via a 301 redirect.</p>
<h3>Remove Session Identifiers</h3>
<p>We are all familiar with sessions IDs in URLs. They look something like this:</p>
<ul>
<li><em>http://www.example.com/index.php?PHPSESSID=a2sd433345gftr</em></li>
</ul>
<p>These should never be used on publicly accessible URLs. If you use PHP, always choose the default cookie option for recording session identifiers. Using identifiers in the URL is less reliable and less secure.</p>
<p>Google also claims that they are able to detect session identifiers in URLs, so in theory this shouldn&#8217;t be a problem.</p>
<p><strong>Lesson: Never use sessions identifiers in your publicly accessible URLs. Never.</strong></p>
<p>As an exception, one thing this tag will be great for combatting is the number of printable versions of documents I come across via search engines. There is very rarely a link to access the full version of the page directly from the printable version. If all these versions of pages were eradicated from the search results, I would be very happy. People should really be tagging these as noindex, but if they are featuring in the search results, I guess this won&#8217;t happen.</p>
<h2>In Summary</h2>
<p>As usual, most of these ideas are just my personal thoughts. While the new tag <em>shouldn&#8217;t</em> be necessary on your site, it is another useful tool to have handy.</p>
<p>But before you use it, think about what is happening on your website in the first place to make you need to use it.</p>
]]></content:encoded>
			<wfw:commentRss>http://newwebmasters.net/search-engines/why-you-should-try-to-not-use-the-new-canonical-tag/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>All You Ever Wanted To Know About PageRank</title>
		<link>http://newwebmasters.net/link-building/all-you-ever-wanted-to-know-about-pagerank/</link>
		<comments>http://newwebmasters.net/link-building/all-you-ever-wanted-to-know-about-pagerank/#comments</comments>
		<pubDate>Tue, 06 Jan 2009 01:03:54 +0000</pubDate>
		<dc:creator>corbyboy</dc:creator>
				<category><![CDATA[Link Building]]></category>
		<category><![CDATA[Search Engines]]></category>
		<category><![CDATA[google]]></category>

		<guid isPermaLink="false">http://newwebmasters.net/?p=372</guid>
		<description><![CDATA[PageRank has always been a popular topic of conversation between webmasters. Learn all you could ever need to know about the topic in this article.]]></description>
			<content:encoded><![CDATA[<p>PageRank (PR) is one of those terms that everybody who has a website seems to talk about. Everybody who &#8220;does <abbr title="Search Engine Optimisation">SEO</abbr>&#8221; knows what PR is and how to boost it. But what exactly <em>is</em> PageRank? What does it mean to increase it? Why would you want to? We will take a look at all of these issues in this article.</p>
<h2>What Exactly is PageRank?</h2>
<p>PageRank is an algorithm invented by the founders of Google that assigns an &#8220;importance&#8221; rating to every page that it indexes. It is on a logarithmic scale from 0-10. This means that each of the 11 ratings is 10 times higher than the previous one. For example, PR 4 is ten times higher than PR 3. PR 8 is 100 times higher than PR 6.</p>
<p>Essentially, pages with a higher PR are more important, or more specifically, more popular, than pages with a lower PR.</p>
<h2>What Determines PageRank?</h2>
<p>When Google crawls a hyperlink, it counts it as a &#8220;vote&#8221; for the destination page from the source page. By linking to that page, the source page is saying &#8220;I endorse this website. It is good.&#8221; Importantly, however, each vote is weighted, so all votes are not equal. What makes a vote worth more? A higher PR of the source page of course. So pages with a higher PR have more influence in boosting the PR of other pages with a lower PR. So it is not just the quantity of links that influences PR, it is the quality of those links too.</p>
<p>Is all this confusing? It can be if you are unfamiliar with the concept. Let&#8217;s look at it with an illustration.</p>
<div class="captionfull">
<img src="http://newwebmasters.net/wp-content/uploads/2009/01/pr-image.png" alt="" title="Example PageRank illustration" width="500" height="333" />
<p>Example PageRank illustration</p>
</div>
<p>In this illustration the coloured circles represent pages on different websites. The numbers inside represent the PageRank of that page. The arrow represents a hyperlink. You can see that site A has 6 inbound links with a PR of 3, while site B has only 4 inbound links and a PR of 4. This is because the sites that link to site B have higher PRs themselves. They are more authoratative in the eyes of Google, therefore their links are more important.</p>
<h2>Why is There Such A Buzz About PageRank?</h2>
<blockquote><p>We use more than 200 signals, including our patented PageRank</p>
]]></content:encoded>
			<wfw:commentRss>http://newwebmasters.net/link-building/all-you-ever-wanted-to-know-about-pagerank/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Live Webmaster Tools Now Detects Malware</title>
		<link>http://newwebmasters.net/search-engines/live-webmaster-tools-now-detects-malware/</link>
		<comments>http://newwebmasters.net/search-engines/live-webmaster-tools-now-detects-malware/#comments</comments>
		<pubDate>Sat, 29 Nov 2008 13:10:25 +0000</pubDate>
		<dc:creator>corbyboy</dc:creator>
				<category><![CDATA[Search Engines]]></category>

		<guid isPermaLink="false">http://newwebmasters.net/?p=412</guid>
		<description><![CDATA[Read all about the latest update to MSN/Live Webmaster Tools]]></description>
			<content:encoded><![CDATA[<div class="captionleft"><img src="http://blogs.msdn.com/blogfiles/webmaster/WindowsLiveWriter/LiveSearchWebmasterCenterFallUpdate_AE22/clip_image004_thumb.jpg" alt="MSN's Webmaster Tools Now detects malware" />
<p>MSN&#8217;s Webmaster Tools Now detects malware</p>
</div>
<p>The latest update to MSN/Live&#8217;s <a href="http://webmaster.live.com">Webmaster Tools</a> notifies webmasters of any pages on their site that contains malware. The service is still playing catchup to <a href="http://google.com/webmaster">Google Webmaster Central</a>, but it is making progress.</p>
<p>The tool will also notifiy you if you are linking to any pages that contain malware, so you can remove the link and prevent your site being associated with them.</p>
]]></content:encoded>
			<wfw:commentRss>http://newwebmasters.net/search-engines/live-webmaster-tools-now-detects-malware/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>The Google Guide To Search Engine Optimization</title>
		<link>http://newwebmasters.net/link-building/google-guide-to-seo/</link>
		<comments>http://newwebmasters.net/link-building/google-guide-to-seo/#comments</comments>
		<pubDate>Thu, 13 Nov 2008 13:10:30 +0000</pubDate>
		<dc:creator>corbyboy</dc:creator>
				<category><![CDATA[Link Building]]></category>
		<category><![CDATA[Search Engines]]></category>
		<category><![CDATA[google]]></category>
		<category><![CDATA[seo]]></category>

		<guid isPermaLink="false">http://newwebmasters.net/?p=380</guid>
		<description><![CDATA[Google has released an excellent beginners search engine optimization guide. Read about it and find out why it is useful for everybody.]]></description>
			<content:encoded><![CDATA[<p>Google has released what it calls &#8220;<a href="http://www.google.com/webmasters/docs/search-engine-optimization-starter-guide.pdf">Google&#8217;s Search Engine Optimization Starter Guide</a>&#8221; (pdf) over on the <a href="http://googlewebmastercentral.blogspot.com/2008/11/googles-seo-starter-guide.html">Official Google Webmaster Central blog</a>.</p>
<p>After reading through it I have to say that it is an excellent introductory guide to the basics of SEO. It would very firmly go on my new webmaster reading list. It is clearly designed for the beginner, but as an experienced web developer even I appreciate a run through of the basics every now and again to refresh me. It runs through all of the advice using a fictional baseball card collector&#8217;s website as an example.</p>
<p>Some of the key advice given in the document includes:</p>
<ul>
<li class="rel_post">Create unique, accurate page titles</li>
<li class="rel_post">Make use of the &#8220;description&#8221; meta tag</li>
<li class="rel_post">Improve the structure of your URLs</li>
<li class="rel_post">Make your site easier to navigate</li>
<li class="rel_post">Offer quality content and services</li>
<li class="rel_post">Write better anchor text</li>
<li class="rel_post">Use heading tags appropriately</li>
<li class="rel_post">Optimize your use of images</li>
<li class="rel_post">Make effective use of robots.txt</li>
<li class="rel_post">Be aware of rel=&#8221;nofollow&#8221; for links</li>
<li class="rel_post">Promote your website in the right ways</li>
<li class="rel_post">Make use of free webmaster tools</li>
<li class="rel_post">Take advantage of web analytics services</li>
</ul>
<p>All of these topics are wrapped up nicely into a 22 page pdf file.</p>
<h2>Reading More Into Google&#8217;s Advice</h2>
<p>Despite being aimed at beginners, there are a few little nuggets of information that the more experienced developer might be interested in.</p>
<p>The first that caught my eye was Google advice to &#8220;take advantage of&#8221; an image&#8217;s filename. I have never really thought of how Google indexes images, but it seems from this advice that the filename plays at least some part in this.</p>
<p>There is also confirmation of things we already suspected about Google, but weren&#8217;t completely certain, such as Google love affair with the blog (&#8220;Blog about new content or services&#8221;) and how important it is to have your site registered for local search (&#8220;Add your business to Google&#8217;s Local Business Center&#8221;).</p>
<p>So the guide is not just useful for the beginner, it is a recommended read for everybody.</p>
]]></content:encoded>
			<wfw:commentRss>http://newwebmasters.net/link-building/google-guide-to-seo/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Fix Your Broken Incoming Links With Google Webmaster Tools</title>
		<link>http://newwebmasters.net/link-building/fix-broken-links-with-google-webmaster-tools/</link>
		<comments>http://newwebmasters.net/link-building/fix-broken-links-with-google-webmaster-tools/#comments</comments>
		<pubDate>Tue, 14 Oct 2008 15:38:30 +0000</pubDate>
		<dc:creator>corbyboy</dc:creator>
				<category><![CDATA[Link Building]]></category>
		<category><![CDATA[Search Engines]]></category>
		<category><![CDATA[google]]></category>
		<category><![CDATA[search]]></category>
		<category><![CDATA[tools]]></category>
		<category><![CDATA[webmaster tools]]></category>

		<guid isPermaLink="false">http://newwebmasters.net/?p=358</guid>
		<description><![CDATA[Google has released a great new tool to help increase the number of inbound links to your website. Learn all about it in this article.]]></description>
			<content:encoded><![CDATA[<p><a href="https://www.google.com/webmasters/tools">Google Webmaster Tools</a> has just released a great new feature that allows you to find out which sites are linking to non-existant (i.e. 404) pages on your website.</p>
<p>Take a look at the example below for this website:</p>
<div class="captionfull">
<img src="http://newwebmasters.net/wp-content/uploads/2008/10/404tools.jpg" alt="Google Webmaster Tools Error Details" title="Google Webmaster Tools Error Details" width="750" height="227" /></p>
<p>Google Webmaster Tools Error Details</p>
</div>
<p>You can find the information in Webmaster Tools by clicking on <em>Diagnostics</em> > <em>Web Crawl</em> > <em>Not Found</em>. This screen will tell you how many people link to each non-existant page. If you click the figure, it will list all the URLs that link to each page. As with a lot of the data available in Webmaster Tools, you can also download it.</p>
<p>Redirecting from these non-existant URLs to ones that do exist is a great way of boosting the number of inbound links to your website. Links that point to non-existant pages will not help you rank in the search engines. Using a 301 redirect to point these links to a real page <em>will</em> help your rankings. <a href="http://www.mattcutts.com/blog/free-direct-text-links/">Matt Cutts</a> describes these as &#8220;free links to your site.&#8221;</p>
<p>Previously, it was very difficult to find the sites that linked to non-existant pages. This is a great addition to the webmaster tool collection and will become a very valuable resource for webmasters to increase the number of links to their site.</p>
]]></content:encoded>
			<wfw:commentRss>http://newwebmasters.net/link-building/fix-broken-links-with-google-webmaster-tools/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>The Ultimate Guide To Yahoo Search Operators</title>
		<link>http://newwebmasters.net/optimise/the-ultimate-guide-to-yahoo-search-operators/</link>
		<comments>http://newwebmasters.net/optimise/the-ultimate-guide-to-yahoo-search-operators/#comments</comments>
		<pubDate>Fri, 05 Sep 2008 23:24:07 +0000</pubDate>
		<dc:creator>corbyboy</dc:creator>
				<category><![CDATA[Link Building]]></category>
		<category><![CDATA[Optimise]]></category>
		<category><![CDATA[Search Engines]]></category>
		<category><![CDATA[keywords]]></category>
		<category><![CDATA[search]]></category>
		<category><![CDATA[yahoo]]></category>

		<guid isPermaLink="false">http://newwebmasters.net/?p=279</guid>
		<description><![CDATA[Since search engine databases hold such vast quantities of information that they put in place special features that allow you to refine what you search for. These "operators" are very useful when optimising your website for the search engines. Learn how to use them with this article.]]></description>
			<content:encoded><![CDATA[<div class="captionright"><img src="http://newwebmasters.net/wp-content/themes/tma/images/latest/yahooadv.jpg" alt="Advanced Yahoo Search Operators" title="Advanced Yahoo Search Operators" width="470" height="175" /></p>
<p>Advanced Yahoo Search Operators</p>
</div>
<p>Search engines are incredibly advanced pieces of software. The basic interface that we see and the speed at which they work often makes us forget that. Since search engine databases hold such vast quantities of information that they put in place special features that allow you to refine what you search for. These &#8220;operators&#8221; are very useful when optimising your website for the search engines.</p>
<p>Yahoo is probably the most popular place to get accurate information on which websites link where. The information on Google is very unreliable, as is MSN/Live. Yahoo&#8217;s Site Explorer also gives us a goldmine of useful linking information.</p>
<p>This article will help you to use the advanced search operators in Yahoo Search to optimise and analyse the links to and from your website as well as those to and from your competitors websites.</p>
<h2>Why Do I Need To Analyse My Competitors&#8217; Links?</h2>
<p>If you find a useful, relevant and popular website that links to one of your competitors, you need to ask yourself &#8220;<b>why is this site not linking to me?</b>&#8220;</p>
<p>It is also useful to find out which &#8220;power&#8221; sites are linking to you. Sites such .edu and .gov sites are widely regarded as more important to search engines and it&#8217;s easy to find where these sites link to. (Note: it is not the actual .edu and .gov extensions that make a site more authorative. Sites like these simply tend to among the more established and linked-to sites on the Web) Perhaps you want to find out which .info sites are linking to you with spammy keywords.</p>
<p><b>Important note.</b> When Yahoo returns results in its regular listings and in Site Explorer it lists nofollowed links along with followed links and doesn&#8217;t differentiate between them. You will have to check the links manually or use a special tool to check if the links are followed or not.</p>
<h2>What Operators Can I Use?</h2>
<p>Let&#8217;s get started on the actual operators themselves.</p>
<h3>Link:</h3>
<p>Use link: to find documents that link to a particular URL. Link: requires you to use a full domain including http://.</p>
<p><b>Example:</b> link:http://search.example.com/</p>
<p>This operator will find any page that links the specified page. The example above will match http://search.example.com only. It will not match links to http://www.example.com, http://search.example.com/blog or http://help.search.example.com.</p>
<p>When you use the link: operator you need to be careful when choosing to use www or not in the domain. If you look for links to http://example.com, a page that links to http://www.example.com will not be returned unless the webmaster has set up a redirect, which many of them do.</p>
<h3>Linkdomain:</h3>
<p>Use linkdomain: to find pages that link to any URL on that domain and on any subdomains.</p>
<p><b>Example:</b> linkdomain:example.com</p>
<p>This will match pages that link to http://example.com, http://www.example.com, http://search.example.com and http://search.example.com/blog.</p>
<p><b>Important note.</b> When you do a search using link: or linkdomain: you will be redirected to Yahoo Site Explorer. Although this is a great tool, often you don&#8217;t want to be redirected. To avoid this happening, simply add one of the other search operators or some search text to your query.</p>
<h3>Site:</h3>
<p>Use site: your search to a particular domain and all its subdomains. Whereas link: and linkdomain: were looking for links to a site, the site: operator is conducting a search within a particular domain. For this reason, you need to use additional operators to your query or add some search text. If you just use site:example.com by itself you will be redirected to Yahoo Site Explorer.</p>
<p><b>Example:</b> site:example.com horse.</p>
<p>This will search the domain example.com for the phrase &#8220;horse.&#8221; It will also match results on search.example.com and help.search.example.com.</p>
<p><b>Example:</b> site: search.example.com horse.</p>
<p>This will match results on search.example.com and help.search.example.com but NOT example.com.</p>
<h3>Hostname:</h3>
<p>Use hostname: to find all documents from a particular host. It won&#8217;t search the subdomains of this host.</p>
<p><b>Example:</b> hostname:example.com horse.</p>
<p>This will find results for the query &#8220;horse&#8221; from example.com. It won&#8217;t search blog.example.com.</p>
<p><b>Example:</b> hostname: search.example.com horse.</p>
<p>This will only find results from that particular subdomain. It won&#8217;t find results from example.com or help.search.example.com.</p>
<h3>Url:</h3>
<p>Use url: to find a specific document in Yahoo&#8217;s search index. This operator does not search a domain, it simply locates the page and provides a link to it. It is useful to determine if a specific page is indexed.</p>
<p><b>Example:</b> url:http://www.example.com.</p>
<p>The full URL is required. In the above example url:www.example.com would not match any results. With this operator you need to be careful about whether you include www or not. This is one of the reasons why webmasters are encouraged to make a site with or without www resolve to the same page.</p>
<h3>Inurl:</h3>
<p>Use inurl: to search the actual URLs in the search index for the search terms.</p>
<p><b>Example:</b> inurl:horse.</p>
<p>The example above will return any page with the term &#8220;horse&#8221; in the URL. The search term will be highlighted in the results. You can use inurl: in isolation without any other operators or search terms.</p>
<h3>Intitle:</h3>
<p>This operator will search the page titles for the search terms.</p>
<p><b>Example:</b> intitle:horse.</p>
<p>The example above will return any page with the term &#8220;horse&#8221; in the <title> section. The search term will be highlighted in the results. You can use intitle: in isolation without any other operators or search terms.</p>
<h2>Advanced Queries: Combining Them Together</h2>
<p>The following example will show you how to build an advanced search query. It uses most of the operators mentioned above to show you how they can all fit together. The example is fictional but there are explanations to show how you can use them on your own site.</p>
<p>1. We want to find a site that links to two of our competitors but doesn&#8217;t link to us. We are MSN and we want to find any pages that currently link to Google and Yahoo but not to MSN:</p>
<p><b>linkdomain:google.com linkdomain:yahoo.com -linkdomain:msn.com</b></p>
<p>2. If we wanted to restrict this search to pages that contain the exact phrase &#8220;search engine&#8221; we can change it to:</p>
<p><b>linkdomain:google.com linkdomain:yahoo.com -linkdomain:msn.com &#8220;search engine&#8221;</b></p>
<p>The query could be modified to remove the quotation marks so we are not searching for the exact phrase.</p>
<p>3. Say we then wanted to restrict this search to pages with &#8220;seo&#8221; in their URL:</p>
<p><b>linkdomain:google.com linkdomain:yahoo.com -linkdomain:msn.com inurl:seo &#8220;search engine&#8221;</b></p>
<p>4. We then decide we don&#8217;t want to include pages that have the term &#8220;seo&#8221; in their title:</p>
<p><b>linkdomain:google.com linkdomain:yahoo.com -linkdomain:msn.com inurl:seo -intitle:seo &#8220;search engine&#8221;</b></p>
<p>5. Next we decide that after dropping the phrase &#8220;seo&#8221; in the title we only want pages that include the British spelling &#8220;Search Engine Optimisation&#8221; in the title. We only want that exact phrase with the words in that exact order</p>
<p><b>linkdomain:google.com linkdomain:yahoo.com -linkdomain:msn.com inurl:seo -intitle:seo intitle:&#8221;search engine optimisation&#8221; &#8220;search engine&#8221;</b></p>
<p>6. We then go back and decide that we only want to include pages that link to the Google and Yahoo homepage. We want to ignore all the links to Google Reader and other subdomains. Our query then becomes:</p>
<p><b>link:http://www.google.com link:http://www.yahoo.com -linkdomain:msn.com inurl:seo -intitle:seo intitle:&#8221;search engine optimisation&#8221; &#8220;search engine&#8221;</b></p>
<p>Note that we have to add http:// when using the link command. This only looks for links to that particular domain so using link:http://yahoo.com would not have worked.</p>
<p>7. Let&#8217;s be really fussy and add one final caveat: we only want sites that are .co.uk. We can do this with the site: command. Remember it searches subdomains of what we ask it to. So adding site:.co.uk will match example.co.uk and sub.example.co.uk.</p>
<div class="captionfull"><img src="http://newwebmasters.net/images/articles/yahooadv.png" alt="Our final results in Yahoo" title="Our final results in Yahoo" width="600" height="394" /></p>
<p>Our final results in Yahoo</p>
</div>
<p>Our final query is therefore:</p>
<p><b>link:http://www.google.com link:http://www.yahoo.com -linkdomain:msn.com inurl:seo -intitle:seo intitle:&#8221;search engine optimisation&#8221; &#8220;search engine&#8221; site:.co.uk</b></p>
<p>You can see from this final example that you can make your query very advanced indeed. Knowing these more advanced operators is very useful for your link building efforts.</p>
]]></content:encoded>
			<wfw:commentRss>http://newwebmasters.net/optimise/the-ultimate-guide-to-yahoo-search-operators/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>A Look Back at Google&#8217;s Acquisitions</title>
		<link>http://newwebmasters.net/history/a-look-back-at-googles-acquisitions/</link>
		<comments>http://newwebmasters.net/history/a-look-back-at-googles-acquisitions/#comments</comments>
		<pubDate>Tue, 19 Aug 2008 09:19:50 +0000</pubDate>
		<dc:creator>corbyboy</dc:creator>
				<category><![CDATA[History]]></category>
		<category><![CDATA[Search Engines]]></category>
		<category><![CDATA[google]]></category>

		<guid isPermaLink="false">http://newwebmasters.net/?p=43</guid>
		<description><![CDATA[Over the years Google has acquired many companies. This article outlines most of them to give you an idea of just how much technology Google actually owns.]]></description>
			<content:encoded><![CDATA[<p>Over the years Google has acquired many companies. Some of the most innovative technolgoy products of the last few years have been snapped up by Google. Many of these they improve or incorporate into an exisiting product. Others, however, are left to stagnate or eventually are wound up.</p>
<p>According to <a href="http://en.wikipedia.org/wiki/List_of_Google_acquisitions">the article on Wikipedia</a>, Google has acquired 51 companies as of March 2008. The list below outlines most of them to give you an idea of just how much technology Google actually owns. The list is not comprehensive, but it lists all the big ones.</p>
<h2>February 2001 &#8211; Deja</h2>
<p>Deja was a Usenet discussion service that Google bought to develop into Google Groups. At the time Deja&#8217;s archive dated back to 1995. After the acquisition, the archives were extended back to 1981.</p>
<h2>February 2003 &#8211; Pyra Labs Blogger Software</h2>
<p>Blogger was first made available to the public in August 1999. Initially the product was offered for use free of charge but the company ran into financial trouble. After the loss of many employees, the company received some funding and released new ad-supported versions of Blogger. The company was acquired by Google in 2003, upon which the &#8220;pro&#8221; features were made available to everybody free of charge. Opening up features for free became a hallmark of Google&#8217;s acquisition policy.</p>
<h2>April 2003 &#8211; Applied Semantics Online Advertising</h2>
<p>The technology behind Google AdSense and AdWords is based on WordNet, a system that groups synonyms into sets based on their semantic relationships. A company called Oingo, a small search engine company founded in 1998, used this technology. Oingo changed its name to Applied Semantics in 2001. Google paid $102 million to acquire the company in 2003.</p>
<h2>September 2003 &#8211; Kaltix Corp Search Engine</h2>
<p>In the press release confirming the acquisition, Google said that it and Kaltix Corp &#8220;share a common commitment to developing innovative search technologies.&#8221; Google acquired that technology on 30 September 2003 and used it to develop iGoogle.</p>
<h2>June 2004 &#8211; Baidu Chinese Search Engine &#8211; 2.6% Share</h2>
<p>Estimates for the cost of the deal vary from $5 million to $10 million, but on 23 June 2004, Google bought a 2.6% share of Baidu, a Chinese language search engine. The money was used to upgrade their technology and to stabilise the company before it was floated on the Stock Market. In December 2007 Baidu became the first Chinese company to be included in the US NASDAQ100 Index.</p>
<h2>July 2004 &#8211; Picasa Image Management Software</h2>
<div class="captionleft">
<img src="http://newwebmasters.net/wp-content/uploads/2008/08/80px-picasasvg.png" alt="Picasa logo" title="Picasa logo" width="80" height="86" /></p>
<p>Picasa logo</p>
</div>
<p>
Picasa became another service offered for free after Google&#8217;s acquisition from Idealab on 13 July 2004. Picasa has since become heavily integrated with Blogger.</p>
<h2>September and October 2004 &#8211; Google Maps Enhancements</h2>
<p>Between September and October 2004, Google purchased three companies: ZipDash, Where2 and Keyhole, Inc. The technology was used to enhance Google Maps and Google Earth, including Ride Finder.</p>
<h2>March 2005 &#8211; Urchin Software Corporation</h2>
<p>Google announced that they were purchasing Urchin on 28 March 2005. The company was founded in December 2005 and was originally called Web Depot. After a few years of serious investment they eventually became Urchin Software Corporation. Google used their Urchin web analytics software to develop Google Analytics.</p>
<h2>August 2005 &#8211; Android Mobile Phone Software</h2>
<p>Google and the Open Handset Alliance developed the Android software platform for mobile devices. It is based on the Linux operating system and allows users to develop using a java-like language.</p>
<p>The platform was announced on 5 November 2007 by the Open Handset Alliance, which consists of 34 different companies.</p>
<h2>December 2005 &#8211; AOL (5% Share)</h2>
<div class="captionright">
<img src="http://newwebmasters.net/wp-content/uploads/2008/08/200px-aol_logo.png" alt="AOL Logo" title="AOL Logo" width="200" height="67" /></p>
<p>AOL Logo</p>
</div>
<p>On 20 December Google announced that they were paying $1 billion for a 5% stake in AOL. The investment allowed the two companies to co-operate on search and advertising. Prior to the acquisition the two companies had worked together on other projects.</p>
<h2>January 2006 &#8211; dMarc Broadcasting</h2>
<p>dMarc Broadcasting, Inc. is an advertising company that specialises in radio advertising. Upon purchase, Google planned to integrate the technology into the AdWords platform. Google paid an initial $102 million for the company.They also announced that they full purchase price could rise to $1.136 billion over three years based on performance and revenue.</p>
<h2>February 2006 &#8211; MeasureMap</h2>
<p>MeasureMap was a small company that developed statistics and analytics software specifically for blogs. Google acquired them on 14 February 2006. The software was incorporated into Google Analytics.</p>
<h2>March 2006 &#8211; Upstartle</h2>
<p>Upstartle&#8217;s online word processor Writely formed the basis for Google Docs. The acquisition was announced on 9 March 2006. Registrations were closed for a few months as the service was integrated. Google Docs now includes spreadsheet and presentation functions.</p>
<h2>June 2006 &#8211; 2Web Technologies</h2>
<p>Most of the Google spreadsheet software was developed by Google itself. However, on 1 June 2006 it quietly bought a company called 2Web Technologies and integrated their flagship software XL2Web into the program. As with other software, the features became free to use once it was in Google&#8217;s hands.</p>
<h2>October 2006 &#8211; YouTube</h2>
<div class="captionright">
<img src="http://newwebmasters.net/wp-content/uploads/2008/08/150px-youtube_logosvg.png" alt="YouTube Logo" title="YouTube Logo" width="150" height="76" /></p>
<p>YouTube logo</p>
</div>
<p>In what was probably its most famous acquisition, Google purchased video sharing site YouTube for $1.65 billion in stock. The two brands remain distinct and Google own &#8220;Google Video&#8221; remains in place. The statistics for YouTube are simply amazing. Although not released very often, the company did state that in 2006, 50,000 videos were being added every day.</p>
<h2>October 2006 &#8211; JotSpot</h2>
<p>Google acquired JotSpot on 31 October 2006. It was rebranded as Google Sites and the features became free to use. It is currently still in beta.</p>
<h2>February 2007 &#8211; AdScpae</h2>
<p>Google acquired AdScape on 16 February 2007. It is a small company that specialises in in-game advertising. Google stated about the deal &#8220;In-game advertising is an area where we believe Google could add a lot of value to users.&#8221; Google reportedly paid $23 million for the company.</p>
<h2>April 2007 &#8211; DoubleClick</h2>
<div class="captionleft">
<img src="http://newwebmasters.net/wp-content/uploads/2008/08/180px-doubleclick_logosvg.png" alt="DoubleClick Logo" title="DoubleClick Logo" width="180" height="99" /></p>
<p>DoubleClick Logo</p>
</div>
<p>In its most expensive acquisition, Google purchased DoubleClick, a huge player in the online advertising industry, for $3.1 billion in cash. It took until 11 March 2008 for the deal to be officially closed, following an antirust hearing. Soon after the acquisition, Google announced that they would be<br />
cutting 300 jobs from the company.</p>
<h2>June 2007 &#8211; Feedburner</h2>
<p>Google paid a rumoured $100 million for Feedburner on 3 June 2007. The next month the PRO features were made available for free. The company provides services for web feeds.</p>
<h2>October 2007 &#8211; Jaiku</h2>
<div class="captionright">
<img src="http://newwebmasters.net/wp-content/uploads/2008/08/jaiku_green_logo.png" alt="Jaiku Logo" title="Jaiku Logo" width="89" height="73" /></p>
<p>Jaiku Logo</p>
</div>
<p>Jaiku is a micro-blogging software company founded in Helsinki, Finland. It is like Twitter, but for your mobile phone. It is also more fully-featured than Twitter. New registrations remain closed and news on Google&#8217;s future plans for the platform are quite scarce.</p>
]]></content:encoded>
			<wfw:commentRss>http://newwebmasters.net/history/a-look-back-at-googles-acquisitions/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Exploring MSN&#8217;s New Webmaster Tools</title>
		<link>http://newwebmasters.net/link-building/exploring-msns-new-webmaster-tools/</link>
		<comments>http://newwebmasters.net/link-building/exploring-msns-new-webmaster-tools/#comments</comments>
		<pubDate>Tue, 12 Aug 2008 10:38:44 +0000</pubDate>
		<dc:creator>corbyboy</dc:creator>
				<category><![CDATA[Link Building]]></category>
		<category><![CDATA[Search Engines]]></category>
		<category><![CDATA[live]]></category>
		<category><![CDATA[msn]]></category>
		<category><![CDATA[search]]></category>
		<category><![CDATA[tools]]></category>

		<guid isPermaLink="false">http://newwebmasters.net/?p=203</guid>
		<description><![CDATA[MSN has recently upgraded their Webmaster Center to provide additional tools. We will look at them in this article.]]></description>
			<content:encoded><![CDATA[<div class="captionright">
<img src="http://newwebmasters.net/wp-content/themes/tma/images/latest/msn.png" alt="Live Search keeps improvin all the time" title="Live Search keeps improving all the time" /></p>
<p>Live Search keeps improving all the time</p>
</div>
<p>MSN has recently upgraded their <a href="http://webmaster.live.com">Webmaster Center</a> to provide additional tools. Previously they provided basically no information but now they provide some excellent tools and some great nuggets of information. We will look at them in this article.</p>
<h2>Websites Overview</h2>
<div class="captionfull"><img src="http://newwebmasters.net/images/articles/msn/overview.png" alt="Overview of Live Webmaster Center" />
<p>Overview of Live Webmaster Center</p>
</div>
<p>The first page you arrive at will list all the websites that you have previously registered. You can see the authentication code that must be present to use your site at Live Webmaster Center. You can also see whether you have verified the site by XML tag or by meta tag.</p>
<p>From this page you can also add a new website by clicking &#8220;Add a site.&#8221; Click the URL of the site you want to access.</p>
<h2>Website Summary</h2>
<div class="captionfull"><img src="http://newwebmasters.net/images/articles/msn/summary.png" alt="Summary page for your chosen website" />
<p>Summary page for your chosen website</p>
</div>
<p>The next page gives the summary of all the information that Live has about your website. Some excellent statistics are listed here. You can see when was the last time the Live crawler accessed your website. You can also see how many pages are indexed and whether your website is blocked or not.</p>
<p>Live now lists an interesting metric called &#8220;Domain score.&#8221; This is how authoratative Live regards your website to be. It is rated from 0 to 5.</p>
<p>Underneath Site Status you can see the top five pages on your website. Presumably these are ordered by Page score. You can see the last time Live accessed each of these pages and whether they are blocked.</p>
<h2>Website Profile</h2>
<div class="captionfull"><img src="http://newwebmasters.net/images/articles/msn/profile.png" alt="Website profile page" />
<p>Website profile page</p>
</div>
<p>On this page you can give Live the address of your website sitemap. You can change how your website is verified. You can also add an email address so that you can be contacted if there are any crawl issues with your website.</p>
<h2>Crawl Issues</h2>
<div class="captionfull"><img src="http://newwebmasters.net/images/articles/msn/profile.png" alt="View crawl issues with your website" />
<p>View crawl issues with your website</p>
</div>
<p>On this page you can see any issues that were encountered when Live crawled your website. One of the most useful parts of this section is that you can filter the errors by subdomain or folder. According to Live&#8217;s help files filters can only be applied when 1,000 or more errors are found. In practice this doesn&#8217;t seem to be the case however.</p>
<p>You can lists pages that were not found (404 errors), pages that were blocked by robots.txt, pages that have long dynamic URLs and unsupported content types. The first 20 results will be listed here but you can download the first 1,000 in csv format.</p>
<h2>Backlinks</h2>
<div class="captionfull"><img src="http://newwebmasters.net/images/articles/msn/backlinks.png" alt="Website backlink data" />
<p>Website backlink data</p>
</div>
<p>The backlink feature lists all the incoming links to your website. Only the first 20 results are displayed, but you can download the first 1,000.</p>
<h3>Filtering to get Useful Data</h3>
<p>The filtering tools in this section are very useful. First of all you can exclude all internal links. Type in your website domain into the filter box. Check the filter radio button and choose the &#8220;Exclude&#8221; option. This will now list all the external links to your website.</p>
<p>You can also filter the results by tld. For example you can list list all the .edu domains that link to your website. Check the filter radio button and type &#8220;.edu&#8221; into the filter box. Choose to &#8220;include&#8221; this extension.</p>
<h2>Outbound Links</h2>
<div class="captionfull"><img src="http://newwebmasters.net/images/articles/msn/outbound.png" alt="View outbound link data" />
<p>View outbound link data</p>
</div>
<p>This page lists all the websites that your site links to. As before only 20 pages are listed but you can download 1,000. These results can be filtered in the same way that the backlink data can.</p>
<h2>Keywords Tool</h2>
<div class="captionfull"><img src="http://newwebmasters.net/images/articles/msn/keywords.png" alt="View keyword performance" />
<p>View keyword performance</p>
</div>
<p>This is a mysterious tool that shows in which order pages are returned for a certain search term. The results do not give any indication at which number the pages appear in the search results, just in which order pages from your site are returned. The page score and last crawled date are also returned for each page. For the time being the real usefulness of this page is questionable.</p>
<p>After the lastest update, Live Webmaster Tools certainly is much more useful. It need some improvement, particularly in the keywords search section. The algorithm to determine page score also needs tweaking. Too many pages are being returned with page score five. Overall though, an excellent improvement.</p>
]]></content:encoded>
			<wfw:commentRss>http://newwebmasters.net/link-building/exploring-msns-new-webmaster-tools/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>How to Prevent Pages Being Indexed by Search Engines</title>
		<link>http://newwebmasters.net/link-building/how-to-prevent-pages-being-indexed/</link>
		<comments>http://newwebmasters.net/link-building/how-to-prevent-pages-being-indexed/#comments</comments>
		<pubDate>Sat, 09 Aug 2008 23:01:46 +0000</pubDate>
		<dc:creator>corbyboy</dc:creator>
				<category><![CDATA[Link Building]]></category>
		<category><![CDATA[Search Engines]]></category>
		<category><![CDATA[google]]></category>
		<category><![CDATA[live]]></category>
		<category><![CDATA[webmaster tools]]></category>
		<category><![CDATA[yahoo]]></category>

		<guid isPermaLink="false">http://newwebmasters.net/?p=196</guid>
		<description><![CDATA[There are often times when you want a page or section of your website to be ignored. This article lists the available methods to keep the spiders away.]]></description>
			<content:encoded><![CDATA[<div class="captionright">
<img src="http://newwebmasters.net/wp-content/themes/tma/images/latest/robots.jpg" alt="Keep those robots at bay" title="Keep those robots at bay" /></p>
<p>Keep those robots at bay</p>
</div>
<p>As much as webmasters are desperate for search engine spiders to crawl their pages, there are often times when you want a page or section to be ignored. They might be non-content pages such as terms and conditions or a privacy policy. They could be test pages or just private pages that you don&#8217;t want to share.</p>
<p>This article will show you all the different ways to keep search engines away from your content. The most appropriate method to use for your own situation is up to you.</p>
<h2>Robots.txt</h2>
<p>The standard way of preventing search engines accessing a page is to block it using a robots.txt file. It goes in the the base directory of your website.</p>
<p>You use the following syntax:</p>
<pre>
User-agent: *
Disallow: /sessions/
Disallow: /cgi-bin/
</pre>
<p>The * under User-agent means that the rules apply to all spiders. The Disallow rule lists the directories that should be excluded.</p>
<p>You can do other things with robots.txt:</p>
<pre>
User-agent: Googlebot
Disallow:

User-agent: *
Disallow: /
</pre>
<p>These rules in this example mean that Googlebot is allowed to access the entire site (note how there is nothing after Disallow). Every other user-agent is blocked. There are more advanced examples of how to build a robots.txt file at <a href="http://www.robotstxt.org/robotstxt.html">robotstxt.org</a>.</p>
<p>There are two important things to remember about using robots.txt. The first is that a spider can ignore it. This is particularly true for robots harvesting email addresses or scraping your content. The second is that anybody can read the file. Don&#8217;t use robots.txt to hide secret information, as a quick glance at the file will tell you where all those secret directories are.</p>
<p>You should also be aware that even though search engines won&#8217;t spider the site, the URL can still appear in search engine result lists, especially if the page is linked to from another site.</p>
<h2>META Tags</h2>
<p>You can use meta tags to control how spiders index specific pages. They look something like this:</p>
<pre>
&lt;meta name="robots" content="noindex,nofollow" /&gt;
</pre>
<p>The options that you have concerning robots are:</p>
<ul>
<li>&#8220;<strong>index</strong>&#8221; &#8211; which tells the robot to index that page</li>
<li>&#8220;<strong>noindex</strong>&#8221; &#8211; tells the robot to not index that page</li>
<li>&#8220;<strong>follow</strong>&#8221; &#8211; the robot should follow all the links on that page</li>
<li>&#8220;<strong>nofollow</strong>&#8221; &#8211; the robot should not spider any links from that page</li>
</ul>
<p>The default is &#8220;index,follow&#8221; so if you want your page spidered and the links followed then you can omit the tag.</p>
<p>When using a meta tag to &#8220;noindex&#8221; a page you will not have the problem of the URL appearing in the search engine results. It will be completely omitted. Remember that a page that is not indexed will still be crawled and unless you have also specified &#8220;nofollow&#8221; in your tag, the links on that page will be spidered too.</p>
<h2>Nofollow Links</h2>
<p>You can use &#8220;nofollow&#8221; on a specific hyperlink to stop search engine spiders from following that link and crawling the page. It is supported by Google, Yahoo! and MSN/Live You use the following syntax:</p>
<pre>
&lt;a href="http://example.com/page.html" rel="nofollow"&gt;Example.com&lt;/a&gt;
</pre>
</p>
<p>This syntax was originally created due to the problem of comment spam on guestbooks and blogs. The original idea was that a site could be linked to without passing any PageRank or link influence.</p>
<p>While this method is useful to prevent a spider following a link to a page, remember that it only affects the specific hyperlink it is attached to. If another link without nofollow points to the same page then that link will be followed and the page will still be indexed.</p>
<h2>Password protected pages</h2>
<p>A common way to keep private pages secure is to protect them with a password. Since the page cannot be accessed without the proper credentials, search engine spiders cannot index the page. As with some of the other examples, the URL may still appear in search engine results if a link to the page is found.</p>
<p>The following code shows you how to protect a page with a username and password using PHP. It uses basic HTTP Authentication:</p>
<pre>
&lt;?php
if (!isset($_SERVER['PHP_AUTH_USER'])) {
    header('WWW-Authenticate: Basic realm="My Realm"');
    header('HTTP/1.0 401 Unauthorized');
    echo 'Text to send if user hits Cancel button';
    exit;
} else {
    echo "Hello {$_SERVER['PHP_AUTH_USER']}.";
    echo "You entered {$_SERVER['PHP_AUTH_PW']} as your password.";
}
?&gt;
</pre>
<p>To read more about authentication in PHP visit <a href="http://uk2.php.net/manual/en/features.http-auth.php">php.net</a></p>
<h2>Offline Pages</h2>
<p>It may seem simple but keeping your web pages offline is the perfect way to prevent them being indexed. Maybe they can be put on an intranet rather than fully online. If collaboration or viewing over the Internet is essential then this method isn&#8217;t suitable, but on many occasions it is.</p>
<h2>Remove URLs From Search Index</h2>
<p>Google and Yahoo! both have tools that allow you to remove specific URLs from their search index relatively quickly.</p>
<h3>Google Webmaster Tools</h3>
<p><img src="http://newwebmasters.net/wp-content/uploads/2008/08/googletools-300x175.png" alt="" title="Google Webmaster Tools" width="300" height="175" class="right" />If you have registered with <a href="http://google.com/sitemaps">Google Webmaster Tools</a> then you can use this to remove a URL.</p>
<ol>
<li>From the dashboard click the website that contains the URL.</li>
<li>Then click &#8220;Tools&#8221;, followed by &#8220;Remove URLs.&#8221; Google will give you some tips about getting your URL removed and keeping it like that.</li>
<li>The URL you want to remove should be blocked from further spidering. If you don&#8217;t ensure this then the page will simply be indexed again at some point in the future. Once you are certain this is the case, click &#8220;New Removal Request.&#8221;</li>
<li>You then choose what you want to remove. You can remove and individual URL, an entire directory, a whole site or the Google cached copy of a page. Make your selection and click &#8220;Next.&#8221;</li>
<li>Enter the URL to be removed and click &#8220;Submit Removal Request.&#8221;</li>
</ol>
<h3>Yahoo! Site Explorer</h3>
<p>To use <a href="https://siteexplorer.search.yahoo.com/">Yahoo! Site Explorer</a> you need to verify your site.</p>
<ol>
<li>From the Site Explorer homepage either search for the URL you want to remove or click &#8220;Explore&#8221; and locate it manually.</li>
<li>Click [Delete URL/Path] next to the URL you wish to remove. Any URLs below that particular folder will also be deleted.</li>
<li>You will be presented with a confirmation page which will list all pages to be removed. You can edit this list to keep certain URLs. Click &#8220;Update&#8221; to create a new list.</li>
<li>Click &#8220;Yes&#8221; to confirm your deletions.</li>
</ol>
<p><strong>A note on removing URLs with webmaster tools.</strong> Using Google and Yahoo&#8217;s webmaster tools requires you to verify your site. The deletions will only stay in effect as long as your site is verified. If at any point your site becomes unverified, your deleted URLs will return to the index.</p>
<p><strong>Another important note.</strong> When you remove a URL, make sure it is blocked from further crawling via one of the other methods listed above. Otherwise your deleted URLs will be reindexed soon after they are removed.</p>
]]></content:encoded>
			<wfw:commentRss>http://newwebmasters.net/link-building/how-to-prevent-pages-being-indexed/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
	</channel>
</rss>

