Google is really good at finding, crawling, and indexing websites – that’s their core business. The trick is, sometimes you don’t want them to.
If you’re redesigning your site, your designer may set up a “development” or “dev” site at a temporary location. That’ll let you see their design and make changes before it goes “live.” The domain might look something like dev.nerdpress.net, or if it’s on their own domain, perhaps nerdpress.webdesigncompany.com.
Or, you might maintain a clone of your site for ongoing testing — to check plugins for compatibility before upgrading, for example.
It’s really important that you keep Google (and other search engines) from indexing development, testing, or staging sites – otherwise, you may end up with pages from that domain in the search results. That can cause duplicate content issues, since you’ll probably have many posts & pages that were copied over from your real site.
In a worst-case scenario, if Google decides that the dev site is the authoritative version of your site, it will start showing it in the search results instead of your main site.
This is a disaster. Traffic will start going to the dev site, and then once you take down the dev site, all those links will be broken, leaving you with nothing — and your main site might never recover on its own. I ran into this on a client’s site eight months ago, and Google still has a handful of the development site URLs in the index, even with the correct fix in place.
Block Search Engines on a Test Site
There are a few ways you can prevent this from happening. The easiest and fastest is to use a built-in WordPress setting to block search engines. In the test site’s dashboard, go toand check the box next to “Search Engine Visibility.”
This tells WordPress to add a “noindex” tag in your website’s <head> section, which in turn tells Google not to index the content. It also modifies your site’s robots.txt file to tell bots they’re not welcome here.
But this comes with one huge caveat: If you ever turn the secondary site into your main site (as is often the case with a redesign), you must remember to un-check that box!! If you don’t, you could end up accidentally killing all of your real site’s positioning in Google.
Because that is such an easy mistake to make, and comes with extreme consequences, I do not recommend using the above technique for a development site that will be copied over to become the live site. It’s just too easy to screw that up.
Block Search Engines on a Development Site
So, for a development site that may be copied over at some point in the future, a better solution is to password protect the entire site. Not only will this keep Google from ever seeing the site – it’ll also keep real people (other than those you want, of course) from seeing it, too.
The ideal way to password protect a development site is at the server level. Each web hosting company has a slightly different way of setting this up, but if your host uses cPanel, this short video will walk you through it. (Or you could just open a support ticket and ask your host to do it for you.)
Alternatively, I recommend the Password Protected plugin as a quick and easy method to password protect your development site. It has a settings page to allow you to set a password, and once you enable it, you’ll have a nice-looking password entry form, like so:
Of course, once you copy over the development site, you’ll need to deactivate and remove that plugin — but it’s a lot easier to remember to do that.
And if you didn’t do all of the above, here’s how to fix it if Google already indexed your development site.
Unfortunately the password plug-in does not stop Google from indexing the site. Your first method is an ok start, however to be sure your site won’t get indexed, you should use a combination of this and also a htaccess based password to restrict access to the site.
Your first Method will stop most crawlers, however it is optional for SEs to recognize the tag.
Hi! Strictly speaking, you’re right – Google could index a password-protected (via plugin) homepage, but since that’s going to be such “thin content” it won’t really matter… it likely won’t be showing up in search results for anything (just because it’s in the index doesn’t mean it shows up in the SERPs in any meaningful way), and it certainly won’t compete with your production site. But that’s also why I included the plugin method as a last option. 🙂
I tried using the following in htaccess:
deny from all
where xxx ismy tcp addr
2 questions / issues:
1. I’m getting a “forbidden” error even when I try from my current tcp addr.
2. what happens if I have a dynamic IP? I can only think that I would have to change htaccess every day…
If your site is behind a proxy (such as Cloudflare, Cloudproxy, or even nginx or Varnish cache on the server itself), you may have trouble implementing blocks by IP address. And yes, you raise a good point that if your IP address changes, you will need to adjust the .htaccess file each time.
You could still implement .htaccess password protection, though, as I mentioned above. This would cause the browser to show a request for username/password before displaying the page…though this is a little bit more complicated. I linked to a video in the post, for setting it up with cPanel. This article has some more info on setting up the .htaccess password protection without cPanel.
Really, for the purposes of keeping Google from indexing your dev site, simply having a “Noindex” is sufficient… so you can enable the setting in WordPress to discourage search engines from indexing your site, and you’ll be all set.
Is it possible with robots.txt or .htaccess file to remove these URLs from Google? Because I have a staging website and logins are not given by the client. If possible please reply ASAP.
So when I remove tag “”, how long will Google reindex?
Hi my dev site is already indexed. We are still redesigning our site and it is planed to go live soon. Can I still use one of those solution you gave to stop google from indexing my dev site? I got confused because at the end of this tutorial you mentioned a link to follow saying “And if you didn’t do all of the above, here’s how to fix it if Google already indexed your development site.” Thank you for your help!
Yes, if Google has already indexed your dev site, you should follow the instructions in my other post:
Google has changed the Search Console a bit since I published that, but the basic idea is still the same: Instead of blocking the dev site from Google, you instead want to redirect all the URLs to your live site. Eventually Google will catch up and update the index.
Hi, out of curiosity, if I am building a new site, straight onto a new URL, and adding new content to it gradually every day, would it not be best to make it accessible to Google? The idea being that if Google goes back fairly often, seeing you doing updates, it would be a good thing in Google’s eye. Just a thought.
If you use WordPress just install “Coming Soon Page & Maintenance Mode”. The page will show maintenance mode and google cannot index all articles post on your dev, but you show normally when you sign on wp-admin
This is very helpful, thanks!
I’m have a dev site going so I can a/b test some large changes to see if they’re worth it from a speed perspective. So far I’ve just created the dev as an exact copy of my live site & fixed a couple of minor things like images being hard coded (so calling from the live site) etc. No plugin changes yet.
However, the speed of the dev is dramatically slower than the live site. Could this be because robots are being “discouraged” when I marked the page to discourage indexing?
Difficult to run a/b speed tests when the dev isn’t testing at the same speed, prior to any changes happening at all.
thanks for any help!