What to do if Google indexed your development site

In my last post, I talked about how to keep Google from indexing your development or staging site. But if it’s already happened, here’s how to fix it.

Let’s say, for example, your development site is http://dev.nerdpress.net, and your correct site is https://www.nerdpress.net. If Google finds and indexes your dev site, the search results could end up looking something like this:

Dev Site showing up in search results

That’s not great, but it’ll be even worse once you remove the dev site, and the link goes to a 404-not-found error page. (And then Google would eventually catch up, and remove the link altogether.)

Heads up: The following fix requires that you basically kill your dev site. If you still want to have a dev or staging site, pick a different subdomain and put it there. Then follow the instructions here to make sure this doesn’t happen again.

First, add both domains to Google Search Console.

Go to Google Search Console (formerly Webmaster Tools) and add and verify both properties (if they’re not already added). From the main page of Search Console, hit the red “Add a Property” button and follow the steps.

For the dev site, I recommend using the “Domain name provider” verification method (it may be under the “Alternate Methods” tab). The other verification methods may not continue to work after you set up the redirects in the next step. Follow the instructions that Google provides for your DNS (Domain Name Service) provider.

Your DNS settings are probably managed by your web hosting company. However, they could also be managed by your domain registrar (such as GoDaddy or Namecheap). If you’re using Cloudflare, you’ll need to change the settings there.

Second, tell Google the content is gone, and redirect visitors from the dev site to the real site.

Update February, 2020: My new recommendation is to set up redirects for real visitors, and also set up your dev site to respond to Google (and other search engines) with a 410 “Gone” response. This tells Google that you’ve deliberately removed the content, which is a strong indicator that they should remove the URL out of the index. Google knows a 410 response is very deliberate and likely to be permanent, so they’ll update the index quickly. This will help get things cleared up in a matter of weeks, instead of months or years!

So by setting up 410 responses for Google — and 301 redirects for real visitors who click on the links to the dev site in the meantime — you get the best of both worlds. Google will remove the content faster, and in the meantime, people will find the content they’re looking for, by being redirected to your live site.

To set this up, you’ll need to modify your .htaccess file. This is a simple text file that sites in the top folder of your server (usually something like /public_html/). (Note, if your server is running Nginx instead of Apache, it’s best to ask your host to set this up for you – Nginx doesn’t use the .htaccess file.)

It actually just takes a few lines, and they should go at the very top of the .htaccess file:

# Issue a 410 "Gone" for Googlebot and other crawlers
RewriteEngine On
RewriteCond %{HTTP_USER_AGENT} Googlebot|Baidu|Bingbot|DuckDuckBot|Slurp|Yandex [NC]
RewriteRule (.*) - [R=410,L]

# Redirect other visitors to the live site
RewriteEngine on
RewriteCond %{HTTP_HOST} ^dev.yourdomain.com [NC]
RewriteRule ^(.*)$ https://www.yourdomain.com/$1 [NC,R=301,L]

The first section checks the user agent, and if it’s Googlebot (or one of the others), it’ll issue the 410 response and that’s that.

If the user gets to the second section (by virtue of not being a bot), we then check to see if request is for the dev site. If it is, we redirect the visitor to the correct domain (including the full URL).

(In case you’re wondering about the bits in [brackets]: the NC makes it case-insensitive (“No Case”),  the R=301 says it’s a redirect with the status of 301, and the L means it’s the Last rule, so don’t do anything after this.)

Of course, you’ll need to change yourdomain.com to your actual domain to get this to work.

Once it’s in place, be sure to test to be sure it redirects everything correctly (including posts/pages, not just the homepage).

You can use FTP to download, edit, and re-upload the .htaccess file. Or, if your account has cPanel, you can use the File Manger there. Be sure to change the setting to “show hidden files.”

If you’re scared to edit your .htaccess file, I recommend asking your host to take care of this for you; they should be happy to help.

Third, submit a change of address to Google.

In the top right corner of the search console, select your dev site. Then click the gear icon and select “Change of Address.”

Initiate a change of address screenshot

That will walk you through a few simple steps:

Change of Address screenshot

Finally, patience.

You’re going to need to be patient. It can take weeks or months to untangle this mess. Google has to crawl the site, follow the redirects, learn that the final URL is actually the correct page, and then update the index accordingly.

To keep tabs on how the fix is going, you can search site:dev.yoursite.com and check out the search results. Note the number of results returned. Over time, it should go down and eventually reach zero.

Filed Under:

Tagged With:

Related Posts

Comments

  1. Hi Andrew, great article!

    Now let me describe something really stupid I did, and I would really appreciate your opinion..

    I have a client that I was working on a replacement website. I was working on one of my own domains (not a subdomain of their domain), nothing to do with the client’s name. For a number of reasons the process took too long, and as a result, my own domain is indexed and appears in google results.

    Now trying to “connect” the two domains with redirections I don’t think it would be of any help. I brought the dev site down (powered off the vps I was working on) and most probably I will move the dev to a completely different place.

    My domain has no webmaster tools connection, would it be wise to add it and then ask google to de-index it?

    Luckily the client’s website is still above mine in the search results, but still my domain appears on the same page.

    I know that was stupid of me I have no idea how to fix it.

    Thanks!

    1. Hi George,

      Since this was on a subdomain of your own domain, you don’t want to use the “Change of Address” tool in search console (step three in my post). But the 410/301 directives (step two) should still work well. For that you’ll need to get hosting set back up somewhere for the staging domain, so you can modify the .htaccess file with the code.

      You’ll need to tweak the RewriteCond line a bit to match the setup, but the basic idea is the same: Issue a 410 (content removed) response to Googlebot, but issue a 301 (permanent redirect) response to real visitors.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.