What to do if Google indexed your development site

In my last post, I talked about how to keep Google from indexing your development or staging site. But if it’s already happened, here’s how to fix it.

Let’s say, for example, your development site is http://dev.nerdpress.net, and your correct site is https://www.nerdpress.net. If Google finds and indexes your dev site, the search results could end up looking something like this:

Dev Site showing up in search results

That’s not great, but it’ll be even worse once you remove the dev site, and the link goes to a 404-not-found error page. (And then Google would eventually catch up, and remove the link altogether.)

Heads up: The following fix requires that you basically kill your dev site. If you still want to have a dev or staging site, pick a different subdomain and put it there. Then follow the instructions here to make sure this doesn’t happen again.

First, add both domains to Google Search Console.

Go to Google Search Console (formerly Webmaster Tools) and add and verify both properties (if they’re not already added). From the main page of Search Console, hit the red “Add a Property” button and follow the steps.

For the dev site, I recommend using the “Domain name provider” verification method (it may be under the “Alternate Methods” tab). The other verification methods may not continue to work after you set up the redirects in the next step. Follow the instructions that Google provides for your DNS (Domain Name Service) provider.

Your DNS settings are probably managed by your web hosting company. However, they could also be managed by your domain registrar (such as GoDaddy or Namecheap). If you’re using Cloudflare, you’ll need to change the settings there.

Second, tell Google the content is gone, and redirect visitors from the dev site to the real site.

Update February, 2020: My new recommendation is to set up redirects for real visitors, and also set up your dev site to respond to Google (and other search engines) with a 410 “Gone” response. This tells Google that you’ve deliberately removed the content, which is a strong indicator that they should remove the URL out of the index. Google knows a 410 response is very deliberate and likely to be permanent, so they’ll update the index quickly. This will help get things cleared up in a matter of weeks, instead of months or years!

So by setting up 410 responses for Google — and 301 redirects for real visitors who click on the links to the dev site in the meantime — you get the best of both worlds. Google will remove the content faster, and in the meantime, people will find the content they’re looking for, by being redirected to your live site.

To set this up, you’ll need to modify your .htaccess file. This is a simple text file that sites in the top folder of your server (usually something like /public_html/. (Note, if your server is running Nginx instead of Apache, it’s best to ask your host to set this up for you – Nginx doesn’t use the .htaccess file.)

Update April, 2022: I’ve updated the code below so that it will not issue a 410 or 301 for a sitemap XML file. This way you can still submit your sitemap to Google and encourage it to crawl that faster. You can use Google’s “Sitemap Ping” tool or submit it in Google Search Console (for the staging site).

It actually just takes a few lines, and they should go at the very top of the .htaccess file:

# Issue a 410 "Gone" for Googlebot and other crawlers
RewriteEngine On
RewriteCond %{HTTP_USER_AGENT} Googlebot|Baidu|Bingbot|DuckDuckBot|Slurp|Yandex [NC]
RewriteCond %{REQUEST_FILENAME} !\.xml$
RewriteRule (.*) - [R=410,L]

# Redirect other visitors to the live site
RewriteEngine on
RewriteCond %{HTTP_HOST} ^dev.yourdomain.com [NC]
RewriteCond %{REQUEST_FILENAME} !\.xml$
RewriteRule ^(.*)$ https://www.yourdomain.com/$1 [NC,R=301,L]

Be sure to change dev.yourdomain.com and yourdomain.com to your actual domains to get this to work.

The first section checks the user agent, and if it’s Googlebot (or one of the others), and not your Sitemap XML file, it’ll issue the 410 response and that’s that.

If the user gets to the second section (by virtue of not being a bot), we then check to see if request is for the dev site. If it is, we redirect the visitor to the correct domain (including the full URL).

(In case you’re wondering about the bits in [brackets]: the NC makes it case-insensitive (“No Case”),  the R=301 says it’s a redirect with the status of 301, and the L means it’s the Last rule, so don’t do anything after this.)

Once it’s in place, be sure to test to be sure it redirects everything correctly (including posts/pages, not just the homepage).

You can use FTP to download, edit, and re-upload the .htaccess file. Or, if your account has cPanel, you can use the File Manger there. Be sure to change the setting to “show hidden files.”

If you’re scared to edit your .htaccess file, I recommend asking your host to take care of this for you; they should be happy to help.

Third, submit a change of address to Google.

In the top right corner of the search console, select your dev site. Then click the gear icon and select “Change of Address.”

Initiate a change of address screenshot

That will walk you through a few simple steps:

Change of Address screenshot

Finally, patience.

You’re going to need to be patient. It can take weeks or months to untangle this mess. Google has to crawl the site, follow the redirects, learn that the final URL is actually the correct page, and then update the index accordingly.

To keep tabs on how the fix is going, you can search site:dev.yoursite.com and check out the search results. Note the number of results returned. Over time, it should go down and eventually reach zero.

Filed Under:

Tagged With:

Related Posts

Comments

  1. Hi! Thank you for you post but I still have one question. If the redirect is ON how can I access the dev website to check if this staging environment is working as expected before switching to live?
    Thanks

    1. Hi Francisco – Good question! The idea is to set up the redirect after going “live” with the changes, and would only be necessary if Google had already indexed your dev site.

      However, if you do want to have the redirect in place and continue working on the dev site, you could change the redirect code so that it does not redirect for your own IP address. See sample code below — I haven’t tested this, but I think this would likely work. The third line says “Don’t do the redirect for this particular IP address.” (You can find your current IP address by simply Googling “What is my IP address”)

      RewriteEngine on
      RewriteCond %{HTTP_HOST} ^dev.yourdomain.com [NC]
      RewriteCond %{REMOTE_ADDR} !=123.45.67.89
      RewriteRule ^(.*)$ http://www.yourdomain.com/$1 [NC,R=301,L]

      Hope that helps!

    1. Hi Evgen – That can be a useful tool, but I don’t typically recommend it… If your dev site is showing up in the SERPs instead of your live site, I’d much rather just set up redirects — that way any real traffic will actually make it to your live site. If you just have Google remove the dev site URLs, there may be a significant lag time before your live site starts showing up in the SERPs – which could lead to a drop in traffic.

      I think it all depends on just how “indexed” the dev site is — if it’s just a few URLs, doing the manual removal may be a good choice. If it’s all the URLs on your site, I’d go with the redirect method.

  2. Andrew, I dove in and followed your guidance, so we’ll see what happens! I completed the change of address today, made note of my current findings regarding the indexing of my site, said a little prayer, and now I’m in patience mode. Hopefully it will rebound sooner rather than later, and I’ll report back as to the relative success of the process!

  3. Hey Andrew,
    my google webmaster tools don’t show correct visitors last one month. my google search click is daily 150-200 but my search console show only 3-4 click this period.
    Please Help Me

      1. Hi Andrew,
        Yes I have treated all four variations differently in the search console.
        I have some time ago deleted Without https property but a month ago i added them again and i think i did not add them right so please help

  4. Andrew,

    This recently occurred to one of my sites. At the same time of discovery my organic search results started falling. I’m assuming the issues are related. Is this common and would it look like duplicate content across sites to Google thus the ‘penalty’?

    Thanks,
    Rick

  5. So Andrew, why not once the site is verified, I should rather use url removal tool and then set up some kind of server authentication like login id to see the page from out of our ip address?

  6. Hi,
    nice blog sir. I have a question my dev site got indexed along with all tags and categories. I don’t have a website where i can redirect them as my client refuse to purchase the website. If i delete the website than also my indexed pages will not be removed. What should i do. Please suggest waiting for a perfect solution withthis issue.

  7. This is great and thanks for the tips. What do you do if you are trying to submit a name change on the DEV site and you get this error: Restricted to root level domains and subdomains only.

    I have tested that the 301 redirect is working, is that enough to let google know that the site is no longer the main site?

  8. Hi Andrew,
    I’ve asked my hosting co to put a redirect in place, and that’s what they put into the config file:

    RewriteCond %{HTTP_HOST} ^straessle\.lamp9\.cloudsites\.net\.au$ [OR]
    RewriteCond %{HTTP_HOST} ^www\.straessle\.lamp9\.cloudsites\.net\.au$ [NC]
    RewriteRule ^/?$ “https\:\/\/museproject\.com\.au\/” [R=301,L]

    It does the job for the homepage, but it does not redirect sub pages. Do you know what to change to the above lines of code to also redirect sub pages from staging to live?

    1. Hi Marc. I haven’t tested this, but I’m guessing this would do what you are asking. The first two lines are the same as what you shared; I’ve modified the RewriteRule on the third line.

      RewriteCond %{HTTP_HOST} ^straessle\.lamp9\.cloudsites\.net\.au$ [OR]
      RewriteCond %{HTTP_HOST} ^www\.straessle\.lamp9\.cloudsites\.net\.au$ [NC]
      RewriteRule ^(.*)$ https\:\/\/museproject\.com\.au\/$1 [R=301,L]

    1. That’s a regex (regular expression) pattern match and backreference. What that means is that it searches for the part of the URL (after the domain name) and then moves it over to the new domain.

      ^ means “start pattern matching”
      .* means “any character, matched zero or more times” (so, basically, anything)
      $ means “stop pattern matching”

      And then whatever it finds in ^(.*)$ will be added where the $1 is.

      So https://olddomain.com/test.html would become https://newdomain.com/test.html

      This is a good primer on using Regex:
      https://httpd.apache.org/docs/2.4/rewrite/intro.html#regex

  9. The easiest solution to this, of course, is to not use a dev site at all, and to simply push updates directly to your live site LIKE A BOSS!

    Okay, just kidding. 😉

  10. My test website already got indexed and website is removed from server as well. In this case I won’t be able to redirect it. What is best solution to overcome this issue.

    1. Hi Phan –
      I wouldn’t necessarily say that “http” is “incorrect,” but I do agree it’s better to use https with sample code. I just updated the post. 🙂
      Thanks!

  11. Hi Andrew
    I developed a website for example ” domain.com “, I do a mistake in robots.txt to prevent google from indexing my domain ( I developed the website on “domain.com” not “dev.domain.com”). after one month I realize google has indexed 94 urls of my domain. these urls have test names and test content and I don’t want to maintain them. What should I do?
    I need to remove whole site urls and contetnts and have fresh start with these “domain.com” from scratch.

    Thanks in advance

    1. If Google has indexed the dev site, then odds are good that it’s competing with your production site in the search results. I’ve also seen cases where the Google has decided that the dev site is the canonical domain, and has essentially booted the entire production site/URLs out of the index. So if you suddenly password protected the dev URLs, you’d block legit visitor traffic from searches too.

      Http auth passwords are a great way to prevent this problem in the first place – I didn’t include that in my other post on this, though, since it’s more complicated for most people to set up.

  12. Hello, is this still the best process in 2020? We have a development website that has been indexed for a few months, and our new website went live recently, creating lots of duplicate pages and content out there.
    I’ve seen some posts that say do a 301, while others say allow for a 404 error and let Google find the correct pages on their own.

    1. Hi Nick,
      I’m glad you asked — I’ve actually been working on a better solution here, but haven’t updated the post just yet since I haven’t tested it fully.
      The better way to go would be to issue a 410 (“Content Gone”) response, but only to Googlebot. And for any other visitors, still do the 301 redirect.

      That’s the best of both worlds. When Google sees a 410, they’ll remove a URL out of the index faster than a 404, since often 404s are mistakes — a 410 sends a very clear “we’ve deliberately removed this!” message. And in the meantime, any real humans that click the link in the search results, will actually get to the content their looking for, so it’s better for users, too.

      To implement this, you’ll need to add code in the .htaccess file to check for Googlebot (before your 301 redirect code) and then issue the 410. It’ll be something along these lines (this isn’t all the code you need, but hopefully it gets you going in the right direction):

      RewriteEngine On
      RewriteCond %{HTTP_USER_AGENT} Googlebot [NC]
      RewriteRule (.*) - [R=410,L]

      Hope that helps!

    2. Hey Nick –
      I just updated the post with a slightly better version of the code (including a few other popular search engine crawlers for the 410 status code). 🙂
      Hope that helps!

  13. Hey Andrew, thanks for this helpful tutorial. We ran into this issue and followed your steps exactly to correct. However, after putting the 410 status code in along with the 301 I’m running into an error when trying to submit the change of address.

    Search console keeps saying they can’t fetch the staging site, I’m assuming because of the 410 status code, therefore, it is not allowing me to submit the change of address.

    Any ideas here?

    1. Hi Paula,
      Hmm… I’m guessing you’ll need to disable the 410 redirect temporarily, so that you can submit the change of address successfully. Once that’s done in GSC, re-enable the 410 code and I think you should be good.
      Please let us know if/how that works, and I’ll adjust the instructions accordingly!

  14. Replying to my earlier comment, but I don’t see it here. After removing the 410 we were able to submit the change of address in GSC with no problem.

    I did get the message from GSC that the change of address has “started” which made me wonder if I should leave off that 410 for now, so they have time to crawl each page and see that each one has changed address and see the 301 directives to each page on the live site. What are your thoughts? Thanks for taking the time to help!

    1. I just re-read the details of the “change of address” tool, here:
      https://support.google.com/webmasters/answer/9370220

      It does sound like if you remove the 301’s (for Google) then it might stop updating the address — but they’re also not factoring in returning 410 responses, either.

      I suppose the question now is: Are the staging URLs that are indexed in Google actually generating significant traffic and click-throughs? If so, you may actually want to keep the 301s and wait for the Change of Address tool to do its thing.

      If they’re not actually generating any significant traffic, you’re probably better off just re-enabling the 410 responses for Google, to get the URLs dropped out of the search results as fast as possible.

  15. Hey Andrew, our developer forgot to no-index our dev site, and some pages got indexed and are now ranking. I’m on WordPress so I went in and checked the box to discourage engines from indexing. A few ranking pages are still out there, does all of the information still apply in February 2021? Can i just redirect my dev link to the proper live page or should i follow all steps?

    1. Hi Garrett – Yep, this all still applies. Redirecting the dev site to the live site is a bare minimum (and if you do that, discouraging engines from indexing actually doesn’t really apply, since the redirect will happen first anyway).
      Really, it’s best to do the other steps to help speed the process.
      Good luck!

  16. Hi Andrew, great article!

    Now let me describe something really stupid I did, and I would really appreciate your opinion..

    I have a client that I was working on a replacement website. I was working on one of my own domains (not a subdomain of their domain), nothing to do with the client’s name. For a number of reasons the process took too long, and as a result, my own domain is indexed and appears in google results.

    Now trying to “connect” the two domains with redirections I don’t think it would be of any help. I brought the dev site down (powered off the vps I was working on) and most probably I will move the dev to a completely different place.

    My domain has no webmaster tools connection, would it be wise to add it and then ask google to de-index it?

    Luckily the client’s website is still above mine in the search results, but still my domain appears on the same page.

    I know that was stupid of me I have no idea how to fix it.

    Thanks!

    1. Hi George,

      Since this was on a subdomain of your own domain, you don’t want to use the “Change of Address” tool in search console (step three in my post). But the 410/301 directives (step two) should still work well. For that you’ll need to get hosting set back up somewhere for the staging domain, so you can modify the .htaccess file with the code.

      You’ll need to tweak the RewriteCond line a bit to match the setup, but the basic idea is the same: Issue a 410 (content removed) response to Googlebot, but issue a 301 (permanent redirect) response to real visitors.

  17. When I get to step #3, the Google Search Console tool tells me that it needs a 301 redirect instead of a 410. Would you recommend this?

    This is the message from the tool:

    Required
    301-redirect from homepage

    Recommended
    301-redirect from sample pages

    1. Hi Jeff,

      I think the best way to solve this probably depends on the specific situation you find yourself in.

      Are the staging URLs that are indexed in Google actually generating significant traffic and click-throughs? If so, you may actually want to keep the 301s and wait for the Change of Address tool to do its thing (and not issue the 410’s at all).

      If they’re not actually generating any significant traffic, you’re probably better off just re-enabling the 410 responses for Google to get the URLs dropped out of the search results as fast as possible. You may also be able to use the “Removal Tool” to get those URLs out of the index quickly — of course just be extra careful not to remove any URLs that you do want to keep.

      Let us know how it goes!

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.