Thursday, March 12, 2009

Handling 301 Redirects with PHP

What To Do With Your 404 Errors

Part 2 of the SEO for PHP Series

Blast! I’m late for my Nunchucks class and I can’t find my keys! I was pretty sure I left them right here by my laptop. Ah ha, there they are—glad I finally found them. Good thing I did, ’cause I need the practice. Check me out at last week’s class.

Don’t you hate it when stuff gets lost? I sure do. When you’re sure that something is there and then all of the sudden, 404! It’d be helpful to have a little 404 post-it note show up on my shelf when my wallet is not in its place. Or, when sitting down to dinner I’d see a big foam 404 cutout where my son should be but is instead still outside playing.

Well, as much as it bugs you, it’s probably just as bad for your site’s visitors to find a link somewhere in a search engine or blog, and come to your site just to get zinged by a big 404 message (the system is down, yo!).

Even cooler than the 404 post-it notes would be little 302 messages, with detailed information pointing me to where the object of my search hides. I can see it now, “302: Keys temporarily moved between the couch cushions.” Of course, you never really see 302 messages on the web, but it would be nice in real life.

So why not equip your site with a clever little something to take care of any misplaced pages? You may even go as far as to write a little script (or borrow the one I’ll include here) to transform those 404s into 301s and get your users back on the road again. Here’s the process I’d follow:

  1. Get yourself a Google account and add your site to your webmaster tools if you haven’t already.
  2. Make a list of 404 errors that your users have come across on your site.
    • Find out what 404 errors Google has found (look through your webmaster tools).
    • Look in your server logs for other 404 errors.
  3. Search through search engine results for problem pages.
  4. Create a PHP “map” page (essentially an array of missing pages and their new locations).
  5. Add a small piece of code before all other code processed on your site to handle redirects.

*A quick side note: right in the middle of typing this entry, I found Nelson’s enormous water jug here in my office—must have been left here sometime today. I just sent it back to him with a friendly 302 post-it note attached “302 Redirect: The water jug you’ve been looking for was temporarily moved to Peter’s office”. Wow, these notes are sure helpful! I need to get a patent attorney on the phone quick! Alright, back to the post…

Remember that not all the 404 messages that are dished out on your server need to be redirected. Some of them are legitimate status codes for pages people have requested that you really don’t have on your site. So sift through and create your list of only the pages you want people to still be able to find. Once you’ve identified these pages, organize them into an array map list like so (I’ve created a file called redir_map.php):

  1. $redir_map_arr = array(
  2. ‘productsconnect.htm’ => ‘products’,
  3. ‘press-releases/press-main.htm’ => ‘news/press-releases’,
  4. ‘current-news/news-main.htm’ => ‘news’
  5. );

OK, I’ve only got 3 pages to redirect here in this example, but it should still be pretty fun. Save the redir_map.php file somewhere that makes sense on your site. Now on to the implementation.

Depending on the type of structure you have on your site for delivering pages, you’ll have to find a way to include this little redirection map before ANYTHING else on your site is processed. This is very important because we’re going to send some HTTP status codes to the agents requesting pages on our site. If you’ve already sent headers, you’re likely going to run into problems.

I personally like to build sites to redirect all incoming traffic to the index.php page (done using an .htaccess file and explained in a future blog entry). This keeps my sites DRY–I don’t have to put includes on every page for a header and footer and what not. So I put the following PHP code in my index page before just about everything else (you’ll have to determine where to put this script on your own site—just ensure it is the very first piece of code on your site that sends headers):

  1. require("redir_map.php");
  2. foreach ($redir_map_arr as $old_url => $map_to){
  3. if(strpos($_SERVER[‘REQUEST_URI’], $old_url)){
  4. header( "HTTP/1.1 301 Moved Permanently" );
  5. $host = $_SERVER[‘HTTP_HOST’];
  6. $uri = rtrim(dirname($_SERVER[‘PHP_SELF’]), ‘/\\);
  7. $query = ($_SERVER[‘REDIRECT_QUERY_STRING’]) ? ‘?’.$_SERVER[‘REDIRECT_QUERY_STRING’] : NULL ;
  8. header("Location: http://$host$uri/$map_to/$query");
  9. }
  10. }

Alright, as you can see, I’m requiring the redir_map.php file we created previously. I then loop through each name value pair in the array (the key as $old_url and the value as $map_to). Then I look for the page that’s being requested by the incoming agent, and see if it matches something in my list. You could write a regular expression for more sophisticated matching, but this works for simple stuff.

Then comes the meat of this operation. The first thing I’m going to do is send a header telling the agent that the file it’s requesting has been moved permanently. This way, Google, Yahoo et al will make sure to index the correct page, leaving the old reference url alone.

The $host, $uri and $query variables should be obvious (see exception below). After setting those, I simply redirect the request to the new location using the PHP header function again.

Last, I exit out of PHP execution. You’ll want to do this just to make sure your script doesn’t continue on with whatever it would have done had you not just redirected your visitor to the new place.

That’s it! I’m going to leave comments open on this to see what other ideas others may have, or see what others have done for this same problem. Thanks everyone—and keep an eye on those keys and water-jugs!

No comments:

Post a Comment