I’ve started hosting a few simple Rails applications on Heroku and so far, I’m really pleased with their hosting service. This post isn’t as much about Heroku as it is how to serve an XML sitemap for your application. Heroku apps don’t give you file system access from within your application, so you’re forced to host your sitemap on an external service, like Amazon S3. There’s a great plugin called sitemap_generator that lets you generate a sitemap and upload it to your Amazon S3 account using carrierwave and Fog.
Even though sitemap_generater will ping all of the major search engines when you build your sitemap (which you should rebuild regularly with a rake task), you will want to configure the sitemap in Google Webmaster Tools. Unfortunately, Webmaster Tools will only let you set a sitemap to come from your domain, not another host. What can we do to fix that?
Well, the easiest solution I came up with was to create a controller to handle your sitemap, but redirect it to the location of your sitemap on S3 (via CloudFront obviously). So, lets get to the code. Create a file called sitemap_controller.rb and paste this in:
class SitemapController < ApplicationController def index redirect_to SITEMAP_PATH end end
This will redirect a call to the index action of this controller to the value of SITEMAP_PATH. But what is SITEMAP_PATH? Well, in my case, my application relies heavily on a custom Rails engine where all of my controllers and models are defined. So I figured it would be nice to configure the location of the sitemap on a per application basis. So in my actual rails application, I created an initializer and set the value of SITEMAP_PATH. Put this in sitemap.rb in config/initializers:
That's the actual location of your sitemap on S3 (again, most likely via CloudFront). Now all that's left is to wire up a Rails route to actually respond to a request for sitemap.xml. That's done easily enough with the following:
match "/sitemap.xml", :controller => "sitemap", :action => "index"
That's it! Simply restart your app if its already running so the initializer will load and access your sitemap.