Google, Sitemaps, Multi-Culture, and ASP.NET
A while ago, by watching my web server logs and some statistics on Google Analytics, I discovered that writing content in english and having a server based in the US has the effect of my site being mostly visible from within the US. It is also a bit visible when using localized google sites, but the page rank is quite low... So I'm tried to improve my page rank.
Google Webmaster Tools is an interesting tool shows rather precisely how Google scans a website. It can alert if it encounters server errors (500), broken links (404) or any other unusual error for links that point to a website. That tool helped me fixing an URL rewriting bug a while ago.
There's also a section called Sitemap, which allows you to provide explicitly a list of urls from your site. This is more of a hint than an order, but that helps to have the content being indexed a bit more quickly.
To have my content displayed in multiple languages, namely English, French and Spanish, I tried multiple techniques, one being the use of the automatic culture selection of ASP.NET. This is fine when a "real" user is browsing a site, but for the indexing this is not effective. The automatic culture is based on the accept-language HTTP header, which is not set by the google crawler... Google then only indexes the default language, which is English for me.
Then I tried the HTTP query parameters, by adding a &lang=fr-FR at the end of each page. Well, that not good enough either, since it seems that Google is not indexing dynamic content pretty well.
Finally, I tried adding the culture just before the aspx extension (default.fr-FR.aspx), which "fools" google into thinking this is static content. This is an extension of the url rewriting technique I use to have my urls containing the titles of my blog posts.
This time it worked! I have my remote control page listed in the first page of Google France and Spain with "Bluetooth Remote Control" query. Also, to make sure that I have links to each and every language available without explicitly placing them in the sitemap, I also placed "language" flag image links at the top of each page. This is not useful for real users, since culture will probably be selected correctly in the first place, but it'll help indexing.
I also noted that having Hn HTML tags improves indexing. Search engine rank optimization is a strange and obscure science... :)
PS. : Sorry for the bad spanish language, it's a raw Google translator cut and paste. I'm in the process of having the text checked by a spanish speaker :)