Astro has an official Sitemap integration, which works well for basic websites, but it does lack a couple of features like being able to set lastmod on a per-page-basis, and while the integration supports some basic i18n, it is not suited for projects where not only the content, but also the slug of a page is localised. Luckily, Astro’s endpoints functionality makes it easy to generate a custom sitemap specific to the needs of a project. It may not be as simple as adding an integration, but the added flexibility and greater control of the outcome are well worth the little additional effort.
Here’s how I went about it in a recent project.
Edit: I’ve updated the code below to reflect the newest requirements of Astro’s static endpoints, namely uppercase method names and returning a Response
object
Setting the Stage
Over the last couple of months, I have been working on implementing a multi-language website for a client in Astro and as part of my finishing touches, I wanted to automatically generate a sitemap.xml
file containing an accurate lastmod
timestamp, and, more importantly, a set of xhtml:link
properties linking to the page in other languages.
As mentioned in the introduction, this wasn’t possible with Astro’s sitemap integration, so instead I set out to generate the XML myself. The following code is an adaptation of what I used in the project, which you should be able to use in order to replicate a similar endpoint adapted to the needs of your own project.
Prerequisites
To follow along, you’ll need an Astro project with the xml
package installed from npm
. Ideally, you should also have some methods in place that will return the URLs of your pages and the relationship between them, although that’s not a requirement if you’re only interested in how to get a per-page lastmod
instead of a site-wide one.
In this example, the pages are stored in /content/pages/
, and represented as a single JSON-file for all languages, however the script should be easily adapted to pages stored as individual Markdown files for every language as well.
An example for an About-page:
{
"name": {
"de": "Über uns",
"en": "About us"
},
"slug": {
"de": "ueber-uns",
"en": "about-us"
},
"meta": {
"noindex": false,
"languages": ["de", "en"],
"lastMod": "2023-05-27"
},
"content" {
"de": "…",
"en": "…"
}
}
How it Works
Astro allows generating routes in your project that allow you to generate and return any kind of data, be it images, PDFs, or XML files. All you have to do is create a file named something.xml.js
in your project’s /pages/
directory that exports a GET()
method. This get method will be called during the build process and expects a Response
object to be returned, which should contain the data in its body.
In this example, we’ll create a sitemap.xml
file in /pages/
that will contain our generated XML-sitemap in the body
of the Response
object returned by GET()
.
Putting Theory to Practice
In order to generate a proper XML-sitemap, we need to do the following:
- Get all the pages
- Prepare an array of routes containing the information of our sitemap-items, including the
lastmod
date and alternate versions of that page in another language - Transform the routes array into a format the
xml
package can properly transform into XML - Add any other elements necessary for an XML-sitemap
- Transform the JS-objects into valid XML
- Create a
Response
object with the XML and correctContent-Type
and return it inGET()
Here’s the code that sets the right properties and generates the sitemap, annotated with comments to explain what’s going on:
import xml from 'xml';
export async function GET(context) {
// grab all the pages from wherever they're stored
// filter out the ones that shouldn't be indexed
const pages = Object.values(import.meta.glob('/content/pages/**/*.json', { eager: true, import: 'default' })).filter((page) => !page?.meta?.noindex);
// define a default language for unprefixed URLs
const defaultLang = 'en';
// prepare a space to store all the routes
const routes = [];
// iterate over all pages and add their URLs, language, alternate versions and last modification date to the routes
pages.forEach((page) => {
// create a place to store versions of this page in different languages
const alternateVersions = {};
// iterate over all languages this page is available in
// get its url for that language
page.meta.languages.forEach((lang) => {
let url;
if (lang === defaultLang) url = `/${page.slug[lang]}/`;
else url = `/${lang}/${page.slug[lang]}/`;
alternateVersions[lang] = url;
routes.push({
alternateVersions,
url,
lang,
lastMod: localizedData.meta.lastMod ? new Date(localizedData.meta.lastMod) : new Date(),
});
});
});
// generate the items of the sitemap from the routes
const sitemapItems = routes.reduce((acc, route) => {
const url = [
{ loc: `${context.site}/${route.url}`.replace(/(?<!:)\/{2,}/g, '/') }, // replace double slashes except after the protocol, i.e. https://
{ lastmod: route.lastMod.toISOString().split('T')[0] },
];
if (Object.values(route.alternateVersions).length > 1) {
// Learn more about the _attr-property here: https://www.npmjs.com/package/xml
Object.entries(route.alternateVersions).forEach(([lang, localUrl]) => {
url.push({
'xhtml:link': {
_attr: {
rel: 'alternate',
hreflang: lang,
href: `${context.site}/${localUrl}`.replace(/(?<!:)\/{2,}/g, '/'),
},
},
});
});
}
acc.push({
url,
});
return acc;
}, []);
// prepare the sitemap as a JS-object that can be converted to XML
const sitemapObject = {
urlset: [
{
_attr: {
xmlns: 'http://www.sitemaps.org/schemas/sitemap/0.9',
'xmlns:news': 'http://www.google.com/schemas/sitemap-news/0.9',
'xmlns:xhtml': 'http://www.w3.org/1999/xhtml',
'xmlns:image': 'http://www.google.com/schemas/sitemap-image/1.1',
'xmlns:video': 'http://www.google.com/schemas/sitemap-video/1.1',
},
},
...sitemapItems,
],
};
return {
// return a valid XML-string with our converted sitemapObject
// the stylesheet is optional
body: `<?xml version="1.0" encoding="UTF-8"?><?xml-stylesheet type="text/xsl" href="/sitemap.xsl"?>${xml(sitemapObject)}`,
};
return new Response(
`<?xml version="1.0" encoding="UTF-8"?><?xml-stylesheet type="text/xsl" href="/sitemap.xsl"?>${xml(sitemapObject)}`,
{ headers: { 'Content-Type': 'application/xml' } },
);
}
Notice the <?xml-stylesheet type="text/xsl" href="/sitemap.xsl"?>
in the returned XML-string? That is completely optional, but since browsers render XML-documents that include the xhtml
namespace as HTML, you might want to include one if you’d like to inspect the resulting sitemap from the browser. You can find a good starting point for an XML-stylesheet here. Make sure to place that stylesheet in the /public/
folder of your project if you’d like to include it.
A Custom Sitemap Fresh from the Oven
And there you have it: simply visit /sitemap.xml
in your browser while the dev server is running, or after you’ve built your project, and you should see your brand new sitemap in action and ready for submission to various search consoles.
Of course, this is a very basic example of what you can do with Astro endpoints, but I believe it is a useful one nonetheless, especially since it gives you full control over how your sitemap is generated. As always, feel free to let me know what you think about this approach to generating sitemaps in Astro on Mastodon, and if you have any questions, please don’t hesitate to ask them. I’ll be back with another post next month!