Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Sitemap and robots.txt

Blogatto can generate a sitemap XML file and a robots.txt file for search engine optimization.

Sitemap

The sitemap includes all static routes and blog post URLs.

Basic setup

import blogatto/config
import blogatto/config/sitemap
import gleam/option.{None}

let sitemap_config =
  sitemap.new("/sitemap.xml")

let cfg =
  config.new("https://example.com")
  |> config.sitemap(sitemap_config)

This generates dist/sitemap.xml with entries for every static route and blog post.

SitemapConfig fields

FieldTypeDescription
pathStringOutput path relative to output_dir
filterOption(fn(String) -> Bool)Include/exclude routes by URL
serializeOption(fn(String) -> SitemapEntry)Custom entry serialization

Filtering routes

Exclude specific routes from the sitemap:

import gleam/string

let sitemap_config =
  sitemap.new("/sitemap.xml")
  |> sitemap.filter(fn(url) {
    // Exclude draft pages
    !string.contains(url, "/draft")
  })

Custom serialization

Control the priority, change frequency, and last-modified date for each entry:

import blogatto/config/sitemap.{Monthly, Weekly}
import gleam/option.{None, Some}
import gleam/string

let sitemap_config =
  sitemap.new("/sitemap.xml")
  |> sitemap.serialize(fn(url) {
    let #(priority, freq) = case string.contains(url, "/blog/") {
      True -> #(0.7, Some(Weekly))
      False -> #(1.0, Some(Monthly))
    }
    sitemap.SitemapEntry(
      url: url,
      priority: Some(priority),
      last_modified: None,
      change_frequency: freq,
    )
  })

SitemapEntry fields

FieldTypeDescription
urlStringThe full URL for this entry
priorityOption(Float)Priority hint (0.0 to 1.0)
last_modifiedOption(Timestamp)Last modification date
change_frequencyOption(ChangeFrequency)How often the page changes

ChangeFrequency values

ValueDescription
AlwaysChanges every access
HourlyChanges approximately every hour
DailyChanges approximately every day
WeeklyChanges approximately every week
MonthlyChanges approximately every month
YearlyChanges approximately every year
NeverArchived, will not change

Robots.txt

The robots.txt file tells search engine crawlers which pages to index.

Basic setup

import blogatto/config
import blogatto/config/robots

let robots_config =
  robots.new("https://example.com/sitemap.xml")
  |> robots.robot(robots.Robot(
    user_agent: "*",
    allowed_routes: ["/"],
    disallowed_routes: [],
  ))

let cfg =
  config.new("https://example.com")
  |> config.robots(robots_config)

This generates dist/robots.txt:

Sitemap: https://example.com/sitemap.xml

User-agent: *
Allow: /

Multiple user agents

Add different policies for different crawlers:

let robots_config =
  robots.new("https://example.com/sitemap.xml")
  |> robots.robot(robots.Robot(
    user_agent: "*",
    allowed_routes: ["/"],
    disallowed_routes: ["/admin/"],
  ))
  |> robots.robot(robots.Robot(
    user_agent: "Googlebot",
    allowed_routes: ["/"],
    disallowed_routes: [],
  ))

RobotsConfig fields

FieldTypeDescription
sitemap_urlStringFull URL to the sitemap
robotsList(Robot)Crawl policies per user agent

Robot fields

FieldTypeDescription
user_agentStringCrawler name ("*" for all)
allowed_routesList(String)Paths the crawler may access
disallowed_routesList(String)Paths the crawler must not access

Combining sitemap and robots.txt

A typical SEO setup uses both together, with the robots.txt pointing to the sitemap:

import blogatto/config
import blogatto/config/robots
import blogatto/config/sitemap
import gleam/option.{None}

let site_url = "https://example.com"

let sitemap_config =
  sitemap.new("/sitemap.xml")

let robots_config =
  robots.new(site_url <> "/sitemap.xml")
  |> robots.robot(robots.Robot(
    user_agent: "*",
    allowed_routes: ["/"],
    disallowed_routes: [],
  ))

let cfg =
  config.new(site_url)
  |> config.sitemap(sitemap_config)
  |> config.robots(robots_config)