Sitemap and robots.txt
Blogatto can generate a sitemap XML file and a robots.txt file for search engine optimization.
Sitemap
The sitemap includes all static routes and blog post URLs.
Basic setup
import blogatto/config
import blogatto/config/sitemap
import gleam/option.{None}
let sitemap_config =
sitemap.new("/sitemap.xml")
let cfg =
config.new("https://example.com")
|> config.sitemap(sitemap_config)
This generates dist/sitemap.xml with entries for every static route and blog post.
SitemapConfig fields
| Field | Type | Description |
|---|---|---|
path |
String |
Output path relative to output_dir |
filter |
Option(fn(String) -> Bool) |
Include/exclude routes by URL |
serialize |
Option(fn(String) -> SitemapEntry) |
Custom entry serialization |
Filtering routes
Exclude specific routes from the sitemap:
import gleam/string
let sitemap_config =
sitemap.new("/sitemap.xml")
|> sitemap.filter(fn(url) {
// Exclude draft pages
!string.contains(url, "/draft")
})
Custom serialization
Control the priority, change frequency, and last-modified date for each entry:
import blogatto/config/sitemap.{Monthly, Weekly}
import gleam/option.{None, Some}
import gleam/string
let sitemap_config =
sitemap.new("/sitemap.xml")
|> sitemap.serialize(fn(url) {
let #(priority, freq) = case string.contains(url, "/blog/") {
True -> #(0.7, Some(Weekly))
False -> #(1.0, Some(Monthly))
}
sitemap.SitemapEntry(
url: url,
priority: Some(priority),
last_modified: None,
change_frequency: freq,
)
})
SitemapEntry fields
| Field | Type | Description |
|---|---|---|
url |
String |
The full URL for this entry |
priority |
Option(Float) |
Priority hint (0.0 to 1.0) |
last_modified |
Option(Timestamp) |
Last modification date |
change_frequency |
Option(ChangeFrequency) |
How often the page changes |
ChangeFrequency values
| Value | Description |
|---|---|
Always |
Changes every access |
Hourly |
Changes approximately every hour |
Daily |
Changes approximately every day |
Weekly |
Changes approximately every week |
Monthly |
Changes approximately every month |
Yearly |
Changes approximately every year |
Never |
Archived, will not change |
Robots.txt
The robots.txt file tells search engine crawlers which pages to index.
Basic setup
import blogatto/config
import blogatto/config/robots
let robots_config =
robots.new("https://example.com/sitemap.xml")
|> robots.robot(robots.Robot(
user_agent: "*",
allowed_routes: ["/"],
disallowed_routes: [],
))
let cfg =
config.new("https://example.com")
|> config.robots(robots_config)
This generates dist/robots.txt:
Sitemap: https://example.com/sitemap.xml
User-agent: *
Allow: /
Multiple user agents
Add different policies for different crawlers:
let robots_config =
robots.new("https://example.com/sitemap.xml")
|> robots.robot(robots.Robot(
user_agent: "*",
allowed_routes: ["/"],
disallowed_routes: ["/admin/"],
))
|> robots.robot(robots.Robot(
user_agent: "Googlebot",
allowed_routes: ["/"],
disallowed_routes: [],
))
RobotsConfig fields
| Field | Type | Description |
|---|---|---|
sitemap_url |
String |
Full URL to the sitemap |
robots |
List(Robot) |
Crawl policies per user agent |
Robot fields
| Field | Type | Description |
|---|---|---|
user_agent |
String |
Crawler name ("*" for all) |
allowed_routes |
List(String) |
Paths the crawler may access |
disallowed_routes |
List(String) |
Paths the crawler must not access |
Combining sitemap and robots.txt
A typical SEO setup uses both together, with the robots.txt pointing to the sitemap:
import blogatto/config
import blogatto/config/robots
import blogatto/config/sitemap
import gleam/option.{None}
let site_url = "https://example.com"
let sitemap_config =
sitemap.new("/sitemap.xml")
let robots_config =
robots.new(site_url <> "/sitemap.xml")
|> robots.robot(robots.Robot(
user_agent: "*",
allowed_routes: ["/"],
disallowed_routes: [],
))
let cfg =
config.new(site_url)
|> config.sitemap(sitemap_config)
|> config.robots(robots_config)