Now that I have a blog, I have been waiting eagerly for the google crawler. Some call me impatient. So I thought it would be good to have a secondary sitemap for google to speed things up a bit. I wrote a small PHP script for this and thought maybe other people might need it too. Simply create the file as sitemap_blog.php in the root directory of the website, change the parameters as needed, and then add it as an additional sitemap to Google Search Console.
Disclaimer 1: Attention, this is my own development and has nothing to do with Elixir. There is no support if you have problems to use it. You should have basic PHP knowledge and take care to change the variables in the beginning of the file to your situation.
Disclaimer 2: I’m actually not a PHP developer, hope the code is still reasonably clean.
<?php
header('Content-type: application/xml');
// Output format as defined here: https://www.google.com/sitemaps/protocol.html
// Create this file into your root folder, e.g. https://www.example.com/sitemap_blog.php
// It will not work if you put it somewhere else!
// Change as needed:
// If your blog is located in example.com/blog this is what you need
// if it is example.com/myblog change it to "require_once "./myblog/files/spyc.php";"
// Don't remove the dot in the beginning of the path
require_once "./blog/files/spyc.php";
// Again, change it to the real url you use for your blog:
$url_prefix = 'https://www.example.de/blog/';
// This is the "Posts folder" as defined it in alloy, change it to the name you used there!
$blog_files = '/blog_files';
// Change frequency of a single blog entry (usually rarely changed after publishing)
// Don't change it, if you don't know what this is about
$change_frequency = 'monthly';
// That's it, this part should need no changes by you
// Read all files
$files = array_diff(scandir(__DIR__ . $blog_files), array('.', '..'));
// Scan for all non draft or future posts
foreach($files as $file) {
$fileExt = pathinfo($file, PATHINFO_EXTENSION);
if ($fileExt = "md") {
// Get date of post
$splitFilename = (explode("_",$file));
$originalDate = $splitFilename[0];
$todaysDate = date("Y-m-d");
if (strtotime($originalDate) <= strtotime($todaysDate)) {
// Blog entry has no future date
// Read content now
$fileContents = file_get_contents($file);
$fileTime = filemtime(__DIR__ . $blog_files . '/' . $file);
$parts = preg_split('/[\n]*[-]{3}[\n]/', $fileContents, 3);
$postID = (explode(".md",$splitFilename[1]));
// Parse YAML part
$frontMatter = spyc_load_file($parts[1]);
if ($frontMatter['draft'] != true) {
// No draft, no future post, ready for output
$sitemap_posts[$i]['url'] = $url_prefix . '?id=' .$postID[0];
$sitemap_posts[$i]['last_modified'] = date ("Y-m-d", $fileTime);
$i++;
}
}
}
};
$output = '<?xml version="1.0" encoding="UTF-8"?>' . "\n";
$output .= '<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">' . "\n";
echo $output;
foreach($sitemap_posts as $sitemap_out) {
echo '<url>' . "\n";
echo '<loc>' . $sitemap_out['url'] . '</loc>' . "\n";
echo '<lastmod>' . $sitemap_out['last_modified'] . '</lastmod>' . "\n";
echo '<changefreq>' . $change_frequency . '</changefreq>' . "\n";
echo '</url>' . "\n";
}
echo '</urlset>';
?>