Simple Static Site Generator

Last update: 2024-08-18

A Script to Generate .html Files from .md Files

I asked ChatGPT a few questions and after some modifications and tests to given code, I have a working script to generate .html files from .md files (with math typesetting, bibliographic references and an RSS feed).¹

#!/bin/bash

# Source and destination directories
SOURCE_DIR="/path/to/source/directory"
INPUT_DIR="/path/to/source/directory/input"
OUTPUT_DIR="/path/to/source/directory/output"

# Number of latest files to copy (you can change this number)
NUM_FILES=11

# Check if source and destination directories exist
if [ ! -d "$SOURCE_DIR" ]; then
    echo "Source directory does not exist: $SOURCE_DIR"
    exit 1
fi

if [ ! -d "$INPUT_DIR" ]; then
    echo "Input directory does not exist: $INPUT_DIR"
    exit 1
fi

# Find and copy the latest markdown files
find "$SOURCE_DIR" -type f -name "*.md" -exec stat -f "%m %N" {} + | \
    sort -n | \
    tail -n "$NUM_FILES" | \
    cut -d' ' -f2- | \
    while read -r file; do
        cp "$file" "$INPUT_DIR"
    done

echo "Done! Copied the latest $NUM_FILES markdown files."

# generate html files
# Path to directory containing Markdown files
# INPUT_DIR="/path/to/source/directory/input"
# Path to directory where HTML files will be saved
# OUTPUT_DIR="/path/to/source/directory/output"
# Loop through Markdown files
for file in "$INPUT_DIR"/*.md; do
    # Check if the Markdown file has been modified since the HTML file was created
    if [ "$file" -nt "${OUTPUT_DIR}/$(basename "${file%.md}.html")" ]; then
        # Convert the Markdown file to HTML
        pandoc --standalone --mathjax --template template.html --bibliography ./refs.bib --citeproc --csl ./apa.csl "$file" -o "${OUTPUT_DIR}/$(basename "${file%.md}.html")"
    fi
done

echo "html files generated successfully at OUTPUT_DIR"

# generate RSS feed
# Define the output file for the RSS feed
OUTPUT_FILE="/path/to/source/directory/output/feed.rss"

# Set the variables
TITLE="title"
DESCRIPTION="description"
LINK="https://your_url"
LANGUAGE="en-ca"
DATE=$(date +"%a, %d %b %Y %T %z") # Current date and time in RFC-822 format
NOTE="Thank you for keeping RSS alive."
GENERATOR="Custom RSS Generator"

# Find the last 11 HTML files in the output directory
# OUTPUT_DIR="/path/to/source/directory/output"
HTML_FILES=$(ls -1t "$OUTPUT_DIR"/*.html 2>/dev/null | head -n 11)

# Start writing the RSS feed  
echo "<?xml version=\"1.0\" encoding=\"UTF-8\" ?>
<rss version=\"2.0\">
<channel>
<title>$TITLE</title>
<description>$DESCRIPTION</description>
<link>$LINK</link>
<language>$LANGUAGE</language>
<lastBuildDate>$DATE</lastBuildDate>
<note>$NOTE</note>
<generator>$GENERATOR</generator>" > "$OUTPUT_FILE"

# Loop through each HTML file and extract required fields
for FILE in $HTML_FILES; do
    MOD_TIME=$(date -R -r "$FILE")
    TITLE=$(grep -o '<title>[^<]*' "$FILE" | sed 's/<title>//')
    DESCRIPTION=$(grep -o '<meta name="description" content="[^"]*' "$FILE" | sed 's/<meta name="description" content="//')
    DESCRIPTION=${DESCRIPTION:-"description"}
    URL="https://your_url$(basename "$FILE")"
    guid="$URL"
    # Write item to RSS feed
    echo "<item>
    <pubDate>$MOD_TIME</pubDate>
    <title>$TITLE</title>
    <link>$URL</link>
    <description>$DESCRIPTION</description>
    <guid>$guid</guid>
    </item>" >> "$OUTPUT_FILE"
done

# Close the RSS feed
echo "</channel>
</rss>" >> "$OUTPUT_FILE"

echo "Generated RSS feed $OUTPUT_FILE successfully."

Footnotes

Based on code via ChatGPT↩︎