March 2024
In this article I want to present you the tiny utility mdbooker.
It allows me to convert my project’s README.md into a beautiful documentation site makesure.dev.
The utility works in conjunction with the amazing mdBook tool.
mdBook
generates a documentation site (a “book”) from a set of markdown files based on a SUMMARY.md.
Therefore, mdbooker
splits your README.md into a set of markdown files (based on header titles) and generates the SUMMARY.md.
REPO=username/reponame \
BOOK=path/to/book_folder \
awk -f mdbooker.awk README.md
where
username/reponame
- the GitHub repository where README resides. This is needed to correctly rewrite relative links.path/to/book_folder
(optional, default ./book
) - the output folder for generated markdown files.mdbook build
The deployment of final html/js/css (as generated by the previous step) to the public web is out of scope of this article.
The project is implemented as a single-file AWK script (Wait… What?!).
This part can be of interest to those who would like to sharpen their AWK-fu a bit.
Before explaining some bits of my implementation let’s come up with the requirements.
## titles
in README.md.## Section Title
by a href="#section-title"
. This requires special handling.## section
is immediately followed by ### sub-section
(So we want this instead of this).Now let’s take a look at how these requirements are implemented in code.
This implementation requires parsing (traversing) the input file README.md two times.
END { ... }
block. Here it explicitly traverses our file again (FILENAME
is a variable equal to the file passed to awk
command, README.md in our case). This pass:
[title](link)
and image links ![title](link)
require different handling.Couple more AWK trick.
Let me remind you how in markdown the hierarchy of headers is defined:
md | html |
---|---|
# Title |
<h1>Title</h1> |
## Title |
<h2>Title</h2> |
### Title |
<h3>Title</h3> |
etc. |
So to parse nesting you need to parse the number of #
. How do you do it with AWK?
It appears, AWK’s match()
function can match a regex in a string, and it sets RSTART
and RLENGTH
for you:
$ awk 'BEGIN { match("#####",/^#+/); print "RSTART="RSTART", RLENGTH="RLENGTH }'
RSTART=1, RLENGTH=5
This explains this line.
With awk you can easily produce the needed indentation using printf
.
This usage is common:
$ awk 'BEGIN { printf "%10s\n", "hello" }'
hello
This is not that obvious:
$ awk 'BEGIN { N=10; printf "%" N "s\n", "hello" }'
hello
But this gives you:
$ awk 'BEGIN { for (N=5;N<10;N++) printf "%" N "s\n", "hello" }'
hello
hello
hello
hello
hello
This explains the logic in this line. It allows producing this structure:
I hope this information should be enough to explain the logic of my script.
Here is an example tool’s output:
will use BOOK=book
fix image link: coverage.svg -> https://github.com/xonixx/makesure/raw/main/coverage.svg
generating: makesure.md...
fix #link: #installation -> Installation.md
fix #link: #os -> Prerequisites-OS.md
fix #link: #reached_if -> Directives-@reached_if.md
fix relative link: Makesurefile -> https://github.com/xonixx/makesure/blob/main/Makesurefile
generating: Features.md...
generating: Usage.md...
generating: Installation.md...
generating: Installation-Update.md...
generating: Prerequisites.md...
generating: Prerequisites-OS.md...
fix #link: #directives -> Directives.md
fix #link: #goal -> Directives-@goal.md
fix #link: #depends_on -> Directives-@depends_on.md
fix #link: #reached_if -> Directives-@reached_if.md
generating: Concepts.md...
generating: Directives.md...
generating: Directives-@options.md...
fix relative link: tests/24_define_everywhere.sh -> https://github.com/xonixx/makesure/blob/main/tests/24_define_everywhere.sh
generating: Directives-@define.md...
generating: Directives-@shell.md...
generating: Directives-@goal.md...
generating: Directives-@goal-Simple_goal.md...
fix #link: #naming-rules -> Directives-@goal-Naming_rules.md
generating: Directives-@goal-Glob_goal.md...
fix relative link: docs/parameterized_goals.md -> https://github.com/xonixx/makesure/blob/main/docs/parameterized_goals.md
generating: Directives-@goal-Parameterized_goal.md...
generating: Directives-@goal-Naming_rules.md...
generating: Directives-@doc.md...
generating: Directives-@depends_on.md...
generating: Directives-@reached_if.md...
generating: Directives-@lib.md...
generating: Directives-@use_lib.md...
generating: Bash_completion.md...
generating: Design_principles.md...
generating: Omitted_features.md...
fix relative link: docs/DEVELOPER.md -> https://github.com/xonixx/makesure/blob/main/docs/DEVELOPER.md
generating: Developer_notes.md...
generating: Developer_notes-AWK.md...
generating: Articles.md...
generating: Similar_tools.md...
The only alternative I’ve found is Docsify – small JS lib that renders your README.md as a single-page website. I didn’t like it, though, because such single-pages usually give poor user experience.