NAME
sblg
—
static blog utility
SYNOPSIS
sblg |
[-acjlLrV ] [-C
file] [-o
file] [-s
sort] [-t
template] file ... |
DESCRIPTION
The sblg
utility merges XML articles and
templates in a number of ways.
- Standalone mode (
-c
) merges a single article's content and metadata into a template. For example, "sblg -o- -c foo.xml" merges foo.xml into the template article-template.xml. - Blog mode (the default) merges multiple articles' content and metadata into a template. For example, "sblg -o- bar.xml baz.xml" merges bar.xml and baz.xml into the template blog-template.xml.
- Combined mode (
-C
) links multiple articles' content and metadata in standalone style. For example, "sblg -o- -C bar.xml baz.xml" will show content for only bar.xml, but metadata for both inputs. The similar-L
flags runs the process for each input file without reparsing. - Atom mode (
-a
) merges multiple articles into an Atom feed template. - JSON mode (
-j
) merges all articles into a JSON object.
By default, sblg
operates in blog mode
with template blog-template.xml. Its arguments are
as follows:
-a
- Creates an Atom feed from its input files.
-c
- Create standalone articles instead of merging articles together.
-l
- Instead of emitting any output files, simply process the input and report
a table of tags. This table consists of the input file name, a tab, then
the tag. (Also known as article-major order.) The tag has escaped
white-space printed as unescaped. You can also use
-r
to have tag-major order and-j
for JSON output. Specify-l
twice to show matches (tags for article-major, articles for tag-major) all on one tab-separated line, instead of one per line. -r
- Print the
-l
tag listing in “tag-major” order wherein the first column is the tag and the second column is the article. If the-j
flag is specified, this is JSON formatted. -j
- JSON instead of XML output mode. This behaves as in blog mode, but outputs
JSON instead of XML. If
-l
is specified, the tag listing will be displayed in JSON instead. See JSON Schema for details. -C
file- Like
-c
, but creating a blog from the article in file with the remaining files being articles used for navigation. -L
- Like
-C
, but acting on all input files, translating the input to output files such as in-c
without-o
. If there are multiple articles in an output file, the output is recreated for each (so only the last will remain). So running with “article0.xml article1.xml” will produce “article0.html article1.html” as if-C
were seperately specified for both. This avoids needing to parse all inputs for each input. -o
file- Output file. If unspecified, standalone articles have
.html
appended to the input file name, unless the input file extension is.xml
, in which case the.xml
is replaced by.html
. If multiple input files are specified,-o
is ignored. If unspecified for the blog, blog.html is used by default. If unspecified for the Atom feed or JSON, atom.xml or blog.json, respectively, is used by default. Use-o
- for standard output. -s
sort- Change how articles are sorted before being written into navigation or article entries. The default is date, which sorts oldest-newest by date. You can also specify filename, which sorts in increasing A–Z case-sensitive order of the source filename; cmdline for the command-line order; ititle for the case-insensitive document title; or title for the case-sensitive document title. Each sort may be prefixed with "r" (e.g., rcmdline) to reverse the sort.
-t
template- Template for all modes. If unspecified, defaults to
article-template.xml for
-c
, atom-template.xml for-a
, and blog-template.xml otherwise. -V
- Emits the version as
sblg-xx.yy.zz
and exits. - file ...
- Input files. In standalone mode with
-c
, input XML files are merged with a template into an output file. Otherwise, multiple input files are merged into a single blog.
All input must be well-formed XML. Element names and attributes are case-sensitive.
Article Input
Article input files consist of the following within the document:
<article data-sblg-article="1"> <header> <h1>Article Name</h1> <address>Author Name</address> <time datetime="2013-06-29">29 June, 2013</time> </header> <aside> This is used as the feed <b>abstract</b>. </aside> <p> Some text in the <b>content</b>. <img src="foo.jpg" alt="An image for the feed" /> </p> </article>
All content outside of the element with the
data-sblg-article="1"
attribute, usually
an <article>
, is discarded. Then the article
is scanned for the following:
- the article title (both as text data only and inclusive of markup) is
extracted from the first
<hn>
(header 1–4); - the article publication date is extracted from the datetime attribute of
the first
<time>
(which must be a date, YYYY-MM-DD, or time, YYYY-MM-DDTHH:MM:SSZ) interpreted in UTC; - the author (both as text data only and inclusive of markup) from the first
<address>
; - the first
<aside>
is used for the feed abstract; and - the first
<img>
is associated as the article's image.
These are all set once: subsequent invocations will not override
prior setting. See data-sblg-aside
,
data-sblg-author
,
data-sblg-datetime
,
data-sblg-img
, and
data-sblg-title
for explicitly setting or overriding
these values.
If unspecified, the default article title text (and mark-up) is "Untitled article", the default author text (and mark-up) is the "Unknown author", the publication time is set to the document's file-system creation time, the abstract is left empty, and the image is empty.
There are a number of special attributes that are recognised in the input file.
data-sblg-aside=string
- Sets the aside material as otherwise would be set from the first
<aside>
element. It overrides the previously set aside. The alternativedata-sblg-const-aside
only sets the aside if it has not yet been set. - Sets the author as otherwise would be set from the first
<address>
element. It overrides the previously set author. The alternativedata-sblg-const-author
only sets the author if it has not yet been set. data-sblg-datetime=datetime
- Overrides the first
<time>
element. This must be YYYY-MM-DD or YYYY-MM-DDTH:MM:SSZ. It overrides the previously set date. The alternativedata-sblg-const-datetime
only sets the date if it has not yet been set. data-sblg-img=url
- Set the image associated with the article. It overrides any previously set
image. The alternative
data-sblg-const-img
only sets the image if it has not yet been set. data-sblg-lang=string
- May only be set on the
<article>
and specifies one or more space-separated languages for the document. You can escape spaces with a backslash (“\”) if you have spaces in the tag name, e.g., “foo\ bar”. These languages are removed in the “stripping” operations for the Tag Symbols. data-sblg-set-xxx=string
- This allows arbitrary values to be attached to the article. For example,
specifying
data-sblg-set-foo="bar"
sets thefoo
keyword tobar
. If specified multiple times for the same key, only the last value is used. These may be retrieved with${sblg-get}
or queried with${sblg-has}
of the Tag Symbols. data-sblg-sort=first|last
- May only be set on the
<article>
element and overrides the article's position relative to other articles. This can be eitherfirst
orlast
. If multiple articles have the same sort override, they are ordered in the natural way. data-sblg-source=file
- Set the source filename associated with the article. It overrides the implicit value set from the actual file.
- This tag may be specified on any element within the article and consists of space-separated tag names. You can escape spaces with a backslash (“\”) if you have spaces in the tag name, e.g., “foo\ bar”. These tags are extracted for navigation tag operation. It may not contain any tabs.
data-sblg-title=string
- Sets the title as otherwise would be set in a
<hN>
element. It overrides the previously set title. The alternativedata-sblg-const-title
only sets the title if it has not yet been set.
Standalone Template
The standalone template file replaces the first element with the
data-sblg-article="1"
attribute, usually
an <article>
, with the article contents.
<body> <header>This consists of a single blog entry.</header> <article>This is kept.</article> <article data-sblg-article="1">This is removed.</article> <footer>Something.</footer> </body>
Article templates may contain the following attributes:
data-sblg-article=boolean
- If set to true, the contents are replaced with the input article. This only happens once: subsequent elements are ignored.
data-sblg-ign-once=boolean
- If an element has the
data-sblg-article="1"
attribute set to true, the element is not processed as an article and thedata-sblg-ign-once
attribute is removed.
See Tag Symbols for a list of symbols that will be replaced if found in attribute value or textual contexts. These may occur anywhere in the template document.
Blog Template
The blog template replaces elements with the
data-sblg-article="1"
attribute, usually
<article>
, with ordered (by default, newest to
oldest) article contents. If there aren't enough articles, the element is
removed.
Elements with a
data-sblg-nav="1"
attribute, usually
<nav>
, are replaced by the same list of
articles within an unordered list.
If an element has both attributes, only the first is recognised.
Usually, the article elements are used for displaying full articles, while the navigation elements are used for displaying navigation to articles, such as just their titles, dates, and links.
<body> <header>This consists of two blog entries.</header> <nav data-sblg-nav="1" /> <article data-sblg-article="1" /> <article data-sblg-article="1" /> <footer>Something.</footer> </body>
Article templates may contain several attributes.
data-sblg-article=boolean
- If set to true, the contents (including the element itself) are replaced with the input article.
data-sblg-articletag=string
- If an element with the
data-sblg-article="1"
attribute contains this, limit displayed articles to those matching the space-separated tags or${sblg-get|xxx}
when in-L
or-C
mode. This scans for tags from the current article in the list of articles. data-sblg-ign-once=boolean
- If an element with the
data-sblg-article="1"
attribute has this set to true, the element is not processed as an article and thedata-sblg-ign-once
attribute is removed. data-sblg-permlink=boolean
- If an element with the
data-sblg-article="1"
attribute has this set to true, a permanent link to the article's input filename is emitted within a<div data-sblg-permlink="1">
element after the element with thedata-sblg-article="1"
attribute.
The navigation element may contain several attributes.
- Deprecated alias for respective content and element styles list-keep and keep if true, list-summarise and keep if false.
- Style for formatting articles into the content of the navigation element.
May be keep, to output the content per-article and
perform Tag Symbols substitution;
summarise or summarize, to
discard content and output the article time followed by a link to the
article; list-keep, same as
keep except surrounding each article with
<li>
and all articles with<ul>
; or list-summarise, or list-summarize, same as summarise except surrounding each article with<li>
and all articles with<ul>
. If not given or unknown, defaults to list-summarise. - Style for the navigation element. May be keep, to
output the element as-is once around all articles;
keep-strip, to output the element without attributes
once around all articles; repeat-strip, to output
the element without attributes around each article (if the content styles
are list-keep or
list-summarise, the element is output within the
li
); or discard to suppress output. If not given or unknown, defaults to keep. - Overrides the global search order given with
-s
. Uses the same names. If the search name is not recognised, the attribute is silently ignored and the global search order used. - How many articles will skip being displayed (so if you have tags, it will only account for articles that would meet those tags) before showing the first navigation entry. Starts at one (a value of zero is the same as a value of one).
- If the
<nav>
element contains this attribute with a positive integer, it is used to limit the number of navigation entries. - Only articles with matching tags are shown. You can specify multiple
space-separated tags, for instance,
data-sblg-navtag="foo bar"
will search for foo or bar. Tags to be matched against are extracted from the space-separateddata-sblg-tags
element of each article's topmost element. Escape spaces with a backslash (“\”) if you have spaces in the tag name, e.g., “foo\ bar”. Use${sblg-get|xxx}
or (for multi-word values)${sblg-get-escaped|xxx}
when in-C
or-L
mode to use the current article's set data as part of a string, e.g.,location-${sblg-get|location}
. - Deprecated alias for respective content and element styles keep and discard if true, list-summarise and keep if false.
Combined Template
This is identical to the Blog
Template except that a single article is noted with
-C
, and this is the only article displayed in the
article stub. Furthermore, like in standalone mode,
Tag Symbols may be used anywhere in
the document template and refer to the current article unless within a
navigation element, in which case the symbol resolves to the
currently-printed article. In the given example,
<body> <header>This consists of two blog entries.</header> <nav data-sblg-nav="1" /> <article data-sblg-article="1" /> <article data-sblg-article="1" /> <footer>Something.</footer> </body>
the navigation would be populated by all articles, but only the first article stub would be filled in with the specified article. The second would be removed.
This follows the usual rules of
data-sblg-articletag
, so if the article you specify
with -C
doesn't have the correct tag, it won't
inline the article.
Atom Template
The Atom template file must be a well-formed XML file where each
<entry>
element with a Boolean
data-sblg-entry
attribute is replaced by ordered
(newest to oldest) article information. If there aren't enough articles, the
element is removed. The template may contain pre-existing entries.
The following is a minimal template: anything less will not conform to the Atom specification:
<?xml version="1.0" encoding="utf-8"?> <feed xmlns="http://www.w3.org/2005/Atom"> <link href="http://example.org" /> <title>A Title Here</title> <updated /> <id /> <entry data-sblg-entry="1" data-sblg-forall="1" /> </feed>
The recognised elements are as follows. Un-recognised elements are printed verbatim.
<entry data-sblg-entry="1">
- Filled-in article entry. If the attribute is not specified, the entry is retained verbatim. Otherwise it is filled in with an article's information.
<id>
- If this is empty, it is filled in with the URL in
<link [rel="alternate"]>
, which must exist. Otherwise, the value is copied and used for subsequent feed entries. <link [rel="alternate"]>
- Unless an
<id>
is provided, thehref
attribute must be a full URL, e.g.,<link href="https://kristaps.bsd.lv/">
. Otherwise, it may be a relative path. This element must be first. <updated>
- This is filled in with the most recent article. Its contents are discarded.
There are a number of special attributes that may be given to the above elements.
data-sblg-altlink=boolean
- If an
<entry data-sblg-entry="1">
element contains this set to true, the alternate<link>
is printed. data-sblg-altlink-fmt=string
- If both
data-sblg-entry
anddata-sblg-altlink
are true for an<entry>
, the value is used as the link address. Accepts Tag Symbols, most commonly being${sblg-base}
. data-sblg-atomcontent=boolean
- If
<entry data-sblg-entry="1">
contains this set to true, the contents are printed directly and the Tag Symbols are processed. This overridesdata-sblg-altlink
anddata-sblg-content
. data-sblg-content=boolean
- If
<entry data-sblg-entry="1">
contains this set to true, the article's contents (everything within the element having thedata-sblg-article="1"
attribute) are inlined within the<content>
element with typehtml
. Tag Symbols are processed. data-sblg-entry=boolean
- Each
<entry>
element with this is filled in with article content. data-sblg-forall=boolean
- If an
<entry data-sblg-entry="1">
element contains this set to true, it is used for all remaining articles. Any<entry data-sblg-entry="1">
following this are discarded.
If not using data-sblg-atomcontent
,
entries are filled in with a <title>
,
<id>
, <author>
,
HTML <content>
(specified in the article as an
<aside>
), and alternate
<link>
. The <id>
is constructed by appending the source filename, hash print, and date
following the feed's <id>
or
<link>
element.
When filling in HTML content, sblg
will
strip away HTML attributes that do not fit into a white-list. This
white-list is defined by the W3C's Feed Validator.
JSON Schema
sblg
can produce JSON with the
-j
flag. The structure of the JSON file is
consumable either with a JSON schema (noted in the
FILES section) or using the typings that may
be downloaded with npm(1):
npm install sblg
If -l
is specified, the output schema is
simply an array as follows. Let source1.xml and
source2.xml be input files with a variety of
tags.
[ {"src": "source1.xml", "tags": ["tag1","tag2"]}, {"src": "source2.xml", "tags": ["tag1"]} ]
If, however, -r
is also specified, the
reverse format is used:
[ {"tag": "tag1", "srcs": ["source1.xml","source2.xml"]}, {"tag": "tag2", "srcs": ["source1.xml"]} ]
Tag Symbols
Within the template for -c
or
-C
, or in any article contents written (either into
an article or navigation entry), the following special strings are replaced.
These symbols concern the current article being processed: in a navigation
entry, or as article contents. In the event of the positional
“next” and “prev” symbols, these refer to the
article's position within the input articles. Obviously,
-c
has only a single article.
In general, these must be considered strict values, e.g.,
${sblg-aside}
and not ${ sblg-aside
}
. Some symbols accept optional arguments, which have the format
${sblg-tags[|argument]}
. Here,
|argument
may be omitted.
Be careful in using tag symbols: the contents are copied directly, so if specifying a value within an HTML attribute that has a double-quote, the attribute will be prematurely closed.
To prevent regular text with ${...}
from
being processed, escape one or more character, such as
${...}
.
${sblg-abscount}
- The total number of articles. This is only valid in
<nav data-sblg-nav="1">
, otherwise it always prints 1. See also${sblg-count}
and${sblg-setcount}
. ${sblg-abspos}
- The position (from 1) of the article's position in the list of all
articles. This is only valid in a
<nav data-sblg-nav="1">
context, otherwise it always prints 1. See also${sblg-pos}
. ${sblg-aside}
- The article's first aside with markup.
${sblg-asidetext}
- The article's first aside, textual parts only.
- The article's author with markup.
- The article's author, textual parts only
${sblg-realbase}
- Like
${sblg-base}
, and having the same sub-types, except deriving from${sblg-real}
. ${sblg-base}
- Same as
${sblg-source}
but with the last suffix part chopped off. For example, foo/bar.xml becomes foo/bar. The${sblg-stripbase}
variant will strip off the directory part and any sufix. For example, foo/bar.xml becomes bar. The${sblg-striplangbase}
variant will also strip the language. For example, if “en” language was specified on the article, foo/bar.en.xml becomes bar. ${sblg-count}
- The total number of articles that will be shown, i.e., taking into
consideration the navigation length and offset. In standalone mode, this
is always 1. In
<nav data-sblg-nav="1">
, it's the total number within the navigation. See also${sblg-abscount}
and${sblg-setcount}
. ${sblg-date}
- The publication date as YYYY-MM-DD (UTC).
${sblg-datetime}
- The publication date and time as YYYY-MM-DDTHH:MM:SSZ (UTC).
${sblg-datetime-fmt[|fmt]}
- A human-readable representation of the date and, if specified, time in
local time. This accepts an optional format string passed to
strftime(3). If the
format string is empty or “auto”, a human-readable date
(with
%x
) or date-time (%c
) is printed. ${sblg-img}
- The article's associated image. This will be an empty string if no image was specified.
${sblg-first-base}
- The first (newest) base name in the list of articles. There are also
${sblg-first-stripbase}
and${sblg-first-striplangbase}
variants. See${sblg-base}
. ${sblg-last-base}
- The last (oldest) base name in the list of articles. There are also
${sblg-last-stripbase}
and${sblg-last-striplangbase}
variants. See${sblg-base}
. ${sblg-next-base}
- The next base name when chronologically ordered from newest to oldest,
wrapping back to the beginning for the last. There are also
${sblg-next-stripbase}
and${sblg-next-striplangbase}
variants. See${sblg-base}
. ${sblg-next-has}
- Prints
sblg-next-has
if there exists a next article in the ordered set, otherwise prints nothing. ${sblg-pos}
- The position (from 1) of the articles actually shown. This always starts
at 1 and increments by one, regardless the tag filtering or starting
position. In standalone mode, it always prints 1. In blog mode (outside of
a
<nav>
context), it shows the position in the input files. Within a<nav>
context, it shows the position within the navigation. ${sblg-pos-frac}
- The fractional (0–1) value of
${sblg-pos}/$(sblg-count}
. ${sblg-pos-pct}
- The percentage (0–100, not including the percent sign) form of
${sblg-pos-frac}
. ${sblg-prev-base}
- The previous base name when chronologically ordered from newest to oldest,
wrapping back to the beginning for the last. There are also
${sblg-prev-stripbase}
and${sblg-prev-striplangbase}
variants. See${sblg-base}
. ${sblg-prev-has}
- Prints
sblg-prev-has
if there exists a previous article in the ordered set, otherwise prints nothing. ${sblg-get[|key]}
- Print the value of
key
assigned indata-sblg-set-key
. If unspecified or the key was not found, this is ignored and omitted from output. The lookup is case sensitive. ${sblg-get-escaped[|key]}
- Like
${sblg-get[|key]}
, but escapes the value of the key so that it may be used fordata-sblg-navtag
ordata-sblg-articletag
attribute values for multi-word tags. ${sblg-has[|key]}
- Like
${sblg-get[|key]}
, but queries with thekey
exists. If it is specified and it does exist, then the stringsblg-has-key
is printed. This is useful inclass
attributes to test whether a given key has been specified. ${sblg-setcount}
- Like
${sblg-count}
, but only the articles matching the requested tags. See also${sblg-count}
and${sblg-abscount}
. ${sblg-real}
- The article's actual source file. See
${sblg-source}
for an overridable source indicator. ${sblg-source}
- The source file associated with the article.
- List of unique tags in the article, optionally filtered by those having
the prefix
tagspec
. If the prefix is not specified, all tags. Each tag (e.g., TAG) is listed as<span class="sblg-tag">TAG</span>
. If no tags were found, a single<span class="sblg-tags-notfound"></span>
is emitted. ${sblg-title}
- The article title with markup.
${sblg-titletext}
- The article title, textual parts only.
${sblg-url}
- The output filename, which is empty for standard output.
${sblg-version}
- The current
sblg
version asxx.yy.zz
.
FILES
The following files are installed in /usr/local/share/sblg.
- schema.json
- JSON schema for output generated with
-j
.
EXIT STATUS
The sblg
utility exits 0 on
success, and >0 if an error occurs.
EXAMPLES
First, create standalone HTML5 files (filled-in
<article data-sblg-article="1">
)
from article fragments. An article-template.xml file
is assumed to exist. This will create article1.html
and article2.html from the re-write rule for the XML
suffix.
% sblg -c article1.xml
article2.xml
Next, merge formatted files into a front page. A blog-template.xml file is assumed to exist.
% sblg -o index.html article1.html
article2.html
This will create index.html with filled-in
<article data-sblg-article="1">
and
<nav data-sblg-nav="1">
elements.
Combining the above two examples, we can specify a single article to be displayed along with a full navigation as follows:
% sblg -o article1.html -C
article1.xml article1.xml article2.xml
This will fill the contents of
article1.xml into the <article
data-sblg-article="1">
but use both (along with any
others) in the <nav
data-sblg-nav="1">
.
If we want to make an output article as in the above example for
each element of the input, we could either run -C
for each input element, or use -L
to avoid
re-running sblg
for each input article, which can be
costly for many articles!
% sblg -L article1.xml
article2.xml
This re-writes the suffixes and fills in the
<article data-sblg-article="1">
for
article1.xml in
article1.html, and so on. For each of these, it will
fill in <nav data-sblg-nav="1">
.
STANDARDS
Input files and templates must be properly-formed XML files. Output files are guranteed to be XML as well. The Atom file template must be well-formed; output is guaranteed to satisfy the Atom 1.0 and Tag ID standards.
AUTHORS
The sblg
utility was written by
Kristaps Dzonsons,
kristaps@bsd.lv.
CAVEATS
Boolean XML values must have an attribute specified. In other
words, <foo bar="1">
is valid, while
<foo bar>
is not.
HTML entity names with attributes, e.g. <a
title="foo…">
, are not properly passed to
output.