ORT(5) | File Formats Manual | ORT(5) |
ort
— syntax for
ort configuration
An ort
configuration is a human-readable
data model format. It defines an application's data types, modifiers
(creation, modification, deletion), queries, and access control.
Configurations have one or more structures, zero or more user-defined types (enumerations, bitfields), and zero or more access control roles.
config :== [ enum | bitfield | struct ]+ [ roles ]? roles :== "roles" "{" [ "role" roledata ";" ]+ "};" struct :== "struct" structname "{" [ "comment" string_literal ";" ]? [ "count" searchdata ";" ]* [ "delete" deletedata ";" ]* [ "field" fielddata ";" ]+ [ "insert" ";" ]* [ "iterate" searchdata ";" ]* [ "list" searchdata ";" ]* [ "roles" roledata ";" ]* [ "search" searchdata ";" ]* [ "unique" uniquedata ";" ]* [ "update" updatedata ";" ]* "};" enum :== "enum" enumname "{" [ "comment" string_literal ";" ]? [ "item" enumdata ";" ]+ [ "isnull" label ";" ]? "};" bitfield :== "bits" bitsname "{" [ "comment" string_literal ";" ]? [ "item" bitsdata ";" ]+ [ "isunset" label ";" ]? [ "isnull" label ";" ]? "};"
Structures describe a class of data, such as a user, animal, product, etc. They consist of data and actions on that data: querying, creating, modifying, deleting. The data is specified in fields defining type, validation constraints, relations to other structures' data, and so on.
Data types may be native (integers, strings, references), user-defined (enumerations or bit-fields), or meta (currently only sub-structures). Enumerations define fixed constants used in data field definitions. Bit-fields are similar, except that they describe bits set within an single value.
Roles define access control on data content and operations.
Structures, user-defined types, and roles are collectively called a configuration's "objects".
In ort
, white-space separates tokens: it
is discarded except as found within quoted literals. Thus, the following are
identical except for the name:
struct foo { field id int rowid; }; struct bar{field id int rowid;};
Except for the content of string literals, a
ort
configuration only recognises ASCII
characters.
Objects are generally named by an identifier. These are always case-insensitive alphanumeric non-empty string beginning with a letter. There are many disallowed or reserved identifiers. There are also unique name constraints to consider (e.g., no two structures can have the same name, etc.).
A conforming and non-conforming identifier:
enum foobar { ... }; # ok enum foo_bar { ... }; # no
Although identifiers may appear in any case, they are internally converted to lowercase (this includes, for example, role names, query names, field names, language labels, etc.).
Another common syntactic element is the string literal: a double-quoted string where internal double quotes may be escaped by a single preceding backslash.
struct ident { field id int comment "\"Literal\"."; };
Document comments are begun by the hash mark (“#”) and extend to the end of the line.
# This is my structure. struct ident { field id int comment "\"Literal\"."; # End of line. };
They are always discarded and not considered part of the parsed configuration file.
Both decimal and integral numbers are recognised. Integral numbers
are signed and limited to 64 bits and formatted as
[-]?[0-9]+
. Decimal numbers are similarly formatted
as [-]?[0-9]+.[0-9]*
. The difference between the two
is the existence of the decimal.
There are no ordering constraints on objects: all linkage between components (e.g., referenced fields, roles, types, etc.) occurs after parsing the document.
A structure consists of data definitions, operations, and access.
It begins with struct
, then the unique identifier of
the structure, then elements within the curly braces.
"struct" structname "{" [ "comment" string_literal ";" ]? [ "count" searchdata ";" ]* [ "delete" deletedata ";" ]* [ "field" fielddata ";" ]+ [ "insert" ";" ]? [ "iterate" searchdata ";" ]* [ "list" searchdata ";" ]* [ "roles" roledata ";" ]* [ "search" searchdata ";" ]* [ "unique" uniquedata ";" ]* [ "update" updatedata ";" ]* "};"
The elements may consist of one or more
field
describing data fields; optionally a
comment
for describing the structure; zero or more
update
, delete
, or
insert
statements that define data modification;
zero or more unique
statements that create unique
constraints on multiple fields; and zero or more
count
, list
,
iterate
, or search
for
querying data; and zero or more roles
statements
enumerating role-based access control.
Column definitions. Each field consists of the
field
keyword followed by an identifier and,
optionally, a type with additional information.
"field" name[":" target]? [type [typeinfo]*]? ";"
The name may either be a standalone identifier or a "foreign
key" referencing a field in another structure by the structure and
field name. In this case, the referenced field must be a
rowid
or unique
and have the
same type.
The type, if specified, may be one of the following.
bit
1LLU << (value - 1)
(using C notation) prior
to storage. For entire bitfields, see bits
.bitfield
namebits
.bits
namebit
, non-zero values must be merged
into a bit-field by setting 1LLU << (value -
1)
(using C notation) prior to storage.blob
email
enum
nameint
password
real
epoch
date
alias
is also available, which is the same but using a date (ISO 8601) sequence
input validator.struct
fieldstruct
. This meta type is not represented by real
data: it only structures the output code. In the C API, for example, this
is represented by a struct name of the referent
structure. The field may be marked with
null
, but this involves a not-inconsiderable
performance hit when querying (directly or indirectly) on the structure.
Sub-structures may not be recursive: a field may not reference another
struct
that eventually references the origin.text
The typeinfo
provides further information
(or operations) regarding the field, and may consist of the following:
actdel
actionactup
but on deletion of the field in the
database.actup
actionnull
or has a
default
value. The nullify
value may only be used if the field is marked
null
.comment
string_literaldefault
integer|decimal|date|string_literal|enumlimit
limit_op limit_valenum
, bits
, or
struct
. If there are multiple statements, all of
them must validate. The limit_op argument consists
of an operator the limit_val is checked against.
Available operators are ge,
le, gt,
lt, and eq. Respectively,
these mean the field should be greater than or equal to, less than or
equal to, greater than, less than, or equal to the given value. If the
field type is text
, email
,
password
, or blob
, this
refers to the string (or binary) length in bytes. For numeric types, it's
the value itself. The given value must match the field type: an integer
(which may be signed) for integers, integer or real-valued for real, or a
positive integer for lengths. Duplicate limit operator-value pairs are not
permitted. Limits are not checked for for sanity, for example,
non-overlapping ranges, but this behaviour is expected to change.noexport
password
are never exported by default.null
rowid
field may not
also be null
.rowid
int
type and may only appear for one field in a
given structure.unique
rowid
.A field declaration may consist of any number of
typeinfo
statements.
A typical set of fields for a web application user in a database
may consist of the following. In this example, the
email
is unique, name
must
be of non-zero length, cookie
is an internal value
never exported (using the default
keyword implies
this was added later in development, such that old records have a value of
zero while new records are non-zero), and id
is the
unique identifier. The user references an parent by its
id
. If the parent is deleted, the reference is
nullified.
struct user { field parentid:user.id int null actdel nullify comment "Parent or null if there is no parent."; field name text limit gt 0 limit lt 128 comment "User's full name."; field cookie int noexport default 0 limit lt 0 comment "A secret cookie (if zero, added after secret cookie functionality)."; field password password; field email email unique comment "User's unique e-mail address."; field ctime epoch comment "When the user was added to the database."; field id int rowid noexport comment "Internal unique identifier."; };
A comment is a string literal describing most any component. Comments are part of the document structure and are usually passed to output formatters to describe a component. For example, a structure may be described as follows:
struct foo { field name text; comment "A foo widget."; };
There's currently no structure imposed on comments: they are
interpreted as opaque text and passed into the frontend. The only exception
is that CRLF are normalised as LF, so sequences of
\r\n
are converted to simply
\n
.
Components may only have a single comment statement. An empty comment is still considered to be a valid comment.
Query data with the search
keyword to
return an individual row (i.e., on a unique column or with a
limit
of one), count
for the
number of returned rows, list
for retrieving
multiple results in an array, or iterate
for
iterating over each result as it's returned.
Queries usually specify fields and may be followed by parameters:
"struct" name "{" [ query [term ["," term]*]? [":" [parms]* ]? ";" ]* "};"
The term consists of the possibly-nested field names to search for
and an optional operator. (Searchers of type search
require at least one field.) Nested fields are in dotted-notation:
[structure "."]*field [operator]?
This would produce functions searching the field "field"
within the struct
structures as listed. The
following operators may be used:
and
,
or
bit
, bits
, and
int
types.eq
,
neq
, streq
,
strneq
eq
operator is the default. The streq
and
strneq
variants operate the same except for on
passwords, where they compare directly to the password hash instead of the
password value.lt
,
gt
le
,
ge
like
text
and email
fields.isnull
,
notnull
Comparisons against a database NULL
value
always fail. If NULL
is passed as a pointer value,
comparison always fails as well.
The password
field does not accept any
operators but isnull
,
notnull
, eq
,
neq
, streq
, and
strneq
. If the query is a
count
, it further does not accept
eq
or neq
.
The search parameters are a series of key-value pairs. In each of these, terms are all in dotted-notation and may represent nested columns.
comment
string_literaldistinct
[["." | term]]null
sub-structures. It is also not possible to test eq
or neq
for password
types
in these queries. Use grouprow
for individual
columns: the distinct
keyword works for an entire
row.grouprow
field ["." field]*maxrow
or
minrow
. It may not be a
null
column, or a password
or struct
type.limit
limitval ["," offsetval]?search
singleton result statement as a way to limit non-unique results to a
single result. If followed by a comma, the next term is used to offset the
query. This is usually used to page through results.maxrow
|
minrow
field ["." field]*grouprow
, identify how
rows are collapsed with either the maximum or minimum value, respectively,
of the given column in the set of grouped rows. This calculation is
lexicographic for strings or blobs, and numeric for numbers. The column
may not be the same as the grouping column. It also may not be a
null
column, or a struct
or password
type.name
searchnameorder
term [type]? ["," term [type]?]*asc
for ascending (the default) and
desc
for descending. Result ordering is applied
from left-to-right.If you're searching (in any way) on a
password
field, the field is omitted from the
initial search, then hash-verified after being extracted from the database.
Thus, this doesn't have the same performance as a normal search.
The following are simple web application queries:
struct user { field email email unique; field password password; field mtime epoch null comment "Null if not logged in."; field id int rowid; search email, password: name creds; iterate mtime notnull: name recent order mtime desc limit 20 comment "Last 20 logins."; };
The advanced grouping is appropriate when selecting as follows. It assumes a user structure such as defined as in the above example.
struct perm { field userid user.id; field ctime epoch; iterate: grouprow userid maxrow ctime name newest comment "Newest permission for each user."; };
Limit role access with the roles
keyword
as follows:
"struct" name "{" [ "roles" role ["," role]* "{" roletype [name]? "};" ]* "};"
The role
is a list of roles as defined in
the top-level block, or one of the reserved roles but for
none
, which can never be assigned. The role may be
one of the following types:
all
delete
nameinsert
iterate
namelist
namenoexport
[name]search
nameupdate
nameTo refer to an operation, use its name
.
The only way to refer to un-named operations is to use
all
, which refers to all operations except
noexport
.
roles { role loggedin; }; struct user { field secret int; field id int rowid; insert; search id: name ident; roles all { search id; }; roles default { noexport secret; }; roles loggedin { insert; }; };
The example permits logged-in operators to insert new rows, and both the default and logged-in roles to search for them. However, the secret variable is only exported to logged-in users.
If, during run-time, the current role is not a subtype (inclusive) of the given role for an operation, the application is immediately terminated.
Data modifiers. These begin with the
update
, delete
, or
insert
keyword. By default, there are no update,
delete, or insert operations defined. The syntax is as follows:
"struct" name "{" [ "update" [mflds]* [":" [cflds]* [":" [parms]* ]? ]? ";" ]* [ "delete" [cflds]* [":" [parms]* ]? ";" ]* [ "insert" ";" ]? "};"
Both mflds
and
cflds
are sequences of comma-separated non-meta
fields in the current structure followed by operators. The former refers to
the fields that will be modified; the latter refers to fields that will act
as constraints to which data is modified.
The delete
statement does not accept
fields to modify. If update
does not have fields to
modify, all fields will be modified using the default modifier. Lastly,
insert
accepts no fields at all: all fields (except
for row identifiers) are included in the insert operations.
Fields have the following operators:
mflds :== mfld [modify_operator]? cflds :== cfld [constraint_operator]?
The fields in mflds
accept an optional
modifier operation:
concat
dec
inc
set
,
strset
strset
sets to the raw value instead of
hashing beforehand.The fields in cflds
accept an optional
operator type as described in Queries.
Fields of type password
are limited to the
streq
and strneq
operators.
The parms
are an optional series of
key-value pairs consisting of the following:
"comment" string_literal "name" name
The name
sets a unique name for the
generated function, while comment
is used for the
API comments.
While individual fields may be marked
unique
on a per-column basis, multiple-column unique
constraints may be specified with the unique
structure-level keyword. The syntax is as follows:
"unique" field ["," field]+ ";"
Each field
must be in the local structure,
and must be non-meta types. There must be at least two fields in the
statement. There can be only one unique statement per combination of fields
(in any order).
For example, consider a request for something involving two parties, where the pair requesting must be unique.
struct request { field userid:user.id int; field ownerid:user.id int; unique userid, ownerid; };
This stipulates that adding the same pair will result in a constraint failure.
To provide more strong typing for data,
ort
provides enumerations and bit-field types. These
are used only for validating data input.
Enumerations constrain an int
field type
to a specific set of constant values. They are defined as follows:
"enum" enumname "{" [ "comment" string_literal ";" ]? [ "item" name [value]? [parms]* ";" ]+ [ "isnull" label ";" ]? "};"
For example,
enum enumname { item "val1" 1 jslabel "Value one"; isnull jslabel "Not given"; };
The enumeration name must be unique among all enumerations, bitfields, and structures.
Items define enumeration item names, their constant values (if
set), and documentation. Each item's name
must be
unique within an enumeration. The value
is the named
constant's value expressed as an integer. It must also be unique within the
enumeration object. It may not be the maximum or minimum 32-bit signed
integer. If not specified, it is assigned as one more than the maximum of
the assigned values or zero, whichever is larger. Automatic assignment is
linear and in the order specified in the configuration. Assigned values may
also not be the maximum or minimum 32-bit signed integer. Parameters may be
any of the following:
"comment" string_literal label
The item's comment
is used to document the
field, while its label (see Labels) is used
only for formatting output. The isnull
label is used
for labelling fields evaluating to null
.
The above enumeration would be used in an example field definition as follows:
field foo enum enumname;
This would constrain validation routines to only recognise values defined for the enumeration.
Like enumerations, bitfields constrain an
int
field type to a bit-wise mask of constant
values. They are defined as follows:
"bits" bitsname "{" [ "comment" string_literal ";" ]? [ "item" name bitidx [parms]* ";" ]+ [ "isunset" label ";" ]? [ "isnull" label ";" ]? "};"
For example,
bits bitsname { item "bit1" 0 jslabel "Bit one"; isunset jslabel "No bits"; isnull jslabel "Not given"; };
The name must be unique among all enumerations, structures, and
other bitfields. The term "bitfield" may be used instead of
bits
, for example,
bitfield bitsname { item "bit1" 0; };
Items define individual bits, their values, and documentation.
Each item's name
must be unique within a bitfield.
The value
is the named constant's bit index from
zero, so a value of zero refers to the first bit, one to the second bit, and
so on. It must fall within 0–63 inclusive. Each must be unique within
the bitfield. Parameters may be any of the following:
"comment" string_literal label
The item's comment
is used to document the
field, while its label (see Labels) is used
only for formatting output.
The above bitfield would be used in an example field definition as follows:
field foo bits bitsname;
The bitfield's comment
is passed into the
output media, the isunset
statement serves to
provide a label (see Labels) for when no
bits are set (i.e., the field evaluates to zero), and
isnull
is the same except for when no data is given,
i.e., the field is null
.
Labels specify how bits
and
enum
types and their items may be described by a
front-end formatter such as ort-javascript(1). That is,
while the string value of a struct
item describes
itself, an enum
maps to a numeric value that needs
to be translated into a meaningful format. Labels export string
representations of the internal numeric value to the front-end
formatters.
The syntax is as follows:
"jslabel" ["." lang]? quoted_string
The lang
token is usually an ISO 639-1
code, but may be any identifier. It is case insensitive. If the
lang
is not specified, the label is considered to be
the default label.
If a label is not specified for a given language, it inherits the default label. If the default label is not provided, it is an empty string. There is no restriction to labels except that they are non-empty.
Only one label may be specified per language, or one default label, per component.
Full role-based access control is available in
ort
when a top-level roles
block is defined.
"roles" "{" [ "role" name [parms] ["{" "role" name... ";" "}"]* ";" ]* "};"
This nested structure defines the role tree. Roles descendent of roles are called sub-roles. Role names are case insensitive and must be unique.
By defining roles
, even if left empty, the
system will switch into default-deny access control mode, and each function
in Structures must be associated with
one or more roles to be used.
There are three reserved roles: default
,
none
, and all
. These may not
be specified in the roles
statement. The first may
be used for the initial state of the system (before a role has been
explicitly assigned), the second refers to the empty role that can do
nothing, and the third contains all explicitly-defined roles.
Each role may be associated with parameters limited to:
"role" name ["comment" quoted_string]?
The comment
field is only produced for
role documentation.
A trivial example is as follows:
struct user { field name text; field id int rowid; comment "A regular user."; }; struct session { field user struct userid; field userid:user.id comment "Associated user."; field token int comment "Random cookie."; field ctime epoch comment "Creation time."; field id int rowid; comment "Authenticated session."; };
This generates two C structures, user
and
session
, consisting of the given fields. The
session
structure contains a struct
user
as well; thus, there is a declarative order that
ort(1) enforces when writing out structures.
The SQL interface, when fetching a struct
session
, will employ an INNER JOIN
over the
user identifier and session userid
field.
October 25, 2021 | OpenBSD 6.7 |