NAME
ort
—
syntax for ort configuration
DESCRIPTION
An ort
configuration is a human-readable
data model format. It defines an application's data types, modifiers
(creation, modification, deletion), queries, and access control.
Configurations have one or more structures, zero or more user-defined types (enumerations, bitfields), and zero or more access control roles.
config :== [ enum | bitfield | struct ]+ [ roles ]? roles :== "roles" "{" [ "role" roledata ";" ]+ "};" struct :== "struct" structname "{" [ "comment" string_literal ";" ]? [ "count" searchdata ";" ]* [ "delete" deletedata ";" ]* [ "field" fielddata ";" ]+ [ "insert" ";" ]* [ "iterate" searchdata ";" ]* [ "list" searchdata ";" ]* [ "roles" roledata ";" ]* [ "search" searchdata ";" ]* [ "unique" uniquedata ";" ]* [ "update" updatedata ";" ]* "};" enum :== "enum" enumname "{" [ "comment" string_literal ";" ]? [ "item" enumdata ";" ]+ [ "isnull" label ";" ]? "};" bitfield :== "bits" bitsname "{" [ "comment" string_literal ";" ]? [ "item" bitsdata ";" ]+ [ "isunset" label ";" ]? [ "isnull" label ";" ]? "};"
Structures describe a class of data, such as a user, animal, product, etc. They consist of data and actions on that data: querying, creating, modifying, deleting. The data is specified in fields defining type, validation constraints, relations to other structures' data, and so on.
Data types may be native (integers, strings, references), user-defined (enumerations or bit-fields), or meta (currently only sub-structures). Enumerations define fixed constants used in data field definitions. Bit-fields are similar, except that they describe bits set within an single value.
Roles define access control on data content and operations.
Structures, user-defined types, and roles are collectively called a configuration's "objects".
SYNTAX
In ort
, white-space separates tokens: it
is discarded except as found within quoted literals. Thus, the following are
identical except for the name:
struct foo { field id int rowid; }; struct bar{field id int rowid;};
Except for the content of string literals, a
ort
configuration only recognises ASCII
characters.
Identifiers
Objects are generally named by an identifier. These are always case-insensitive alphanumeric non-empty string beginning with a letter. There are many disallowed or reserved identifiers. There are also unique name constraints to consider (e.g., no two structures can have the same name, etc.).
A conforming and non-conforming identifier:
enum foobar { ... }; # ok enum foo_bar { ... }; # no
Although identifiers may appear in any case, they are internally converted to lowercase (this includes, for example, role names, query names, field names, language labels, etc.).
String literals
Another common syntactic element is the string literal: a double-quoted string where internal double quotes may be escaped by a single preceding backslash.
struct ident { field id int comment "\"Literal\"."; };
Document Comments
Document comments are begun by the hash mark (“#”) and extend to the end of the line.
# This is my structure. struct ident { field id int comment "\"Literal\"."; # End of line. };
They are always discarded and not considered part of the parsed configuration file.
Numbers
Both decimal and integral numbers are recognised. Integral numbers
are signed and limited to 64 bits and formatted as
[-]?[0-9]+
. Decimal numbers are similarly formatted
as [-]?[0-9]+.[0-9]*
. The difference between the two
is the existence of the decimal.
Ordering
There are no ordering constraints on objects: all linkage between components (e.g., referenced fields, roles, types, etc.) occurs after parsing the document.
STRUCTURES
A structure consists of data definitions, operations, and access.
It begins with struct
, then the unique identifier of
the structure, then elements within the curly braces.
"struct" structname "{" [ "comment" string_literal ";" ]? [ "count" searchdata ";" ]* [ "delete" deletedata ";" ]* [ "field" fielddata ";" ]+ [ "insert" ";" ]? [ "iterate" searchdata ";" ]* [ "list" searchdata ";" ]* [ "roles" roledata ";" ]* [ "search" searchdata ";" ]* [ "unique" uniquedata ";" ]* [ "update" updatedata ";" ]* "};"
The elements may consist of one or more
field
describing data fields; optionally a
comment
for describing the structure; zero or more
update
, delete
, or
insert
statements that define data modification;
zero or more unique
statements that create unique
constraints on multiple fields; and zero or more
count
, list
,
iterate
, or search
for
querying data; and zero or more roles
statements
enumerating role-based access control.
Fields
Column definitions. Each field consists of the
field
keyword followed by an identifier and,
optionally, a type with additional information.
"field" name[":" target]? [type [typeinfo]*]? ";"
The name may either be a standalone identifier or a "foreign
key" referencing a field in another structure by the structure and
field name. In this case, the referenced field must be a
rowid
or unique
and have the
same type.
The type, if specified, may be one of the following.
bit
- Integer constrained to 64-bit bit index (that is, 0–64). The bit
indices start from 1 in order to represent a zero value (no bits to set).
Non-zero values must be merged into a bit-field by setting
1LLU << (value - 1)
(using C notation) prior to storage. For entire bitfields, seebits
. bitfield
name- Alias for
bits
. bits
name- Integer constrained to the given name bitfield's
bits. As with
bit
, non-zero values must be merged into a bit-field by setting1LLU << (value - 1)
(using C notation) prior to storage. blob
- A fixed-size binary buffer.
email
- Text constrained to e-mail address format.
enum
name- Integer constrained to valid enumeration values of name.
int
- A 64-bit signed integer.
password
- Text. This field is special in that it converts an input password into a hash before insertion into the database. It also can properly search for password hashes by running the hash verification after extraction. Thus, there is a difference between a password field that is being inserted or updated (as a password, which is hashed) and extracted using a search (as a hash).
real
- A double-precision float.
epoch
- Integer constrained to valid time_t values and
similarly represented in the C API. The
date
alias is also available, which is the same but using a date (ISO 8601) sequence input validator. struct
field- A substructure referenced by the field target
struct
. This meta type is not represented by real data: it only structures the output code. In the C API, for example, this is represented by a struct name of the referent structure. The field may be marked withnull
, but this involves a not-inconsiderable performance hit when querying (directly or indirectly) on the structure. Sub-structures may not be recursive: a field may not reference anotherstruct
that eventually references the origin. text
- Text, usually encoded in ASCII or UTF-8.
The typeinfo
provides further information
(or operations) regarding the field, and may consist of the following:
actdel
action- Like
actup
but on deletion of the field in the database. actup
action- SQL actions taken when the field is updated. May be one of
none (do nothing), restrict
(disallow the reference from deleting if referrers exist),
nullify (set referrers to null),
cascade, (propogate new value to referrers), or
default (set referrers to their default values).
This is only available on foreign key references. The
default value may only be used if the field is
marked
null
or has adefault
value. The nullify value may only be used if the field is markednull
. comment
string_literal- Documents the field using the quoted string.
default
integer|decimal|date|string_literal|enum- Set a default value for the column that's used only when adding columns to the SQL schema via ort-sqldiff(1). It's only valid for numeric, date, enumeration, or string literal (email, text) field types. Dates must be in yyyy-mm-dd format. Defaults are not currently checked against type limits (i.e., e-mail form or string length).
limit
limit_op limit_val- Used when generating validation functions. Not available for
enum
,bits
, orstruct
. If there are multiple statements, all of them must validate. The limit_op argument consists of an operator the limit_val is checked against. Available operators are ge, le, gt, lt, and eq. Respectively, these mean the field should be greater than or equal to, less than or equal to, greater than, less than, or equal to the given value. If the field type istext
,email
,password
, orblob
, this refers to the string (or binary) length in bytes. For numeric types, it's the value itself. The given value must match the field type: an integer (which may be signed) for integers, integer or real-valued for real, or a positive integer for lengths. Duplicate limit operator-value pairs are not permitted. Limits are not checked for for sanity, for example, non-overlapping ranges, but this behaviour is expected to change. noexport
- Never exported using the JSON interface. This is useful for sensitive
internal information. Fields with type
password
are never exported by default. null
- Accepts null SQL values. A
rowid
field may not also benull
. rowid
- The field is an SQL primary key. This is only available for the
int
type and may only appear for one field in a given structure. unique
- Has a unique SQL column value. It's redundant (but harmless) to specify
this alongside
rowid
.
A field declaration may consist of any number of
typeinfo
statements.
A typical set of fields for a web application user in a database
may consist of the following. In this example, the
email
is unique, name
must
be of non-zero length, cookie
is an internal value
never exported (using the default
keyword implies
this was added later in development, such that old records have a value of
zero while new records are non-zero), and id
is the
unique identifier. The user references an parent by its
id
. If the parent is deleted, the reference is
nullified.
struct user { field parentid:user.id int null actdel nullify comment "Parent or null if there is no parent."; field name text limit gt 0 limit lt 128 comment "User's full name."; field cookie int noexport default 0 limit lt 0 comment "A secret cookie (if zero, added after secret cookie functionality)."; field password password; field email email unique comment "User's unique e-mail address."; field ctime epoch comment "When the user was added to the database."; field id int rowid noexport comment "Internal unique identifier."; };
Comments
A comment is a string literal describing most any component. Comments are part of the document structure and are usually passed to output formatters to describe a component. For example, a structure may be described as follows:
struct foo { field name text; comment "A foo widget."; };
There's currently no structure imposed on comments: they are
interpreted as opaque text and passed into the frontend. The only exception
is that CRLF are normalised as LF, so sequences of
\r\n
are converted to simply
\n
.
Components may only have a single comment statement. An empty comment is still considered to be a valid comment.
Queries
Query data with the search
keyword to
return an individual row (i.e., on a unique column or with a
limit
of one), count
for the
number of returned rows, list
for retrieving
multiple results in an array, or iterate
for
iterating over each result as it's returned.
Queries usually specify fields and may be followed by parameters:
"struct" name "{" [ query [term ["," term]*]? [":" [parms]* ]? ";" ]* "};"
The term consists of the possibly-nested field names to search for
and an optional operator. (Searchers of type search
require at least one field.) Nested fields are in dotted-notation:
[structure "."]*field [operator]?
This would produce functions searching the field "field"
within the struct
structures as listed. The
following operators may be used:
and
,or
- Logical AND (&) and logical OR (|), respectively. Only available for
bit
,bits
, andint
types. eq
,neq
,streq
,strneq
- Equality or non-equality binary operator. The
eq
operator is the default. Thestreq
andstrneq
variants operate the same except for on passwords, where they compare directly to the password hash instead of the password value. lt
,gt
- Less than or greater than binary operators. For text, the comparison is lexical; otherwise, it is by value.
le
,ge
- Less than/equality or greater than/equality binary operators. For text, the comparison is lexical; otherwise, it is by value.
like
- The LIKE SQL operator. This only applies to
text
andemail
fields. isnull
,notnull
- Unary operator to check whether the field is null or not null.
Comparisons against a database NULL
value
always fail. If NULL
is passed as a pointer value,
comparison always fails as well.
The password
field does not accept any
operators but isnull
,
notnull
, eq
,
neq
, streq
, and
strneq
. If the query is a
count
, it further does not accept
eq
or neq
.
The search parameters are a series of key-value pairs. In each of these, terms are all in dotted-notation and may represent nested columns.
comment
string_literal- Documents the query using the quoted string.
distinct
[["." | term]]- Return only distinct rows of the sub-structure indicated by
term, or if only a period (“.”), the
current structure. This does not work with
null
sub-structures. It is also not possible to testeq
orneq
forpassword
types in these queries. Usegrouprow
for individual columns: thedistinct
keyword works for an entire row. grouprow
field ["." field]*- Groups results by the given column. This collapses all rows with the same
value for the given column into a single row with the choice of row being
determined by
maxrow
orminrow
. It may not be anull
column, or apassword
orstruct
type. limit
limitval ["," offsetval]?- A value >0 that limits the number of returned results. By default,
there is no limit. This can be used in a
search
singleton result statement as a way to limit non-unique results to a single result. If followed by a comma, the next term is used to offset the query. This is usually used to page through results. maxrow
|minrow
field ["." field]*- When grouping rows with
grouprow
, identify how rows are collapsed with either the maximum or minimum value, respectively, of the given column in the set of grouped rows. This calculation is lexicographic for strings or blobs, and numeric for numbers. The column may not be the same as the grouping column. It also may not be anull
column, or astruct
orpassword
type. name
searchname- A identifier used in the C API for the search function. This must be unique within a structure.
order
term [type]? ["," term [type]?]*- Result ordering. Each term may be followed by an order direction:
asc
for ascending (the default) anddesc
for descending. Result ordering is applied from left-to-right.
If you're searching (in any way) on a
password
field, the field is omitted from the
initial search, then hash-verified after being extracted from the database.
Thus, this doesn't have the same performance as a normal search.
The following are simple web application queries:
struct user { field email email unique; field password password; field mtime epoch null comment "Null if not logged in."; field id int rowid; search email, password: name creds; iterate mtime notnull: name recent order mtime desc limit 20 comment "Last 20 logins."; };
The advanced grouping is appropriate when selecting as follows. It assumes a user structure such as defined as in the above example.
struct perm { field userid user.id; field ctime epoch; iterate: grouprow userid maxrow ctime name newest comment "Newest permission for each user."; };
Roles
Limit role access with the roles
keyword
as follows:
"struct" name "{" [ "roles" role ["," role]* "{" roletype [name]? "};" ]* "};"
The role
is a list of roles as defined in
the top-level block, or one of the reserved roles but for
none
, which can never be assigned. The role may be
one of the following types:
all
- A special type referring to all function types.
delete
name- The named delete operation.
insert
- The insert operation.
iterate
name- The named iterate operation.
list
name- The named list operation.
noexport
[name]- Do not export the field name via the JSON export routines. If no name is given, don't export any fields.
search
name- The named search operation.
update
name- The name update operation.
To refer to an operation, use its name
.
The only way to refer to un-named operations is to use
all
, which refers to all operations except
noexport
.
roles { role loggedin; }; struct user { field secret int; field id int rowid; insert; search id: name ident; roles all { search id; }; roles default { noexport secret; }; roles loggedin { insert; }; };
The example permits logged-in operators to insert new rows, and both the default and logged-in roles to search for them. However, the secret variable is only exported to logged-in users.
If, during run-time, the current role is not a subtype (inclusive) of the given role for an operation, the application is immediately terminated.
Updates
Data modifiers. These begin with the
update
, delete
, or
insert
keyword. By default, there are no update,
delete, or insert operations defined. The syntax is as follows:
"struct" name "{" [ "update" [mflds]* [":" [cflds]* [":" [parms]* ]? ]? ";" ]* [ "delete" [cflds]* [":" [parms]* ]? ";" ]* [ "insert" ";" ]? "};"
Both mflds
and
cflds
are sequences of comma-separated non-meta
fields in the current structure followed by operators. The former refers to
the fields that will be modified; the latter refers to fields that will act
as constraints to which data is modified.
The delete
statement does not accept
fields to modify. If update
does not have fields to
modify, all fields will be modified using the default modifier. Lastly,
insert
accepts no fields at all: all fields (except
for row identifiers) are included in the insert operations.
Fields have the following operators:
mflds :== mfld [modify_operator]? cflds :== cfld [constraint_operator]?
The fields in mflds
accept an optional
modifier operation:
concat
- String concatenate the current field by a given value (x = x || ?).
dec
- Decrement the current field by a given value (x = x - ?).
inc
- Increment the current field by a given value (x = x + ?).
set
,strset
- Default behaviour of setting to a value (x = ?). If the field is a
password,
strset
sets to the raw value instead of hashing beforehand.
The fields in cflds
accept an optional
operator type as described in Queries.
Fields of type password
are limited to the
streq
and strneq
operators.
The parms
are an optional series of
key-value pairs consisting of the following:
"comment" string_literal "name" name
The name
sets a unique name for the
generated function, while comment
is used for the
API comments.
Uniques
While individual fields may be marked
unique
on a per-column basis, multiple-column unique
constraints may be specified with the unique
structure-level keyword. The syntax is as follows:
"unique" field ["," field]+ ";"
Each field
must be in the local structure,
and must be non-meta types. There must be at least two fields in the
statement. There can be only one unique statement per combination of fields
(in any order).
For example, consider a request for something involving two parties, where the pair requesting must be unique.
struct request { field userid:user.id int; field ownerid:user.id int; unique userid, ownerid; };
This stipulates that adding the same pair will result in a constraint failure.
TYPES
To provide more strong typing for data,
ort
provides enumerations and bit-field types. These
are used only for validating data input.
Enumerations
Enumerations constrain an int
field type
to a specific set of constant values. They are defined as follows:
"enum" enumname "{" [ "comment" string_literal ";" ]? [ "item" name [value]? [parms]* ";" ]+ [ "isnull" label ";" ]? "};"
For example,
enum enumname { item "val1" 1 jslabel "Value one"; isnull jslabel "Not given"; };
The enumeration name must be unique among all enumerations, bitfields, and structures.
Items define enumeration item names, their constant values (if
set), and documentation. Each item's name
must be
unique within an enumeration. The value
is the named
constant's value expressed as an integer. It must also be unique within the
enumeration object. It may not be the maximum or minimum 32-bit signed
integer. If not specified, it is assigned as one more than the maximum of
the assigned values or zero, whichever is larger. Automatic assignment is
linear and in the order specified in the configuration. Assigned values may
also not be the maximum or minimum 32-bit signed integer. Parameters may be
any of the following:
"comment" string_literal label
The item's comment
is used to document the
field, while its label (see Labels) is used
only for formatting output. The isnull
label is used
for labelling fields evaluating to null
.
The above enumeration would be used in an example field definition as follows:
field foo enum enumname;
This would constrain validation routines to only recognise values defined for the enumeration.
Bitfields
Like enumerations, bitfields constrain an
int
field type to a bit-wise mask of constant
values. They are defined as follows:
"bits" bitsname "{" [ "comment" string_literal ";" ]? [ "item" name bitidx [parms]* ";" ]+ [ "isunset" label ";" ]? [ "isnull" label ";" ]? "};"
For example,
bits bitsname { item "bit1" 0 jslabel "Bit one"; isunset jslabel "No bits"; isnull jslabel "Not given"; };
The name must be unique among all enumerations, structures, and
other bitfields. The term "bitfield" may be used instead of
bits
, for example,
bitfield bitsname { item "bit1" 0; };
Items define individual bits, their values, and documentation.
Each item's name
must be unique within a bitfield.
The value
is the named constant's bit index from
zero, so a value of zero refers to the first bit, one to the second bit, and
so on. It must fall within 0–63 inclusive. Each must be unique within
the bitfield. Parameters may be any of the following:
"comment" string_literal label
The item's comment
is used to document the
field, while its label (see Labels) is used
only for formatting output.
The above bitfield would be used in an example field definition as follows:
field foo bits bitsname;
The bitfield's comment
is passed into the
output media, the isunset
statement serves to
provide a label (see Labels) for when no
bits are set (i.e., the field evaluates to zero), and
isnull
is the same except for when no data is given,
i.e., the field is null
.
Labels
Labels specify how bits
and
enum
types and their items may be described by a
front-end formatter such as ort-javascript(1). That is, while the string value of a
struct
item describes itself, an
enum
maps to a numeric value that needs to be
translated into a meaningful format. Labels export string representations of
the internal numeric value to the front-end formatters.
The syntax is as follows:
"jslabel" ["." lang]? quoted_string
The lang
token is usually an ISO 639-1
code, but may be any identifier. It is case insensitive. If the
lang
is not specified, the label is considered to be
the default label.
If a label is not specified for a given language, it inherits the default label. If the default label is not provided, it is an empty string. There is no restriction to labels except that they are non-empty.
Only one label may be specified per language, or one default label, per component.
ROLES
Full role-based access control is available in
ort
when a top-level roles
block is defined.
"roles" "{" [ "role" name [parms] ["{" "role" name... ";" "}"]* ";" ]* "};"
This nested structure defines the role tree. Roles descendent of roles are called sub-roles. Role names are case insensitive and must be unique.
By defining roles
, even if left empty, the
system will switch into default-deny access control mode, and each function
in Structures must be associated with
one or more roles to be used.
There are three reserved roles: default
,
none
, and all
. These may not
be specified in the roles
statement. The first may
be used for the initial state of the system (before a role has been
explicitly assigned), the second refers to the empty role that can do
nothing, and the third contains all explicitly-defined roles.
Each role may be associated with parameters limited to:
"role" name ["comment" quoted_string]?
The comment
field is only produced for
role documentation.
EXAMPLES
A trivial example is as follows:
struct user { field name text; field id int rowid; comment "A regular user."; }; struct session { field user struct userid; field userid:user.id comment "Associated user."; field token int comment "Random cookie."; field ctime epoch comment "Creation time."; field id int rowid; comment "Authenticated session."; };
This generates two C structures, user
and
session
, consisting of the given fields. The
session
structure contains a struct
user
as well; thus, there is a declarative order that
ort(1) enforces when writing
out structures.
The SQL interface, when fetching a struct
session
, will employ an INNER JOIN
over the
user identifier and session userid
field.