ORT(5)

DESCRIPTION

An ort configuration is a human-readable data model format. It defines an application's data types, modifiers (creation, modification, deletion), queries, and access control.

Configurations have one or more structures, zero or more user-defined types (enumerations, bitfields), and zero or more access control roles.

config :== [ enum | bitfield | struct ]+ [ roles ]?
roles :== "roles" "{"
  [ "role" roledata ";" ]+
"};"
struct :== "struct" structname "{"
  [ "comment" string_literal ";" ]?
  [ "count" searchdata ";" ]*
  [ "delete" deletedata ";" ]*
  [ "field" fielddata ";" ]+
  [ "insert" ";" ]*
  [ "iterate" searchdata ";" ]*
  [ "list" searchdata ";" ]*
  [ "roles" roledata ";" ]*
  [ "search" searchdata ";" ]*
  [ "unique" uniquedata ";" ]*
  [ "update" updatedata ";" ]*
"};"
enum :== "enum" enumname "{"
  [ "comment" string_literal ";" ]?
  [ "item" enumdata ";" ]+
  [ "isnull" label ";" ]?
"};"
bitfield :== "bits" bitsname "{"
  [ "comment" string_literal ";" ]?
  [ "item" bitsdata ";" ]+
  [ "isunset" label ";" ]?
  [ "isnull" label ";" ]?
"};"

Structures describe a class of data, such as a user, animal, product, etc. They consist of data and actions on that data: querying, creating, modifying, deleting. The data is specified in fields defining type, validation constraints, relations to other structures' data, and so on.

Data types may be native (integers, strings, references), user-defined (enumerations or bit-fields), or meta (currently only sub-structures). Enumerations define fixed constants used in data field definitions. Bit-fields are similar, except that they describe bits set within an single value.

Roles define access control on data content and operations.

Structures, user-defined types, and roles are collectively called a configuration's "objects".

SYNTAX

In ort, white-space separates tokens: it is discarded except as found within quoted literals. Thus, the following are identical except for the name:

struct foo {
  field id int rowid;
};
struct bar{field id int rowid;};

Except for the content of string literals, a ort configuration only recognises ASCII characters.

Identifiers

Objects are generally named by an identifier. These are always case-insensitive alphanumeric non-empty string beginning with a letter. There are many disallowed or reserved identifiers. There are also unique name constraints to consider (e.g., no two structures can have the same name, etc.).

A conforming and non-conforming identifier:

enum foobar { ... };  # ok
enum foo_bar { ... }; # no

Although identifiers may appear in any case, they are internally converted to lowercase (this includes, for example, role names, query names, field names, language labels, etc.).

String literals

Another common syntactic element is the string literal: a double-quoted string where internal double quotes may be escaped by a single preceding backslash.

struct ident { field id int comment "\"Literal\"."; };

Document Comments

Document comments are begun by the hash mark (“#”) and extend to the end of the line.

# This is my structure.
struct ident {
  field id int comment "\"Literal\"."; # End of line.
};

They are always discarded and not considered part of the parsed configuration file.

Numbers

Both decimal and integral numbers are recognised. Integral numbers are signed and limited to 64 bits and formatted as [-]?[0-9]+. Decimal numbers are similarly formatted as [-]?[0-9]+.[0-9]*. The difference between the two is the existence of the decimal.

Ordering

There are no ordering constraints on objects: all linkage between components (e.g., referenced fields, roles, types, etc.) occurs after parsing the document.

STRUCTURES

A structure consists of data definitions, operations, and access. It begins with struct, then the unique identifier of the structure, then elements within the curly braces.

"struct" structname "{"
  [ "comment" string_literal ";" ]?
  [ "count" searchdata ";" ]*
  [ "delete" deletedata ";" ]*
  [ "field" fielddata ";" ]+
  [ "insert" ";" ]?
  [ "iterate" searchdata ";" ]*
  [ "list" searchdata ";" ]*
  [ "roles" roledata ";" ]*
  [ "search" searchdata ";" ]*
  [ "unique" uniquedata ";" ]*
  [ "update" updatedata ";" ]*
"};"

The elements may consist of one or more field describing data fields; optionally a comment for describing the structure; zero or more update, delete, or insert statements that define data modification; zero or more unique statements that create unique constraints on multiple fields; and zero or more count, list, iterate, or search for querying data; and zero or more roles statements enumerating role-based access control.

Fields

Column definitions. Each field consists of the field keyword followed by an identifier and, optionally, a type with additional information.

"field" name[":" target]? [type [typeinfo]*]? ";"

The name may either be a standalone identifier or a "foreign key" referencing a field in another structure by the structure and field name. In this case, the referenced field must be a rowid or unique and have the same type.

The type, if specified, may be one of the following.

bit: Integer constrained to 64-bit bit index (that is, 0–64). The bit indices start from 1 in order to represent a zero value (no bits to set). Non-zero values must be merged into a bit-field by setting 1LLU << (value - 1) (using C notation) prior to storage. For entire bitfields, see bits.
bitfield name: Alias for bits.
bits name: Integer constrained to the given name bitfield's bits. As with bit, non-zero values must be merged into a bit-field by setting 1LLU << (value - 1) (using C notation) prior to storage.
blob: A fixed-size binary buffer.
email: Text constrained to e-mail address format.
enum name: Integer constrained to valid enumeration values of name.
int: A 64-bit signed integer.
password: Text. This field is special in that it converts an input password into a hash before insertion into the database. It also can properly search for password hashes by running the hash verification after extraction. Thus, there is a difference between a password field that is being inserted or updated (as a password, which is hashed) and extracted using a search (as a hash).
real: A double-precision float.
epoch: Integer constrained to valid time_t values and similarly represented in the C API. The date alias is also available, which is the same but using a date (ISO 8601) sequence input validator.
struct field: A substructure referenced by the field target struct. This meta type is not represented by real data: it only structures the output code. In the C API, for example, this is represented by a struct name of the referent structure. The field may be marked with null, but this involves a not-inconsiderable performance hit when querying (directly or indirectly) on the structure. Sub-structures may not be recursive: a field may not reference another struct that eventually references the origin.
text: Text, usually encoded in ASCII or UTF-8.

The typeinfo provides further information (or operations) regarding the field, and may consist of the following:

actdel action: Like actup but on deletion of the field in the database.
actup action: SQL actions taken when the field is updated. May be one of none (do nothing), restrict (disallow the reference from deleting if referrers exist), nullify (set referrers to null), cascade, (propogate new value to referrers), or default (set referrers to their default values). This is only available on foreign key references. The default value may only be used if the field is marked null or has a default value. The nullify value may only be used if the field is marked null.
comment string_literal: Documents the field using the quoted string.
default integer|decimal|date|string_literal|enum: Set a default value for the column that's used only when adding columns to the SQL schema via ort-sqldiff(1). It's only valid for numeric, date, enumeration, or string literal (email, text) field types. Dates must be in yyyy-mm-dd format. Defaults are not currently checked against type limits (i.e., e-mail form or string length).
limit limit_op limit_val: Used when generating validation functions. Not available for enum, bits, or struct. If there are multiple statements, all of them must validate. The limit_op argument consists of an operator the limit_val is checked against. Available operators are ge, le, gt, lt, and eq. Respectively, these mean the field should be greater than or equal to, less than or equal to, greater than, less than, or equal to the given value. If the field type is text, email, password, or blob, this refers to the string (or binary) length in bytes. For numeric types, it's the value itself. The given value must match the field type: an integer (which may be signed) for integers, integer or real-valued for real, or a positive integer for lengths. Duplicate limit operator-value pairs are not permitted. Limits are not checked for for sanity, for example, non-overlapping ranges, but this behaviour is expected to change.
noexport: Never exported using the JSON interface. This is useful for sensitive internal information. Fields with type password are never exported by default.
null: Accepts null SQL values. A rowid field may not also be null.
rowid: The field is an SQL primary key. This is only available for the int type and may only appear for one field in a given structure.
unique: Has a unique SQL column value. It's redundant (but harmless) to specify this alongside rowid.

A field declaration may consist of any number of typeinfo statements.

A typical set of fields for a web application user in a database may consist of the following. In this example, the email is unique, name must be of non-zero length, cookie is an internal value never exported (using the default keyword implies this was added later in development, such that old records have a value of zero while new records are non-zero), and id is the unique identifier. The user references an parent by its id. If the parent is deleted, the reference is nullified.

struct user {
  field parentid:user.id int null actdel nullify
    comment "Parent or null if there is no parent.";
  field name text limit gt 0 limit lt 128
    comment "User's full name.";
  field cookie int noexport default 0 limit lt 0
    comment "A secret cookie (if zero, added
             after secret cookie functionality).";
  field password password;
  field email email unique
    comment "User's unique e-mail address.";
  field ctime epoch
    comment "When the user was added to the database.";
  field id int rowid noexport
    comment "Internal unique identifier.";
};

Comments

A comment is a string literal describing most any component. Comments are part of the document structure and are usually passed to output formatters to describe a component. For example, a structure may be described as follows:

struct foo {
  field name text;
  comment "A foo widget.";
};

There's currently no structure imposed on comments: they are interpreted as opaque text and passed into the frontend. The only exception is that CRLF are normalised as LF, so sequences of \r\n are converted to simply \n.

Components may only have a single comment statement. An empty comment is still considered to be a valid comment.

Queries

Query data with the search keyword to return an individual row (i.e., on a unique column or with a limit of one), count for the number of returned rows, list for retrieving multiple results in an array, or iterate for iterating over each result as it's returned.

Queries usually specify fields and may be followed by parameters:

"struct" name "{"
  [ query [term ["," term]*]? [":" [parms]* ]? ";" ]*
"};"

The term consists of the possibly-nested field names to search for and an optional operator. (Searchers of type search require at least one field.) Nested fields are in dotted-notation:

[structure "."]*field [operator]?

This would produce functions searching the field "field" within the struct structures as listed. The following operators may be used:

and, or: Logical AND (&) and logical OR (|), respectively. Only available for bit, bits, and int types.
eq, neq, streq, strneq: Equality or non-equality binary operator. The eq operator is the default. The streq and strneq variants operate the same except for on passwords, where they compare directly to the password hash instead of the password value.
lt, gt: Less than or greater than binary operators. For text, the comparison is lexical; otherwise, it is by value.
le, ge: Less than/equality or greater than/equality binary operators. For text, the comparison is lexical; otherwise, it is by value.
like: The LIKE SQL operator. This only applies to text and email fields.
isnull, notnull: Unary operator to check whether the field is null or not null.

Comparisons against a database NULL value always fail. If NULL is passed as a pointer value, comparison always fails as well.

The password field does not accept any operators but isnull, notnull, eq, neq, streq, and strneq. If the query is a count, it further does not accept eq or neq.

The search parameters are a series of key-value pairs. In each of these, terms are all in dotted-notation and may represent nested columns.

comment string_literal: Documents the query using the quoted string.
distinct [["." | term]]: Return only distinct rows of the sub-structure indicated by term, or if only a period (“.”), the current structure. This does not work with null sub-structures. It is also not possible to test eq or neq for password types in these queries. Use grouprow for individual columns: the distinct keyword works for an entire row.
grouprow field ["." field]*: Groups results by the given column. This collapses all rows with the same value for the given column into a single row with the choice of row being determined by maxrow or minrow. It may not be a null column, or a password or struct type.
limit limitval ["," offsetval]?: A value >0 that limits the number of returned results. By default, there is no limit. This can be used in a search singleton result statement as a way to limit non-unique results to a single result. If followed by a comma, the next term is used to offset the query. This is usually used to page through results.
maxrow | minrow field ["." field]*: When grouping rows with grouprow, identify how rows are collapsed with either the maximum or minimum value, respectively, of the given column in the set of grouped rows. This calculation is lexicographic for strings or blobs, and numeric for numbers. The column may not be the same as the grouping column. It also may not be a null column, or a struct or password type.
name searchname: A identifier used in the C API for the search function. This must be unique within a structure.
order term [type]? ["," term [type]?]*: Result ordering. Each term may be followed by an order direction: asc for ascending (the default) and desc for descending. Result ordering is applied from left-to-right.

If you're searching (in any way) on a password field, the field is omitted from the initial search, then hash-verified after being extracted from the database. Thus, this doesn't have the same performance as a normal search.

The following are simple web application queries:

struct user {
  field email email unique;
  field password password;
  field mtime epoch null
    comment "Null if not logged in.";
  field id int rowid;
  search email, password: name creds;
  iterate mtime notnull: name recent order mtime desc limit 20
    comment "Last 20 logins.";
};

The advanced grouping is appropriate when selecting as follows. It assumes a user structure such as defined as in the above example.

struct perm {
  field userid user.id;
  field ctime epoch;
  iterate: grouprow userid maxrow ctime name newest
    comment "Newest permission for each user.";
};

Roles

Limit role access with the roles keyword as follows:

"struct" name "{"
  [ "roles" role ["," role]* "{" roletype [name]? "};" ]*
"};"

The role is a list of roles as defined in the top-level block, or one of the reserved roles but for none, which can never be assigned. The role may be one of the following types:

all: A special type referring to all function types.
delete name: The named delete operation.
insert: The insert operation.
iterate name: The named iterate operation.
list name: The named list operation.
noexport [name]: Do not export the field name via the JSON export routines. If no name is given, don't export any fields.
search name: The named search operation.
update name: The name update operation.

To refer to an operation, use its name. The only way to refer to un-named operations is to use all, which refers to all operations except noexport.

roles { role loggedin; };
struct user {
  field secret int;
  field id int rowid;
  insert;
  search id: name ident;
  roles all { search id; };
  roles default { noexport secret; };
  roles loggedin { insert; };
};

The example permits logged-in operators to insert new rows, and both the default and logged-in roles to search for them. However, the secret variable is only exported to logged-in users.

If, during run-time, the current role is not a subtype (inclusive) of the given role for an operation, the application is immediately terminated.

Updates

Data modifiers. These begin with the update, delete, or insert keyword. By default, there are no update, delete, or insert operations defined. The syntax is as follows:

"struct" name "{"
  [ "update" [mflds]* [":" [cflds]* [":" [parms]* ]? ]? ";" ]*
  [ "delete" [cflds]* [":" [parms]* ]? ";" ]*
  [ "insert" ";" ]?
"};"

Both mflds and cflds are sequences of comma-separated non-meta fields in the current structure followed by operators. The former refers to the fields that will be modified; the latter refers to fields that will act as constraints to which data is modified.

The delete statement does not accept fields to modify. If update does not have fields to modify, all fields will be modified using the default modifier. Lastly, insert accepts no fields at all: all fields (except for row identifiers) are included in the insert operations.

Fields have the following operators:

mflds :== mfld [modify_operator]?
cflds :== cfld [constraint_operator]?

The fields in mflds accept an optional modifier operation:

concat: String concatenate the current field by a given value (x = x || ?).
dec: Decrement the current field by a given value (x = x - ?).
inc: Increment the current field by a given value (x = x + ?).
set, strset: Default behaviour of setting to a value (x = ?). If the field is a password, strset sets to the raw value instead of hashing beforehand.

The fields in cflds accept an optional operator type as described in Queries. Fields of type password are limited to the streq and strneq operators.

The parms are an optional series of key-value pairs consisting of the following:

"comment" string_literal
"name" name

The name sets a unique name for the generated function, while comment is used for the API comments.

Uniques

While individual fields may be marked unique on a per-column basis, multiple-column unique constraints may be specified with the unique structure-level keyword. The syntax is as follows:

"unique" field ["," field]+ ";"

Each field must be in the local structure, and must be non-meta types. There must be at least two fields in the statement. There can be only one unique statement per combination of fields (in any order).

For example, consider a request for something involving two parties, where the pair requesting must be unique.

struct request {
  field userid:user.id int;
  field ownerid:user.id int;
  unique userid, ownerid;
};

This stipulates that adding the same pair will result in a constraint failure.

TYPES

To provide more strong typing for data, ort provides enumerations and bit-field types. These are used only for validating data input.

Enumerations

Enumerations constrain an int field type to a specific set of constant values. They are defined as follows:

"enum" enumname "{"
  [ "comment" string_literal ";" ]?
  [ "item" name [value]? [parms]* ";" ]+
  [ "isnull" label ";" ]?
"};"

For example,

enum enumname {
  item "val1" 1 jslabel "Value one";
  isnull jslabel "Not given";
};

The enumeration name must be unique among all enumerations, bitfields, and structures.

Items define enumeration item names, their constant values (if set), and documentation. Each item's name must be unique within an enumeration. The value is the named constant's value expressed as an integer. It must also be unique within the enumeration object. It may not be the maximum or minimum 32-bit signed integer. If not specified, it is assigned as one more than the maximum of the assigned values or zero, whichever is larger. Automatic assignment is linear and in the order specified in the configuration. Assigned values may also not be the maximum or minimum 32-bit signed integer. Parameters may be any of the following:

"comment" string_literal
label

The item's comment is used to document the field, while its label (see Labels) is used only for formatting output. The isnull label is used for labelling fields evaluating to null.

The above enumeration would be used in an example field definition as follows:

field foo enum enumname;

This would constrain validation routines to only recognise values defined for the enumeration.

Bitfields

Like enumerations, bitfields constrain an int field type to a bit-wise mask of constant values. They are defined as follows:

"bits" bitsname "{"
  [ "comment" string_literal ";" ]?
  [ "item" name bitidx [parms]* ";" ]+
  [ "isunset" label ";" ]?
  [ "isnull" label ";" ]?
"};"

For example,

bits bitsname {
  item "bit1" 0 jslabel "Bit one";
  isunset jslabel "No bits";
  isnull jslabel "Not given";
};

The name must be unique among all enumerations, structures, and other bitfields. The term "bitfield" may be used instead of bits, for example,

bitfield bitsname { item "bit1" 0; };

Items define individual bits, their values, and documentation. Each item's name must be unique within a bitfield. The value is the named constant's bit index from zero, so a value of zero refers to the first bit, one to the second bit, and so on. It must fall within 0–63 inclusive. Each must be unique within the bitfield. Parameters may be any of the following:

"comment" string_literal
label

The item's comment is used to document the field, while its label (see Labels) is used only for formatting output.

The above bitfield would be used in an example field definition as follows:

field foo bits bitsname;

The bitfield's comment is passed into the output media, the isunset statement serves to provide a label (see Labels) for when no bits are set (i.e., the field evaluates to zero), and isnull is the same except for when no data is given, i.e., the field is null.

Labels

Labels specify how bits and enum types and their items may be described by a front-end formatter such as ort-javascript(1). That is, while the string value of a struct item describes itself, an enum maps to a numeric value that needs to be translated into a meaningful format. Labels export string representations of the internal numeric value to the front-end formatters.

The syntax is as follows:

"jslabel" ["." lang]? quoted_string

The lang token is usually an ISO 639-1 code, but may be any identifier. It is case insensitive. If the lang is not specified, the label is considered to be the default label.

If a label is not specified for a given language, it inherits the default label. If the default label is not provided, it is an empty string. There is no restriction to labels except that they are non-empty.

Only one label may be specified per language, or one default label, per component.

ROLES

Full role-based access control is available in ort when a top-level roles block is defined.

"roles" "{"
   [ "role" name [parms] ["{" "role" name... ";" "}"]* ";" ]*
"};"

This nested structure defines the role tree. Roles descendent of roles are called sub-roles. Role names are case insensitive and must be unique.

By defining roles, even if left empty, the system will switch into default-deny access control mode, and each function in Structures must be associated with one or more roles to be used.

There are three reserved roles: default, none, and all. These may not be specified in the roles statement. The first may be used for the initial state of the system (before a role has been explicitly assigned), the second refers to the empty role that can do nothing, and the third contains all explicitly-defined roles.

Each role may be associated with parameters limited to:

"role" name ["comment" quoted_string]?

The comment field is only produced for role documentation.

EXAMPLES

A trivial example is as follows:

struct user {
  field name text;
  field id int rowid;
  comment "A regular user.";
};

struct session {
  field user struct userid;
  field userid:user.id comment "Associated user.";
  field token int comment "Random cookie.";
  field ctime epoch comment "Creation time.";
  field id int rowid;
  comment "Authenticated session.";
};

This generates two C structures, user and session, consisting of the given fields. The session structure contains a struct user as well; thus, there is a declarative order that ort(1) enforces when writing out structures.

The SQL interface, when fetching a struct session, will employ an INNER JOIN over the user identifier and session userid field.

NAME