LMNL syntax

From LMNLWiki

This document describes the LMNL syntax, which is a syntax that can be used to represent LMNL data models. It isn't the only syntax that could be used with LMNL data models, but it's fairly simple and it covers all possible LMNL data models, which gives it an advantage over XML syntax, for example. See also Alternative Syntaxes.

This page contains a brief guide. For full details, see Detailed LMNL syntax.

Text

LMNL documents can contain any text you like as native Unicode characters. Character references can be used for characters that you don't want to include as native characters. These are a special kind of atom. For example

{{#xA0}}

indicates a non-breaking space.

Certain characters are significant in LMNL. These can be escaped with character references, or more mnemonically thus:

character atom
[ {{#lsqb}}
{ {{#lcub}}

Other characters that are used in LMNL syntax but don't have to be escaped also have standard atoms as follows

character atom
] {{#rsqb}}
} {{#rcub}}

Ranges

LMNL documents contain ranges, whose start and end are indicated by start and end tags. A range usually looks like:

...[foo}...{foo]...

where [foo} is the start tag and {foo] is the end tag. The name of the range is a qualified name, resolved using the namespaces for the document.

To handle cases where ranges with the same name overlap, you can supply an ID for a range. The ID is a non-colonized name. IDs must be unique within a document: no two start-tags can have the same ID, and no two end-tags can have the same ID.

...[foo=a}...[foo=b}...{foo=a]...{foo=b]...

You can have anonymous ranges, which don't have a name. These will usually have annotations on them.

Annotations

Ranges can have annotations. These live in the start and end tags of the range. The start and end of an annotation is indicated by a start and end tag just like a range. Here, for example, the [foo] range has a [bar] annotation.

[foo [bar}...{bar]}...{foo]

You can use the shorthand {] for the end of an annotation. The above is equivalent to

[foo [bar}...{]}...{foo]

Annotations can have annotations themselves; these live in the start and end tags of the annotation. For example, here the [bar] annotation on the [foo] range has a [baz] annotation.

[foo [bar [baz}...{]}...{]}...{foo]

Annotations can also contain ranges. Here, the [bar] annotation on the [foo] range contains a [baz] range.

[foo [bar}...[baz}...{baz]...{]}...{foo]

You can have anonymous annotations. Annotations cannot overlap.

Atoms

The content of a document may contain media other than characters, such as images, audio, video and so on. These are called atoms and have the syntax

{{foo}}

Atoms do not have any content, but can have annotations. For example

{{img [src}example.gif{]}}

You can have anonymous atoms.

An atom of the form {{#xhexnum}} is equivalent to a character whose Unicode codepoint is hexnum.

Namespaces

All LMNL names may contain a single colon separating the prefix from the localpart. If there is no such colon, the prefix is said to be empty. Names must not contain colons otherwise.

Prefixes do not appear in the data model, where names are represented as a pair of (namespace name, local part), where namespace name is a string with the syntax of an IRI. (This may turn into XRIs when and if these are defined properly; XRIs allow random unescaped ASCII characters, whereas IRIs only allow what URIs allow plus all non-ASCII characters.)

The mapping between prefixes and namespace names is arranged using namespace declarations, which can appear essentially anywhere and take one of these forms:

  [!ns prefix="IRI"]
  [!ns "IRI"]

to map a specific prefix and the empty prefix, respectively, to the specified IRI. Documents must not map the same prefix to more than one namespace name, nor the same namespace name to more than one prefix.

This paragraph is still controversial (see the talk page):

Prefixes may be used without a prior definition, in which case the empty prefix is mapped to the namespace name http://lmnl.net/prefixes/ and every other prefix p is mapped to a namespace name of the form http://lmnl.net/prefixes/p. As a result, LMNL does not have the XML concept of names that are not in a namespace. Mapping a prefix to some other namespace after using it without a definition is equivalent to mapping it twice, and therefore not allowed. Mapping these namespace names to any but the corresponding prefixes is not allowed.

The purpose of the restrictions on mappings are to permit names within a single document to be compared for equality by looking only at the prefix and the localpart.

Note that the prefix lmnl is reserved to the Ad Hoc LMNL Committee. It must not be mapped to any namespace name except http://lmnl.net/prefixes/lmnl. Document authors who use names in this namespace other than ones defined by the LMNL cabal do so at their own risk!