the mantikhor datamodel

Mantikhor (whose name, an extract from semantikhoron, echoes the mythical monster called the manticore) is a practical datamodel created to address certain kinds of operations and problem spaces. In particular, it's intended for two cases:

when it's necessary to express semantics pervasively, and perhaps idiosyncratically;
when a wide variety of recordtypes, very changeable (or labile), possibly unknown or unanticipated at design time, is expected.
note on the term "labile"
The term "labile" (which, um, is not to be confused with the admittedly typo-equivalent "liable") is particularly good for this purpose: it connotes a lack of predictabililty, a lack of stability, and the likelihood that the onset of change could be very rapid. It also suggests the possibility of unexpected combinations of characteristics in the newly emergent state.

Although many other situations may suggest the use of Mantikhor, these two — pervasive semantics and labile recordtypes — are the conceptual and purposive focus of Mantikhor's design.

An instance of a Mantikhor space is called a khoron. That term is used analogously to the way "database" means an instance of an RDBMS space, or "directory" means an instance of an LDAP space.

current state of Mantikhor development

Mantikhor is still at a very early stage. The specification itself is not complete, and much of what has been written down has not yet been published here. Much of the content has not been sufficiently tested, vetted, and reviewed, and should be regarded as provisional. Further, a lot of the explanations can be much improved, and will be. The first reference implementations will not be ready for several months.

So, the caveat for the time being is that the level of content churn — in both the substantive and the expressive aspects — will remain high for some time to come. The DRAFT marker in the left margin serves as an additional reminder of all of this.

Mantikhor is a runtime recordtype model

It's a pretty good generalization to describe Mantikhor as a runtime recordtype model:

Mantikhor is factored for optimal management of records.
note on datatypes vs recordtypes
The distinction between the datatype and recordtype aspects of type definition tends to be obscured or ignored in most discussions. It's important in Mantikhor, though.
- A datatype is a definition that is based primarily on the nature of the information expressed in instances of the type.
- A recordtype is a definition whose instances are intended to fulfill some kind of purpose.
- There's not really a bright line separating the two: most typedefs have some of each. Hence the use of the word "aspects": datatyping and recordtyping are aspects of type definitions in Mantikhor. In many cases, one aspect or the other will be negligible; but in other cases, it'll be nothing but shades of gray as far as the eye can see.
Mantikhor's optimization for recordtypes is largely implemented through making the structure and purpose of a record easily accessible and analyzable. In general, records are handled as associative composites of well-understood subparts, with pervasive semantic attachments.
Mantikhor is built to delegate a lot of reasoning and analysis to runtime, rather than design time, processes.
design time
Definitions of design time can vary depending on the context and scope of the definition. In Mantikhor discussions, we introduce an ad-hoc usage:

Design time refers to any point in the process of software creation (requirements definition, architecture, design, construction, or revision) at which meaning, purpose, or behavior is defined and/or injected into the software.

Design time does not include operation of the software itself.

Design time is always dominated by the actions and decisions of human beings — analysts and programmers primarily.

The problem with design-time specification of behavior is simply that it requires a change in the software to effect a change in that behavior.

This is not exactly an original insight: most enterprise-level software architectures take advantage of "frameworks" whose capabililties include extrinsic specifications of behavior, specifications which are consulted at runtime. These extrinsic specifications are often central to the design intent of the framework. Usually, it's still dependent on specific behavior built into the code at design time code. Still, the activity of the system at large is, at least to some degree, controlled by declarative configuration or command files.

The benefits of displacing high-level operational directives outside the body of code are numerous. The most significant is probably the happy circumstance that changing the behavior of the software application can be done without breaking anything.

external config files can't break the software

An extreme example of the "external config files can't break the software" principle is the ordinary Web Browser. It's simply a piece of software whose state and operation are defined by an external config file called a Web Page.

If you write an exceptionally stupid Web Page (warning: seriously, don't click that link if you are subject to motion sickness or attacks of psychomotor epilepsy), you can get exceptionally stupid behavior; but you don't need to modify the browser code, recompile, and reinstall the browser to achieve that apotheosis of stupidity.

Nor, of course, will you break the browser: viewing the Stupid Page will not alter the browser's abililty to subsequently display non-stupid pages. (Succumbing to malware embedded in some visited page is not part of the browser's intended functionality, so it's not considered when making this assertion.)

Beyond that: even if changes don't break anything, the cost in resources and time is significant for every trip through the rewrite-compile-regression-test-redeploy cycle.

Mantikhor's "runtime recordtype model" paradigm is in large part derived from the extrinsic specification of behavior. It is thus is not wholly unprecedented, but it is a very different approach than the design practice of traditional datamodels.

how it's done traditionally

Traditional datamodels are ordinarily Object-Oriented ("OO"), or at least strongly influenced by the OO paradigm. In OO design, runtime data is an attribute of an owning object, and operations on that data are performed by methods provided at the object definition level (the class of the object). This much is familiar conceptual territory. The benefits of object-level encapsulation of data, and of class-level encapsulation of operation, are well understood.

However, this serves neither of our main purposes well:

It obscures the semantics; and
It precludes the runtime handling of labile types.

semantics

Traditional datamodels seldom even identify semantics with much clarity. Semantic characterizations are considered to be intuitively understood attributes of class definitions. Schemes of semantic classification or identification, and any reasoning about how to handle entities according to those schemes, are worked out implicitly. Usually this occurs during software application design time as follows:

The basic semantics of the problem space are developed during Business Requirements Analysis.
Analysis results are incorporated in an authoritative Requirements Document, describing the problem space characteristics and the requisite system behaviors. Semantics are implicitly embedded in that document; they can be (at least to some extent) understood by a careful reading of the Document's natural language statements, in light of its general context.
Developers study the authoritative expression, thereby transferring its contents to mental models in their individual assessments of the problem.
As the developers collaborate on design and coding, their individual mental models come into contact, seldom remaining unaltered. Those models, as they interact, converge on a rough consensus.
The consensus understanding of the semantics is modified as the developers refine their designs, receiving clarifications from Subject Matter Experts as they go.
The software is constructed. The behavior of the software is now, implicitly, the authoritative expression of the semantics of the problem space.
As long as the software is under active development, loop back to Step 3.

This is a remarkably indirect process. Lots of software has been successfully developed and maintained using this approach. However, this clearly doesn't provide a solid, low-cost way of working with semantics as such.

labile recordtypes

Traditional OO datamodels and conceptual approaches are no kinder to the handling of labile recordtypes than they are to pervasive semantics. The essence of this problem is twofold:

The encapsulation of composited class types;
The concentration of functionality into methods, defined at class the level.

These two aspects of OO design push the datamodel in the direction of beautifully factored, well-defined characteristics for objects. They support robust, rugged software implementations, and they provide almost unlimited capability to define specific behaviors for classes of objects.

...which are set in concrete until the next release.

The OO paradigm is strongly predisposed to modeling only those objects about which designers have fairly exact prior knowledge.

how Mantikhor is different

Mantikhor's runtime recordtype model design overcomes the traditional model's deficiencies by two design measures:

Explicit representation of semantics as runtime entities, subject to inspection and some degree of manipulation by the operations defined for the model itself;
Object-Disoriented Design ("ODD"). OK, there's considerable silliness in the term. But it's salient and mnemonic, so it's in.

What it means is: Discarding the OO paradigm for Mantikhor representations of data and types. This is a radical and somewhat unnerving step, but pays off pretty well. In specific, Mantikhor adopts the following OD design characteristics:

Details:

explicit semantics

Mantikhor treats semantic qualities as discrete items. These items are called endpoints, because they represent a piece of human-significant information. This means that they cannot be "opened up" and analyzed by machine logic -- it takes human thought to work out the internals of semantic concepts.

From Mantikhor's point of view, then, these semantic endpoints are atomic, i.e. indivisible. They exist only to be referenced, and cannot be proved nor disproved by anything Mantikhor can do. They are very much like axioms in formal logic.

Semantic endpoints may be referenced from any non-semantic information item in the datamodel, thereby asserting that the meaning defined in the endpoint applies to the item from which the reference emanates.

The term semantic attachment is used as a verb to describe the act of assigning a semantic endpoint (and hence its meaning) with any data item. It is also used as a noun to describe the resulting association.

Object-Disoriented design

To be clear: nothing in Mantikhor renders it incompatible with Object-Oriented programming. Implementations of Mantikhor are expected to be traditionally written, in OO languages, using traditional OO design practices. However, from Mantikhor's point of view, that's all under the hood. Mantikhor, at the datamodel level, abandons the normal OO design approach — and admittedly forfeits a lot of expressive power by doing so.

Mantikhor's ODD consists of several significant design items:

A unitary or flattened scope model for definitions and instance data. Every item within a khoron is always in scope. No part of an instance of a Mantkhor datamodel is "encapsulated", "hidden", or "out of scope", ever. Access to various data items may be restricted according to policy or business requirements, but this is a very different thing from the programming concept of scope as an enclosing context.
A property-based, as opposed to object-based, type system. In the property-based model, the property type definitions are central. Recordtypes are defined simply by enumerating the property types that a record may have. Since
- the property definitions are always in scope, and
- they are well-understood,
this has the effect of making all record types as transparent, as granularly inspectable, as desired. By analogy, it's like building something out of old-style Lego® Blocks: you know everything you need to know about the few individual sorts of block, and then you can understand how anything, no matter how complex, is assembled.
A set of inherent operations for analysis of represented information. These are mainly comparative operations, which identify and/or describe the kinds of relationships that two items can have to one another. These operations are formally and precisely defined for all kinds of constituent items in the model.

Here OD diverges very widely from OO. Instead of method definitions of functionality at the class level, applicable to objects of other classes only via the mechanisms of inheritance and encapsulation, Mantikhor is defining model-wide behaviors. And instead of encapsulated implementation (requiring individual documentation, e.g. Javadocs, for each method defined for an object), Mantikhor does not even acknowledge such type-specific actions.

Of course, this is a severe limitation. It renders Mantikhor wildly unsuitable for many kinds of representations in many different kinds of architectures. However, for the specific needs of runtime recordtype modeling, it's considerably more powerful.
A set of logical, structural, and operational primitives provided as built-ins in Mantikhor.

arc-node representation

Mantikhor's fundamental representation is graphical. In particular, Mantikhor information is represented as a labeled directed-edge graph of nodes, depicted on a two-dimensional plane. Following accepted custom, we call this an arc-node graph, and we draw it as shapes (nodes) connected by arrows (arcs).

The shapes shown in this example are simple ellipses. The ellipse is only one kind of node representation in the model; Mantikhor defines a number of specific shapes to graphically represent specific kinds of nodes. Mantikhor does not use any graphical conventions to distinguish between arcs: an arrow is an arrow.

Mantikhor does introduce one terminological idiosyncrasy when discussing nodes from the viewpoint of an arc: the node from which an arc issues is called the arc's origin or originating node; and the node to which the arc points is called its target. This does not appear to be accepted nomenclature in any of the mathematical literature (which tends to use head and tail respectively). Origin and target have been adopted because they very clearly represent the relationships.

For an overview of the nature and properties of labeled directed-edge graphs, Wikipedia's article on Graph Mathematics is a very good place to start.

Also from Wikipedia, see the very useful glossary of Graph Theory terms, including some discussion of the unsettled nature of graph terminology (due in large part to the comparative newness of Graph Theory as a branch of mathematics).

This is, of course, similar to the W3C's Resource Description Framework, normally known as RDF. The similarity is not accidental: Mantikhor is bound to RDF at a fairly low level, and as a data representation it can be considered an RDF Application. However, Mantikhor and RDF are not, in general, interchangeable idioms: Mantikhor is a more constrained and structured representation, and not as general as RDF; on the other hand, RDF does not recognize Mantikhor's inherent operations. RDF and Mantikhor are best characterized as "highly overlapping" technologies, and the binding between the two exists in that overlapping region of common content.

the khoron

An instance of fully valid (i.e., not fragmentary) Mantikhor data is called a khoron, from the Greek word choron, "space" (which is the root of part of the "Mantikhor" name).

A khoron is reified as a single ur-node emplaced on a two-dimensional plane of infinite extent. The khoron's ur-node is identified by a URI; this identification is taken to extend to the entire khoron, and all content contained therein.

The khoron has two kinds of properties:

Metadata that describe the khoron in terms of its creation, purpose, ownership, modification history, etc;
The namenodes of graphs contained in the khoron.

Khoron metadata is easily intuitively understood. Namenodes, however, require more explanation and a precise definition.

namenodes: linking graphs to the khoron

Data exists in Mantikhor as nodes connected by arcs. The presence of the arcs provides structure: the only structure, in fact, that can be expressed in Mantikhor. The root-level organizing principle imposed on this structure is that the information contained in a khoron is associated into discrete graphs.

Every graph has exactly one namenode. The graph consists of all arcs and nodes (A,N) such that node N is the target of arc A, and there exists a path of arcs and nodes from the namenode that ends with (A,N).

In other words: from the namenode, you can get to any node in the graph by following arcs according to their direction, from node to node.

basic components

item	shape	label	description	comment
named resource node	ellipse	identifies the node itself	This is the basic resource representation. Resources provide a means of constructing associative property relationships. If the resource is labeled, it is called a named resource.	same as the shapes used in RDF graphs
anonymous resource node	circle	none	Anonymous resources usually serve as collection points for properties. Anonymous resources may have types and semantic attachments.
literal value node	rectangle	value	This is the simplest representation of a literal value (such as a string, number, date etc). Literal nodes are always anonymous: they do not have a URI. A literal node with a value of "5" is considered to be equivalent (but not identical) to the occurrence of a literal node with the value of "5" elsewhere. Mantikhor actually uses a more complex value space representation of literals in instance data. However, this representation is hidden inside a value space, which is significant when converting a Mantikhor graph to RDF and vice versa.
arc	arrow	identifies the arc's type only	An arc emanates from an origin node, and points to a target node.
semantic resource node	rectangle with pointed ends	identifies the node itself	A semantic resource is also known as a semantic endpoint, because it exists at the edge of computational space. It represents a concept, or an item of meaning, that is opaque to algorithmic logic: it is meaningful to human intelligence only. Algorithms can reason about semantic resources, given logical assertions provided by semantic designers and standards bodies. But neither the semantic endpoints, nor the axiomatic assertion of relationships between them, can be "understood" in any meaningful sense by any inherent operations in Mantikhor.
typedef resource node	rectangle with clipped corners	identifies the node itself	Typedefs describe instantiation constraints on graphs or properties in Mantikhor. Their characteristics include: Annotations (explanatory data helpful for human beings) Precise specifications of type characteristics permitted properties for graph types value space specifications for literal types Semantic attachments to portions of the definition, or to the definition itself
namenode of a graph	resource node drawn with double border	identifies the node itself, if present	A namenode is the identifying node of a represented graph of nodes. Each namenode is a direct property of the khoron itself. This is how namenodes are identified, and distinguished from other kinds of nodes. Every graph in the khoron has one, and exactly one, namenode. (References within the graph to other graphs - i.e., the use of a namenode and its associated graph as the value of a property - are required to be terminating values. The foreign namenode therefore appears in a non-namenode role, which avoids contention with the graph's actual namenode.)
property	an arc-node pair, in which the node is the target of the arc		The property is the fundamental aspect of graph structure in Mantikhor. Every graph can be described as its namenode; the properties of its namenode (i.e., the arc-node pairs emanating from the namenode); recursively, the properties of any propertyvalue that is itself a resource node (literals cannot have properties, except for semantic attachments). A property's two parts are named. The arc is the property definition, and the target node is the property value.

semantikhoron.org

semantic reasoning, semantic web, semantic craft and engineering

the Mantikhor model

overview: semantics, properties, runtime reasoning, and inherent operations