Data Model Schema¶
Preamble¶
A data model describes what a valid stored document will look like.- A data model is required for all GenboreeKB collections.
- A GenboreeKB can have multiple document collections, and each will have its own model.
- The model defines key things like:
- What are the valid property names? (case sensitive!)
- What is the domain or "type" information for the value of a property?
- Does a given property have any sub-properties? If so, define them.
- Or does a given property have a homogeneous list/array/set of sub-properties instead?
- Do you need to index the property to enable faster searching?
- For a given property, the data model defines things like:
- Is the property required to be present?
- Is the property the record identifier? (exactly 1 property must be tagged as the record identifier)
- Should the value be unique? (no other documents in the collection shall have the same value for this property)
- Is the property's value fixed/static? (i.e. can't be altered; good for fixed "categories" and "headings")
The Property Definition¶
Each property will have a definition that describes value use of the property in a document. A model thus contains a set of property definitions.
Summary of Key Fields in a Property Definition¶
Illustrated below are some of the key fields that can be present in a property definition.- The second column indicates the DEFAULT value for the field if you don't provide it. Often, the default is appropriate!
{ // ---- Commonly used fields ---- "name" : NO_DEFAULT, // [Required] Contains the name of the property. "domain" : "string", // Keyword indicating domain/type-information for the property's value. "identifier" : false, // Is this property the "document identifier"? Exactly 1 property must have "identifier"==true. "required" : false, // Is this property required in a valid document, or can it be left out? // ---- Slightly more advanced fields ---- "category" : false, // Are you using this property more as a category or header, for organizing documents nicely? "fixed" : false, // Is the value fixed/static? i.e. defined in the model? (Often used with categories) "default" : NO_DEFAULT, // If there are any default values to be used for that property, define them here. "unique" : false, // If the property value should be unique in the entire collection, specify it here. "description" : "", // Provide a useful description about the property. This description will be shown as a tool tip in the UI. "index" : false, // Would you like to index this property, so searches can be faster? // ---- Mutually exclusive fields regarding sub-properties [if any] --- "properties" : null, // If the property has its own sub-properties, define them here. "items" : null // If the property has an open list of 0+ sub-properties, define them here. // Property definitions in "items" must be SINGLY-ROOTED, but of course can be nested. }
Detailed Descriptions of Model-Related Property Metadata¶
Key Fields¶
(core/key fields you should be aware of and use/consider often)"name"
- [String
] The property name. Can have spaces, special chars (if escaped where needed). Case sensitive.- Predefined in the data model (!!)
- New properties/attributes cannot be provided at data entry/submission time. All properties must be in the model.
"domain"
- [Keyword string
] The domain or type-information for the property'svalue
.- Must be one of the known domain specifier Strings (see Domain Specifier Keywords).
- You often want to provide this. If not provided, the default domain
"string"
will be used.
"default"
- [(various)
] The default value, if any.- Must be either a value from the value domain or the
null
value (not text"null"
) for no default. - Documents won't be accepted as valid if they fail to provide values for properties with
null
defaults. It just means no suggested value is provided by default.
- Must be either a value from the value domain or the
"identifier"
- [Boolean
] Is the property the document identifier?- ALL MODELS MUST HAVE EXACTLY 1 doc identifier PROPERTY i.e. the unique name by which you refer to the document.
- Implies
"unique"=true
. There is no need to provide"unique"
for the property; providing"unique"=false
for an identifier property is an error. - Implies
"required"=true
. There is no need to provide"required"
for the property; providing"required"=false
for an identifier property is an error.
"required"
- [Boolean
] Whether the property is required to be present.- i.e. must be filled in or provided when data is submitted.
- Note that you can have required sub-properties under a non-required property. Just means that "IF this property is provided, it MUST ALSO have certain sub-properties".
"properties"
- [Array
] An Array of sub-property definitions for this property, if any.- i.e. the properties-of-this-property. This is where you define the content-related property metadata for this property.
- Mutually-exclusive [currently] with
"items"
.
"items"
- [Array
] For properties which have a list of sub-ordinate properties, this is an Array of sub-property definitions for the items stored in the list.- All items (i.e. properties; i.e. ~sub-documents) in the items list MUST BE SINGLY ROOTED PROPERTY DEFINITIONS. i.e. there is one top level property, and it may have any number of sub-properties as usual (or none)
- The list is homogeneous and here is where you define this kind of "sub-document" or kind of "property" the list contains.
- Mutually-exclusive [currently] with
"properties"
.
Ancillary Fields¶
(more specialized fields supporting specific cases or providing more advanced information)"description"
- [String
] Description of the property.- For documentation, best practices, informative models, reminders to self (8 months later), communication with others, etc.
- Strongly recommended best-practice.
- But completely optional.
- Note: may be used for tooltips/popups in UIs.
"unique"
- [Boolean
] Whether the value for this property is unique.- For
"unique"
properties not in an"items"
list, no other document in the collection can have the same value for this property. - For
"unique"
properties which are within an"items"
list, the scope is restricted to the list; i.e. no other sub-document in the list can have have the same value for the property. Very useful for properties which act as item "identifiers" in the list! - Search and get-by-unique property functionality of UIs, APIs, etc. can leverage unique fields (otherwise, saving and using formal Queries will be needed, with much more overhead).
- For
"category"
- [Boolean
] Whether the property is the special case of a category.- Category properties are a specific case that are expected to be handled differently than normal in other representations.
- Mundane example: Category property names & values may be rendered in bold when presented in UIs and as HTML, etc.
- Information example: Category properties are actually tag/categorization attributes for the properties immediately underneath the category. The "tag" analogy is good here. Some data exports (e.g. RDF) may not keep the category as a hierarchical "header" but rather as subordinate statements about the property(ies) within the category.
- Most commonly used together with
"fixed=true"
, but not strictly required to be.
- Category properties are a specific case that are expected to be handled differently than normal in other representations.
"fixed"
-[Boolean
] Is the property's value fixed/static/unmodifiable? i.e. defined and fixed by the model?- Most commonly used together with
"category"
, but not strictly required to be.
- Most commonly used together with
"index"
- [Boolean
] If appropriate, should an index be built on this property to speed up common (!!) searches?- This is actually a "hint" for the infrastructure and may or may not result in an actual index in the underlying storage engine.
- Best impact will be when used together with
"unique"=true
. Not needed when"identifier"=true
, because a property definition with that field (and associated value) is indexed by default. - Don't over-index! If you do, storage space will be consumed very quickly and insert-times will become very very long. Judiciousness is best.
A Data Model Is a Singly Rooted Nested Collection of Property Definitions¶
We've seen property definitions on their own above. A document will likely have many properties & sub-properties, all needing definitions.- The full data model document will contain all of these various property definitions.
- Each document model has a single root property, which is the document
identifier
.
Illustrative Example/Template
// DOCUMENT MODEL // * At the top level, it's an array/list of the top-level properties [ // Definition of the top-level document identifier property { "name" : "nameOfIdentifierProperty", "domain" : "string", "identifier" : true, "properties" : [ { "name" : "subPropName1 - an optional String" }, { "name" : "subPropName2 - an optional Boolean", "domain" : "boolean" } // ... ] } ]