Fork me on GitHub

Midgard2

Generic Content Repository for Web and Desktop applications

Midgard2 is an Open Source Content Repository. It provides an object-oriented and replicated environment for building data-intensive applications for both web and the desktop.

With Midgard2 you have generic way to define your own storage objects, that can then be queried and managed using multiple programming languages and applications. This enables writing your CMS or project management tool using a repository-oriented architecture where the Midgard2 storage system acts as the central point of integration between various tools.

Midgard2 is built on the GNOME stack of libraries like GLib and libgda, and has language bindings for C, Python, PHP, and other languages via GObject Introspection. Communications between applications written in the different languages happen over D-Bus.

12.09 Gjallarhorn is the latest stable version of Midgard2. Please refer to the release notes or the introductory blog post.

Downloads

Source download

Ubuntu

On Ubuntu Lucid (10.04 LTS), run the following command:

$ sudo add-apt-repository "deb http://download.opensuse.org/repositories/home:/midgardproject:/ratatoskr/xUbuntu_10.04/ ./"

Then run apt-get update and apt-get install libmidgard2-2010

To use Midgard2 with GObject Introspection in any language, install the library with apt-get install gir1.0-midgard2.

For PHP, install the extension via apt-get install php5-midgard2.

On Ubuntu Natty (11.04) and onwards Midgard2 is available in the repositories. Simply run:

$ sudo apt-get install libmidgard2

To use Midgard2 with GObject Introspection in any language, install the library with apt-get install gir1.2-midgard2.

For PHP, install the extension via apt-get install php5-midgard2.

Max OS X

On MacPorts, install Midgard2 with:

$ sudo port install midgard2-core

To use Midgard2 with PHP, you also need to:

$ sudo port install php5-midgard2

On HomeBrew, install Midgard2 with:

$ brew install midgard2

To use Midgard2 with PHP, you also need to:

$ brew install midgard2-php

Debian

Midgard2 is available straight from the Debian repositories. To install it, run:

$ sudo apt-get install libmidgard2

To use Midgard2 with GObject Introspection in any language, install the library with apt-get install gir1.0-midgard2. For PHP, install the extension via apt-get install php5-midgard2.

What is a Content Repository?

Content Repository is a service that sits between an application and a data store. It provides several advantages:

  • Common rules for data access mean that multiple applications can work with same content without breaking consistency of the data
  • Signals about changes let applications know when another application using the repository modifies something, enabling collaborative data management between apps
  • Objects instead of SQL mean that developers can deal with data using APIs more compatible with the rest of their desktop programming environment, and without having to fear issues like SQL injection
  • Data model is scriptable when you use a content repository, meaning that users can easily write Python or PHP scripts to perform batch operations on their data without having to learn your storage format
  • Synchronization and sharing features can be implemented on the content repository level meaning that you gain these features without having to worry about them

The basic idea is that with a Content Repository, developers can use standard interfaces for data storage instead of coming up with their own file formats or database schemas.

Workspaces

Workspaces are a new feature in Midgard2 10.05.5 that adds the capability of organizing your whole content into different containers and accessing them by the workspace. This basically adds capability similar to layers in image editing software) like Photoshop to your content repository.

Workspace tree

Workspaces form a tree hierarchy. This allows easy building of display rules or workflows to a Midgard2 powered application. Here are some examples of tree structures you may use:

  • /published/draft: content in the published workspace is displayed to everybody. New content is added in the draft workspace and only moved to published when ready
  • /en/fi: For English language users the content from workspace en is shown. Workspace fi extends that and allows translating content targeted at Finnish language audience
  • /public/private/group1: Content in the public workspace is shown to all users. Authorized users also see content from the private workspace, and members of a particular group see the content of the group1 workspace layered on top of the other two

Using workspaces

Workspaces support the whole Midgard2 content model transparently. This means that in order to use workspaces you only need to set the workspace tree you want to work with to the Midgard2 connection, and all your content input and output will obey that.

There are two ways to use workspaces:

  • Workspace context displays content from the selected workspace, and all workspaces above it in the tree
  • Workspace displays content only from a particular workspace

Content schemas

Midgard2 objects are defined using MgdSchema XML configuration files. Their classes are automatically registered for usage in applications and are described using MgdSchema file attributes and properties.

Naming Conventions

Due to language binding limitations, type names should be in lowercase and use underscores as word separators. You should follow this convention if you want to define schema types for midgard-java and midgard-php applications.

Temporary files with '.' or '#' prefixes, or with a '#' suffix will be ignored and warning messages will be printed to the log file or directly to the terminal window.

Schema Structure

Here's a simple example:

<?xml version="1.0" encoding="UTF-8"?>
<Schema xmlns="http://www.midgard-project.org/repligard/1.4">
  <type name="example_article" table="example_article">
    <property name="id" type="unsigned integer" primaryfield="id" />
    <property name="title" type="string" />
    <property name="content" type="text" />
  </type>
</Schema>

Loading Schema Files

When an application starts up, Midgard2 parses the main schema file MidgardObjects.xml which defines all the built-in types like midgard_person and midgard_attachment.

To load additional MgdSchema files, place them into your /usr/share/midgard2/schema directory (this may be different if you chose another prefix during midgard-core compilation).

If you want to use a custom schema loading path, you can set it using the ShareDir setting of your Midgard configuration file.

Writing Midgard Schemas

For every newly-defined type, mandatory attributes have to be set. To define a new MySuperClass type, use the type element:

<type name="MySuperClass">
</type>

This is sufficient to create a new type and initialize the corresponding new class when the schema is loaded. Such a type is accessible on the bindings level as the MySuperClass class.

Defining Storage Locations

Empty classes may be constructed without storage (i.e. a corresponding database table) defined. To set the table name for a particular type, the XML property table should be used:

<type name="MySuperClass" table="my_table">
</type>

Defining Properties

To define class members (object properties) for a new type, use property elements for child attributes:

<type name="MySuperClass" table="my_table">
  <property name="title" type="string" />
</type>

When using Midgard2 with your programming language this means there will be a class MySuperClass available with a property title.

Settting the Property Type

By default all properties are string type, unless another type is distinctly set. Available data types are:

  • string
  • integer
  • unsigned integer
  • text
  • float
  • bool
  • datetime
  • guid

For every XML property data type there is a corresponding database data type:

  • string is equal to varchar(255)
  • integer is equal to int(11) (values can range from -2,147,483,648 to 2,147,483,647)
  • unsigned integer is equal to int(11) (values can range from 0 to 4,294,967,295)
  • text is equal to text (or longtext )
  • float is equal to float
  • bool is equal to boolean
  • guid is equal to varchar(80)

Specifying the Database Data Type

The additional XML property dbtype may be used when a specific database type is needed, for example, a property with the string type could use the varchar(80) type instead of varchar(255).

<property name="title" type="string" dbtype="varchar(80)"/>

or

<property name="info" type="string" dbtype="set('auth')"/>

Resolving Property Name and Table Field Conflicts

The XML property field can be used if you want the object's property name to be different from the table's column name.

<property name="title" type="string" field="otherfield"/>

This describes that object's property 'title' will use the column 'otherfield' as its value storage. This is equal to SQL's SELECT table.otherfield AS title.

Setting the Primary Key

When a type has many properties defined and one of those should be used as (or better, point to) the table's primary key, the XML property primaryfield can be used.

<property name="primary" type="unsigned integer" primaryfield="id"/>

Setting database index

If property's field has special usage (e.g. holds a reference for other record), create index. By default indexes are not created for all properties. To set it, use reserved attribute 'index'.

<property="count" type="int" index="yes"/>

Note, that by default indexes are created for such properties and fields:

  • up property
  • parent property
  • property which is a link
  • property which is a link target (is linked)

Do not create indexes for boolean properties, as it won't improve performance.

Defining Tree Hierarchies

MgdSchema is able to create tree hierarchies. Objects of the same type may be managed in a tree structure by using the XML property upfield. For defining a type as a 'child' object of another type, the XML properties parentfield and parent should be used.

  • upfield is defined as property of an XML 'property' attribute
  • parentfield is defined as property of an XML 'property' attribute
  • parent is defined as property of an XML 'type' attribute

Properties which are parentfield or upfield must be the type of unsigned integer, guid or string.

This definition describes the MyClass type, which is a node in the MySuperClass tree, and also may have own nodes with objects of the same (MyClass) Type. Useful methods for such types are documented in MgdSchema object's API (list_childs, is_in_tree, get_by_path).

References to Other Types

The link attribute's property should be used when the object's property (and the column's record value) holds a pointer to another type. The property's value should hold the primary property value of the referenced type.

<property name="creator" type="unsigned integer" link="midgard_person:id"/>

The special : separator describes which property of the referenced type should be used as the property's value. If this separator is not defined, the guid property of the referenced type is used by default.

Properties which are links must be the type of unsigned integer, guid or string.

Asynchronous operations

'Gjallarhorn' generation introduces asynchronous operations. Those are designed to provide non blocking operations on your storage.

The basic idea

In general, any asynchronous operation separates content and a task. To perform any read/write operation three types of objects are required:

  • content which holds the data and might be any GObject derived object, which properties are the simplest and volatile storage
  • job which represents single operation and is responsible to map content data to underlying storage
  • pool which executes operations

Default implementation provides execution pool based on threads. Once job has content assigned, it can be added to such pool, which executes job in separate thread. As soon as operations succeeds, proper signal is emitted and all callbacks which are connected to such signal are invoked.

Basic example

def executionCallback(self, obj, arg):
    print "ASYNC OPERATION IS DONE"

obj = Midgard.Object.factory(cnc, "TheType", None)
ref = Midgard.ObjectReference(id = 123, name = "id")
job = Midgard.SqlContentManagerJobLoad(connection = cnc, contentobject = obj, reference = ref)

pool = Midgard.ExecutionPool(max_n_threads = 2)
job.connect("execution-end", executionCallbac, None)
pool.push(self.job)

loop = GLib.MainLoop()
loop.run()

API reference

PHP5 extension

PHP5 extension provides language bindings for Midgard2 content repository. All core classes, and MgdSchema ones are available as normal PHP classes.

Registering classes

In midgard core, all types are registered when connection is opened for specific configuration (either created on application level, or from exisiting config file). This is not valid for PHP, as connection might be opened on application level, but classes can not be registered this way.

Due to the nature of PHP, all midgard classes are registered during module loading phase. It means, it's not possible to register classes after module has been loaded and typical request started (either httpd request or command line application). The positive part of this behavior are persistent classes available during the whole lifetime of module. There is no need to load files and register classes per every request. So once module has been loaded, every class is available and it's destroyed when main process terminates. Either httpd server or command line application has been stopped. Of course, it means, every mgdschema class, once is registered is available for every application which is running in the main process. For httpd server, like apache, any type is shared among different requests and virtual hosts, so it's not possible to isolate registered types.

Setting up the schemas

There are couple of ways to set up the directory which contains schema files with class definitions.

Core's prefix

By default, all schemas will be read from the share dir (e.g: /usr/share/midgard2/schema, /usr/local/share/midgard2/schema). So if types have to be registered, there's no need to alter configuration.

Setting share dir with environmental variable

Sometimes it's needed to register classes from custom directory. So different types are registered before application actually starts. To do this we need to set environmental variable:

export MIDGARD_ENV_GLOBAL_SHAREDIR="/full/path/to/share/directory"

Setting the sharedir in configuration file for multiple virtual hosts.

For multiple virtual hosts, it's recommended to define configuration file in virtual hosts: In php.ini or in midgard2.ini configuration file these config keys must be added:

midgard.http On
midgard.engine On

In virtual host configuration:

php_admin_value midgard.configuration 'midgard'

The configuration file 'midgard' must exist in system configuration directory (/etc/midgard2/conf.d).

In apache httpd environment, all files existing in /etc/midgard2/conf.d will be read upon module startup, all classes will be registered and for every configuration, new connection will be implicitly established. Classes will be registered "globally", which means every class will be available for every virtual host. Connection will be available thanks to php_admin_value being set in virtual host, and as identified by name, it'll be shared among different virtual hosts with the same configuration name specified.

This way is recommended also for safety reasons. All files in /etc/midgard2/conf.d directory can be readable only by root user. And as apache httpd server starts up it's able to read every single configuration with root privileges, and as soon as all its modules (including PHP) are loaded, the privileges are dropped, so it's not possible to read those files from any code executed within virtual hosts.

Setting the sharedir in configuration file for one host or one application.

For single virtual host, configured with lighttpd for example:

In php.ini (or midgard2.ini) these config keys must be added:

midgard.http Off
midgard.engine On
midgard.configuration_file "/absolute/path/to/configuration"

midgard.configuration_file directive takes precedence, so in case of virtual host, if such is configured, it'll be used to establish connection for that host.

midgard.http directive set to Off disables opening new multiple connections implicitly when php module is loaded, and establishes connection for given config file during the first request made.