S3XML
S3XML is a data exchange format for Sahana Eden.
S3XML is a meta-format and does not specify any particular data elements. The interface is entirely introspective to the underlying data model, thus the specific constraints defined in the data model also apply for S3XML documents.
Conventions
Name Space
In the current implementation of S3XML, no name space identifier shall be used. Where a name space identifier for the native S3XML format is needed (e.g. when embedding S3XML in other XML), it shall be:
xmlns:s3xml="http://eden.sahanafoundation.org/wiki/S3XML"
Character Encoding
Generally, XML documents can specify their character encoding in the XML header:
<?xml version="1.0" encoding="utf-8"?>
Sources in non-XML formats (JSON, CSV) used with S3XML on-the-fly conversion/transformation are expected to be UTF-8 encoded.
All exported data are always UTF-8 encoded.
Import Sources
There are 3 different ways to specify or submit data sources for import:
Files on the Server
A source file in the server file system can be specified using the filename URL variable:
PUT http://<server>/<controller>/<resource>.xml?filename=<path>
Multiple files can be specified as list of comma-separated pathnames:
PUT http://<server>/<controller>/<resource>.xml?filename=<path>,<path>
Remote Files
A source file can be specified by its URL using the fetchurl URL variable:
PUT http://<server>/<controller>/<resource>.xml?fetchurl=<url>
Multiple files can be specified as list of comma-separated pathnames:
PUT http://<server>/<controller>/<resource>.xml?fetchurl=<url>,<url>
Supported protocols are http, ftp and file, where file is interpreted in the server file system context. URLs of different protocols can be mixed.
The specified URLs must be accessible either without authentication, or (if you specify credentials in the URLs) they must support unsolicited HTTP basic authentication - HTTP 403 retries are not handled by the interface.
The URLs must be properly quoted (see http://www.w3schools.com/tags/ref_urlencode.asp for more details), and must not contain commas.
Request Attachments
Source files can also be attached to a multipart-request. In this case the file extension of the source file must match the request URL file extension. Multiple files can be attached.
Multiple Sources
Where multiple sources are specified or attached, they are first converted and transformed one-by-one and then combined into a single element tree before import.
Duplicate Resolution
The S3XML Importer does not handle duplicates within the same source. As the order of elements in the resulting element tree is not defined, and the last update time attribute is not mandatory in source elements, there is no predictable rule of precedence.
Records in the source must not be fractionated, but submitted in one element. Fractions of records will not be merged by the Importer, and which of the fractions finally would be imported is not predictable
Source elements using unique keys are automatically matched with existing records. Where the match is ambiguous (e.g. a set of keys matching multiple existing records), the import element will be rejected as invalid. For certain resources, the server may have additional duplicate finders and resolvers configured. How duplicates are handled by these resolvers, can differ from resource to resource.
The duplicate resolution strategy in standard import mode is to update the existing record with the values from the source record. In synchronization mode the default strategy is to accept/keep the newest data (the last update time attribute is mandatory in this case).
XML Format
Document Types and Structure
S3XML defines 3 types of documents:
Document Type
|
Description
|
Schema Documents
|
describe the data schema for a resource |
Field Option Documents
|
describe the currently acceptable options for fields in a record
|
Data Documents
|
provide the current contents (data) of resources |
Schema Documents
Schema documents describe the data schema for a resource. Clients can use these documents e.g. for automatic generation of forms.
Schema documents can be retrieved from Sahana Eden by sending an empty GET request (i.e. without source) to the create.xml method of a resource, e.g.:
GET http://localhost:8000/eden/pr/person/create.xml
Document Tree:
<s3xml> <resource> <field> ... <resource> <field> ... </resource> </resource> </s3xml>
or (if requested from the fields.xml method):
<fields resource="name"> <field/> <field/> <field/> ... </fields>
Note:
- These documents can only be requested (GET), but not submitted for import
- Schema documents support on-the-fly transformation (see chapter Web Services)
- the URL query parameter ?options=true adds a list of field options to those fields where options are defined, and combined with the parameter &reference=true, even options for foreign key references will be included
- the URL query parameter ?meta=true will include the meta fields (as <meta> elements). In data documents, the meta fields appear as attributes of the <resource> element
Field Options Documents
Field options documents describe the currently acceptable options for fields in a record. Clients can use these documents e.g. for automatic generation and/or client-side validation of forms.
Field options documents can be requested from Sahana Eden by sending a GET request to the options.xml method of a resource, e.g.:
GET http://localhost:8000/eden/pr/person/options.xml
Document Tree:
<options> <select> <option> <option> <option> ... </select> <select> ... </select> ... </options>
Note:
- the field URL variable can be used to specify a particular field in the resource, the enclosing <options> element would then be omitted (i.e. <select> becomes root element)
- on-the-fly transformation of field options documents is not supported
- Field option documents can only be requested (GET), but not submitted for import
Data Documents
Data documents provide the current contents (data) of resources.
Data documents can be requested from Sahana Eden by sending a GET request to the URL of the resource, e.g.:
GET http://localhost:8000/eden/pr/person.xml
Data documents can be submitted to Sahana Eden by sending PUT requests to the URL of the resource, e.g.:
PUT http://localhost:8000/eden/pr/person.xml
Note that sending data with POST will enter an interactive review of the source data before importing them, thus POST cannot be used by merely non-interactive clients.
Document Tree:
<s3xml> <resource> <!-- primary resource element --> <data> <!-- field data --> <data> ... <resource> <!-- component resource inside the primary resource --> <data> <data> <reference/> <!-- reference --> ... </resource> <reference/> <!-- reference --> <reference> <!-- reference with embedded resource element --> <resource> <data> ... </resource> </reference> </resource> </s3xml>
Components
Components of resources are <resource> elements nested inside the master <resource> element. Component records will be automatically imported and the required key references be added (=no explicit reference-element required).
Foreign key references of component records to their primary record will not be exported, and where they appear in import sources, they will be ignored.
Components of components are not allowed (maximum depth 1), and where they appear in import sources, they will be ignored.
References
Foreign key references (except those linking components to their primary record) are represented by <reference> elements.
Foreign keys can be importable UIDs (uuid-attribute, which will be both imported and used to find and/or link to existing records in the DB) or temporary UIDs (tuid-attribute, which will not be imported but only used to find records within the current tree), If a <resource> element with a matching UID key attribute is found in the same tree, it will be automatically imported.
References inside referenced elements will be resolved (unlimited depth) and also be imported. Circular references will be detected and properly resolved.
Multi-references (list:reference type in web2py) use a list of UID keys separated by vertical dashes like uuid=|uid1|uid2|uid3|. The leading and trailing vertical dashes must be present.
If a <resource> element is nested inside the <reference>, either or both of the UID keys can be omitted. Where both keys are however used, they must match. Multiple embedded <resource> elements are allowed for multi-references.
Element Descriptions
<s3xml>
This is the root element (in schema and data documents).
<s3xml success="true" results="2" domain="mycomputer" url="http://127.0.0.1:8000/eden" latmin="-90.0" latmax="90.0" lonmin="-180.0" lonmax="180.0"> ... </s3xml>
Parent elements: | none (root element) |
Child elements: | <resource> |
Contents: | empty |
Attributes:
Name | Type | Description | mandatory? |
domain | string | the domain name of the data repository | no |
url | string | the URL of the data repository | no |
success | boolean | true if the page contains any records, otherwise false | no |
results | integer | the total number of records matching the request | no |
start | integer | the index of the first record returned (in paginated requests) | no
|
limit | integer | the maximum number of records returned (in paginated requests) | no
|
latmin, latmax, lonmin, lonmax | float | geo-location boundary box of the results | no |
<resource>
This element represents a record (in data documents) or a database table (in schema documents).
<s3xml> <resource name="xxx_yyy"> ... </resource> </s3xml>
Parent elements: | <s3xml>, <resource>, <reference> |
Child elements: | <resource>, <data>, <field> |
Contents: | empty |
Attributes:
Name | Type | Description | mandatory?
|
name | string | the name of the database table
|
yes |
uuid | string | a unique identifier for the record | no* |
tuid | string | a temporary unique identifier for the record | no* |
created_on | datetime | date and time when the record was created | no** |
modified_on | datetime | date and time when the record was last updated | no, default: time of the request** *** |
created_by | string | email-address of the user who created the record | no |
modified_by | string | email-address of the user who last updated the record | no |
owned_by_user | string | email-address of the user who owns the record***** | no |
owned_by_role | string | name of the user group who collectively own the record***** | no |
mci | integer | master-copy-index | no, default: 2*** **** |
- (*) Records will be identified within the input file by their uuid, or, if no uuid is specified, by their tuid.
- (**) as YYYY-MM-DDTHH:mm:ssZ, always UTC
- (***) the last update date/time and mci are required in synchronization
- (****) the master copy index specifies how often a record has been copied across sites, see below
- (*****) record ownership will be retained if the record owners can be matched against existing users/user groups
The uuid will be stored in the database together with the record. If uuid is present and matches an existing record in the database, then this record will be updated. If there's no match or no uuid specified in the resource element, then the importer will create a new record in the database (and automatically generate a uuid if required).
The mci - master-copy-index - indicates how often this record has been copied across sites:
- when importing a new record the mci value is always *imported* as-is from the source
- when updating a record, the mci of the database record remains unchanged
- the mci of a record is *exported* as its current database value + 1.
- the repository first creating a record sets mci=0 in the database record, which appears as mci=1 in the exported XML.
- a copying site then imports mci=1 into its database, which appears as mci=2 in its export XML, and so forth...
The mci can be used to filter records for whether they have been originated at a repository or not. If there's a fixed set of synchronization paths between a number of Sahana Eden instances, the mci can be used for conflict resolution. If the mci is not specified, it defaults to 2.
MCI handling is optional for non-synchronizing peers.
<data>
This element represents the value of a single field in the record.
<s3xml> <resource> <data field="fieldname" value="value">...</data> </resource> </s3xml>
Parent elements: | resource |
Child elements: | none (leaf element) |
Contents: | Text |
Attributes:
Name | Type | Description
|
mandatory? |
field | string | the field name in the record | yes |
value | JSON
|
the native field value | no |
url | URL | the URL to download the contents from* | no |
filename | filename | the filename of the attached contents* | no |
(*) If the field is for file upload, a url attribute should be provided to specify the location of the file. The importer will try to download and store the file (file transfer) from that URL (pull). It is also possible to send the file together with the HTTP request - in this case the filename must be specified instead of the url (push). The push variant for uploads is meant for peers which do not support pulling for some reason (e.g. mobile phones). Normal servers would always provide a URL for download in order to allow the consuming site to decide which files to download and when (saves bandwidth).
The text node in the data element provides a human-readable representation of the field value.
The value attribute contains a JSON representation of the field value, retaining the original data type (i.e. strings must be double-quoted) except for date, time and datetime values, which are to be represented as simple strings in the respective standard format (no double quotes). The standard format for datetime values is YYYY-MM-ddTHH:mm:ssZ (ISO format, UTC), date shall be represented as YYYY-MM-dd, and time as HH:mm:ss.
data elements representing passwords can contain the clear text password in the value attribute, or the encrypted password in the text node. Where a clear text password is given as value attribute, it will be stored encrypted, otherwise the password will be stored as-is. Note that clear-text representation of passwords will be accepted by the interface, but never be exported.
<reference>
Represents a foreign key reference.
<s3xml> <resource name="xxx_yyy"> <reference field="xy" resource="aaabbb" uuid="urn:uuid:e4bcb9fd-d890-4f2f-b221-1d75fff79e2d"/> </resource> </s3xml>
Parent elements: | <resource> |
Child elements: | <resource> |
Contents: | Text |
Attributes:
Name | Type | Description | mandatory? |
field | string | the field name in the record | yes |
resource | string | the name of the referenced database table
|
yes |
uuid | string | the unique identifier of the referenced record (foreign key)* | (yes)** |
tuid | string | a temporary identifier for a referenced record (foreign key)* | (yes)** |
(**) Records will be identified within the input file by their uuid, or, if no uuid is specified, by their tuid.
If the referenced record is enclosed in the reference element, then uuid and tuid can be omitted:
<s3xml> <resource name="xxxyyy"> <!-- content of the record goes here --> <reference field="xy" resource="aaabbb"> <resource name="aaabbb"> <!-- content of the referenced record goes here --> </resource> </reference> </resource> </s3xml>