CMIS is the OASIS specification for content management interoperability. It allows client and servers to talk together in HTTP (REST with JSON or AtomPub) using a unified domain model. The latest published version is CMIS 1.1.
Nuxeo supports CMIS through the following modules:
- The Apache Chemistry OpenCMIS library (an Apache project to which Nuxeo is a contributor), which is a general-purpose Java library allowing developers to easily write CMIS clients and servers,
- Specific Nuxeo OpenCMIS connector bundles, allowing the Nuxeo Platform to be used as a CMIS server with the help of OpenCMIS. The CMIS connector is included in the Nuxeo Platform by default.
Usage
The following documentation uses http://localhost:8080/nuxeo
as the URL of the Nuxeo server but you can replace it with http://NUXEO_SERVER/nuxeo
if you have another instance available.
You can access the different services from the following URLs:
- Browser Binding root URL:
http://localhost:8080/nuxeo/json/cmis
- AtomPub service document:
http://localhost:8080/nuxeo/atom/cmis
JSON
The Browser Binding (JSON) endpoint is recommended, as it is faster and has more features than the other two endpoints.
You can use a CMIS 1.1 Browser Binding (JSON) client and point it at http://localhost:8080/nuxeo/json/cmis
.
If you want to check the JSON returned using the command line, this can be done using curl
or wget
:
curl -u Administrator:Administrator http://localhost:8080/nuxeo/json/cmis | json_pp
This will give you the description of the default repository:
{
"default" : {
"cmisVersionSupported" : "1.1",
"productName" : "Nuxeo OpenCMIS Connector",
"productVersion" : "8.10",
"vendorName" : "Nuxeo",
"repositoryName" : "Nuxeo Repository default",
"repositoryDescription" : "Nuxeo Repository default",
"repositoryUrl" : "http://localhost:8080/nuxeo/json/cmis/default/",
"thinClientURI" : "http://localhost:8080/nuxeo/",
"repositoryId" : "default",
"rootFolderId" : "fe7944e0-3d44-4abc-90d4-64e0e07c63c7",
"rootFolderUrl" : "http://localhost:8080/nuxeo/json/cmis/default/root",
"latestChangeLogToken" : "42",
"capabilities" : {
"capabilityChanges" : "objectidsonly",
"capabilityVersionSpecificFiling" : false,
"capabilityMultifiling" : false,
"capabilityContentStreamUpdatability" : "pwconly",
"capabilityQuery" : "bothcombined",
"capabilityACL" : "manage",
"capabilityRenditions" : "read",
"capabilityPWCSearchable" : true,
"capabilityOrderBy" : null,
"capabilityPWCUpdatable" : true,
"capabilityGetDescendants" : true,
"capabilityUnfiling" : false,
"capabilityAllVersionsSearchable" : true,
"capabilityGetFolderTree" : true,
"capabilityJoin" : "none"
"capabilityNewTypeSettableAttributes" : { ... },
"capabilityCreatablePropertyTypes" : {
"canCreate" : []
},
},
"aclCapabilities" : {
"propagation" : "propagate",
"supportedPermissions" : "repository",
"permissionMapping" : [ ... ],
"permissions" : [ ... ]
]
},
"changesIncomplete" : false,
"changesOnType" : [
"cmis:document",
"cmis:folder"
],
"principalIdAnyone" : "Everyone",
"principalIdAnonymous" : "Guest",
"extendedFeatures" : [
{
"description" : "Adds an additional DateTime format for the Browser Binding.",
"commonName" : "Browser Binding DateTime Format",
"versionLabel" : "1.0",
"id" : "http://docs.oasis-open.org/ns/cmis/extension/datetimeformat",
"url" : "https://www.oasis-open.org/committees/tc_home.php?wg_abbrev=cmis"
}
]
}
}
To do a query you can do:
curl -u Administrator:Administrator "http://localhost:8080/nuxeo/json/cmis/default?cmisselector=query&succinct=true&q=SELECT+cmis:objectId,+dc:title+FROM+cmis:folder+WHERE+dc:title+=+'Workspaces'" | json_pp
Which returns:
{
"numItems" : 1,
"hasMoreItems" : false,
"results" : [
{
"succinctProperties" : {
"cmis:objectTypeId" : "WorkspaceRoot",
"cmis:objectId" : "96e9e7b9-75be-4123-888d-ca89af7c8da3",
"dc:title" : "Workspaces"
}
}
]
}
AtomPub
You can use a CMIS 1.1 AtomPub client and point it at http://localhost:8080/nuxeo/atom/cmis
.
(Since Nuxeo 7.10, a legacy CMIS 1.0 AtomPub endpoint is also available at http://localhost:8080/nuxeo/atom/cmis10
. This is provided for old clients that cannot be upgraded.)
If you want to check the AtomPub XML returned using the command line, this can be done using curl
or wget
:
curl -u Administrator:Administrator http://localhost:8080/nuxeo/atom/cmis
To do a query you can do:
curl -u Administrator:Administrator "http://localhost:8080/nuxeo/atom/cmis/default/query?q=SELECT+cmis:objectId,+dc:title+FROM+cmis:folder+WHERE+dc:title+=+'Workspaces'&searchAllVersions=true"
You should probably pipe this through tidy
if you want a readable output:
... | tidy -q -xml -indent -wrap 999
Notes
- The
searchAllVersions=true
part is mandatory if you want something equivalent to what you see in Nuxeo (which often contains mostly private working copies). In order to fetch custom metadata, you must restrict the selection to document types that contain the metadata. For example, if you have a metadata "custom" in a document type "mytype", then your query would be something like:
curl -u Administrator:Administrator "http://localhost:8080/nuxeo/atom/cmis/default/query?q=SELECT+cmis:objectId,+mytype:custom+FROM+mytype&searchAllVersions=true"
SOAP
The SOAP endpoints are not available anymore starting with Nuxeo 10.3.
CMIS Clients
Several free clients for CMIS 1.1 are available. The most complete is the CMIS Workbench, part of OpenCMIS.
Developers can use the Chemistry libraries to produce their own client (Java, Python, PHP, .NET). Documentation and sample for using OpenCMIS libraries can be found on the OpenCMIS developer wiki with also example code and how-to guides.
From Java Code Within a Nuxeo Component
To create, delete or modify documents, folders and relations just use the regular CoreSession
API of Nuxeo. To perform CMISQL queries (for instance to be able to perform JOIN
that are not supported by the default NXQL
query language) have a look at the page Using CMISQL from Java.
Capabilities
The Nuxeo OpenCMIS connector implements the following capabilities from the specification:
Navigation Capabilities | ||
Get descendants supported | Yes | |
Get folder tree supported | Yes | |
Order By supported | Custom | |
Object Capabilities | ||
Content stream updates | PWC only | |
Changes | Object IDs only | |
Renditions | Read | |
Filing Capabilities | ||
Multifiling supported | _No_ | |
Unfiling supported | _No_ | |
Version-specific filing supported | _No_ | |
Versioning Capabilities | ||
PWC updatable | Yes | |
PWC searchable | Yes | |
All versions searchable | Yes | |
Query Capabilities | ||
Query | Both combined | |
Joins | None (Inner and outer if org.nuxeo.cmis.joins=true ) | |
Type Capabilities | ||
Create property types | _No_ | |
New type settable attributes | None | |
ACL Capabilities | ||
ACLs | Manage | |
ACLs propagation | Propagate | |
Supported permissions | Repository |
Model Mapping
The following describes how Nuxeo documents are mapped to CMIS objects and vice versa.
- Only Nuxeo documents including the "dublincore" schema are visible in CMIS.
- Complex properties are not visible in CMIS by default, as this notion does not exist in CMIS. However, if the server is configured to do so, they can be exposed as JSON-encoded strings (since Nuxeo 7.1, see NXP-14474).
- Dynamic facets are visible as CMIS 1.1 secondary types (since Nuxeo 7.1, see NXP-15070).
- Proxy documents are visible in CMIS if the system property
org.nuxeo.cmis.proxies=true
(since Nuxeo 8.3 / Nuxeo 7.10-HF08 (defaulttrue
since Nuxeo 9.1,false
in previous versions), see NXP-17313 and NXP-21828). - Secondary content streams are not visible as renditions. Only the Nuxeo thumbnail and renditions explicitly made available through the Nuxeo RenditionService are visible.
- Documents in the Nuxeo trash (those whose
nuxeo:isTrashed
istrue
) are not visible in CMIS, unless an explicit query using thenuxeo:isTrashed
property is done.
This mapping may change to be more comprehensive in future Nuxeo Platform versions.
Nuxeo-Specific System Properties
In addition to the system properties defined in the CMIS specification under the cmis:
prefix, the Nuxeo Platform adds some additional properties under the nuxeo:
prefix:
nuxeo:isTrashed
: To access the trashed state of a document. By default only non-trashed document will be returned in CMISQL queries unless an explicitnuxeo:isTrashed
predicate is specifiedin theWHERE
clause of the query.nuxeo:isVersion
: To distinguish between archived (read-only revision) and live documents (that can be edited).nuxeo:lifecycleState
: To access the lifecycle state of a document.nuxeo:secondaryObjectTypeIds
: Makes it possible to access the facets of a document. Those facets can be static (as defined in the type definitions) or dynamic (each document instance can have declared facets).nuxeo:contentStreamDigest
: The low level, MD5 or SHA1 digest of blobs stored in the repository. The algorithm used to compute the digest is dependent on the configuration of theBinaryManager
component of the Nuxeo repository.nuxeo:isCheckedIn
: For live documents, distinguishes between the checked-in and checked-out state.nuxeo:parentId
: Likecmis:parentId
but also available on Document objects (which is possible because the Nuxeo Platform does not have direct multi-filing).nuxeo:pathSegment
: The last path segment of the document (ecm:name
in NXQL).nuxeo:pos
: The position of an object in its containing folder, if that folder is ordered, ornull
otherwise.
All these properties can be used as regular CMIS properties and in a CMISQL query (in a SELECT
, WHERE
or ORDER BY
clause where relevant), except for nuxeo:contentStreamDigest
which can only be read in query results or by introspecting the properties of the ObjectData
representation of a document.
Use Cases
Document Capture Integration with Ephesoft
Ephesoft is an advanced document capture and data extraction solution to help businesses run more efficiently. It automatically classifies and extracts data from any type of document. Ephesoft has a CMIS interface, which ease the integration with Nuxeo.
Ephesoft has a CMIS import plugin and a CMIS export plugin so that it can ingest documents stored in Nuxeo to extract information and send back the extraction results to Nuxeo.
CMIS Import
Ephesoft monitors a specified folder for a new file (as a hot folder) using a cron job, and process any new document in an Ephesoft batch. Ephesoft uses a "technical" Nuxeo property to tag the document as processed, in order to not process twice the same document (for example, a custom property called invoice:status
passes from To process
to Processed
).
In the picture above:
Parameter | Value | Description |
---|---|---|
Server URL | http://localhost:8080/nuxeo/atom/cmis |
Nuxeo CMIS URL |
Username | ephesoft |
Username of the technical account to create a connexion between Nuxeo and Ephesoft. This user needs WRITE permission on the documents |
Password | mySecretPassword |
Password of the technical account |
Repository Id | default |
Generally default . You can read it from the downloaded file when you enter in a web browser the CMIS Server URL |
File Extension | pdf;tif |
Cannot be changed |
Folder | default/domain/workspaces/folder1 |
Folder path of the hot folder. The initial / should not be written |
Property | invoice:status |
Property used by Ephesoft to check which document has been processed |
Value | To process |
Each document with invoice:status=To Process will be sent to Ephesoft |
New Value | Processed |
When Ephesoft processes a document, Ephesoft will update the invoice:status to Processed |
CMIS Version | 1.1 |
Value of the CMIS implementation |
Enabled | true |
To activate the CMIS Import |
Don't forget to activate the CMIS import by uncommenting the <import resource="classpath:/META-INF/applicationContext-dcma-mail-import.xml" />
line of the applicationContext.xml
file as the CMIS import is disabled by default.
CMIS Export
When a document is processed in Ephesoft, it means the platform has classified and extracted the information from the document. Instead of exporting the binary files in a filesystem folder, along with its XML document (corresponding to the field properties), you can export them into Nuxeo with the CMIS export addon. The configuration is quite easy:
- Activate the CMIS Export in your Ephesoft batch class modules
- Map the Ephesoft property fields with the Nuxeo property fields in the
DLF-attributes-mapping.properties
invoice=Invoice invoice.number=invoice:number invoice.details=dc:description ...
- Configure the CMIS export properties:
Here is the list of the most important properties:
Parameter | Value | Description |
---|---|---|
Cmis Root Folder Name | default-domain/workspaces/invoices_processed |
Nuxeo Folder where Ephesoft exports the documents and its properties. The initial / should not be written |
Cmis Upload File Extension | tif |
It can be either tif or pdf |
Cmis Server URL | http://localhost:8080/nuxeo/atom/cmis |
Nuxeo CMIS URL |
Cmis Server User Name | ephesoft |
Username of the technical account to create a connexion between Nuxeo and Ephesoft. This user needs WRITE permission on the Cmis Root Folder Name to upload documents |
Cmis Server Switch ON/OFF | ON |
If set to OFF, the CMIS export is disabled |
CMIS Export File Name | $BATCH_IDENTIFIER & _ & $DOCUMENT_ID |
dc:title value of the exported document. Any extracted information from the document can be reused with the $ character |
Custom CMIS Integration
Watch this 15 min video presenting a custom case-processing application started from scratch, leveraging CMIS.
Resources
Source Code
- The Nuxeo OpenCMIS connector source code on GitHub: https://github.com/nuxeo/nuxeo-chemistry.
- The Apache Chemistry OpenCMIS source code on Apache's Subversion server: https://svn.apache.org/repos/asf/chemistry/opencmis/trunk.
Documentation
- CMIS 1.1 (HTML),
- CMIS 1.1 (PDF) (1.3 MB).
Slide Decks
- CMIS and Apache Chemistry, ApacheCon 2010 presentation on SlideShare