The Nuxeo CSV add-on enables users to proceed to a bulk import of documents in the Nuxeo Platform using a CSV file. This add-on enables to create documents with their metadata filled in, to import files with their main attachment, to create a tree structure.
Installing this add-on adds a button "Import a CSV file" for all users that have at least the Write right on any document in which it is possible to import a file. By default, this means workspaces and folders. If you configured other documents types in which it is possible to import files, then the button "Import a CSV file" can also be available (see the page How to Enable CSV Import on a Custom Document Type).
Installation
This add-on requires no specific installation steps. It can be installed like any other package from the Marketplace or from the Admin tab.
After the package is installed, users have a Import a CSV file button available in workspaces, folders and in any document where they can import files.
Configuration
The Nuxeo CSV add-on enables users to create file documents and upload their main attachment at the same time. This requires to configure where the server will take the attachments. This is done adding the parameter nuxeo.csv.blobs.folder
in the server nuxeo.conf and giving him a value that is the path to a folder that can be accessed by the server.
CSV File Definition
The CSV file used to import documents in the Nuxeo Platform must respect the following rules:
- 1st line defines the properties that will be filled in,
- other lines define the documents to be imported,
- use a comma to separate properties,
- values must be between quotes,
- dates must use the format MM/dd/yyyy,
- for multi-valued metadata, such as contributors, use a pipe character (|) to separate the different values,
- for vocabularies values, use their id,
- lines defining the documents to import must define all properties specified on the 1st line, even empty ones (by using empty values).
Here is a simple example of the structure of a CSV file:
"name","type","dc:title","dc:description"
"my-file","File","My file","This is my file's description"
In the example above:
name
is the id of the document (used in the URL),type
is the id of document type (see the page How to Override Existing Document Types for some default types properties),dc:title
anddc:description
are the title and description fields of the document from the Dublin Core (dc) schema. They follow theschema:field
formatting.
To have new lines in a field value (like dc:description
), just write them as in the following CSV file example:
"name","type","dc:title","dc:description"
"a-file","File","A File","description with
some new
lines"
"another-file","File","Another File","description without new line"
Nuxeo CSV doesn't support complex properties, such as blob definition.
Using Nuxeo CSV
Importing documents using Nuxeo CSV always the same few steps. Some specific use cases are explained below.
To import documents using Nuxeo CSV:
- Prepare the CSV file that defines the documents to import, following the rules explained in the CSV file definition section. Some specific use cases are explained below.
- In the Nuxeo Platform, go on the workspace or folder you want to import documents into.
- Click on the Import a CSV file button in the workspace or folder you want to import documents into.
- Browse and select your CSV file.
- Optionally check the box Send me the import report by email if you want to receive an email when the import is done, that shows how the import went. This is useful in case of long import that will take some time.
Click on the Process button. The import starts. You can either:
- wait for the import to be completed.
You are then display a report of the import when it is completed;
- start a new import;
- browse the application.If you checked the box Send me the import report by email , you receive an email once the import is completed.
- wait for the import to be completed.
You are then display a report of the import when it is completed;
Importing a Document Tree Structure
It is possible to import a hierarchy of documents using Nuxeo CSV. To do that, the name
property is used to determine where the document should be created in the hierarchy of documents you are importing: its name
is composed of the names of its parents separated by /, forming a path.
Since the importer creates the documents in the order they are listed in the CSV file, you have to be careful about the order in which you declare the documents to import so as to be sure to create the workspace or folder before the documents it will hold.
Here is an example of a CSV import that creates documents at the root of the workspace from which the import is started and in a child folder:
"name","type","dc:title","dc:description"
"folder","Folder","Folder in the workspace","The description of the folder created by CSV import"
"folder/doc-created-in-folder","File","Document created in a folder","The description of a file imported in a folder created by the import"
"doc1","File","Document 0","A document created directly in the workspace in which the import is started"
"doc2","File","Doc 1","A file document description, created at the same location as doc1"
"doc3","Note","Doc 2","A note document, created at the same location as doc1 and doc2"
You can use the attached file to test Nuxeo CSV to import a tree structure.
Importing Files
It is possible to create documents of type File and to upload their main attachment using Nuxeo CSV. This requires that your administrator enabled it in the server configuration and to put the binary files in a folder that can be accessed by the server.
On your CSV file, use the file:content
property in the 1st line and the name of your file on the document definition line.
"name","type","dc:title","dc:description","file:content"
"my-file","File","My file with uploaded attachment","This is a file with its attachment, created using Nuxeo CSV","my-file.doc"
You can use the attached ZIP sample to test the import of files.
Setting Life Cycle State When Creating Documents
It is possible to set the life cycle state when the document is created through Nuxeo CSV, using the ecm:currentLifeCycleState
property. This property is ignored when updating documents.
"name","type","dc:description","dc:title","ecm:currentLifeCycleState"
"myfile","File","a simple file","My File","obsolete"