The Nuxeo AI package integrates Machine Learning Services into the Nuxeo platform. This can be used on several tasks like Data Enrichment.
See the GitHub Readme for the Developer project description.
Concept
Nuxeo AI package is a core system of streams that allows the Nuxeo Platform to interact with AI services, be them external from external suppliers, or internal from Nuxeo. These services can be used in a multitude of ways within the platform.
The core of the system is a sequence of processors connected with streams. At the head of the process there is a filtering system that selects documents to be processed. The next step is to call the AI service to apply a classification to the data. The final step handles the returned data from the AI service, transforms it to the purpose needed in the Nuxeo Platform.
The first use of the Core-AI streams system is to enrich data in existing/new document. We filter data on a new documents event from a specific documentType, call a classification system and use the results to enrich the document via tags or a specific facet. This makes it easy to search for data.
The system is composed of a core package, called nuxeo-ai-core
and extension
packages that implement extensions for external services usage.
Core AI
Installation
This addon requires no specific installation steps. It can be installed like any other package with nuxeoctl command line or from the Update Center.
Nuxeo Configuration
You can set these in your nuxeo.conf
:
Parameter | Description | Default value | Since |
---|---|---|---|
nuxeo.ai.images.enabled |
Create a stream for creation/modification of images. | false |
Since 1.0 |
nuxeo.ai.video.enabled |
Create a stream for creation/modification of video files. | false |
Since 1.0 |
nuxeo.ai.audio.enabled |
Create a stream for creation/modification of audio files. | false |
Since 1.0 |
nuxeo.ai.text.enabled |
Create a stream for text extracted from blobs. | false |
Since 1.0 |
nuxeo.ai.stream.config.name |
The name of the stream log config | pipes |
Since 1.0 |
nuxeo.enrichment.source.stream |
The name of the stream that receives Enrichment data | enrichment.in |
Since 1.0 |
nuxeo.enrichment.save.tags |
Should enrichment labels be saved as a standard Nuxeo tags? | false |
Since 1.0 |
nuxeo.enrichment.save.facets |
Should enrichment data be saved as a document facet? | true |
Since 1.0 |
nuxeo.enrichment.raiseEvent |
Should an enrichmentMetadataCreated event be raised when new enrichment data is added to the stream? |
true |
Since 1.0 |
Core AI Streams
Core AI allows you to customize a series of streams and processors. By default it provides 4 default document streams that can be activated by the configuration parameters shown above.
- images - When a image is added to a document.
- videos - When a video is added to a document.
- audio - When an audio file is added to a document.
- text - When binary text is extracted from a document.
These allow you to start your processing chain quickly.
Extensions
Core AI is created with multiple extension points to the several processors.
The initial release has:
nuxeo-ai-aws
package that allows us to connect to the Machine Learning services supplied by Amazon.nuxeo-ai-image-quality
package that uses Sightengine.
AWS
As part of the initial release, we have a set of extensions for Amazon Web Services.
These include Rekognition, Comprehend and Translate.
See the GitHub Readme for more technical details and all the services that are currently available with this extension.
Before You Start
You should be familiar with Amazon Web Services and be in possession of your credentials.
Big Picture
Specifying Your Amazon Credentials and Region
Credentials are discovered using nuxeo-runtime-aws
.
The chain searches for credentials in order: Nuxeo's AWSConfigurationService, environment variables, system properties, profile credentials, EC2Container credentials.
In nuxeo.conf
, add the following lines:
nuxeo.aws.accessKeyId=your_AWS_ACCESS_KEY_ID
nuxeo.aws.secretKey=your_AWS_SECRET_ACCESS_KEY
nuxeo.aws.sessionToken=your_AWS_SESSION_TOKEN
nuxeo.aws.region=your_AWS_REGION
The region code can be found in the S3 Region Documentation. The default is us-east-1
. At the time this documentation was written, the list is:
- us-east-1: US East (N. Virginia) (default)
- us-east-2: US East (Ohio)
- us-west-1: US West (N. California)
- us-west-2: US West (Oregon)
- eu-west-1: EU (Ireland)
- eu-west-2: EU (London)
- eu-west-3: EU (Paris)
- eu-central-1: EU (Frankfurt)
- ap-south-1: Asia Pacific (Mumbai)
- ap-southeast-1: Asia Pacific (Singapore)
- ap-southeast-2: Asia Pacific (Sydney)
- ap-northeast-1: Asia Pacific (Tokyo)
- ap-northeast-2: Asia Pacific (Seoul)
- ap-northeast-3: Asia Pacific (Osaka-Local)
- sa-east-1: South America (São Paulo)
- ca-central-1: Canada (Central)
- cn-north-1: China (Beijing)
- cn-northwest-1: China (Ningxia)
If you are only using images and an S3 BinaryManager is already being used then it re-uses the image data to pass a reference instead of uploading the binary again.
Installation
This addon requires no specific installation steps. It can be installed like any other package with nuxeoctl command line or from the Update Center.
Quick Start
Install the
nuxeo-ai-aws
package../bin/nuxeoctl mp-install nuxeo-ai-aws
Add the following parameters to
nuxeo.conf
.nuxeo.ai.images.enabled=true nuxeo.ai.text.enabled=true nuxeo.enrichment.aws.images=true nuxeo.enrichment.aws.text=true nuxeo.enrichment.save.tags=true nuxeo.enrichment.save.facets=true nuxeo.enrichment.raiseEvent=true
- Set your AWS credentials AWS credentials.
- Start Nuxeo and upload an image.
- Wait 10 seconds then look at the document tags and document JSON
enrichment:items
facet.
Nuxeo Configuration
You can set these in your nuxeo.conf
. They are used in combination with the other configuration parameters for nuxeo-ai-core
shown above.
Parameter | Description | Default value | Since |
---|---|---|---|
nuxeo.enrichment.aws.images |
Run AWS enrichiment services on images. | false |
Since 1.0 |
nuxeo.enrichment.aws.text |
Run AWS enrichiment services on text. | false |
Since 1.0 |
Image Quality
An implementation of an enrichment service that uses Sightengine.
See the GitHub Readme for additional technical details.
Before You Start
Register with Sightengine to obtain your apiKey
and apiSecret
.
Big Picture
Installation
This addon requires no specific installation steps. It can be installed like any other package with nuxeoctl command line or from the Update Center.
Quick Start
Install the nuxeo-ai-image-quality package.
./bin/nuxeoctl mp-install nuxeo-ai-image-quality`
Add the following parameters to
nuxeo.conf
:nuxeo.ai.images.enabled=true nuxeo.enrichment.save.tags=true nuxeo.enrichment.save.facets=true nuxeo.enrichment.raiseEvent=true nuxeo.ai.sightengine.apiKey=YOUR_API_KEY nuxeo.ai.sightengine.apiSecret=YOUR_API_SECRET
- Start Nuxeo and upload an image.
- Wait 10 seconds then look at the document tags and document JSON
enrichment:items
facet.
Nuxeo Configuration
You can set these in your nuxeo.conf
. They are used in combination with the other configuration parameters for nuxeo-ai-core
shown above.
Parameter | Description | Default value | Since |
---|---|---|---|
nuxeo.ai.sightengine.apiKey |
The API key for sightengine | Since 1.0 | |
nuxeo.ai.sightengine.apiSecret |
The API secret for sightengine | Since 1.0 | |
nuxeo.ai.sightengine.all |
Configure an enrichment service to process the images stream and call all sightengine models |
true |
Since 1.0 |