Addons

Nuxeo AI

Updated: December 4, 2024

The Nuxeo AI package integrates Machine Learning Services into the Nuxeo platform. This can be used on several tasks like Data Enrichment.

See the GitHub Readme for the Developer project description.

Concept

Nuxeo AI package is a core system of streams that allows the Nuxeo Platform to interact with AI services, be them external from external suppliers, or internal from Nuxeo. These services can be used in a multitude of ways within the platform.

The core of the system is a sequence of processors connected with streams. At the head of the process there is a filtering system that selects documents to be processed. The next step is to call the AI service to apply a classification to the data. The final step handles the returned data from the AI service, transforms it to the purpose needed in the Nuxeo Platform.

The first use of the Core-AI streams system is to enrich data in existing/new document. We filter data on a new documents event from a specific documentType, call a classification system and use the results to enrich the document via tags or a specific facet. This makes it easy to search for data.

The system is composed of a core package, called nuxeo-ai-core and extension packages that implement extensions for external services usage.

Core AI

Installation

This addon requires no specific installation steps. It can be installed like any other package with nuxeoctl command line or from the Update Center.

Nuxeo Configuration

You can set these in your nuxeo.conf:

Parameter Description Default value Since
nuxeo.ai.images.enabled Create a stream for creation/modification of images. false Since 1.0
nuxeo.ai.video.enabled Create a stream for creation/modification of video files. false Since 1.0
nuxeo.ai.audio.enabled Create a stream for creation/modification of audio files. false Since 1.0
nuxeo.ai.text.enabled Create a stream for text extracted from blobs. false Since 1.0
nuxeo.ai.stream.config.name The name of the stream log config pipes Since 1.0
nuxeo.enrichment.source.stream The name of the stream that receives Enrichment data enrichment.in Since 1.0
nuxeo.enrichment.save.tags Should enrichment labels be saved as a standard Nuxeo tags? false Since 1.0
nuxeo.enrichment.save.facets Should enrichment data be saved as a document facet? true Since 1.0
nuxeo.enrichment.raiseEvent Should an enrichmentMetadataCreated event be raised when new enrichment data is added to the stream? true Since 1.0

Core AI Streams

Core AI allows you to customize a series of streams and processors. By default it provides 4 default document streams that can be activated by the configuration parameters shown above.

  • images - When a image is added to a document.
  • videos - When a video is added to a document.
  • audio - When an audio file is added to a document.
  • text - When binary text is extracted from a document.

These allow you to start your processing chain quickly.

Extensions

Core AI is created with multiple extension points to the several processors.
The initial release has:

AWS

As part of the initial release, we have a set of extensions for Amazon Web Services.
These include Rekognition, Comprehend and Translate.
See the GitHub Readme for more technical details and all the services that are currently available with this extension.

Before You Start

You should be familiar with Amazon Web Services and be in possession of your credentials.

Big Picture

Specifying Your Amazon Credentials and Region

Credentials are discovered using nuxeo-runtime-aws.
The chain searches for credentials in order: Nuxeo's AWSConfigurationService, environment variables, system properties, profile credentials, EC2Container credentials.

In nuxeo.conf, add the following lines:

nuxeo.aws.accessKeyId=your_AWS_ACCESS_KEY_ID
nuxeo.aws.secretKey=your_AWS_SECRET_ACCESS_KEY
nuxeo.aws.sessionToken=your_AWS_SESSION_TOKEN
nuxeo.aws.region=your_AWS_REGION

If your Nuxeo instance runs on Amazon EC2 or Amazon ECS, you can also transparently use IAM instance roles, in which case you do not need to specify the AWS ID and secret (the credentials will be fetched automatically from the instance metadata). The same applies to the region.

The region code can be found in the S3 Region Documentation. The default is us-east-1. At the time this documentation was written, the list is:

  • us-east-1: US East (N. Virginia) (default)
  • us-east-2: US East (Ohio)
  • us-west-1: US West (N. California)
  • us-west-2: US West (Oregon)
  • eu-west-1: EU (Ireland)
  • eu-west-2: EU (London)
  • eu-west-3: EU (Paris)
  • eu-central-1: EU (Frankfurt)
  • ap-south-1: Asia Pacific (Mumbai)
  • ap-southeast-1: Asia Pacific (Singapore)
  • ap-southeast-2: Asia Pacific (Sydney)
  • ap-northeast-1: Asia Pacific (Tokyo)
  • ap-northeast-2: Asia Pacific (Seoul)
  • ap-northeast-3: Asia Pacific (Osaka-Local)
  • sa-east-1: South America (São Paulo)
  • ca-central-1: Canada (Central)
  • cn-north-1: China (Beijing)
  • cn-northwest-1: China (Ningxia)

If you are only using images and an S3 BinaryManager is already being used then it re-uses the image data to pass a reference instead of uploading the binary again.

Installation

This addon requires no specific installation steps. It can be installed like any other package with nuxeoctl command line or from the Update Center.

Quick Start
  1. Install the nuxeo-ai-aws package.

    ./bin/nuxeoctl mp-install nuxeo-ai-aws
    
  2. Add the following parameters to nuxeo.conf.

    nuxeo.ai.images.enabled=true
    nuxeo.ai.text.enabled=true
    nuxeo.enrichment.aws.images=true
    nuxeo.enrichment.aws.text=true
    nuxeo.enrichment.save.tags=true
    nuxeo.enrichment.save.facets=true
    nuxeo.enrichment.raiseEvent=true
    
  3. Set your AWS credentials AWS credentials.
  4. Start Nuxeo and upload an image.
  5. Wait 10 seconds then look at the document tags and document JSON enrichment:items facet.

Nuxeo Configuration

You can set these in your nuxeo.conf. They are used in combination with the other configuration parameters for nuxeo-ai-core shown above.

Parameter Description Default value Since
nuxeo.enrichment.aws.images Run AWS enrichiment services on images. false Since 1.0
nuxeo.enrichment.aws.text Run AWS enrichiment services on text. false Since 1.0

Image Quality

An implementation of an enrichment service that uses Sightengine.
See the GitHub Readme for additional technical details.

Before You Start

Register with Sightengine to obtain your apiKey and apiSecret.

Big Picture

Installation

This addon requires no specific installation steps. It can be installed like any other package with nuxeoctl command line or from the Update Center.

Quick Start
  1. Install the nuxeo-ai-image-quality package.

    ./bin/nuxeoctl mp-install nuxeo-ai-image-quality`
    
  2. Add the following parameters to nuxeo.conf:

    nuxeo.ai.images.enabled=true
    nuxeo.enrichment.save.tags=true
    nuxeo.enrichment.save.facets=true
    nuxeo.enrichment.raiseEvent=true
    nuxeo.ai.sightengine.apiKey=YOUR_API_KEY
    nuxeo.ai.sightengine.apiSecret=YOUR_API_SECRET
    
  3. Start Nuxeo and upload an image.
  4. Wait 10 seconds then look at the document tags and document JSON enrichment:items facet.

Nuxeo Configuration

You can set these in your nuxeo.conf. They are used in combination with the other configuration parameters for nuxeo-ai-core shown above.

Parameter Description Default value Since
nuxeo.ai.sightengine.apiKey The API key for sightengine Since 1.0
nuxeo.ai.sightengine.apiSecret The API secret for sightengine Since 1.0
nuxeo.ai.sightengine.all Configure an enrichment service to process the images stream and call all sightengine models true Since 1.0