Skip to end of metadata
Go to start of metadata

Since Nuxeo EP 5.2.1, Nuxeo can be clustered between several nodes (machines) with the appropriate configuration when the VCS backend is used. In addition, a HTTP load balancer with session affinity should be used in front of the instance in most cases.

VCS configuration

Nuxeo 5.4

To set up clustering, please update the repository.clustering.enabled, repository.clustering.delay and repository.binary.store in the nuxeo.conf parameters. On all Nuxeo instances, the repository.binary.store should point to a shared filesystem.

Earlier Nuxeo versions

For earlier Nuxeo versions without nuxeo.conf configuration, to enable clustering, configure two identical Nuxeo instances with datasources pointing to the same database server and sharing the same filesystem for the NXRuntime/binaries or folder.

On Nuxeo 5.2.1 this is achieved by sharing the complete JBoss data folder using the JVM system property definition -Djboss.server.data.dir=/some/shared/folder (using a SAN, a NFS, sshfs or Windows/samba shared folders for instance) in run.conf or run.bat JVM options. On Nuxeo 5.3 and later, it is recommended to only share the binaries folder as explained in the following.

On all nodes, configure the nuxeo.ear/config/default-repository-config.xml file (it may have a different but similar name) the configuration:

The delay is expressed in milliseconds, and specifies a delay during which invalidations don't need to be processed. This is an important optimization as otherwise every single transaction, even a read-only one, would have to hit the database to check invalidations.

Alternatively, if you are running Nuxeo EP version >= 5.3.0 (nuxeo-core version >= 1.6.0), you can configure the repository to use a shared folder for the binaries store without having to relocate all the jboss data dir:

Under Windows, the binaryStore path value can be UNC formatted, eg:

There is a dedicated page detailing all the VCS configuration options.

Testing the SQL table initialization

Start the SQL server, all Nuxeo nodes (the first alone and the other afterwards to avoid concurrent initialization of the SQL tables) and the load balancer and login on the HTTP user interface on each cluster node, then check that on the database that the cluster_nodes table is initialized with one line per node:

Testing VCS cache invalidation

Create a document and browse it from two different nodes. Edit the title from one node and navigate back to the document from second node to check that the change is visible. You can also monitor what's happening in the cluster_invals table to see cache invalidation information.

Quartz scheduler cluster (Since 5.4.2)

Quartz scheduler should be configured for being ran in a cluster. The first point is to populate the database with the table needed by quartz. The DDL scripts are available in the quartz distribution (download catalog). Then you should configure two new data-sources for quartz
in the container named nxquartz and nxquartx_no_tx. The last things is to instruct quartz about the cluster environment
by injecting the following configuration in the config folder of nuxeo. Note that you should select the delegate class to use
according to the database you're using.

http://www.quartz-scheduler.org/download/download-catalog.html

Since nuxeo 5.4.2, we've put in the default templates for tomcat the configuration suitable for PostgresSQL, Oracle and SQL Server.
Regarding the database type you're selected, you should also add after the database template you're using , database-quartz-cluster in your configuration.

For example, if your using PostgreSql  you should have in your nuxeo.conf : nuxeo.templates=postgresql, postgresql-quartz-cluster

The first time you're running the server, you will get an error from quartz because you're missing the quartz tables in the database. For solving, you just have to execute by hand the script located in 'bin/create-quartz-table.sql' on the
database you're connecting to.

In cluster mode the schedule contributions MUST be the same on all nodes.

HTTP load balancer

Setup an HTTP or AJP load balancer such as Apache with mod_proxy / mod_proxy_ajp or Pound, and configure it to keep session affinity by tracking the value of the "JSESSIONID" cookie and the ";jsessionid" URL parameter.

If you use a stateless load balancer such as apache modules such as mod_jk and mod_proxy_balancer, you need to make JBoss generate JSESSIONID cookies with values that end with ".route_n" where route_n is a string suffix specific to each node.

To do so, edit the server.xml file of each node's JBoss instance (e.g. in server/default/deploy/jboss-web.deployer/ or server/default/deploy/jbossweb-tomcat55.sar/) and edit the Connector and Engine tags to add the connectionTimeout and jvmRoute attributes as the following:

Then configure you stateless balancer to follow those routes, for instance here is the relevant configuration fragment when using mod_proxy_balancer:

Troubleshooting session affinity problems

To test that the load balancer forwards the HTTP requests of a given session to the same jboss node you can add a new file on each node (after jboss start), $JBOSS_DIR/server/default/nuxeo.ear/nuxeo.war/clusterinfo.html, with content:

on the first node, and:

on the other one.

Using a browser with a active session go to http://load-balancer/nuxeo/clusterinfo.html and check that you always return on the same node when hitting the refresh button of the browser.

FAQ

Here is an extract from ecm mailing-list (November 2008, 24th), where Thierry Delprat answers various questions about Nuxeo and clustering.

Do we have a "push button" cluster configuration ?

The quick answer is no.

Do we have high availability Nuxeo installations in production ?

We currently have some Nuxeo installations in production with high availability using multi-JVM packaging, Linux Clusters and and / or PGSQL replication.

Deployment configuration heavily depends on the project / client requirements :

  • depending on hardware available
  • depending on hosting skills
  • depending on the storage used (Core Repo storage and SQL DB)
  • depending on critical data management in the application

Will there be a "push button" cluster configuration ?

We are not convinced at all that a JBoss Cluster configuration is the answer in all cases.

For example, there is no reason to cluster a Nuxeo stateless packaging: simple NLB with session affinity if far better for performance and far easier to setup and administer.

JBoss Cluster configuration could be an option for stateful packaging, but we never had this requirement since we have other solutions to handle availability and performance for Core on a single server is not an issue.

When will there be a JBoss Cluster config ?

Probably only when we have this requirement from one of our clients.

Labels:
None
Enter labels to add to this page:
Please wait 
Looking for a label? Just start typing.