Nuxeo Server

PostgreSQL

Updated: September 22, 2017 Page Information Edit on GitHub

Nuxeo supports the following PostgreSQL version:

PostgreSQL 9.6

The database needs to be configured to work properly with Nuxeo. Some settings must be changed in order for Nuxeo to work. Other settings should be changed in order for Nuxeo to have good performance.

This FAQ will give you some hints to configure your database, but please refer to your DBA or the PostgreSQL documentation for more information (http://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server).

Most settings have to be changed in the postgresql.conf file. Some SQL commands may have to be executed directly at the PostgreSQL console (psql).

Mandatory Changes

Two-Phase Commit

You shouldn't need to change this, because two-phase commit (XA) isn't used in basic configurations.

However, if you use XA datasources (see the JDBC Datasource Configuration documentation for more on this), you will need to update the default max_prepared_transactions settings. You can use the same value as max_connections.

max_prepared_transactions = 100

Create the Role and Database for Nuxeo

For instance (please change the password and the nuxeo.conf file of your instance accordingly):

$ createuser -U postgres -W -P nuxeo
$ createdb -U postgres -W -O nuxeo -E UTF8 nuxeo

Or from the psql command prompt:

CREATE ROLE nuxeo WITH PASSWORD 'nuxeo' LOGIN;
CREATE DATABASE nuxeo ENCODING 'UTF8' OWNER nuxeo;

Note that using the UTF8 encoding for your database is important.

Secure the Authentication Method

Edit the file pg_hba.conf, make sure that the nuxeo user (or all users, if the nuxeo user is not individually listed), have an authentication METHOD of md5, rather than ident. Otherwise you will have some FATAL: Ident authentication failed for user "nuxeo" errors in the logs.

# TYPE  DATABASE    USER        CIDR-ADDRESS          METHOD

# "local" is for Unix domain socket connections only
local   all         all                               md5
# IPv4 local connections:
host    all         all         127.0.0.1/32          md5

Performance Tuning

Shared Buffers and System Cache

One of the most important thing for PostgreSQL is to have lots of shared buffers along with free memory that can be used by the system cache.

Refer to the section Adapt Your Configuration to Your Hardware to get the correct value.

If you plan to use 1 GB of shared buffers, update the following property in your postgresql.conf file:

shared_buffers = 1GB

Then restart PostgreSQL.

If there is no enough shared memory you will have an explicit error message and you should try with a bigger kernel.shmmax value.

Once PostgreSQL starts properly with the chosen kernel.shmmax value, it should be registered in the /etc/sysctl.conf file by adding the following line.

kernel.shmmax = <SHMMAX_VALUE>

Using effective_cache_size PostgreSQL is informed of how much memory the system will use for disk caching. This is used as a hint when executing queries, note that this memory is not allocated by PostgreSQL itself.

To set effective_cache_size value, you need to run your application once and check how much memory is used by system cache. This can be done using the free command and using the free value for -/+ buffers/cache. Then you can set this value in the configuration:

effective_cache_size = 1536MB

Memory for Workers

Increasing the work_mem parameter allows PostgreSQL to do larger in-memory sorts which is much faster than disk sorts. Keep in mind that work_mem size will be taken up by each connection (a pool of 20 connections will take up to 20 * work_mem).

work_mem = 12MB

Increasing the maintenance_work_mem will speed up the vacuum procedure.

maintenance_work_mem = 512MB

Buffering Writes

The default wal_buffers can be increase to improve write access time. Increasing the checkpoint segments and completion target helps to spread out the writes.

wal_buffers = 16MB
checkpoint_segments = 32
checkpoint_completion_target=0.8

Index vs Table Scan

The random_page_cost parameter influences this query planner's choice. The value to use depends on your disk IO, here are some advices:

# random_page_cost = 4 # Slow disk AWS EBS
# random_page_cost = 2 # Recent HD
# random_page_cost = 1 # SSD

Updating the Planner Statistics

PostgreSQL computes statistics on table content in order to plan for the best performance when executing queries with joins and complex filters. The default configuration is default_statistics_target set to the value 100 which can lead to not accurate enough estimates. We recommend a higher value like 500:

default_statistics_target = 500

If the database is already populated you need to execute ANALYZE to update the statistics.

Vacuuming

The autovacuum is enabled by default in PostgreSQL.

Exceptionally, a full vacuum can be done at downtime to recover disk space, it should be followed with a reindexdb command.

Monitoring

We recommend the following setting to have a meaningful log in production

log_line_prefix = '%t [%p]: [%l-1] user=%u,db=%d '
log_min_duration_statement = 400
log_checkpoints=on
log_lock_waits=on
log_temp_files=0
log_autovacuum_min_duration=0
log_statement = ddl
track_functions=pl

Also to have an effective monitoring you should install the following extensions: pg_stat_statements, pg_buffercache

  1. Install postgresql-contrib package.

    sudo apt-get install postgresql-contrib
    
  2. Login to your database as postgres user and create the extensions.

    sudo su postgres -c'psql -U postgres -d nuxeo -c"CREATE EXTENSION pg_buffercache;"'
    sudo su postgres -c'psql -U postgres -d nuxeo -c"CREATE EXTENSION pg_stat_statements;"'
    
  3. Update the configuration.
    shared_preload_libraries = 'pg_stat_statements, auto_explain'
    pg_stat_statements.max = 10000
    pg_stat_statements.track = top
    auto_explain.log_min_duration = -1
    auto_explain.log_analyze = 'false'
    
  4. Restart the database.

    sudo /etc/init.d/postgresql restart
    

See the PostgreSQL section of the Monitoring and Maintenance page.

Adapt Your Configuration to Your Hardware

Here are some values that can be used as a starting point for a dedicated server depending on the amount of memory.

Amount of RAM4 GB8 GB16 GB32 GB
shared_buffers1GB2GB4GB8GB
effective_cache_size1536MB4GB8GB16GB
work_mem12MB12MB16MB20MB
maintenance_work_mem512MB1GB1GB1GB
max_connections63103153203

Specific Configuration

If you want accent-insensitive full-text search, you'll need to install the unaccent contribution, create a new text search configuration, and specify its use in Nuxeo.

Unaccent is described here https://www.postgresql.org/docs/current/static/unaccent.html.

Install it by running unaccent.sql script. For Ubuntu users, this file is located at /usr/share/postgresql/9.6/contrib/unaccent.sql.

Connect to your database and run the following instructions:

CREATE TEXT SEARCH CONFIGURATION fr ( COPY = french );
ALTER TEXT SEARCH CONFIGURATION fr ALTER MAPPING FOR asciihword, asciiword, hword_asciipart, hword, hword_part, word WITH unaccent, french_stem;

Then replace in your default-repository-config.xml file the french analyzer by the one you just defined (fr in this example).

Mass Import Specific Tuning

When doing mass import you can disable the full-text trigger and full-text index. They must be dropped after a successful login on a running Nuxeo DM because DDL SQL commands are executed on the first access.

ALTER TABLE fulltext DISABLE TRIGGER nx_trig_ft_update;
DROP INDEX IF EXISTS fulltext_fulltext_idx;
DROP INDEX IF EXISTS fulltext_fulltext_description_idx;
DROP INDEX IF EXISTS fulltext_fulltext_title_idx;

After the import you can update the full-text column like this:

ALTER TABLE fulltext ENABLE TRIGGER nx_trig_ft_update;
-- Let the trigger update the fulltext column
UPDATE fulltext SET fulltext = ''::TSVECTOR WHERE length(fulltext) is NULL;
CREATE INDEX fulltext_fulltext_title_idx ON fulltext USING gin (nx_to_tsvector(fulltext_title::character varying));
CREATE INDEX fulltext_fulltext_description_idx ON fulltext USING gin (nx_to_tsvector(fulltext_description::character varying));
CREATE INDEX fulltext_fulltext_idx ON fulltext USING gin (nx_to_tsvector(fulltext::character varying));

Changing temporarily the PostgreSQL configuration during the import provides performance benefits:

full_page_writes = off
fsync = off
synchronous_commit = off

Do not do this if there is already any data you care about in any database on your PostgreSQL cluster, Please refer to the PostgreSQL reference manual.

Using uuid idType

If you want to use the PostgreSQL uuid type instead of the default varchar(36) (this is the case when you set nuxeo.vcs.idtype=uuid in the nuxeo.conf file) you need to create a new operator to support GIN index on uuid[] type.

CREATE OPERATOR CLASS _uuid_ops DEFAULT
   FOR TYPE _uuid USING gin AS
   OPERATOR 1  &&(anyarray, anyarray),
   OPERATOR 2  @>(anyarray, anyarray),
   OPERATOR 3  <@(anyarray, anyarray),
   OPERATOR 4  =(anyarray, anyarray),
   FUNCTION 1  uuid_cmp(uuid, uuid),
   FUNCTION 2  ginarrayextract(anyarray, internal, internal),
   FUNCTION 3  ginqueryarrayextract(anyarray, internal, smallint, internal, internal, internal, internal),
   FUNCTION 4  ginarrayconsistent(internal, smallint, anyarray, integer, internal, internal, internal, internal),
   STORAGE uuid;

Possible error if you don't create the operator described above is:

ERROR: data type uuid[] has no default operator class for access method "gin"

Limitations

PostgreSQL is very good, very fast, and is our reference database. To be completely honest it still has some limitations that other databases don't have:

  • Its full-text engine doesn't know how to do full-text phrase search, which means we have to use a slower and not completely equivalent workaround when phrase search is required (NXP-6720).

Reporting Problems

If you have a database configuration problem and you want to fill a JIRA ticket, there are some information to report:

  • The PostgreSQL server state: is it dedicated or shared, which OS, how many CPU, RAM, is it a virtual machine...

  • How much memory is available on the database server (free -m output).

  • Amount of Nuxeo documents and PostgreSQL configuration. Using the following commands:

    1. Log in on your database with the postgres user.

      sudo su - postgres
      
    2. Get the Nuxeo SQL script to dump your configuration.

      wget --no-check-certificate https://gist.github.com/bdelbosc/5507796/raw/dump-nuxeo-postgres-config.sql
      
    3. Execute the SQL script with psql against the Nuxeo DB (not the default database named postgres).

      psql nuxeo -f dump-nuxeo-postgres-config.sql
      

      Note that this SQL script can generate some errors logs when optional tables are not present or depending on the PostgreSQL version. The result file is still relevant.

  • Attach the output file located in /tmp/pgconf.txt into the JIRA ticket. An example of such a result file is here, so that you can check that yours has the correct format.

  • If you are monitoring the slowest queries (See monitoring section) you can zip and attach the postgresql log file to the JIRA ticket.

2 days ago Solen Guitter Don't show excerpt 'upgrade-9.2-drop-change-token' and add comment
a year ago Solen Guitter 109
a year ago Florent Guillaume 108 | remove mention of single datasource
a year ago Solen Guitter 107
a year ago Manon Lumeau 106
2 years ago Solen Guitter 105
2 years ago Solen Guitter 104
2 years ago Solen Guitter 103
2 years ago Solen Guitter 102
2 years ago Solen Guitter 101
2 years ago Florent Guillaume 100 | details
2 years ago Alain Escaffre 99
2 years ago Solen Guitter 98
2 years ago Benoit Delbosc 97
2 years ago Thibaud Arguillere 96
2 years ago Solen Guitter 95
3 years ago Julien Carsique 94
3 years ago Maxime Hilaire 93
3 years ago Florent Guillaume 92 | max_prepared_transactions clarification
3 years ago Solen Guitter 91
3 years ago Solen Guitter 90
3 years ago Solen Guitter 89
3 years ago Joshua Fletcher 88 | PG doesn't support "m" or "g" for these values, you need to use "MB" or "GB". This part of the doc was a little confusing as a new user.
3 years ago Solen Guitter 87 | Updated supported versions for 6.0
3 years ago Solen Guitter 86 | Moved limitations from independent page into postgreSQL page
3 years ago Solen Guitter 85
3 years ago Florent Guillaume 84
3 years ago Julien Carsique 83
3 years ago Julien Carsique 82
4 years ago Solen Guitter 81
4 years ago Benoit Delbosc 80
4 years ago Benoit Delbosc 79
4 years ago Benoit Delbosc 78
4 years ago Benoit Delbosc 77
4 years ago Benoit Delbosc 76
4 years ago Benoit Delbosc 75
4 years ago Benoit Delbosc 74 | update recommendations
4 years ago Benoit Delbosc 73 | Update recommendation
4 years ago Benoit Delbosc 72
4 years ago Florent Guillaume 71
4 years ago Solen Guitter 70 | Added PostgreSQL version 9.3.
4 years ago Solen Guitter 69
4 years ago Benoit Delbosc 68
4 years ago Florent Guillaume 66
4 years ago Florent Guillaume 67 | Migration of unmigrated content due to installation of a new plugin
4 years ago Florent Guillaume 65
4 years ago Solen Guitter 64
4 years ago Benoit Delbosc 63
4 years ago Benoit Delbosc 62 | Update the reporting problems procedure
4 years ago Benoit Delbosc 61
4 years ago Benoit Delbosc 60
5 years ago Florent Guillaume 58
5 years ago Florent Guillaume 59 | Migration of unmigrated content due to installation of a new plugin
5 years ago Benoit Delbosc 56 | more reporting stats
5 years ago Benoit Delbosc 57 | Migration of unmigrated content due to installation of a new plugin
5 years ago Benoit Delbosc 53
5 years ago Benoit Delbosc 54 | Migrated to Confluence 4.0
5 years ago Benoit Delbosc 55 | Migration of unmigrated content due to installation of a new plugin
5 years ago Benoit Delbosc 52
5 years ago Benoit Delbosc 51 | update mass import index creation for 5.5
5 years ago Benoit Delbosc 50
5 years ago Solen Guitter 49
5 years ago Solen Guitter 48 | Added related pages
6 years ago Mariana Cedica 47
6 years ago Mariana Cedica 46
6 years ago stan 45
6 years ago Benoit Delbosc 44 | Adding postgresql 8.4 unaccent contrib compilation
6 years ago Benoit Delbosc 43 | adding more sql for reporting pb
6 years ago Florent Guillaume 42
6 years ago Solen Guitter 41
6 years ago Solen Guitter 40
6 years ago Benoit Delbosc 39 | explain how to log slow queries
6 years ago Benoit Delbosc 38 | Adding index information when reporting problem.
6 years ago Benoit Delbosc 37
6 years ago Julien Carsique 36
6 years ago Florent Guillaume 35
6 years ago Laurent Doguin 34 | add accent unsensitive search configuration details
6 years ago Solen Guitter 33
7 years ago Wojciech Sulejman 32
7 years ago Solen Guitter 31
7 years ago Benoit Delbosc 30 | Add SQL queries to get user count, index ratio and cpu speed
7 years ago Florent Guillaume 29
7 years ago Olivier Grisel 28
7 years ago Julien Carsique 27 | moved driver information to NXDOC/Configuring+Nuxeo+EP
7 years ago Julien Carsique 26 | Add jdbc driver information
7 years ago Florent Guillaume 25
7 years ago Florent Guillaume 24
7 years ago Florent Guillaume 23 | put Two-phase commit first as it's the most common error
7 years ago Florent Guillaume 22
7 years ago Florent Guillaume 21
7 years ago Stéfane Fermigier 20
7 years ago Stéfane Fermigier 19
7 years ago Stéfane Fermigier 18
7 years ago Stéfane Fermigier 17
7 years ago Stéfane Fermigier 16
7 years ago Stéfane Fermigier 15
7 years ago Stéfane Fermigier 14
7 years ago Stéfane Fermigier 13
7 years ago Stéfane Fermigier 12
7 years ago Stéfane Fermigier 11
7 years ago Stéfane Fermigier 10
7 years ago Stéfane Fermigier 9
7 years ago Stéfane Fermigier 8
7 years ago Stéfane Fermigier 7
7 years ago Stéfane Fermigier 6
7 years ago Stéfane Fermigier 5
7 years ago Stéfane Fermigier 4
7 years ago Stéfane Fermigier 3
7 years ago Stéfane Fermigier 2
7 years ago Stéfane Fermigier 1
History: Created by Stéfane Fermigier