Installation and Administration

Reporting Problems

Updated: July 17, 2023

Here are listed some procedures to extract information of a running Nuxeo instance. These information can be requested by the support team. Please always compress files before uploading them to your JIRA ticket.

 

Nuxeo Configuration and Status

To dump your server configuration and status run

./bin/nuxeoctl showconf
./bin/nuxeoctl status

JMX Monitoring

When JMX is enabled (uncomment JMX related lines in nuxeo.conf), the Nuxeo Platform exposes lots of metrics accessible in the "metrics" domains.

You can use GUI tools like Java Mission Control or VisualVM to introspect these metrics, but if you want to dump all of them to report a problem you can use jmxterm (using the same JVM and user as your Nuxeo):

# download jmxterm
wget http://sourceforge.net/projects/cyclops-group/files/jmxterm/1.0-alpha-4/jmxterm-1.0-alpha-4-uber.jar/download -O /tmp/jmxterm-1.0-alpha-4-uber.jar
# list metrics beans and create a script
echo -e "domain metrics\nbeans" | java -jar /tmp/jmxterm-1.0-alpha-4-uber.jar -l localhost:1089 -n | sed -e "s,^,get -b ,g" -e "s,$, \*,g" > /tmp/metrics-script.txt
# get metrics info
(date +'%Y-%m-%d %H:%M:%S:%3N'; java -jar /tmp/jmxterm-1.0-alpha-4-uber.jar -l localhost:1089 -n -i /tmp/metrics-script.txt)  > /tmp/metrics.txt 2>&1

And attach the metrics.txt file to your JIRA ticket.

JVM Garbage Collector

The garbage collector attempts to reclaim memory used by objects that are no longer in use by the application.

The garbage collector is monitored by default since Nuxeo 6.0, the log file is located here: ${nuxeo.log.dir}/gc.log.

In case of problem think to save this file before restarting because the file is overridden on start. If you see many full GC in the file try to run a JVM heap histo.

JVM Heap Histo

To see what objects are present in the heap

jcmd Bootstrap GC.class_histogram > /tmp/heap-histo.txt

JVM Thread Dump

A thread dump is useful to understand what code is running at time t. It is always better to create 2 or 3 thread dumps with few seconds of pause between them. It makes possible to pinpoint stuck code, you should also take capture of the thread activity.

The first step is to log in as same user as the Nuxeo JVM then use either jcmd:

jcmd Bootstrap Thread.print > /tmp/nuxeo.tdump

Or jstack:

  1. Get the PID of the Nuxeo JVM, running command and look at a Bootstrap process id.
  2. Then run

    jstack <PID> > /tmp/nuxeo.tdump
    
  3. If you have errors try again with the force option: jstack -F <PID>.

CPU Thread Activity

It is also interesting to correlate the code path given by a thread dump with the CPU activity:

top -bcH -n1 -w512 > /tmp/top-thread.txt

Oracle JVM Flight Recording

If you use the Oracle JVM you can activate this option in the nuxeo.conf:

JAVA_OPTS=$JAVA_OPTS -Dcom.sun.management.jmxremote.autodiscovery=true -Dcom.sun.management.jdp.name=Nuxeo -XX:+UnlockCommercialFeatures -XX:+FlightRecorder

Then to record JVM activity for 1 minute use the following command:

jcmd Bootstrap JFR.start duration=60s filename=/tmp/record-01.jfr

JVM Core Dump

When the JVM is stuck, in addition to thread dump and before restarting, a core dump can give more context information,

If you have gdb installed, you can generate a core dump without killing the application:

sudo gdb --pid=<PID> --batch -ex generate-core-file -ex detach

PostgreSQL

Follow the Nuxeo recommendation and perform the reporting problem procedure. Pgbadger and explain are your friends.

Elasticsearch

If the problem is related to Elasticsearch access (initialization or bad health status), please list:

  • the non default nuxeo.conf`elasticsearch.*`options
  • the non default Elasticsearch configuration options (especially the discovery)

And report the output of the following commands, assuming that Elasticsearch is on localhost and that the HTTP protocol is open on port 9200:

curl "localhost:9200"
curl "localhost:9200/_cat/health?v"
curl "localhost:9200/_cat/nodes?v"
curl "localhost:9200/_cat/indices?v"

In addition If the problem is related to unexpected search results or errors, follow this procedure: Reporting Settings and Mapping

Network

Measure the round trip between Nuxeo and the database:

ping -s 8192 <database IP>

Use mtr to discover what is between the Nuxeo server and the database, report any firewall or known hardware.

Look at the number of errors reported by netstat -s , as a large number of errors may indicate a network problem.

A network capture can be helpful at some point:

# Capture all eth0 traffic
sudo tcpdump -i eth0 -w /tmp/out.tcpdump
# Capture http traffic to port localhost:8080
sudo tcpdump  -i lo -A host localhost and tcp port 8080 -w /tmp/out.tcpdump

OS

You can report a Linux configuration using the aspersa summary script:

wget http://aspersa.googlecode.com/svn/trunk/summary && bash summary

To monitor the system the sysstat utilities are a collection of performance monitoring tools for Linux that is easy to setup.

You can monitor the system activity like this:

sar -d -o /tmp/sysstat-sar.log 5 720 >/dev/null 2>&1 &

This will monitor the activity every 5s during 1h.

Very useful also is to have a process monitoring, this can be done with atop running as root:

atop -w /tmp/atop.log 5 720 >/dev/null 2>&1 &

Security

If you think you've found a security issue, please report it privately to [email protected].