Nuxeo cluster shared nuxeo.tmp.dir causing problems due to nuxeo-launcher jar naming contention
According to this answer, it is a best practice for nodes in a Nuxeo cluster to share their nuxeo.tmp.dir. When doing so, must each node in the cluster have its own tmpdir on the binary store filesystem? I am encountering nuxeo-launcher jar file naming collisions causing NFS stale file handle errors when multiple servers in a cluster share their tmpdir and I simultaneous invoke nuxeoctl operations (using Ansible) on all nodes in the cluster.
In cluster mode it's not recommended at all to share nuxeo.tmp.dir
, there are many libraries we don't control which could have a problem with it. This means in turn that you can't leverage the NXP-9361 no-copy optimizations…
On the other hand if the only problems you have are due to nuxeo-launcher jar file naming then we could fix this on our end and allow tmp sharing. Please open a JIRA ticket.
Edit: the simplest and surest way is probably to have a shared filesystem but make each node point its nuxeo.tmp.dir
to a different subdirectory in it.
In cluster mode, do you recommend nuxeo.tmp.dir be set to a cluster-node-unique directory on the shared file system in order to take advantage of NXP-9361? By default, java.io.tmpdir = nuxeo.tmp.dir, right?
Or in cluster mode, should java.io.tmpdir and nuxeo.tmp.dir be set independently? NXP-9361 says java.io.tmpdir should be on the shared file system. Should it be set to a cluster-node-unique directory there and nuxeo.tmp.dir be local?
nuxeoctl
, the launcher in the tmp dir (which is there to allow the launcher to update itself) should be named nuxeo-launcher-$RANDOM.jar
where $RANDOM
is randomly generated by bash and should be collision-free (although mktemp
would be better). Is that not the case for you? Please open a ticket if you have enough info for us to track this down.nuxeo.tmp.dir
should be ok.Since this configuration has the temp directory on the shared file system, I would expect the NXP-9361 optimization to be fully-functional, do you agree? In general, this seems like a safer configuration than trying to share a common nuxeo.tmp.dir across all nodes. What are your thoughts?
nuxeo.tmp.dir
point to different parts of a shared filesystem depending on the node is a good way to solve the issue.Even if nuxeo-launcher naming collisions were fixed, it seems risky for multiple nodes to share nuxeo.tmp.dir.