Batch uploading and attaching a document fails in a clustered configuration -- how can it be fixed?

Hello,

I have been following the “Blob Upload for Batch Processing” directions with some success in development, but I have run into a very severe problem when attempting to use it in a production configuration in a two-node cluster.

I believe the root of the problem is described in that documentation. “The files attached to the batch are stored on a temporary disk storage (inside java.io.tmp) until the batch is executed or dropped.” – this means that batch-uploaded files are only available on the original node. So given the following configuration:

  • nuxeo-lb.example.com load balancer
  • nuxeo-a.example.com node A
  • nuxeo-b.example.com node B

If one uses nuxeo-lb.example.com to access the batch processing and document modification API as documented, then one is quite likely to upload the file to node-a.example.com and attempt to use it from node-b.example.com, which results in an error because node-b.example.com does not have access to that file.

As a temporary workaround we can directly use the nuxeo-a.example.com address for everything, but this is certainly not an ideal solution. What else can we do? We are using S3 binary storage on an EC2 instance, so sharing a directory isn't quite as straightforward as a shared NFS mount, and it's not even clear that java.io.tmp should be shared anyway.

3 votes

0 answers

2643 views

ANSWER

Have you set up session affinity at the load balancer level? If you're not using a session cookie for these uploads, maybe you can add a custom header and do affinity on that?
12/04/2014

We have the load balancer set up for user logins, but this is a sessionless direct call to the REST API. We are considering just using IP affinity for the time being unless there is a better way.
12/04/2014