How can I configure Nuxeo to store binary files in an SQL database?

Ok, so far I know that Nuxeo, by default, stores its binary files to a path “var/lib/nuxeo/data”. My first issue was that every time I rebuild my Nuxeo docker image, the contents of that directory disappears. The first solution I opted to was to mount “var/lib/nuxeo/data” to a docker volume on the machine where my Nuxeo instance resides. This solution solved the issue of “persistence” as I am now able to rebuild my Nuxeo docker image and have my existing binary files accessible after the rebuild.

The problem now is that whenever I try to access the document containing the file on a machine with a different file system, it cannot access the document. But when I access it using the machine with the same file system, I am able to access it. My question is that, is there a solution wherein I can make my binary store persistent after rebuild, at the same time make it available to different machines with different file systems?

The title mentions of a specific solution, but I cannot seem to find a configuration or instructions to do it, so I resort to the community.

0 votes

3 answers

3161 views

ANSWER



I had to do something like this for development Vagrant images a couple of weeks ago. I simply followed the backup/restore instructions from Nuxeo at https://doc.nuxeo.com/display/NXDOC/Backup+and+Restore.

My scripts look like this:

dump_nuxeo:

#!/bin/sh

PGUSER=nuxeo
export PGUSER

PGPASSWORD=$(grep ^nuxeo.db.password /etc/nuxeo/nuxeo.conf | cut -d = -f 2)
export PGPASSWORD

pg_dump -h localhost -p 5433 > nuxeo_backup.sql

tar czf nuxeo_backup.tar.gz -C /var/lib/nuxeo/data/ .

load_nuxeo:

#!/bin/sh

PGUSER=nuxeo
export PGUSER

PGPASSWORD=$(grep ^nuxeo.db.password /etc/nuxeo/nuxeo.conf | cut -d = -f 2)
export PGPASSWORD

rm -rf /var/lib/nuxeo/data
mkdir -p /var/lib/nuxeo/data
tar xzpf nuxeo_backup.tar.gz -C /var/lib/nuxeo/data


dropdb -h localhost -p 5433 nuxeo
sudo -u postgres createdb -p 5433 -U postgres -T template0 nuxeo

psql -h localhost -p 5433 < nuxeo_backup.sql

And this lets me pass around a known repository full of binaries from developer to developer, which I am guessing is what you are after.

0 votes



This looks promising, can you share what your nuxeo.conf looks like? And also, by the way you explained it, it seems that you are using postgresql to store the binaries am I right? Can you share as to how you configured Nuxeo to store its binaries in the postgresql db?
10/26/2016

No, I'm just using the standard /var/lib/nuxeo/data filesystem store for the binaries. I didn't bother putting them in the PostgreSQL database. The binaries are archived with tar czf nuxeo_backup.tar.gz -C /var/lib/nuxeo/data/ and unarchived on another system with tar xzpf nuxeo_backup.tar.gz -C /var/lib/nuxeo/data. I put running the load_nuxeo script into our Vagrant provision step.
10/26/2016


You can simply solve your problem by mounting your filesystem containing the files data (var/lib/nuxeo/data) to the other machine using for instance NFS mounts. Check your OS documentation on how to do that. This is the solution we recommend over sql storage.

0 votes



Is this solution not the same as the one I mentioned above? "The first solution I opted to was to mount “var/lib/nuxeo/data” to a docker volume on the machine…", to add specifically, the machine where I mounted the docker volume to was acting as the server. The problem I encountered in this solution is that not all the machines that are accessing the Nuxeo instance have the same file system, hence they are not able to access the document with the attached files.

EDIT: I misunderstood your comment, I got it after the 3rd read. Are you saying that we have to mount the directory containing the binaries to all the machines that are accessing the binary store? Wouldn't that be of a hassle for users?

10/25/2016

Users access Nuxeo as a server through HTTP, they don't need access to the filesystem. Only the Nuxeo server itself accesses its filesystem. I don't understand what architecture you have in mind.
10/25/2016

Basically, we are using Nuxeo in a global scale (inter-continent sites). The setup is that we have a central global database for all of the sites that would be using Nuxeo on their respective platform. This database would contain all document related data to be readily accessible to all users globally. Each platform on a specific site has it's own Docker image of our configured Nuxeo, so that it would run with the exact same configurations. Here is where the problem comes into the picture, since we have different platforms per site, and each of them having the docker image of our pre-configured Nuxeo, each site would have their own binary store since we can't mount all of the platforms' binary store in a single docker volume. What I'm trying to create is something like a central database for the binaries, in which the binaries would be readily available in a global scale. I've already tried searching for different solutions, and I came along with the S3 buckets as central binary store. Actually, I have a question with regards to this specific solution in this community found here: https://answers.nuxeo.com/general/q/d102486c5c914ec9917d15d640ebd17d/Is-it-possible-to-use-S3-Buckets-deployed-in-a-Eucalyptus-cloud-as-binary-stores

Anyways, I hope I explained it well. I understood your comment, that it should not matter since Nuxeo is the one accessing its own file system. The problem is that I keep on trying to have a Nuxeo instance access another Nuxeo instance's file system to achieve that "globally accessible" state.

10/27/2016

Nuxeo is not designed to access the files of an unrelated Nuxeo instance. Only if the two instances are in cluster (therefore share the same database and binary files) are things going to work correctly.
10/28/2016


Hi,

See https://doc.nuxeo.com/display/NXDOC/VCS#VCS-BlobStorage

You can also choose to store blobs in the database but we don't recommend it. In our experience storing blobs in the database leads to a lot of problems:

  • Performances really drops,
  • Some JDBC drivers load blobs in the JVM memory,
  • Database backup/restore is very slow,
  • Database sync (Master/Slave) is very very slow too.

If despite those recommendation you still want to take this architectural option, you can use the dedicated plugin available at https://github.com/nuxeo/nuxeo-core-binarymanager-sql/.

There is no Nuxeo Package for that plugin at the moment, so you will have to manually deploy the JAR downloaded from https://maven.nuxeo.org/nexus/#nexus-search;gav~~nuxeo-core-binarymanager-sql~~~. You can use the “custom” template for that purpose.

0 votes