How can I import(index) documents to Nuxeo without moving or touching them (uploading)?


This might be a newbie question, but I cannot seem to find the solution through searching docs, communities or questions. I also cannot do how do this on my own with the Nuxeo admin interface.

I have an absolutely massive document archive, at least in my own point of reference, standing at 3000 gigabytes. I would like to see how Nuxeo can handle this mix-mash of zip-files, images, pdf's, html's, and ms office documents and libreoffice files.

The structure of the file hierarchy and the physical location of the files cannot change, as they are used by other systems. Therefore I would like to index the files right where they are located.

Is this possible in Nuxeo? (I would think this is quite a case for a lot of people)

All the best – TJAF

0 votes

2 answers



AFAIK this is not possible with Nuxeo out of the box. Nuxeo expects to ingest the binary, drop it into its own file structure, and do the necessary metadata capture and indexing. You could override this default functionality with a custom component.

1 votes

Data recovery of huge data is something to take care.

Nuxeo gives many tools to do that:

  • nuxeo-platform-importer
  • Automation
  • Nuxeo shell

Clearly do that from a browser is not good practice. I think the best way is to do that from the server side. Maybe in your case a direct SQL script is the best way and Enable the indexation of documents after the import.

You can find documentation about that here.

0 votes

Sorry bad reading. You mean you want use Nuxeo as exalead or google search appliance. Bruce has right this not a standard way to use Nuxeo. If you really needs that you can look around solr/lucene, but you will need to make the UI. Or pay a lot of money with solution above.

If you want to do that through Nuxeo we have some ways but infrastructure code will be needed and Nuxeo can do that but this not in our roadmap at all today.