Best way to change document schemas

I need to make an change to a document schema after the fact. The more particular case is that I need to extract data that is common to 2 schemas and put it in its own schema.

I have only a vague idea how you can do that on a system that has already been put in production.

It would be meaningless to detail my particular problem so here is an example of what I'm trying to do.

Given 2 document types: DocumentA and DocumentB.

Both documents have their own schema SchemaA and SchemaB. The field FieldZ is common to both schemas is semantically the same in both document types.

I want to create another schema SchemaZ, which will include FieldZ, and which will be included in both DocumentA and DocumentB.

0 votes

2 answers

2200 views

ANSWER



Myself I would do the actual data migration step at the SQL level as it's way more efficient and you don't have to write Java or scripting code:

  • backup the SQL database,

  • create the new schema SchemaZ,

  • add the new schema SchemaZ to both document definitions DocumentA and DocumentB,

  • remove the old field FieldZ from the old schemas SchemaA and SchemaB,

  • start Nuxeo once so that the new table, schemaz, is created,

  • stop Nuxeo,

  • write SQL to copy the old field, something like:

    INSERT INTO schemaz (id, fieldz) SELECT id, fieldz FROM schemaa;
    INSERT INTO schemaz (id, fieldz) SELECT id, fieldz FROM schemab;
    
  • start Nuxeo,

  • check that things are ok with the new schema,

  • when you're really sure that all its ok, ALTER the old tables schemaa and schemab to remove the old fieldz column (you should get a WARN about it being an extra unused column at startup).

In any case, make sure you start with a backup.

1 votes



The general approach I would take…

 - Create the new schema
 - Add the new schema to both document definitions
 - BACKUP everything
 - Write and run a piece of code (with all users disconnected) to iterate over the entire set of "A" and "B" documents and copy the value of FieldZ to the new schema FieldZ
 - Delete the field from "A" and "B" original schemas
1 votes