Nuxeo Drive / PostGresql satured

Hi,

Currently, we use the Nuxeo Drive client in my company (about 140 users) For performance reasons, we separated the different flows between Nuxeo DM and Nuxeo Drive (see diagram)

We have a problem of connections in our database PostgreSQL. Our instances are heavily loaded because of flooding by the Nuxeo Drive client (refresh every 3 seconds).

Have you an idea of ​​the settings I can change to avoid this saturation?

Thank you in advance

Example of our configuration (nuxeo-prod3) :

# Database configuration  
nuxeo.db.min-pool-size = 5 
nuxeo.db.max-pool-size = 60 
nuxeo.db.validationQuery = SELECT 1 
nuxeo.vcs.min-pool-size = 0 
nuxeo.vcs.max-pool-size = 58 
nuxeo.vcs.blocking-timeout-millis = 100  
nuxeo.vcs.idle-timeout-minutes = 10

I based this documentation here: http://doc.nuxeo.com/display/public/ADMINDOC/JDBC+Datasource+Configuration

alt text

0 votes

3 answers

1955 views

ANSWER



Created https://jira.nuxeo.com/browse/NXDRIVE-32 for configuration of the delay in the server or through the GUI.

Though we will keep 5 seconds by default for now as it matches most of our use cases of people needing to have fresh information quickly.

About the SQL errors this looks like a bug in the Nuxeo code, it would need further investigation through a support issue…

0 votes



For example, what are the hardware specifications of your PostgreSQL server in Nuxeo?

How many users do you have simultaneously?

07/04/2014


Thank you for this return.

Indeed, NuxeoDrive enormously requests our servers. We wanted to isolate not to disturb our intranet portal based on NuxeoDM.

I attach two errors log when our server NuxeoDrive doesn't work properly.

I just activate the “pg_stat_statements” postgresql on our production server. I can check queries requested.

Concerning the development of NuxeoDrive, can we not put the default delay to 60 seconds in Windows and MacOS bundles or/and configure it in the GUI.

Second track, such as automatic update for the NuxeoDrive client, we could push the refresh time from the server to the client NuxeoDrive …

Thank you for your work

First Error : “Calling method prepareStatement, connection sharing started in transaction”

Caused by: javax.servlet.ServletException: org.nuxeo.ecm.core.api.ClientException: Failed to get document d79b2933-91c2-487a-b81f-e1c95c3afc67
    at org.nuxeo.ecm.platform.ui.web.download.DownloadServlet.resolveBlob(DownloadServlet.java:195)
    at org.nuxeo.ecm.platform.ui.web.download.DownloadServlet.handleDownloadSingleDocument(DownloadServlet.java:258)
    at org.nuxeo.ecm.platform.ui.web.download.DownloadServlet.doGet(DownloadServlet.java:101)
    at javax.servlet.http.HttpServlet.service(HttpServlet.java:621)
    at javax.servlet.http.HttpServlet.service(HttpServlet.java:728)
    at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:305)
    at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
    at org.nuxeo.wss.servlet.BaseWSSFilter.doFilter(BaseWSSFilter.java:137)
    at org.nuxeo.wss.servlet.FailSafeWSSFilter.doFilter(FailSafeWSSFilter.java:55)
    at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)
    at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
    at org.nuxeo.ecm.core.management.jtajca.internal.Log4jWebFilter.doFilter(Log4jWebFilter.java:64)
    at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)
    at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
    at org.nuxeo.ecm.platform.ui.web.rest.FancyURLFilter.doFilter(FancyURLFilter.java:129)
    ... 41 more
Caused by: org.nuxeo.ecm.core.api.ClientException: Failed to get document d79b2933-91c2-487a-b81f-e1c95c3afc67
    at org.nuxeo.ecm.core.api.AbstractSession.getDocument(AbstractSession.java:1254)
    at sun.reflect.GeneratedMethodAccessor71.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.nuxeo.ecm.core.api.TransactionalCoreSessionWrapper.invoke(TransactionalCoreSessionWrapper.java:136)
    at com.sun.proxy.$Proxy104.getDocument(Unknown Source)
    at org.nuxeo.ecm.platform.ui.web.download.DownloadServlet.resolveBlob(DownloadServlet.java:156)
    ... 55 more
Caused by: org.nuxeo.ecm.core.api.DocumentException: Failed to get document: d79b2933-91c2-487a-b81f-e1c95c3afc67
    at org.nuxeo.ecm.core.storage.sql.coremodel.SQLSession.getDocumentById(SQLSession.java:710)
    at org.nuxeo.ecm.core.storage.sql.coremodel.SQLSession.getDocumentByUUID(SQLSession.java:253)
    at org.nuxeo.ecm.core.api.DocumentResolver.resolveReference(DocumentResolver.java:61)
    at org.nuxeo.ecm.core.api.AbstractSession.resolveReference(AbstractSession.java:510)
    at org.nuxeo.ecm.core.api.AbstractSession.getDocument(AbstractSession.java:1250)
    ... 61 more
Caused by: org.nuxeo.ecm.core.storage.StorageException: Could not select: SELECT "id", "parentid", "pos", "name", "isproperty", "primarytype", "mixintypes", "ischeckedin", "baseversionid", "majorversion", "minorversion", "isversion" FROM "hierarchy" WHERE "id" IN (?)
    at org.nuxeo.ecm.core.storage.sql.jdbc.JDBCRowMapper.getSelectRows(JDBCRowMapper.java:457)
    at org.nuxeo.ecm.core.storage.sql.jdbc.JDBCRowMapper.readSimpleRows(JDBCRowMapper.java:255)
    at org.nuxeo.ecm.core.storage.sql.jdbc.JDBCRowMapper.read(JDBCRowMapper.java:221)
    at org.nuxeo.ecm.core.storage.sql.SoftRefCachingRowMapper.read(SoftRefCachingRowMapper.java:380)
    at org.nuxeo.ecm.core.storage.sql.PersistenceContext.getFromMapper(PersistenceContext.java:637)
    at org.nuxeo.ecm.core.storage.sql.PersistenceContext.getMulti(PersistenceContext.java:681)
    at org.nuxeo.ecm.core.storage.sql.SessionImpl.getNodesByIds(SessionImpl.java:660)
    at org.nuxeo.ecm.core.storage.sql.SessionImpl.getNodeById(SessionImpl.java:633)
    at org.nuxeo.ecm.core.storage.sql.SessionImpl.getNodeById(SessionImpl.java:649)
    at org.nuxeo.ecm.core.storage.sql.ra.ConnectionImpl.getNodeById(ConnectionImpl.java:194)
    at org.nuxeo.ecm.core.storage.sql.coremodel.SQLSession.getDocumentById(SQLSession.java:707)
    ... 65 more
Caused by: java.sql.SQLException: Calling method prepareStatement, connection sharing started in transaction org.apache.geronimo.transaction.manager.TransactionImpl@3dc2105 but it is now used in transaction org.apache.geronimo.transaction.manager.TransactionImpl@4fa92bc2
    at org.nuxeo.runtime.api.ConnectionHelper$ConnectionHandle.invoke(ConnectionHelper.java:182)
    at com.sun.proxy.$Proxy91.prepareStatement(Unknown Source)
    at org.nuxeo.ecm.core.storage.sql.jdbc.JDBCRowMapper.getSelectRows(JDBCRowMapper.java:379)
    ... 75 more

Second Error : “Connection PoolingConnection: null is closed.“

Caused by: org.nuxeo.ecm.core.api.ClientException: Failed to get lock info on 335da207-faac-4869-ad71-33bfbe419b96
    at org.nuxeo.ecm.core.api.AbstractSession.getLockInfo(AbstractSession.java:2994)
    at org.nuxeo.ecm.core.api.impl.DocumentModelImpl.getLockInfo(DocumentModelImpl.java:744)
    at org.nuxeo.ecm.core.api.impl.DocumentModelImpl.getLock(DocumentModelImpl.java:696)
    ... 259 more
Caused by: org.nuxeo.ecm.core.api.DocumentException: org.nuxeo.ecm.core.storage.StorageException: Could not select: SELECT "owner", "created" FROM "locks" WHERE "id" = ?
    at org.nuxeo.ecm.core.storage.sql.coremodel.SQLSession.getLock(SQLSession.java:969)
    at org.nuxeo.ecm.core.storage.sql.coremodel.SQLDocumentLive.getLock(SQLDocumentLive.java:415)
    at org.nuxeo.ecm.core.api.AbstractSession.getLockInfo(AbstractSession.java:2992)
    ... 261 more
Caused by: org.nuxeo.ecm.core.storage.StorageException: Could not select: SELECT "owner", "created" FROM "locks" WHERE "id" = ?
    at org.nuxeo.ecm.core.storage.sql.jdbc.JDBCRowMapper.getSelectRows(JDBCRowMapper.java:457)
    at org.nuxeo.ecm.core.storage.sql.jdbc.JDBCRowMapper.readSimpleRow(JDBCRowMapper.java:920)
    at org.nuxeo.ecm.core.storage.sql.jdbc.JDBCMapper.getLock(JDBCMapper.java:1214)
    at org.nuxeo.ecm.core.storage.sql.LockManager.getLock(LockManager.java:132)
    at org.nuxeo.ecm.core.storage.sql.SessionImpl.getLock(SessionImpl.java:1276)
    at org.nuxeo.ecm.core.storage.sql.ra.ConnectionImpl.getLock(ConnectionImpl.java:379)
    at org.nuxeo.ecm.core.storage.sql.coremodel.SQLSession.getLock(SQLSession.java:967)
    ... 263 more
Caused by: java.sql.SQLException: Connection PoolingConnection: null is closed.
    at org.apache.tomcat.dbcp.dbcp.DelegatingConnection.checkOpen(DelegatingConnection.java:398)
    at org.apache.tomcat.dbcp.dbcp.DelegatingConnection.prepareStatement(DelegatingConnection.java:279)
    at org.apache.tomcat.dbcp.dbcp.PoolingDataSource$PoolGuardConnectionWrapper.prepareStatement(PoolingDataSource.java:313)
    at org.nuxeo.ecm.core.storage.sql.jdbc.JDBCRowMapper.getSelectRows(JDBCRowMapper.java:379)
    ... 269 more
0 votes



Hi,

This is interesting but I wonder why exactly you have a separate instance for the requests coming from Nuxeo Drive, instead of load balancing all the request to the 3 nodes? Have you made some tests / benches encouraging this configuration?

Maybe it is useful to dedicate an instance to the load generated by Nuxeo Drive for the upstream synchronization (content modified locally that needs to be updated server side), though for the polling part (getting changes from the server to send the change summary to the client) it is mainly a query to the audit log table that is involved, so as there is only one database in your configuration, no matter if onr Nuxeo instance takes care of Nuxeo Drive, it will all end up in this database, which indeed is probably the bottleneck.

The request for changes (thus the query to the audit log) is done every 5 seconds by default by each client. You can change this parameter client side, by launching the ndrive program with the delay parameter, for instance setting it to 60 seconds:

ndrivew.exe --delay 60

For the database configuration, you can also check the Configuring PostgreSQL page which is dedicated to PostgreSQL configuration and tuning for production. The “Reporting problems” section is interesting as it allows you to make a finer analysis of the problem: tables with a lot of entries, slow queries, memory usage, …

Hope this helps.

0 votes