Start nuxeo fails when constrained by the pool
Hi
we're fixing a deadlock in our code that happens when the connection pool is exhausted - we've been bad at handling sessions, but now we're fixing them all up.
now, we are trying to test if the issue is definitely solved, but for testing we need to start nuxeo on a minimal pool
the problem is if we set the conf with: org.nuxeo.ecm.core.strictlazyloading=true nuxeo.db.max-pool-size=1 nuxeo.vcs.max-pool-size=1 nuxeo.vcs.blocking-timeout-millis=1000
nuxeo won't start properly.
it seems the startup code has it's own set of deadlocking issues:
java.lang.RuntimeException: Failed to initialize repository 'default': org.nuxeo.ecm.core.storage.lock.LockException: org.nuxeo.ecm.core.storage.StorageException: Cannot connect to database
at org.nuxeo.ecm.core.repository.RepositoryService.initializeRepository(RepositoryService.java:152)
at org.nuxeo.ecm.core.repository.RepositoryService.applicationStarted(RepositoryService.java:104)
at org.nuxeo.runtime.model.impl.RegistrationInfoImpl.notifyApplicationStarted(RegistrationInfoImpl.java:325)
at org.nuxeo.runtime.osgi.OSGiRuntimeService.notifyComponentsOnStarted(OSGiRuntimeService.java:487)
at org.nuxeo.runtime.osgi.OSGiRuntimeService.fireApplicationStarted(OSGiRuntimeService.java:523)
at org.nuxeo.runtime.osgi.OSGiRuntimeService.frameworkEvent(OSGiRuntimeService.java:533)
at org.nuxeo.osgi.OSGiAdapter.fireFrameworkEvent(OSGiAdapter.java:232)
at org.nuxeo.osgi.application.loader.FrameworkLoader.doStart(FrameworkLoader.java:246)
at org.nuxeo.osgi.application.loader.FrameworkLoader.start(FrameworkLoader.java:126)
at org.nuxeo.runtime.deployment.NuxeoStarter.start(NuxeoStarter.java:118)
at org.nuxeo.runtime.deployment.NuxeoStarter.contextInitialized(NuxeoStarter.java:91)
at org.apache.catalina.core.StandardContext.listenerStart(StandardContext.java:4994)
at org.apache.catalina.core.StandardContext.startInternal(StandardContext.java:5492)
at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:150)
at org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:901)
at org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:877)
at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:649)
at org.apache.catalina.startup.HostConfig.deployDescriptor(HostConfig.java:672)
at org.apache.catalina.startup.HostConfig$DeployDescriptor.run(HostConfig.java:1861)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.nuxeo.ecm.core.api.ClientException: org.nuxeo.ecm.core.storage.lock.LockException: org.nuxeo.ecm.core.storage.StorageException: Cannot connect to database
at org.nuxeo.ecm.platform.routing.core.impl.DocumentRoutingServiceImpl.importRouteModel(DocumentRoutingServiceImpl.java:683)
at org.nuxeo.ecm.platform.routing.core.impl.DocumentRoutingServiceImpl.importAllRouteModels(DocumentRoutingServiceImpl.java:658)
at org.nuxeo.ecm.platform.routing.core.listener.RouteModelsInitializator.doInitializeRepository(RouteModelsInitializator.java:38)
at org.nuxeo.ecm.core.repository.RepositoryInitializationHandler.initializeRepository(RepositoryInitializationHandler.java:88)
at org.nuxeo.ecm.core.repository.RepositoryService$3.run(RepositoryService.java:148)
at org.nuxeo.ecm.core.api.UnrestrictedSessionRunner.runUnrestricted(UnrestrictedSessionRunner.java:139)
at org.nuxeo.ecm.core.repository.RepositoryService.initializeRepository(RepositoryService.java:150)
... 23 more
Caused by: org.nuxeo.ecm.core.storage.lock.LockException: org.nuxeo.ecm.core.storage.StorageException: Cannot connect to database
at org.nuxeo.ecm.core.storage.sql.VCSLockManager.removeLock(VCSLockManager.java:290)
at org.nuxeo.ecm.core.storage.sql.SessionImpl.removeLock(SessionImpl.java:1223)
at org.nuxeo.ecm.core.storage.sql.PersistenceContext.removeNode(PersistenceContext.java:850)
at org.nuxeo.ecm.core.storage.sql.SessionImpl.removeNode(SessionImpl.java:1022)
at org.nuxeo.ecm.core.storage.sql.ra.ConnectionImpl.removeNode(ConnectionImpl.java:241)
at org.nuxeo.ecm.core.storage.sql.coremodel.SQLSession.remove(SQLSession.java:749)
at org.nuxeo.ecm.core.storage.sql.coremodel.SQLDocumentLive.remove(SQLDocumentLive.java:156)
at org.nuxeo.ecm.core.api.AbstractSession.removeNotifyOneDoc(AbstractSession.java:1451)
at org.nuxeo.ecm.core.api.AbstractSession.removeDocument(AbstractSession.java:1416)
at org.nuxeo.ecm.core.api.AbstractSession.removeDocument(AbstractSession.java:1405)
at org.nuxeo.ecm.platform.routing.core.persistence.RouteModelsZipImporter.create(RouteModelsZipImporter.java:73)
at org.nuxeo.ecm.platform.filemanager.service.FileManagerService.createDocumentFromBlob(FileManagerService.java:223)
at org.nuxeo.ecm.platform.routing.core.impl.DocumentRoutingServiceImpl.importRouteModel(DocumentRoutingServiceImpl.java:671)
... 29 more
Caused by: org.nuxeo.ecm.core.storage.StorageException: Cannot connect to database
at org.nuxeo.ecm.core.storage.sql.jdbc.JDBCConnection.openConnections(JDBCConnection.java:144)
at org.nuxeo.ecm.core.storage.sql.jdbc.JDBCMapper.connect(JDBCMapper.java:1353)
at org.nuxeo.ecm.core.storage.sql.jdbc.JDBCMapperConnector.invoke(JDBCMapperConnector.java:59)
at com.sun.proxy.$Proxy50.removeLock(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.nuxeo.ecm.core.storage.sql.jdbc.JDBCMapperTxSuspender.doInvoke(JDBCMapperTxSuspender.java:23)
at org.nuxeo.ecm.core.storage.sql.jdbc.JDBCMapperTxSuspender.invoke(JDBCMapperTxSuspender.java:33)
at com.sun.proxy.$Proxy50.removeLock(Unknown Source)
at org.nuxeo.ecm.core.storage.sql.VCSLockManager.removeLock(VCSLockManager.java:283)
... 41 more
Caused by: java.sql.SQLException
at org.tranql.connector.jdbc.TranqlDataSource.getConnection(TranqlDataSource.java:67)
at org.nuxeo.runtime.datasource.geronimo.PooledDataSourceFactory$DataSource.getConnection(PooledDataSourceFactory.java:55)
at org.nuxeo.runtime.datasource.ConnectionHelper.getConnection(ConnectionHelper.java:807)
at org.nuxeo.ecm.core.storage.sql.jdbc.JDBCConnection.openBaseConnection(JDBCConnection.java:152)
at org.nuxeo.ecm.core.storage.sql.jdbc.JDBCConnection.openConnections(JDBCConnection.java:140)
... 52 more
Caused by: javax.resource.ResourceException: No ManagedConnections available within configured blocking timeout ( 10000 [ms] ) for pool org.apache.geronimo.connector.outbound.SinglePoolMatchAllConnectionInterceptor@dc9bed0
at org.apache.geronimo.connector.outbound.AbstractSinglePoolConnectionInterceptor.getConnection(AbstractSinglePoolConnectionInterceptor.java:86)
at org.apache.geronimo.connector.outbound.TransactionEnlistingInterceptor.getConnection(TransactionEnlistingInterceptor.java:49)
at org.apache.geronimo.connector.outbound.TransactionCachingInterceptor.getConnection(TransactionCachingInterceptor.java:109)
at org.apache.geronimo.connector.outbound.ConnectionHandleInterceptor.getConnection(ConnectionHandleInterceptor.java:43)
at org.apache.geronimo.connector.outbound.TCCLInterceptor.getConnection(TCCLInterceptor.java:39)
at org.apache.geronimo.connector.outbound.ConnectionTrackingInterceptor.getConnection(ConnectionTrackingInterceptor.java:66)
at org.apache.geronimo.connector.outbound.AbstractConnectionManager.allocateConnection(AbstractConnectionManager.java:77)
at org.nuxeo.runtime.jtajca.NuxeoContainer$ConnectionManagerWrapper.allocateConnection(NuxeoContainer.java:838)
at org.tranql.connector.jdbc.TranqlDataSource.getConnection(TranqlDataSource.java:62)
... 56 more
what can we do? can the pool size be changed after startup?
max-pool-size=1 is too small, Nuxeo needs one fixed connection for the lock manager in addition to the connections needed for the repository access. You should try with 2 at least. And add one to that in cluster mode, when a connection is also needed for the cluster node handler (aka cluster invalidator).