Execution Engine Stopped

Problem

Jobs cannot be submitted to Semarchy xDM because the Execution Engine has stopped.

The PDE log shows errors like this:

PDE Error log snippet
!ENTRY com.semarchy.mdm.backend 4 0 2018-12-30 00:00:15.876
!MESSAGE Unexpected error during purge (Message reported by logger com.semarchy.mdm.integration.purge.PurgeServiceImpl)
!STACK 0
java.util.concurrent.ExecutionException: java.lang.RuntimeException: org.springframework.jdbc.CannotGetJdbcConnectionException: Failed to obtain JDBC Connection; nested exception is org.apache.tomcat.jdbc.pool.PoolExhaustedException: [pool-428-thread-1] Timeout: Pool empty. Unable to fetch a connection in 15 seconds, none available[size:8; busy:8; idle:0; lastwait:15000].
at java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.util.concurrent.FutureTask.get(FutureTask.java:192)
at com.semarchy.mdm.integration.purge.PurgeServiceImpl.purgeDataLocation(PurgeServiceImpl.java:417)
at com.semarchy.mdm.integration.purge.PurgeSchedulerImpl.execute(PurgeSchedulerImpl.java:108)
at com.semarchy.mdm.integration.purge.PurgeSchedulerImpl$SemarchyPurgeJob.execute(PurgeSchedulerImpl.java:170)
at org.quartz.core.JobRunShell.run(JobRunShell.java:202)
at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:573)
Caused by: java.lang.RuntimeException: org.springframework.jdbc.CannotGetJdbcConnectionException: Failed to obtain JDBC Connection; nested exception is org.apache.tomcat.jdbc.pool.PoolExhaustedException: [pool-428-thread-1] Timeout: Pool empty. Unable to fetch a connection in 15 seconds, none available[size:8; busy:8; idle:0; lastwait:15000].
at com.semarchy.platform.engine.core.impl.DefaultStandaloneEngineImpl$SubmitJobDefinition.call(DefaultStandaloneEngineImpl.java:359)
at com.semarchy.platform.engine.core.impl.DefaultStandaloneEngineImpl$SubmitJobDefinition.call(DefaultStandaloneEngineImpl.java:1)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.springframework.jdbc.CannotGetJdbcConnectionException: Failed to obtain JDBC Connection; nested exception is org.apache.tomcat.jdbc.pool.PoolExhaustedException: [pool-428-thread-1] Timeout: Pool empty. Unable to fetch a connection in 15 seconds, none available[size:8; busy:8; idle:0; lastwait:15000].
at org.springframework.jdbc.datasource.DataSourceUtils.getConnection(DataSourceUtils.java:81)
at org.springframework.jdbc.core.JdbcTemplate.execute(JdbcTemplate.java:600)
at org.springframework.jdbc.core.JdbcTemplate.query(JdbcTemplate.java:657)
at org.springframework.jdbc.core.JdbcTemplate.query(JdbcTemplate.java:682)
at org.springframework.jdbc.core.JdbcTemplate.query(JdbcTemplate.java:736)
at org.springframework.jdbc.core.namedparam.NamedParameterJdbcTemplate.query(NamedParameterJdbcTemplate.java:211)
at org.springframework.jdbc.core.namedparam.NamedParameterJdbcTemplate.query(NamedParameterJdbcTemplate.java:218)
at com.semarchy.commons.mda.persistence.bulkloader.ObjInstBulkLoader.loadLevelData(ObjInstBulkLoader.java:126)
at com.semarchy.commons.mda.persistence.bulkloader.ObjInstBulkLoader.loadOwnershipTree(ObjInstBulkLoader.java:91)
at com.semarchy.commons.mda.persistence.bulkloader.ObjInstBulkLoader.loadOwnershipTree(ObjInstBulkLoader.java:100)
at com.semarchy.commons.mda.persistence.bulkloader.ObjInstBulkLoader.loadOwnershipTree(ObjInstBulkLoader.java:100)
at com.semarchy.commons.mda.persistence.bulkloader.ObjInstBulkLoader.loadOwnershipTree(ObjInstBulkLoader.java:60)
at com.semarchy.commons.mda.persistence.bulkloader.ObjInstBulkLoader.loadOwnershipTree(ObjInstBulkLoader.java:52)
at com.semarchy.platform.repository.job.adapter.JobInstanceAdapter.bulkLoadedDataProvider(JobInstanceAdapter.java:96)
at com.semarchy.platform.repository.job.adapter.JobInstanceAdapter.bulkLoadClass(JobInstanceAdapter.java:78)
at com.semarchy.platform.repository.job.adapter.JobInstanceAdapter.getJobDefinition(JobInstanceAdapter.java:49)
at com.semarchy.platform.engine.core.impl.DefaultStandaloneEngineImpl$SubmitJobDefinition.call(DefaultStandaloneEngineImpl.java:345)
... 5 more
Caused by: org.apache.tomcat.jdbc.pool.PoolExhaustedException: [pool-428-thread-1] Timeout: Pool empty. Unable to fetch a connection in 15 seconds, none available[size:8; busy:8; idle:0; lastwait:15000].
at org.apache.tomcat.jdbc.pool.ConnectionPool.borrowConnection(ConnectionPool.java:701)
at org.apache.tomcat.jdbc.pool.ConnectionPool.getConnection(ConnectionPool.java:198)
at org.apache.tomcat.jdbc.pool.DataSourceProxy.getConnection(DataSourceProxy.java:127)
at com.semarchy.commons.sql.proxy.ProxyDataSource.getConnection(ProxyDataSource.java:61)
at com.semarchy.commons.sql.proxy.AutoCommitDataSourceProxy.getConnection(AutoCommitDataSourceProxy.java:22)
at com.semarchy.commons.sql.meta.DataSourceWithMeta.getConnection(DataSourceWithMeta.java:41)
at org.springframework.jdbc.datasource.DataSourceUtils.fetchConnection(DataSourceUtils.java:151)
at org.springframework.jdbc.datasource.DataSourceUtils.doGetConnection(DataSourceUtils.java:115)
at org.springframework.jdbc.datasource.DataSourceUtils.getConnection(DataSourceUtils.java:78)
... 21 more



# Also in the log, you'll see errors like this:



!ENTRY com.semarchy.mdm.backend 4 0 2019-01-01 17:37:00.104
!MESSAGE Unexpected error during purge (Message reported by logger com.semarchy.mdm.integration.purge.PurgeServiceImpl)
!STACK 0
java.lang.RuntimeException: The engine is currently stopped
at com.semarchy.platform.engine.core.impl.DefaultStandaloneEngineImpl.submit(DefaultStandaloneEngineImpl.java:213)
at com.semarchy.mdm.integration.purge.PurgeServiceImpl.purgeDataLocation(PurgeServiceImpl.java:417)
at com.semarchy.mdm.integration.purge.PurgeSchedulerImpl.execute(PurgeSchedulerImpl.java:108)
at com.semarchy.mdm.integration.purge.PurgeSchedulerImpl$SemarchyPurgeJob.execute(PurgeSchedulerImpl.java:170)
at org.quartz.core.JobRunShell.run(JobRunShell.java:202)
at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:573)

!ENTRY org.quartz-scheduler.quartz 4 0 2019-01-01 17:37:00.105
!MESSAGE Job DEFAULT.6da64b5bd2ee-b18ba688-a162-4015-b6a6-ff1c9b3c053c threw an unhandled Exception: (Message reported by logger org.quartz.core.JobRunShell)
!STACK 0
java.lang.RuntimeException: java.lang.RuntimeException: The engine is currently stopped
at com.semarchy.mdm.integration.purge.PurgeServiceImpl.purgeDataLocation(PurgeServiceImpl.java:444)
at com.semarchy.mdm.integration.purge.PurgeSchedulerImpl.execute(PurgeSchedulerImpl.java:108)
at com.semarchy.mdm.integration.purge.PurgeSchedulerImpl$SemarchyPurgeJob.execute(PurgeSchedulerImpl.java:170)
at org.quartz.core.JobRunShell.run(JobRunShell.java:202)
at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:573)
Caused by: java.lang.RuntimeException: The engine is currently stopped
at com.semarchy.platform.engine.core.impl.DefaultStandaloneEngineImpl.submit(DefaultStandaloneEngineImpl.java:213)
at com.semarchy.mdm.integration.purge.PurgeServiceImpl.purgeDataLocation(PurgeServiceImpl.java:417)
... 4 more




Background

This is most often caused when a Semarchy instance has multiple data locations, and all of the data locations are set to purge their old data simultaneously. If a few connections are used by users on the systems and a few are used by integration jobs, then the scheduled purges can find that there are no available connections to connects and delete old data. The default connection pool size is just 8 connections.

Solution

Go into the Workbench and review the purge schedule.

In this example, the purge schedule was created when the system was upgraded to xDM 4.3. By default it is scheduled to run at exactly midnight on Sunday (based on the timezone of the application server). In this screenshot the schedule has been updated to run at 10 minutes past midnight so that it does not run simultaneously with the purges scheduled for other data locations.

If your purge schedules are all the same, then change them. Use something along these lines to run at 10 minute intervals:
0 0 0 ? * SUN
0 10 0 ? * SUN
0 20 0 ? * SUN
0 30 0 ? * SUN

Alternate Solution

Update semarchy.xml to allow more active connections. You may find that staggering the purge schedule fully solves your problem. If so, then there is no need to update semarchy.xml. But if you find that you are still exhausting your supply of available connections, then you should increase the maxActive connections.

The default connection sample for a data location looks like this:

semarchy.xml snippet
        <Resource name="jdbc/SEMARCHY_CUSTOMER_B2C_MDM" auth="Container"
                factory="org.apache.tomcat.jdbc.pool.DataSourceFactory" type="javax.sql.DataSource"
                driverClassName="org.postgresql.Driver" url="jdbc:postgresql://localhost:5432/postgres"
                username="semarchy_customer_b2c_mdm" password="semarchy_customer_b2c_mdm"
                maxActive="8" maxIdle="8" minIdle="0" maxWait="15000" initialSize="1"
                defaultAutoCommit="false" rollbackOnReturn="true" validationQuery="select 1"
                testOnBorrow="true" timeBetweenEvictionRunsMillis="45000"
                validationInterval="60000" />

If you have arrived at this article, you have probably encountered "pool empty" problems. In that case, increase maxActive from "8" to "32" as a good starting point.

The choices here are not one-size-fits-all. You'll need analysis of your expected usage along with experimentation to determine the best parameters for your needs.