Lately in our production environment we saw that when some of our services where called in rapid succession, some of them failed. Under normal load we never had issues but under bigger load we saw the following error in our BPEL instances:
Failure in Oracle WSM Agent, category= security, function=agent.function.client, stage=fault due to RuntimeException.
java.lang.IllegalArgumentException: Context of the same RID, 0:1:4 already present in this family, f42da30f-5c89-49d9-a4fd-bd2b7e80644e-00154a3f.
It also seemed that those faults where creating stuck threads and after a while making our managed servers unstable and unreachable. After some digging around we found that there was a patch for it. See Sporadic error during invocation of a SOA composite, ‘Context of the same RID, xxxxxx already present in this family’ (Doc ID 2150992.1). It was released since june 2016 but it seems it was never put into a bundle patch somewhere. After applying the patch the errors where replaced by warnings:
Sep 13, 2017 11:48:22 AM CEST Warning oracle.dms.context BEA-000000 Duplicate RID detected. Multiple tasks are attempting to execute with the same exec-ctx identifier 9f31c77a-4bee-4f57-a970-9275a6577bcb-00030ae5,0:1:4. Will use the unrooted identifier 9f31c77a-4bee-4f57-a970-9275a6577bcb-00030ae5,1:32171 for this task.
and we had no more stuck threads and all our instances where running fine again.
After installing bundle patch 126.96.36.199.5 we noticed that the flow instance title of our instances in SOA where flakey. Meaning sometimes they appeared, but sometimes they didn’t. After some testing and going back and forth with Oracle support, we where not able to steadily reproduce the bug. We did come to a work-around though and that is forcing a dehydrate. Not the best option is my opinion but a possible workaround. So if you have this issue, just add a dehydrate to the BPEL and magically see your flow instance titles come back to live again.
When this bug is going to be resolved is still unknown.
Recently we had a timing issue in our project. We were processing certain events and one event got processed before another one which caused a problem. A quick fix seemed possible by adding a Wait activity in our BPEL process which got processed too quickly. We added the Wait and set it to 2 seconds. We deployed it, ran our unit test again but it seemed to ignore the 2 seconds Wait activity even though it showed up in the Enterprise Managers trace. We then had a better look at the Oracle documentation and the Wait activity seemed to have some special rules to it.
When specifying a time period for waiting, note the following:
- Wait times cannot be guaranteed if they are scheduled with other events that require processing. Due to this additional processing, the actual wait time can be greater than the wait time specified in the BPEL process.
- Wait times of less than two seconds are ignored by the server. Wait times above two seconds, but less than one minute, may not get executed in the exact, specified time. However, wait times in minutes do execute in the specified time.
- The default value of 2 seconds for wait times is specified with the MinBPELWait property in the System MBean Browser of Oracle Enterprise Manager Fusion Middleware Control Console. You can set this property to any value and the wait delay is bypassed for any waits less than MinBPELWait.
So the 2 seconds didn’t work as the documentation pointed out. Next we tried setting it to 3 seconds. After deploying the composite again and running the unit tests, we ran into a timeout error.
[2012-11-20T02:12:38.781-04:30] [soa_server1] [TRACE]  [oracle.soa.bpel.engine.delivery] [tid: [ACTIVE].ExecuteThread: '19' for
queue: 'weblogic.kernel.Default (self-tuning)'] [userId: <anonymous>] [ecid: b1ee8223c4e185ca:5044d24d:13b1807f7e8:-8000-
0000000000000672,0:2] [SRC_CLASS: DeliveryHandler] [WEBSERVICE_PORT.name: BPELProcess1_pt] [APP: soa-infra] [composite_name:
WaitTestProject3] [J2EE_MODULE.name: fabric] [SRC_METHOD: initialRequestAnyType] [WEBSERVICE.name: bpelprocess1_client_ep]
[J2EE_APP.name: soa-infra] [[
com.oracle.bpel.client.delivery.ReceiveTimeOutException: Waiting for response has timed out. The conversation id is null. Please
check the process instance for detail.
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
After some looking around, a colleague of mine came up with a reference to an Oracle support document 1507094.1. This document states that the Wait activity works well with an asynchronous BPEL process but not with a synchronous BPEL process where a transaction is required. The Wait activity involves a dehydration. Dehydration will persist the process and continue the process in a new thread. When the BPEL process is transaction required, the persist will not be complete until the BPEL process completes and transaction commits.
One of the solutions is possible:
We where trying to connect from BPEL to a HTTPS service but we ran into SSL problems. After checking all the keystores and it’s locations, it still didn’t seem to pick our keystore up.
The error we found in the log looked like this:
Caused by: javax.net.ssl.SSLHandshakeException: sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target