Salem's Euphoria

Sharing Experience

IBM reported process hang for long batch processes

Leave a comment

I was looking into WAS logs, while I saw the warning 

WSVR0605W: Thread “WebContainer : 2” (0000005a) has been active for 7305026 milliseconds and may be hung.  There is/are 1 thread(s) in total in the server that may be hung.

I went deeper in the stacktrace, and I found  that this particular process is a batch function that may work for 24hours without any return (batch cleaning job). Well, Webpshere has a thread monitoring policy that will consider a thread ganging after a predefined interval (600 seconds by default). If the process finishes job after this the threshold interval, WAS will report a false alarm as an apologize and will increase threashold interval if this false alarm occurs many times(100) by 1.5 times. 

hmmm… I don’t see this behavior in log files although the process completed after 5 hours…Moreover, the application reported an error (channel call failure).

A quick way to avoid this is to disable thread hang monitoring by setting property to zero or less. This may not be a good choice for many applications as it may hide bigger problems.

The safer way is to calculte the maximum of the time taken by any process in your application and set to a suitable value. In my application that will never be the case as I’m not the full owner of the database, and I found that sometimes, billing system takes all available accesses to the database for many days.


Annestou ….



Author: Salem Ben Afia

Big Data & Java developer Search Engine Architect, Lucene Expert

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s