Incorrect finishing of tim-jboss-4.2

Type: Bug
Status: Closed
Priority: 1 Critical
Resolution: Fixed
Component/s: Integration Modules
Labels:

Assignee: kkannaiy
Reporter: imscooter
Created: September 01, 2009
Votes: 0
Watchers: 3
Updated:
February 12, 2013
Resolved:
November 23, 2009

Attachments

imscooter (11.00 k) application/zip logs.zip
imscooter (11.00 k) application/zip shared-timer.zip

Description

tim-jboss-4.2 неправильно заканчивает свою работу. Когда JBoss заканчивает свою работу, он называет System.exit (), и ‘shutdownHooks’ начинает выполнение. Терракота shutdownHook заканчивает RemoteTransactionManagerImpl.class, и JBoss shutdownHook неразвертывает приложения (EJB/WEB). Если RemoteTransactionManagerImpl прибывает, чтобы закончиться в то время как приложение EJB/WEB staies в терракотовом разделе синхронизации, затем зайти в тупик, случается. API Событий Кластера не полезны в этом случае, потому что уведомление о завершении Terracota является асинхронным с Терракотовой тематикой завершения.

Я думаю, что терракота shutdownHook не должна закончиться, в то время как некоторая тематика блокирует или ждет разделенный объект.

В тесте вложения приложение сети использовало разделенный терракотой объект и некоторые файлы регистрации. Тупик иногда случается в то время как остановка JBoss.

Comments

Toha Bakanovsky 2009-09-01

tim-jboss-4.2 incorrectly finishes its work. When JBoss finishes its work it calls System.exit(), and ‘shutdownHooks’ begins execution. Terracotta shutdownHook finishes RemoteTransactionManagerImpl.class, and JBoss shutdownHook undeploys (EJB/WEB) applications. If RemoteTransactionManagerImpl comes to end while EJB/WEB application staies in terracotta synchronization section, then deadlock happens. Cluster Events API are not useful in this case, because notification about Terracota shutdown are asynchronous with Terracotta shutdown-thread.

I think that terracotta shutdownHook should not finish while some thread locks or waits shared object.

In attachment test web-application used terracotta-shared object and some logs. Deadlock sometimes happens while stop JBoss.

Toha Bakanovsky 2009-09-03

I reviewed the com.tc.object.tx.RemoteTransactionManagerImpl.commit(final ClientTransaction txn) method. And I think some mistakes was made.

So original end of commit() method looks like this:

synchronized (this.lock)  {
    if (isStoppingOrStopped())   {
        // Send now if stop is requested
        sendBatches(true, "commit() : Stop initiated.");
    }
        waitUntilRunning();
        sendBatches(false);
}

I see no reason for waitUntilRunning() in STOPPED or STOP-INITIATED state. Transactions would be sent immediately and PAUSED state be ignored. Thus this code fragment would look like this:

synchronized (this.lock)  {
    if (isStoppingOrStopped())   {
        // Send now if stop is requested
        sendBatches(true, "commit() : Stop initiated.");
    } else {
        waitUntilRunning();
        sendBatches(false);
    }
}

Attached test works correctly by means this correction. Have anyone an ideas about this?

Saravanan Subbiah 2009-09-03

Seems like a reasonable change to make.

Toha Bakanovsky 2009-09-04

I suppose it will be a good idea to notify client listener immediately about Terracotta finishing its work. The most convenient place for this notification is a com.tc.object.ClientShutdownManager#shutdown() method, but DsoCluster interface are not accessible at this method, so the easiest way to notify client is to add client notification in com.tc.object.handshakemanager.ClientHandshakeManagerImpl#shutdown().

Client listener will be able to block Terracotta shutdownHoook and to finish Terracotta using.

So com.tc.object.handshakemanager.ClientHandshakeManagerImpl#shutdown() methiod can look like that:

public void shutdown() { //Toha: first of all notify client() about Terracotta finishing!!! dsoCluster.fireOperationsDisabled();

isShutdown = true;
shutdownCallbacks();   \}

Geert Bevin 2009-09-04

The RemoteTransactionManagerImpl change seems fine by me. However, about the cluster events, this needs some careful investigation to make sure that the right state transitions are still being honored. We still have to make sure that the event isn’t triggered a second time afterwards and that it happens in the right order. Also, we need to consider what happens during the execution of the cluster events. Currently they are implemented by leveraging the SEDA stages in threads in our architecture to prevent user code in cluster listener methods to intervene with critical infrastructure logic. Putting this in the shutdown() method would potentially open up a can of worms during cluster event processing that prevents the proper handling of the shutdown logic.

Toha Bakanovsky 2009-09-04

I have done an mistake in my previous message. We could use com.tc.object.ClientShutdownManager#execute(boolean fromShutdownHook) method. In this method we can clearly detect where method was called from. If method are called from ‘shutdownHook’ we can notify client. I have seen two Terracottas shutdownHooks - “CommonShutDownHook” and shutdownHook from com.tc.object.bytecode.ManagerImpl class.

‘CommonShutDownHook’ just closes com.tc.object.net.DSOClientMessageChannel if reconnection are enabled (I have noted no useful activity else). And com.tc.object.ClientShutdownManager#execute(boolean fromShutdownHook) are called first from ManagerImpl shutdownHook. Terracota seems to be fully operational in this moment.

Abhishek Singh 2009-11-23

Fixed in rev-14046 (trunk) and rev-14047 (tc-3.1)

Added else block so that RemoteTrandactionManagerImpl does no do waitUntilRunning() when in STOPPED or STOP_INITIALIZED state.

Kalai Kannaiyan 2009-12-08

Verified the fix on trunk, 3.2 and 3.1