Cancellation Issues

Cancellation is a serious problem in multi-threading environment. In one word, a well built program does not use asynchronous cancellation, that is, does not interrupt another thread about which almost nothing is known. Doing so may result in disastrous outcomes, as interrupting the target thread while it is holding a mutex, blocking forever other threads. Or problems may more subtle: interrupting an application thread while holding heap memory will result in memory leaks that will probably be unnoticeable until in a later production phase, where an heavy use will show up memory consumption.

How to solve this problem? The best solution is probably never to call the stop() method of the Thread class, unless you are going to quit your application. If you want to tell another thread that is time to quit, you should set a flag that will be read from target thread at one point; this is called "kind cancellation" To implement it, you must make sure that you never block your threads forever; you must use timeouts, callbacks and anything to ensure that sooner or later your thread will be waken up, so if another threads wants to cancel it, the target thread will have a chance to read the cancellation request and then quit.

Wefts++ has support for kind cancellation and for a quasi-kind cancellation called "deferred". To activate the support for kind cancellation, start a thread with cancellation disabled (see Thread::start). Then, when you issue a stop() on that thread, a flag will be risen into the Thread class; this flags will prevent many automatic things to happen (i.e. will prevent threads from joining the stopped one), and will make so that the Thread::testCancel() method will quit your thread.

Deferred cancellation is similar, but it also interrupts some system blocking calls, as Wefts::Sleep() function and condition waits (Cond::wait()). Some systems (UNIX/pthread) provide also support for I/O call deferred cancellation, so that a stop() could interrupt a blocking read() call. Some UNIX flavor could use I/O calls as true cancellation points (i.e. as if a testCancel() were issued inside the I/O call), others may just interrupt the I/O calls so that your program is resumed with an error (usually UNIX errno==EINTR). The best way to work around this difference is to put a testCancel() just after I/O blocking calls, so that when the call is interrupted because of a cancellation request, the thread is immediately canceled.

On system that does not provide blocking calls interruption (i.e. Windows), the deferred cancellation is emulated. The OS Cooperative File Function Extended Enviroment is a system that uses asynchronous Windows calls, where they are available (Win2000 and later), and uses a "controlled asynchronous cancellation" where this operations are not available. The first ones are pretty simple to implement; a Windows event variable can be set to interrupt immediately and cleanly a pending I/O request. When this system cannot be used, the thread is interrupted using immediate unkind cancellation. So, this aspect makes possible for the threads to safely state when they are in a condition that makes immediate cancellation possible. So, emulated deferred cancellation is implemented by:

checking if a cancellation request has been issued,
if so, terminate the thread
if not, set a flag that signal that we can be canceled
do system call
reset the flag that signal that we can be canceled.

The killer will atomically check for that flag to be set in the killed thread; if that flag is not set, then the killer issue a kind cancellation request, while if it is set, the target thread is immediately killed.

Note:: The problem with killing another thread in a so unclean manner is that if the system call allocates internally heap memory, that memory is lost. So, even if we provide automatic emulated deferred cancellation points that seem to be safe at the moment, noting can ensure that the same low level functions that are safe now won't use heap memory in the future.

Another problem is that only waits in control of Wefts are automatically managed in this way. Wefts provides a OSThread::emuDeferredStart() and OSThread::emuDeferredEnd() that can be placed around system calls. If you are writing a portable code that must run both on windows and on UNIX, you can safely use emuDeferred, as they are defined empty in system that provides deferred cancellation natively; and if you don't want to mess with this two calls being placed around all your I/O, you can:

define a macro or a function like deferredRead() & co. and use that one in all your apps or
use only kind cancellation (that is never a bad choice...)

Anyhow, if you can cope with this, the COFFEE layer has already gone a whole lot in this direction, and is able to use the best cancellation scheme available in the host system.

See also:: How to use OS Cooperative File Function Extended Environment layer

Cleanup system

Cleanup is a concept tightly related with automatic cancellation and thread termination. A cleanup sequence is a set of actions that are done at thread termination, both if this termination is "natural" that is, if the thread just finished running, or if a cancellation request has been fulfilled via the Thread::testCancel() method or via OS deferred cancellation mechanism.

Wefts provides its own cleanup system in two flavors: thread cleanup stack list and condition wait cleanup routine. Moreover, Thread class provide an overloadable Thread::cleanup() method that is called at thread termination. Finally, after all the cleanup sequence is done, Wefts automatically takes care of:

setting the thread status to not running + stopped (that is to say, terminated).
lower the count of running threads.
destroying the thread object if the thread is detached.

Cleanup system rely on a pure virtual method-only class called CleanupHandler. This kind of classes are called "interfaces", mutating the terminology from Java language and general OOP. A class can have its own cleanup handler by deriving itself from CleanupHandler and providing a handleCleanup( int ) method. The integer parameter can be used if the class is used in more than a cleanup sequence to determine which of the possible actions must be performed.

Cleanup actions are organized in a stack (Last-in-first-out structure) so that it is possible to have "inner" cleanup managers, and can be added with the method Thread::pushCleanupHandler( CleanupHandler *, int ), and removed with Thread::popCleanupHandler( bool ). A typical scheme may be:

class MyThread: public Thread, public CleanupHandler
{
   ...
   // code within a thread sub class method
   void *run()
   {
      ...
      pushCleanupHandler( this, 1 );
      ... do some mess and allocate member m_heap1
      testCancel(); // or other things that can cause thread to terminate

      pushCleanupHandler( this, 2 );
      ... do some mess and allocate member m_heap2
      if (something)
         return 0; // causing thread to terminate, and cleanup to go!

      ... free m_heap2
      popCleanupHandler();

      testCancel();
      ... free m_heap1
      popCleanupHandler();
      ...
   }

   // overload CleanupHandler
   virtual void handleCleanup( int position, void *object )
   {
      if ( position == 2 ) {
         ... free m_heap2 and m_heap1
      }
      else if ( position == 1 ) {
         ... free m_heap1 only!
      }
   }
   ...

Remember that if cancellation is enabled, you can be canceled only at cancellation points, and if it is disabled, you can't be canceled unless you issue a testCancel(), so you can push the handler and then allocate the memory, or the reverse; the important thing is that there must not be any cancellation point in the code between the cleanup handler push and the things that are done ( that must be undone in the handler).

Using the method Thread::popCleanup() and passing true as a parameter, the handler will be first executed and then removed, so if you have a long code in the cleanup sequence you can reuse it in your functions without having to duplicate it.

Condition cleanup

Condition waits have a special cleanup sequence. Conditions are wefts object that encapsulates both an object that can be signaled and a wefts Mutex; when a cancellation request is issued while a thread is waiting on a condition, the Mutex object is locked before the cleanup action is taken. So, if your program uses raw conditions or their derivate classes, you must either:

don't use deferred cancellation: start the thread using conditions with cancellation disabled, and use only kind cancellation and timed condition waits. OR
disable cancellation just before wait is called, and re-elable it soon after. OR
implement a correct cleanup sequence.

All the objects using a condition in Wefts (i.e. Wefts::Subscription, Wefts::RWMutex ecc.) use this third option; this makes them a little less efficient, but allows users to use deferred cancellation schemes.

To implement the cleanup sequence for a condition, you have two options: if you are in control of the thread that is going to wait the condition, you can just use Thread::pushCleanupHandler() and Thread::popCleanupHandler(), knowing that you will be called with Condition variable locked, and taking good care to release it when your are done. If you can't control the Thread object that is waiting, you can set a Condition object specific wait using the CleanupHandler parameter of the Condition::wait() method. This method works as Thread::pushCleanup, but it is more efficient, it does not require to have a reference to the Thread object and it sets only one hander at a time. So, you can create a condition that may be stopped without worries about holding the mutex in exit in this way:

class MyCondition: public FastCondition, public CleanupHandler
   
   ...
   
   virtual bool wait() {
      return FastCondition::wait( this, 1 );
   }
   
   ...
   
   virtual void handleCleanup( int value ) {
      if (value == 1 )
         unlock();
   }
};

A generic condition clearer object has been recently added. It will just unlock the mutex associated with a given condition; to use it, you can just call wait( &BasicConditionCleanup, 0 ). This ensures mutex releasing of condition under very simple schemes.

This is not the default because on some systems, the cleanup sequence may be heavy. In the vast majority of your programs, you won't have cancellation requests sent to threads that may be waiting on a condition, and having this cleanup sequence active by default would be a waste.

Note:: This means that the default is "no condition cleanup", and this may hang your program if a cancellation is issued while a thread is waiting for a condition; always remember this.

See also:: CondCleanup

Cleanup sequence in emulated deferred cancellation

Note:: The following text in this section is meant for "future" development, as wefts 1.0 should be portable on UNIX and windows. It is also a reference for developers implementing new low level thread ports of Wefts.

On systems that do not provide deferred cancellation natively (i.e. Windows), is the killer thread that takes care to call the cleanup sequence. Although all the low-level interface is remapped so that the killer thread appeares as the creator of the Thread object that is being cleaned, and although Mutex low level objects are made so that a non-owner thread can unlock them, there is no way to access the Thread Local Stack anymore. Don't use any TLS data if you are going to implement emulated deferred cancellation, or be sure to release it before engaging emulated cancellation points.

Generated on Tue Oct 5 14:57:01 2004 for Wefts by

1.3.7