Monday, February 25, 2013

Scalable RW-Lock without limitation on the number of threads


There are two limitations on the classic algorithm of the ScalableRWLock:
  1. The maximum number of threads must be hard-coded.
  2. Whenever a new thread accesses a lock instance, it must first call threadInit(), and before it terminates it must call threadCleanup().
If your application has a small number of threads and doesn't create new threads throughout its running lifetime, then this is not an issue.
If your program has a large number of threads (much larger than the number of real threads in the system), this limitation is going to make your life harder. This is particularly true for applications that use thread-pools or Future patterns, if each task touches one or several lock instances.
The reason behind this difficulty is that, each new thread will take up an entry in the static array of assignedThreads, until the static function threadCleanup() is called from that thread. If you forget to call threadCleanup(), or more than MAX_NUM_THREADS try to concurrently use locks, you get  a failed lock at best and undefined behavior at worst.

To overcome these sort of limitation we created an implementation of the Scalable RWLock without static variables, and that uses a ThreadLocal member to access the Reader's state and a trick with finalize() to cleanup the entry when the thread is terminated.
Here's how it works:

Constructor
In the constructor of the lock, we create a new HashSet<AtomicInteger> named readersState where the state of each Reader will be stored in the form of an AtomicInteger. There is also a ThreadLocal<ReadersEntry> variable named entry that stores an instance of the inner class ReadersEntry belonging to the current thread.

The inner class ReadersEntry
This class is very simple:


    public class ReadersEntry {
        final private ScalableRWLock srwlock;
        final public AtomicInteger state;       
       
        public ReadersEntry(ScalableRWLock srwlock, AtomicInteger state) {
            this.srwlock = srwlock;
            this.state = state;        
        }
       
        protected void finalize() throws Throwable {
            srwlock.removeState(state);
            super.finalize();
        }
    }

Notice that it contains a variable state, which will point to the state of the current thread (if it is a Reader).

Adding and removing states
The sole purpose of the ReadersEntry variable entry is to have a finalize() that is called whenever a thread terminates, thus calling srwlock.removeState(state) which will remove the state from the readersState HashSet, and therefore a Writer will no longer waist time scanning through that state:


    public void removeState(AtomicInteger state) {
        exclusiveLock();
        readersState.remove(state);
        exclusiveUnlock();
    }

One detail is that the removeState() function needs to acquire the lock in write-mode before removing the state from the HashSet in order to have a consistent view of the HashSet.
A new state is added whenever the entry.get() returns null, which means that the current thread has not yet initialized and, therefore, needs to create a new state and add it to the HashSet:


    private void addState() {
        final AtomicInteger state = new AtomicInteger(0);
        ReadersEntry newEntry = new ReadersEntry(this, state);
        entry.set(newEntry);
       
        exclusiveLock();
        readersState.add(state);
        exclusiveUnlock();
    }


Optimization
Scanning the readersState HashSet every time we want to do a write-lock can cause a big performance hit if there are many threads. A simpler approach is to create a helper array that is obtained from HashSet.toArray(). This helper array needs to be recreated every time a new state is added or removed from the HashSet, but the impact on performance should be minimal because the HashSet.add() is only called the first time a given thread tries to lock the lock instance in read-only mode, and the HashSet.remove() is only called when the thread is about to terminate.

This version of the code is implemented in the ConcurrencyFreaks library as ScalableRWLock.java and ScalableReentrantRWLock.java


Considerations for C and C++
Life is so much prettier in the land of Garbage Collectors... once we don't have a garbage collector, we have a lot more to worry about (as if we didn't have enough already), and unfortunately this approach doesn't work on C and C++. I'll try to go into more details in a future post... perhaps in the meantime we'll figure out a way to work around it.


No comments:

Post a Comment