Wednesday, November 12, 2014

Harris with RTTI and AtomicMarkableReference

We talked about Harris linked list on a previous post, which is a lock-free linked list that allows for efficient insertions and removals anywhere on the list.

The problem with Harris linked list is that to implement it on a system with a GC we can't use the algorithm as-is, instead, we need to modify it considerably in one of two ways, either using a Copy-On-Write pattern (in Java we can use AtomicMarkableReference) or Run-Time-Type-Information.
The AtomicMarkableReference implementation is pretty much as described in The Art of Multiprocessor Programming, and you can get our version from github:
https://github.com/pramalhe/ConcurrencyFreaks/blob/master/Java/com/concurrencyfreaks/list/HarrisAMRLinkedList.java
The RTTI implementation is mentioned in a few papers and we couldn't find code for it anywhere, so we made our own implementation on what we think it should be the proper way, and you can also get it from github:
https://github.com/pramalhe/ConcurrencyFreaks/blob/master/Java/com/concurrencyfreaks/list/HarrisRTTILinkedList.java

The other (or same) problem with Harris is that to implement it on a system without a GC, you need a lock-free memory reclamation system like Reference Counters or Hazard Pointers. The problem with using per-node reference counters on Harris list is that they can reduce performance by a factor of 10x or more, and as for Hazard Pointers, it is incredibly difficult to implement Hazard Pointers with Harris list without some significant modifications to the original algorithm (according to Maged Michael, in the last three paragraphs of section 2.2 of this paper).
Just in case you don't know who Maged Michael is, he's the guy that invented Hazard Pointers, and invented the lock-free list/queue that is the basis of Java's ConcurrentLinkedQueue, so if he says something is hard, I trust him.
Unfortunately, I had already spent a significant amount of time trying to merge hazard pointers and harris linked list before I read that part of the paper, so I got to the conclusion that although it may be possible to do it, it will be such a pain that it's not worth the trouble. The reason for all this pain is mostly related to the multiple unlinking occurring in the search() function. Imagine the following scenario on a section of the linked list with four nodes we named A,B,C and D:

A  ->  B  ->  C  ->  D

Now suppose that Thread 1 is holding a reference to node B and has put B as one of its hazard pointers. Meanwhile, Thread 2 comes along and sees that both B.next and C.next are marked to be removed, so it unlinks both at the same time (see section 3 of the search() function), by doing a CAS of A.next from B to D. If the CAS is successful the end result is this:

       B   ->   C  -_/
A          ->         D

where both A.next and C.next are pointing to D.
On the search() function or the delete() function of Harris we can now retire the nodes B and C because they have been unlinked, but the problem is, that although there is currently no thread holding a reference to C, it can be accessed by a thread holding a reference to B, namely Thread 1. The B node itself won't be free()'ed until Thread 1 advances to C, but by the time that happens, the node C may have been already free()'ed and you get a crash when Thread 1 tries to de-reference C.next

After discussing with Andreia for a few hours, we got to the conclusion that we can't unlink more than one node at a time, and that the only one that can call retireNode() is the thread that marked the bit on the node for removal, after it has unlinked it, or confirmed that some other thread has done so... and even then, we're not completely sure that is enough to be correct, and in fact, Maged Michael seems to think the same.

Anyways, there is a better way to implement Harris + Hazard Pointers, which is described in Figure 5 of this paper, so I'll try that one instead and if I manage to get it working, I'll post the C++1x implementation.


No comments:

Post a Comment