The test had a race that could cause two threads to end up with the same
"thread local" value. I believe this would not cause the test to fail,
but it could cause it to succeed even when the functionality is broken.
The new implementation removes this uncertainty, and removes a lot of
cruft left over from the time this test was written using pthreads.