Rewrite the pthread spinlock implementation.

    This implementation lets the pthread spinlocks just be spinlocks, as
    they should be. It makes them much faster in every respect.

    Since these spinlocks are used in many other places in the library,
    those routines have now become measureably faster as well.  For
    example, pthread_getspecific has gained substantial speed as a result.

(cherry picked from commit 249898d9ae310959116efa333e4e3e690cf97452)
1 file changed