| 123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537538539540541542543544545546547548549550551552553554555556557558559560561562563564565566567568569570571572573574575576577578579580581582583584585586587588589590591592593594595596597598599600601602603604605606607608609610611612613614615616617618619620621622623624625626627628629630631632633634635636637638639640641642643644645646647648649650651652653654655656657658659660661662663664665666667668669670671672673674675676677678679680681682683684685686687688689690691692693694695696697698699700701702703704705706707708709710711712713714715716717718719720721722723724725726727728729730731732733734735736737738739740741742743744745746747748749750751752753754755756757758759760761762763764765766767768769770771772773774775776777778779780781782783 |
- This file documents non-portable functions and other issues.
- Non-portable functions included in pthreads-win32
- -------------------------------------------------
- BOOL
- pthread_win32_test_features_np(int mask)
- This routine allows an application to check which
- run-time auto-detected features are available within
- the library.
- The possible features are:
- PTW32_SYSTEM_INTERLOCKED_COMPARE_EXCHANGE
- Return TRUE if the native version of
- InterlockedCompareExchange() is being used.
- This feature is not meaningful in recent
- library versions as MSVC builds only support
- system implemented ICE. Note that all Mingw
- builds use inlined asm versions of all the
- Interlocked routines.
- PTW32_ALERTABLE_ASYNC_CANCEL
- Return TRUE is the QueueUserAPCEx package
- QUSEREX.DLL is available and the AlertDrv.sys
- driver is loaded into Windows, providing
- alertable (pre-emptive) asyncronous threads
- cancelation. If this feature returns FALSE
- then the default async cancel scheme is in
- use, which cannot cancel blocked threads.
- Features may be Or'ed into the mask parameter, in which case
- the routine returns TRUE if any of the Or'ed features would
- return TRUE. At this stage it doesn't make sense to Or features
- but it may some day.
- void *
- pthread_timechange_handler_np(void *)
- To improve tolerance against operator or time service
- initiated system clock changes.
- This routine can be called by an application when it
- receives a WM_TIMECHANGE message from the system. At
- present it broadcasts all condition variables so that
- waiting threads can wake up and re-evaluate their
- conditions and restart their timed waits if required.
- It has the same return type and argument type as a
- thread routine so that it may be called directly
- through pthread_create(), i.e. as a separate thread.
- Parameters
- Although a parameter must be supplied, it is ignored.
- The value NULL can be used.
- Return values
- It can return an error EAGAIN to indicate that not
- all condition variables were broadcast for some reason.
- Otherwise, 0 is returned.
- If run as a thread, the return value is returned
- through pthread_join().
- The return value should be cast to an integer.
- HANDLE
- pthread_getw32threadhandle_np(pthread_t thread);
- Returns the win32 thread handle that the POSIX
- thread "thread" is running as.
- Applications can use the win32 handle to set
- win32 specific attributes of the thread.
- DWORD
- pthread_getw32threadid_np (pthread_t thread)
- Returns the Windows native thread ID that the POSIX
- thread "thread" is running as.
- Only valid when the library is built where
- ! (defined(__MINGW64__) || defined(__MINGW32__)) || defined (__MSVCRT__) || defined (__DMC__)
- and otherwise returns 0.
- int
- pthread_mutexattr_setkind_np(pthread_mutexattr_t * attr, int kind)
- int
- pthread_mutexattr_getkind_np(pthread_mutexattr_t * attr, int *kind)
- These two routines are included for Linux compatibility
- and are direct equivalents to the standard routines
- pthread_mutexattr_settype
- pthread_mutexattr_gettype
- pthread_mutexattr_setkind_np accepts the following
- mutex kinds:
- PTHREAD_MUTEX_FAST_NP
- PTHREAD_MUTEX_ERRORCHECK_NP
- PTHREAD_MUTEX_RECURSIVE_NP
- These are really just equivalent to (respectively):
- PTHREAD_MUTEX_NORMAL
- PTHREAD_MUTEX_ERRORCHECK
- PTHREAD_MUTEX_RECURSIVE
- int
- pthread_delay_np (const struct timespec *interval);
- This routine causes a thread to delay execution for a specific period of time.
- This period ends at the current time plus the specified interval. The routine
- will not return before the end of the period is reached, but may return an
- arbitrary amount of time after the period has gone by. This can be due to
- system load, thread priorities, and system timer granularity.
- Specifying an interval of zero (0) seconds and zero (0) nanoseconds is
- allowed and can be used to force the thread to give up the processor or to
- deliver a pending cancelation request.
- This routine is a cancelation point.
- The timespec structure contains the following two fields:
- tv_sec is an integer number of seconds.
- tv_nsec is an integer number of nanoseconds.
- Return Values
- If an error condition occurs, this routine returns an integer value
- indicating the type of error. Possible return values are as follows:
- 0 Successful completion.
- [EINVAL] The value specified by interval is invalid.
- int
- pthread_num_processors_np (void)
- This routine (found on HPUX systems) returns the number of processors
- in the system. This implementation actually returns the number of
- processors available to the process, which can be a lower number
- than the system's number, depending on the process's affinity mask.
- BOOL
- pthread_win32_process_attach_np (void);
- BOOL
- pthread_win32_process_detach_np (void);
- BOOL
- pthread_win32_thread_attach_np (void);
- BOOL
- pthread_win32_thread_detach_np (void);
- These functions contain the code normally run via dllMain
- when the library is used as a dll but which need to be
- called explicitly by an application when the library
- is statically linked. As of version 2.9.0 of the library, static
- builds using either MSC or GCC will call pthread_win32_process_*
- automatically at application startup and exit respectively.
- Otherwise, you will need to call pthread_win32_process_attach_np()
- before you can call any pthread routines when statically linking.
- You should call pthread_win32_process_detach_np() before
- exiting your application to clean up.
- pthread_win32_thread_attach_np() is currently a no-op, but
- pthread_win32_thread_detach_np() is needed to clean up
- the implicit pthread handle that is allocated to a Win32 thread if
- it calls any pthreads routines. Call this routine when the
- Win32 thread exits.
- Threads created through pthread_create() do not need to call
- pthread_win32_thread_detach_np().
- These functions invariably return TRUE except for
- pthread_win32_process_attach_np() which will return FALSE
- if pthreads-win32 initialisation fails.
- int
- pthreadCancelableWait (HANDLE waitHandle);
- int
- pthreadCancelableTimedWait (HANDLE waitHandle, DWORD timeout);
- These two functions provide hooks into the pthread_cancel
- mechanism that will allow you to wait on a Windows handle
- and make it a cancellation point. Both functions block
- until either the given w32 handle is signaled, or
- pthread_cancel has been called. It is implemented using
- WaitForMultipleObjects on 'waitHandle' and a manually
- reset w32 event used to implement pthread_cancel.
- Non-portable issues
- -------------------
- Thread priority
- POSIX defines a single contiguous range of numbers that determine a
- thread's priority. Win32 defines priority classes and priority
- levels relative to these classes. Classes are simply priority base
- levels that the defined priority levels are relative to such that,
- changing a process's priority class will change the priority of all
- of it's threads, while the threads retain the same relativity to each
- other.
- A Win32 system defines a single contiguous monotonic range of values
- that define system priority levels, just like POSIX. However, Win32
- restricts individual threads to a subset of this range on a
- per-process basis.
- The following table shows the base priority levels for combinations
- of priority class and priority value in Win32.
-
- Process Priority Class Thread Priority Level
- -----------------------------------------------------------------
- 1 IDLE_PRIORITY_CLASS THREAD_PRIORITY_IDLE
- 1 BELOW_NORMAL_PRIORITY_CLASS THREAD_PRIORITY_IDLE
- 1 NORMAL_PRIORITY_CLASS THREAD_PRIORITY_IDLE
- 1 ABOVE_NORMAL_PRIORITY_CLASS THREAD_PRIORITY_IDLE
- 1 HIGH_PRIORITY_CLASS THREAD_PRIORITY_IDLE
- 2 IDLE_PRIORITY_CLASS THREAD_PRIORITY_LOWEST
- 3 IDLE_PRIORITY_CLASS THREAD_PRIORITY_BELOW_NORMAL
- 4 IDLE_PRIORITY_CLASS THREAD_PRIORITY_NORMAL
- 4 BELOW_NORMAL_PRIORITY_CLASS THREAD_PRIORITY_LOWEST
- 5 IDLE_PRIORITY_CLASS THREAD_PRIORITY_ABOVE_NORMAL
- 5 BELOW_NORMAL_PRIORITY_CLASS THREAD_PRIORITY_BELOW_NORMAL
- 5 Background NORMAL_PRIORITY_CLASS THREAD_PRIORITY_LOWEST
- 6 IDLE_PRIORITY_CLASS THREAD_PRIORITY_HIGHEST
- 6 BELOW_NORMAL_PRIORITY_CLASS THREAD_PRIORITY_NORMAL
- 6 Background NORMAL_PRIORITY_CLASS THREAD_PRIORITY_BELOW_NORMAL
- 7 BELOW_NORMAL_PRIORITY_CLASS THREAD_PRIORITY_ABOVE_NORMAL
- 7 Background NORMAL_PRIORITY_CLASS THREAD_PRIORITY_NORMAL
- 7 Foreground NORMAL_PRIORITY_CLASS THREAD_PRIORITY_LOWEST
- 8 BELOW_NORMAL_PRIORITY_CLASS THREAD_PRIORITY_HIGHEST
- 8 NORMAL_PRIORITY_CLASS THREAD_PRIORITY_ABOVE_NORMAL
- 8 Foreground NORMAL_PRIORITY_CLASS THREAD_PRIORITY_BELOW_NORMAL
- 8 ABOVE_NORMAL_PRIORITY_CLASS THREAD_PRIORITY_LOWEST
- 9 NORMAL_PRIORITY_CLASS THREAD_PRIORITY_HIGHEST
- 9 Foreground NORMAL_PRIORITY_CLASS THREAD_PRIORITY_NORMAL
- 9 ABOVE_NORMAL_PRIORITY_CLASS THREAD_PRIORITY_BELOW_NORMAL
- 10 Foreground NORMAL_PRIORITY_CLASS THREAD_PRIORITY_ABOVE_NORMAL
- 10 ABOVE_NORMAL_PRIORITY_CLASS THREAD_PRIORITY_NORMAL
- 11 Foreground NORMAL_PRIORITY_CLASS THREAD_PRIORITY_HIGHEST
- 11 ABOVE_NORMAL_PRIORITY_CLASS THREAD_PRIORITY_ABOVE_NORMAL
- 11 HIGH_PRIORITY_CLASS THREAD_PRIORITY_LOWEST
- 12 ABOVE_NORMAL_PRIORITY_CLASS THREAD_PRIORITY_HIGHEST
- 12 HIGH_PRIORITY_CLASS THREAD_PRIORITY_BELOW_NORMAL
- 13 HIGH_PRIORITY_CLASS THREAD_PRIORITY_NORMAL
- 14 HIGH_PRIORITY_CLASS THREAD_PRIORITY_ABOVE_NORMAL
- 15 HIGH_PRIORITY_CLASS THREAD_PRIORITY_HIGHEST
- 15 HIGH_PRIORITY_CLASS THREAD_PRIORITY_TIME_CRITICAL
- 15 IDLE_PRIORITY_CLASS THREAD_PRIORITY_TIME_CRITICAL
- 15 BELOW_NORMAL_PRIORITY_CLASS THREAD_PRIORITY_TIME_CRITICAL
- 15 NORMAL_PRIORITY_CLASS THREAD_PRIORITY_TIME_CRITICAL
- 15 ABOVE_NORMAL_PRIORITY_CLASS THREAD_PRIORITY_TIME_CRITICAL
- 16 REALTIME_PRIORITY_CLASS THREAD_PRIORITY_IDLE
- 17 REALTIME_PRIORITY_CLASS -7
- 18 REALTIME_PRIORITY_CLASS -6
- 19 REALTIME_PRIORITY_CLASS -5
- 20 REALTIME_PRIORITY_CLASS -4
- 21 REALTIME_PRIORITY_CLASS -3
- 22 REALTIME_PRIORITY_CLASS THREAD_PRIORITY_LOWEST
- 23 REALTIME_PRIORITY_CLASS THREAD_PRIORITY_BELOW_NORMAL
- 24 REALTIME_PRIORITY_CLASS THREAD_PRIORITY_NORMAL
- 25 REALTIME_PRIORITY_CLASS THREAD_PRIORITY_ABOVE_NORMAL
- 26 REALTIME_PRIORITY_CLASS THREAD_PRIORITY_HIGHEST
- 27 REALTIME_PRIORITY_CLASS 3
- 28 REALTIME_PRIORITY_CLASS 4
- 29 REALTIME_PRIORITY_CLASS 5
- 30 REALTIME_PRIORITY_CLASS 6
- 31 REALTIME_PRIORITY_CLASS THREAD_PRIORITY_TIME_CRITICAL
-
- Windows NT: Values -7, -6, -5, -4, -3, 3, 4, 5, and 6 are not supported.
- As you can see, the real priority levels available to any individual
- Win32 thread are non-contiguous.
- An application using pthreads-win32 should not make assumptions about
- the numbers used to represent thread priority levels, except that they
- are monotonic between the values returned by sched_get_priority_min()
- and sched_get_priority_max(). E.g. Windows 95, 98, NT, 2000, XP make
- available a non-contiguous range of numbers between -15 and 15, while
- at least one version of WinCE (3.0) defines the minimum priority
- (THREAD_PRIORITY_LOWEST) as 5, and the maximum priority
- (THREAD_PRIORITY_HIGHEST) as 1.
- Internally, pthreads-win32 maps any priority levels between
- THREAD_PRIORITY_IDLE and THREAD_PRIORITY_LOWEST to THREAD_PRIORITY_LOWEST,
- or between THREAD_PRIORITY_TIME_CRITICAL and THREAD_PRIORITY_HIGHEST to
- THREAD_PRIORITY_HIGHEST. Currently, this also applies to
- REALTIME_PRIORITY_CLASSi even if levels -7, -6, -5, -4, -3, 3, 4, 5, and 6
- are supported.
- If it wishes, a Win32 application using pthreads-win32 can use the Win32
- defined priority macros THREAD_PRIORITY_IDLE through
- THREAD_PRIORITY_TIME_CRITICAL.
- The opacity of the pthread_t datatype
- -------------------------------------
- and possible solutions for portable null/compare/hash, etc
- ----------------------------------------------------------
- Because pthread_t is an opague datatype an implementation is permitted to define
- pthread_t in any way it wishes. That includes defining some bits, if it is
- scalar, or members, if it is an aggregate, to store information that may be
- extra to the unique identifying value of the ID. As a result, pthread_t values
- may not be directly comparable.
- If you want your code to be portable you must adhere to the following contraints:
- 1) Don't assume it is a scalar data type, e.g. an integer or pointer value. There
- are several other implementations where pthread_t is also a struct. See our FAQ
- Question 11 for our reasons for defining pthread_t as a struct.
- 2) You must not compare them using relational or equality operators. You must use
- the API function pthread_equal() to test for equality.
- 3) Never attempt to reference individual members.
- The problem
- Certain applications would like to be able to access only the 'pure' pthread_t
- id values, primarily to use as keys into data structures to manage threads or
- thread-related data, but this is not possible in a maximally portable and
- standards compliant way for current POSIX threads implementations.
- For implementations that define pthread_t as a scalar, programmers often employ
- direct relational and equality operators on pthread_t. This code will break when
- ported to an implementation that defines pthread_t as an aggregate type.
- For implementations that define pthread_t as an aggregate, e.g. a struct,
- programmers can use memcmp etc., but then face the prospect that the struct may
- include alignment padding bytes or bits as well as extra implementation-specific
- members that are not part of the unique identifying value.
- [While this is not currently the case for pthreads-win32, opacity also
- means that an implementation is free to change the definition, which should
- generally only require that applications be recompiled and relinked, not
- rewritten.]
- Doesn't the compiler take care of padding?
- The C89 and later standards only effectively guarrantee element-by-element
- equivalence following an assignment or pass by value of a struct or union,
- therefore undefined areas of any two otherwise equivalent pthread_t instances
- can still compare differently, e.g. attempting to compare two such pthread_t
- variables byte-by-byte, e.g. memcmp(&t1, &t2, sizeof(pthread_t) may give an
- incorrect result. In practice I'm reasonably confident that compilers routinely
- also copy the padding bytes, mainly because assignment of unions would be far
- too complicated otherwise. But it just isn't guarranteed by the standard.
- Illustration:
- We have two thread IDs t1 and t2
- pthread_t t1, t2;
- In an application we create the threads and intend to store the thread IDs in an
- ordered data structure (linked list, tree, etc) so we need to be able to compare
- them in order to insert them initially and also to traverse.
- Suppose pthread_t contains undefined padding bits and our compiler copies our
- pthread_t [struct] element-by-element, then for the assignment:
- pthread_t temp = t1;
- temp and t1 will be equivalent and correct but a byte-for-byte comparison such as
- memcmp(&temp, &t1, sizeof(pthread_t)) == 0 may not return true as we expect because
- the undefined bits may not have the same values in the two variable instances.
- Similarly if passing by value under the same conditions.
- If, on the other hand, the undefined bits are at least constant through every
- assignment and pass-by-value then the byte-for-byte comparison
- memcmp(&temp, &t1, sizeof(pthread_t)) == 0 will always return the expected result.
- How can we force the behaviour we need?
- Solutions
- Adding new functions to the standard API or as non-portable extentions is
- the only reliable and portable way to provide the necessary operations.
- Remember also that POSIX is not tied to the C language. The most common
- functions that have been suggested are:
- pthread_null()
- pthread_compare()
- pthread_hash()
- A single more general purpose function could also be defined as a
- basis for at least the last two of the above functions.
- First we need to list the freedoms and constraints with restpect
- to pthread_t so that we can be sure our solution is compatible with the
- standard.
- What is known or may be deduced from the standard:
- 1) pthread_t must be able to be passed by value, so it must be a single object.
- 2) from (1) it must be copyable so cannot embed thread-state information, locks
- or other volatile objects required to manage the thread it associates with.
- 3) pthread_t may carry additional information, e.g. for debugging or to manage
- itself.
- 4) there is an implicit requirement that the size of pthread_t is determinable
- at compile-time and size-invariant, because it must be able to copy the object
- (i.e. through assignment and pass-by-value). Such copies must be genuine
- duplicates, not merely a copy of a pointer to a common instance such as
- would be the case if pthread_t were defined as an array.
- Suppose we define the following function:
- /* This function shall return it's argument */
- pthread_t* pthread_normalize(pthread_t* thread);
- For scalar or aggregate pthread_t types this function would simply zero any bits
- within the pthread_t that don't uniquely identify the thread, including padding,
- such that client code can return consistent results from operations done on the
- result. If the additional bits are a pointer to an associate structure then
- this function would ensure that the memory used to store that associate
- structure does not leak. After normalization the following compare would be
- valid and repeatable:
- memcmp(pthread_normalize(&t1),pthread_normalize(&t2),sizeof(pthread_t))
- Note 1: such comparisons are intended merely to order and sort pthread_t values
- and allow them to index various data structures. They are not intended to reveal
- anything about the relationships between threads, like startup order.
- Note 2: the normalized pthread_t is also a valid pthread_t that uniquely
- identifies the same thread.
- Advantages:
- 1) In most existing implementations this function would reduce to a no-op that
- emits no additional instructions, i.e after in-lining or optimisation, or if
- defined as a macro:
- #define pthread_normalise(tptr) (tptr)
- 2) This single function allows an application to portably derive
- application-level versions of any of the other required functions.
- 3) It is a generic function that could enable unanticipated uses.
- Disadvantages:
- 1) Less efficient than dedicated compare or hash functions for implementations
- that include significant extra non-id elements in pthread_t.
- 2) Still need to be concerned about padding if copying normalized pthread_t.
- See the later section on defining pthread_t to neutralise padding issues.
- Generally a pthread_t may need to be normalized every time it is used,
- which could have a significant impact. However, this is a design decision
- for the implementor in a competitive environment. An implementation is free
- to define a pthread_t in a way that minimises or eliminates padding or
- renders this function a no-op.
- Hazards:
- 1) Pass-by-reference directly modifies 'thread' so the application must
- synchronise access or ensure that the pointer refers to a copy. The alternative
- of pass-by-value/return-by-value was considered but then this requires two copy
- operations, disadvantaging implementations where this function is not a no-op
- in terms of speed of execution. This function is intended to be used in high
- frequency situations and needs to be efficient, or at least not unnecessarily
- inefficient. The alternative also sits awkwardly with functions like memcmp.
- 2) [Non-compliant] code that uses relational and equality operators on
- arithmetic or pointer style pthread_t types would need to be rewritten, but it
- should be rewritten anyway.
- C implementation of null/compare/hash functions using pthread_normalize():
- /* In pthread.h */
- pthread_t* pthread_normalize(pthread_t* thread);
- /* In user code */
- /* User-level bitclear function - clear bits in loc corresponding to mask */
- void* bitclear (void* loc, void* mask, size_t count);
- typedef unsigned int hash_t;
- /* User-level hash function */
- hash_t hash(void* ptr, size_t count);
- /*
- * User-level pthr_null function - modifies the origin thread handle.
- * The concept of a null pthread_t is highly implementation dependent
- * and this design may be far from the mark. For example, in an
- * implementation "null" may mean setting a special value inside one
- * element of pthread_t to mean "INVALID". However, if that value was zero and
- * formed part of the id component then we may get away with this design.
- */
- pthread_t* pthr_null(pthread_t* tp)
- {
- /*
- * This should have the same effect as memset(tp, 0, sizeof(pthread_t))
- * We're just showing that we can do it.
- */
- void* p = (void*) pthread_normalize(tp);
- return (pthread_t*) bitclear(p, p, sizeof(pthread_t));
- }
- /*
- * Safe user-level pthr_compare function - modifies temporary thread handle copies
- */
- int pthr_compare_safe(pthread_t thread1, pthread_t thread2)
- {
- return memcmp(pthread_normalize(&thread1), pthread_normalize(&thread2), sizeof(pthread_t));
- }
- /*
- * Fast user-level pthr_compare function - modifies origin thread handles
- */
- int pthr_compare_fast(pthread_t* thread1, pthread_t* thread2)
- {
- return memcmp(pthread_normalize(&thread1), pthread_normalize(&thread2), sizeof(pthread_t));
- }
- /*
- * Safe user-level pthr_hash function - modifies temporary thread handle copy
- */
- hash_t pthr_hash_safe(pthread_t thread)
- {
- return hash((void *) pthread_normalize(&thread), sizeof(pthread_t));
- }
- /*
- * Fast user-level pthr_hash function - modifies origin thread handle
- */
- hash_t pthr_hash_fast(pthread_t thread)
- {
- return hash((void *) pthread_normalize(&thread), sizeof(pthread_t));
- }
- /* User-level bitclear function - modifies the origin array */
- void* bitclear(void* loc, void* mask, size_t count)
- {
- int i;
- for (i=0; i < count; i++) {
- (unsigned char) *loc++ &= ~((unsigned char) *mask++);
- }
- }
- /* Donald Knuth hash */
- hash_t hash(void* str, size_t count)
- {
- hash_t hash = (hash_t) count;
- unsigned int i = 0;
- for(i = 0; i < len; str++, i++)
- {
- hash = ((hash << 5) ^ (hash >> 27)) ^ (*str);
- }
- return hash;
- }
- /* Example of advantage point (3) - split a thread handle into its id and non-id values */
- pthread_t id = thread, non-id = thread;
- bitclear((void*) &non-id, (void*) pthread_normalize(&id), sizeof(pthread_t));
- A pthread_t type change proposal to neutralise the effects of padding
- Even if pthread_nornalize() is available, padding is still a problem because
- the standard only garrantees element-by-element equivalence through
- copy operations (assignment and pass-by-value). So padding bit values can
- still change randomly after calls to pthread_normalize().
- [I suspect that most compilers take the easy path and always byte-copy anyway,
- partly because it becomes too complex to do (e.g. unions that contain sub-aggregates)
- but also because programmers can easily design their aggregates to minimise and
- often eliminate padding].
- How can we eliminate the problem of padding bytes in structs? Could
- defining pthread_t as a union rather than a struct provide a solution?
- In fact, the Linux pthread.h defines most of it's pthread_*_t objects (but not
- pthread_t itself) as unions, possibly for this and/or other reasons. We'll
- borrow some element naming from there but the ideas themselves are well known
- - the __align element used to force alignment of the union comes from K&R's
- storage allocator example.
- /* Essentially our current pthread_t renamed */
- typedef struct {
- struct thread_state_t * __p;
- long __x; /* sequence counter */
- } thread_id_t;
- Ensuring that the last element in the above struct is a long ensures that the
- overall struct size is a multiple of sizeof(long), so there should be no trailing
- padding in this struct or the union we define below.
- (Later we'll see that we can handle internal but not trailing padding.)
- /* New pthread_t */
- typedef union {
- char __size[sizeof(thread_id_t)]; /* array as the first element */
- thread_id_t __tid;
- long __align; /* Ensure that the union starts on long boundary */
- } pthread_t;
- This guarrantees that, during an assignment or pass-by-value, the compiler copies
- every byte in our thread_id_t because the compiler guarrantees that the __size
- array, which we have ensured is the equal-largest element in the union, retains
- equivalence.
- This means that pthread_t values stored, assigned and passed by value will at least
- carry the value of any undefined padding bytes along and therefore ensure that
- those values remain consistent. Our comparisons will return consistent results and
- our hashes of [zero initialised] pthread_t values will also return consistent
- results.
- We have also removed the need for a pthread_null() function; we can initialise
- at declaration time or easily create our own const pthread_t to use in assignments
- later:
- const pthread_t null_tid = {0}; /* braces are required */
- pthread_t t;
- ...
- t = null_tid;
- Note that we don't have to explicitly make use of the __size array at all. It's
- there just to force the compiler behaviour we want.
- Partial solutions without a pthread_normalize function
- An application-level pthread_null and pthread_compare proposal
- (and pthread_hash proposal by extention)
- In order to deal with the problem of scalar/aggregate pthread_t type disparity in
- portable code I suggest using an old-fashioned union, e.g.:
- Contraints:
- - there is no padding, or padding values are preserved through assignment and
- pass-by-value (see above);
- - there are no extra non-id values in the pthread_t.
- Example 1: A null initialiser for pthread_t variables...
- typedef union {
- unsigned char b[sizeof(pthread_t)];
- pthread_t t;
- } init_t;
- const init_t initial = {0};
- pthread_t tid = initial.t; /* init tid to all zeroes */
- Example 2: A comparison function for pthread_t values
- typedef union {
- unsigned char b[sizeof(pthread_t)];
- pthread_t t;
- } pthcmp_t;
- int pthcmp(pthread_t left, pthread_t right)
- {
- /*
- * Compare two pthread handles in a way that imposes a repeatable but arbitrary
- * ordering on them.
- * I.e. given the same set of pthread_t handles the ordering should be the same
- * each time but the order has no particular meaning other than that. E.g.
- * the ordering does not imply the thread start sequence, or any other
- * relationship between threads.
- *
- * Return values are:
- * 1 : left is greater than right
- * 0 : left is equal to right
- * -1 : left is less than right
- */
- int i;
- pthcmp_t L, R;
- L.t = left;
- R.t = right;
- for (i = 0; i < sizeof(pthread_t); i++)
- {
- if (L.b[i] > R.b[i])
- return 1;
- else if (L.b[i] < R.b[i])
- return -1;
- }
- return 0;
- }
- It has been pointed out that the C99 standard allows for the possibility that
- integer types also may include padding bits, which could invalidate the above
- method. This addition to C99 was specifically included after it was pointed
- out that there was one, presumably not particularly well known, architecture
- that included a padding bit in it's 32 bit integer type. See section 6.2.6.2
- of both the standard and the rationale, specifically the paragraph starting at
- line 16 on page 43 of the rationale.
- An aside
- Certain compilers, e.g. gcc and one of the IBM compilers, include a feature
- extention: provided the union contains a member of the same type as the
- object then the object may be cast to the union itself.
- We could use this feature to speed up the pthrcmp() function from example 2
- above by casting rather than assigning the pthread_t arguments to the union, e.g.:
- int pthcmp(pthread_t left, pthread_t right)
- {
- /*
- * Compare two pthread handles in a way that imposes a repeatable but arbitrary
- * ordering on them.
- * I.e. given the same set of pthread_t handles the ordering should be the same
- * each time but the order has no particular meaning other than that. E.g.
- * the ordering does not imply the thread start sequence, or any other
- * relationship between threads.
- *
- * Return values are:
- * 1 : left is greater than right
- * 0 : left is equal to right
- * -1 : left is less than right
- */
- int i;
- for (i = 0; i < sizeof(pthread_t); i++)
- {
- if (((pthcmp_t)left).b[i] > ((pthcmp_t)right).b[i])
- return 1;
- else if (((pthcmp_t)left).b[i] < ((pthcmp_t)right).b[i])
- return -1;
- }
- return 0;
- }
- Result thus far
- We can't remove undefined bits if they are there in pthread_t already, but we have
- attempted to render them inert for comparison and hashing functions by making them
- consistent through assignment, copy and pass-by-value.
- Note: Hashing pthread_t values requires that all pthread_t variables be initialised
- to the same value (usually all zeros) before being assigned a proper thread ID, i.e.
- to ensure that any padding bits are zero, or at least the same value for all
- pthread_t. Since all pthread_t values are generated by the library in the first
- instance this need not be an application-level operation.
- Conclusion
- I've attempted to resolve the multiple issues of type opacity and the possible
- presence of undefined bits and bytes in pthread_t values, which prevent
- applications from comparing or hashing pthread handles.
- Two complimentary partial solutions have been proposed, one an application-level
- scheme to handle both scalar and aggregate pthread_t types equally, plus a
- definition of pthread_t itself that neutralises padding bits and bytes by
- coercing semantics out of the compiler to eliminate variations in the values of
- padding bits.
- I have not provided any solution to the problem of handling extra values embedded
- in pthread_t, e.g. debugging or trap information that an implementation is entitled
- to include. Therefore none of this replaces the portability and flexibility of API
- functions but what functions are needed? The threads standard is unlikely to
- include that can be implemented by a combination of existing features and more
- generic functions (several references in the threads rationale suggest this.
- Therefore I propose that the following function could replace the several functions
- that have been suggested in conversations:
- pthread_t * pthread_normalize(pthread_t * handle);
- For most existing pthreads implementations this function, or macro, would reduce to
- a no-op with zero call overhead.
|