- Thread-safe initialization of static objects - 16 Updates
- What a bug - 7 Updates
- from_chars vs my parse_double<> - 1 Update
- It is my last post - 1 Update
Bonita Montero <Bonita.Montero@gmail.com>: Sep 07 04:54AM +0200 Am 07.09.2023 um 00:44 schrieb Pavel: >> is initialized and a mutex which guards the initalization. > Not needed. A test-and-set instruction on a flag -- that is itself > constant-initialized -- is sufficient. Using only one flag would require spin-locking. However, spin-locking is not possible in userspace because a thread holding a spinlock could keep other threads spinning for a long time. Therefore, there is no getting around a solution with a mutex. And creation and mutex synchro- nization may fail. |
Pavel <pauldontspamtolk@removeyourself.dontspam.yahoo>: Sep 07 12:18AM -0400 Bonita Montero wrote: >> constant-initialized -- is sufficient. > Using only one flag would require spin-locking. However, spin-locking > is not possible in userspace not true, a loop with yield in the body is very possible. > because a thread holding a spinlock could > keep other threads spinning for a long time. Therefore, there is no > getting around a solution with a mutex. does not have to be a mutex, can be call_once, a semaphore, anything, actually. > And creation and mutex synchro- > nization may fail. If initialization synchronization fails, the initialization can catch and terminate. No need to throw a system error. No need to use C++ synchronization primitives in the initialization code either; nothing prevents the implementation from being implemented in a platform-specific manner. |
Bonita Montero <Bonita.Montero@gmail.com>: Sep 07 06:27AM +0200 Am 07.09.2023 um 06:18 schrieb Pavel: > not true, a loop with yield in the body is very possible. No one would accept that because that would make the waiters to wait much more longer than necessary. > does not have to be a mutex, can be call_once, a semaphore, anything, > actually. Every applicable facility for that would rely on kernel-synchronization. You'd need to create a binary semaphore for that and you need so syn- chronize on that; both may fail. > If initialization synchronization fails, the initialization can catch > and terminate. ... Nothing like that is specified. |
"Chris M. Thomasson" <chris.m.thomasson.1@gmail.com>: Sep 06 11:33PM -0700 On 9/6/2023 3:44 PM, Pavel wrote: >> is initialized and a mutex which guards the initalization. > Not needed. A test-and-set instruction on a flag -- that is itself > constant-initialized -- is sufficient. [...] You also need to use the appropriate memory barriers. An acquire after the first check, and a release before making the object visible. // pseudo code, its been a while. // Damn, I used to work with threads all of the time. ___________________________ static foo* g_foo = nullptr; foo* local = g_foo; // atomic load if (! local) { hash_lock(&g_foo); local = g_foo; // atomic load if (! local) { local = new foo; // release mb #LoadStore | #StoreStore g_foo = local; // atomic store } else { // acquire mb #LoadStore | #LoadLoad } hash_unlock(&g_foo); } else { // acquire mb #LoadStore | #LoadLoad } local->foobar(); ___________________________ Iirc, that is a bare bones DCL. |
Bonita Montero <Bonita.Montero@gmail.com>: Sep 07 08:53AM +0200 Am 07.09.2023 um 08:33 schrieb Chris M. Thomasson: > You also need to use the appropriate memory barriers. ... That's all inside the Wikipedia example about DCL. But the discussion was about whether the thead safe-initialization may fail before or after the object's constructor is called because the mutex creation or the kernel-synchronization may fail. |
"Chris M. Thomasson" <chris.m.thomasson.1@gmail.com>: Sep 07 12:16AM -0700 On 9/6/2023 11:53 PM, Bonita Montero wrote: > was about whether the thead safe-initialization may fail before or > after the object's constructor is called because the mutex creation > or the kernel-synchronization may fail. Usually, the hash table of mutexes is created before any of the programs logic is executed... |
Paavo Helde <eesnimi@osa.pri.ee>: Sep 07 12:55PM +0300 06.09.2023 21:15 Bonita Montero kirjutas: > for all static data objects since it includes a kernel semaphore, > which is a costly resource. > To find out if this is true I wrote the below application: [...] > code. So the threads don't share a central object. If there would be > a central mutex used the above code would run for about 10s. But the > code does run about one second, There is a note in the standard: "[Note: This definition permits initialization of a sequence of ordered variables concurrently with another sequence. —end note]" i.e. there are indivual mutexes per > is done while the DCL-locked creation of the static object. So at last > static initialization should be declared to throw a system_errror. But > I can't find anything about that in the standard. Some debugging with VS2022 seems to indicate it is using a Windows critical section for thread-safe statics initialization. EnterCriticalSection() does not return any error code and of course does not throw any C++ exceptions either, so it is supposed to never fail. Yes, it's true it can throw a Windows structured exception EXCEPTION_POSSIBLE_DEADLOCK (after 30 days by default). But this would be considered as a fault in the program. This is what the C++ standard says about deadlocks (again a footnote): "The implementation must not introduce any deadlock around execution of the initializer. Deadlocks might still be caused by the program logic; the implementation need only avoid deadlocks due to its own synchronization operations." So I gather that in case the thread-safe static init synchronization fails, there must be a bug in the implementation. No C++ exceptions would be thrown anyway. |
Bonita Montero <Bonita.Montero@gmail.com>: Sep 07 02:17PM +0200 Am 07.09.2023 um 09:16 schrieb Chris M. Thomasson: > Usually, the hash table of mutexes is created before > any of the programs logic is executed... That sounds too complex for me. Creating individual mutexes on demand would be o.k. and I think a lock-free stack with a pool of mutexes would be fancy. A hashtable is too slow for that. |
Pavel <pauldontspamtolk@removeyourself.dontspam.yahoo>: Sep 07 10:44AM -0400 Bonita Montero wrote: >> not true, a loop with yield in the body is very possible. > No one would accept that because that would make the waiters to wait > much more longer than necessary. How so? Waiters will wait for the time of initialization. The initializing thread will be yielded to and receive virtually as many and as complete time slices as it would under any other scheduling discipline so it will complete its job in same or virtually same time (actually, the higher system contention level is the more efficient user-space waiting with yield becomes). Hence the time waiters will wait is exactly or virtually same as when using mutex. Also it should be taken into account that all above (and below) is only relevant to the rare case when the initialization is contended; in other words, "no one would accept" would be an exaggeration of the year even if your speculation on "wait much longer" were true -- which it isn't. > Every applicable facility for that would rely on kernel-synchronization. > You'd need to create a binary semaphore for that and you need so syn- > chronize on that; both may fail. try_lock in a loop with a sleep or yield wouldn't fail. But as said above, it's not needed. Regardless, your argument assumes too much C++-morphism in the implementation whereas the implementation can use any platform-specific approach available to it. E.g. pthread_once on Linux does not fail if given valid arguments (which C++ implementation can provide). Other platforms may have tools of their own to do the job. >> If initialization synchronization fails, the initialization can catch >> and terminate. ... > Nothing like that is specified. Correct, the above was wrong. But initialization can catch and try again. This does not have to be specified as it is not observable from outside. |
Bonita Montero <Bonita.Montero@gmail.com>: Sep 07 06:39PM +0200 Am 07.09.2023 um 16:44 schrieb Pavel: > How so? Waiters will wait for the time of initialization. > The initializing thread will be yielded ... Initializing is usually much faster than a whole timeslice, so yielding would be incacceptabel. That's just a stupid idea. > try_lock in a loop with a sleep or yield wouldn't fail. ... Do you really think someone would accept spinning with that ? This means that an initializing thread which is scheduled away while holding the mutex might keep other threads spinning for a long time. > implementation whereas the implementation can use any platform > -specific approach available to it. E.g. pthread_once on Linux > does not fail if given valid arguments ... pthread_once could be implemented with a single central semaphore for all operations and if the implementers know that the synchronization itself doesn't fail and the semphore is pre-allocated by the runtime it's possible to survive that synchronization without error. But check that code: #include <iostream> #include <thread> #include <vector> using namespace std; template<unsigned Thread> struct SleepAtInitialize { SleepAtInitialize() { this_thread::sleep_for( 1s ); } }; int main() { auto unroll = []<size_t ... Indices>( index_sequence<Indices ...>, auto fn ) { ((fn.template operator ()<Indices>()), ...); }; constexpr unsigned N_THREADS = 10; vector<jthread> threads; threads.reserve( N_THREADS ); unroll( make_index_sequence<N_THREADS>(), [&]<unsigned Thread>() { threads.emplace_back( [&]<unsigned IObj>( integral_constant<unsigned, IObj> ) { static SleepAtInitialize<IObj> guard; }, integral_constant<unsigned, Thread>() ); } ); } This code runs with individual mutexes per object, i.e. the time taken is about one second (with MSVC, libc++ and libstdc++). So when individual mutexes are used the initialization may fail. > Correct, the above was wrong. But initialization can catch and try > again. ... That a bumbler solution. |
scott@slp53.sl.home (Scott Lurndal): Sep 07 05:27PM >> The initializing thread will be yielded ... >Initializing is usually much faster than a whole timeslice, >so yielding would be incacceptabel. That's just a stupid idea. So tell us, on which operating systems will there be more than one thread running when application static objects are initialized (which happens generally before the application 'main' function is called, and thus before the application has a chance to create any threads)? |
Richard Damon <Richard@Damon-Family.org>: Sep 07 11:06AM -0700 On 9/7/23 10:27 AM, Scott Lurndal wrote: > initialized (which happens generally before the application > 'main' function is called, and thus before the application > has a chance to create any threads)? While GLOBAL static objects get initialized before main starts, function local static objects don't get initialized until the first call of the function. These will need some synchronization if the function is called from multiple threads at "the same time". Also, some global object could start up a thread in its constructor. |
Bonita Montero <Bonita.Montero@gmail.com>: Sep 07 08:24PM +0200 Am 07.09.2023 um 19:27 schrieb Scott Lurndal: > initialized (which happens generally before the application > 'main' function is called, and thus before the application > has a chance to create any threads)? I've shown with my code that each static object gets its own mutex with MSVC, libstdc++ and libc++. Here it is again in a simplified version: #include <iostream> #include <thread> #include <vector> using namespace std; int main() { struct SleepAtInitialize { SleepAtInitialize() { this_thread::sleep_for( 1s ); } }; auto unroll = []<size_t ... Indices>( index_sequence<Indices ...>, auto fn ) { ((fn.template operator ()<Indices>()), ...); }; constexpr unsigned N_THREADS = 10; vector<jthread> threads; threads.reserve( N_THREADS ); unroll( make_index_sequence<N_THREADS>(), [&]<unsigned Thread>() { threads.emplace_back( [&]<unsigned IObj>( integral_constant<unsigned, IObj> ) { static SleepAtInitialize guard; }, integral_constant<unsigned, Thread>() ); } ); } The code runs about one second with all three implementations, so ther's a mutex per statically initialized object. |
"Chris M. Thomasson" <chris.m.thomasson.1@gmail.com>: Sep 07 12:06PM -0700 On 9/7/2023 5:17 AM, Bonita Montero wrote: > That sounds too complex for me. Creating individual mutexes on > demand would be o.k. and I think a lock-free stack with a pool > of mutexes would be fancy. A hashtable is too slow for that. No. A simple hash of a pointer into an index works out okay, not too slow at all. Fwiw, check this out, tell me what you think: https://groups.google.com/g/comp.lang.c++/c/sV4WC_cBb9Q/m/wwYQCG2hAwAJ It is a quick and crude example simulation of one way to do it. The hash lock table is created before program logic is executed. Any thoughts? Also, we are talking about a slow path wrt DCL. |
"Chris M. Thomasson" <chris.m.thomasson.1@gmail.com>: Sep 07 12:20PM -0700 On 9/7/2023 11:06 AM, Richard Damon wrote: > local static objects don't get initialized until the first call of the > function. These will need some synchronization if the function is called > from multiple threads at "the same time". A simple global hash lock scheme is where we can hash addresses directly into a static locking table. The lock table is created _before_ any program logic is executed. https://groups.google.com/g/comp.lang.c++/c/sV4WC_cBb9Q/m/wwYQCG2hAwAJ > Also, some global object could start up a thread in its constructor. YIKES! Shit. I have had to debug other peoples code that did this. Many points of errors... One was a rather common peach of a bug. The constructor would create a thread that would in turn call into a virtual function and start using the object before its constructor was completed. A massive race condition, nasty ones! |
Bonita Montero <Bonita.Montero@gmail.com>: Sep 07 09:27PM +0200 Am 07.09.2023 um 21:06 schrieb Chris M. Thomasson: > No. A simple hash of a pointer into an index works out okay, not > too slow at all. Fwiw, check this out, tell me what you think. Then the number of mutexes would be fixed. |
red floyd <no.spam.here@its.invalid>: Sep 06 05:51PM -0700 On 9/5/2023 9:14 PM, Keith Thompson wrote: >> No, I believe that std::unordered_map has a specialization for both >> const char* and const wchar_t*. > I don't believe that's correct. Do you have a reference? Isn't it based on std::less<>? Just checked. My mistake, forget I said anything. |
Bonita Montero <Bonita.Montero@gmail.com>: Sep 07 04:51AM +0200 Am 06.09.2023 um 20:44 schrieb Pavel: >> The string-object is created for every lookup with a string-literal >> if the key is also a string-object. > False. Read the standard. Quote the standard. >> That has nothing to do with C++20. > False. Read the standard, e.g. how to use Quote the standard. > template<class K> iterator find(const K& k); > template<class K> const_iterator find(const K& k) const; > }; Theres a templated K-parameter for the loookup, but internally this parameter needs to be converted to a string-object to be hashed and to be equality-comparable the same way. In theory there might be overloaded specializations that take a string_view or whatever; but that's ratther unlikely. |
Pavel <pauldontspamtolk@removeyourself.dontspam.yahoo>: Sep 06 11:54PM -0400 Bonita Montero wrote: > Theres a templated K-parameter for the loookup, but internally > this parameter needs to be converted to a string-object to be > hashed and to be equality-comparable the same way. correct > In theory there might be overloaded specializations that take > a string_view or whatever; not needed > but that's ratther unlikely. close but imprecise. That's simply not there. |
Bonita Montero <Bonita.Montero@gmail.com>: Sep 07 06:58AM +0200 Am 07.09.2023 um 05:54 schrieb Pavel: >> In theory there might be overloaded specializations that take >> a string_view or whatever; > not needed With out such a specialization a string object needs to be created inside find to have compatible hashing and equality comparison. >> but that's ratther unlikely. > close but imprecise. That's simply not there. That's a theoretically valid possibility, so it can't be said that it doesn't exist. |
Pavel <pauldontspamtolk@removeyourself.dontspam.yahoo>: Sep 07 11:08AM -0400 Bonita Montero wrote: >> not needed > With out such a specialization a string object needs to be created > inside find to have compatible hashing and equality comparison. Not true. I was benevolently trying to make you read the standard so you could become a better C++ programmer but you refused. I am therefore giving up on helping you. I will become evil and post the complete standard-compliant example that demonstrates how exactly the unordered_map::find can be made work on string literal and the map with the string keys without creating a string object (in the example, the string is wrapped in StringKey to track constructions easily but you are welcome to remove the wrapper). The code will work as described under C++20 but not any previous version of the standard -- and this is the expected behavior. You still have a chance to improve your C++ programming skills if you read the standard and find the explanation for why this code shall behave like it does. // ------------- code begin cut here ------------------------- #include <cassert> #include <cstdlib> #include <cstring> #include <iostream> #include <numeric> #include <string> #include <unordered_map> using namespace std; class StringKey {like public: StringKey(const char* cStr): s_(cStr) { cout << "\n\t(StringKey(" << cStr << ") called)\n"; } const string& getS() const { return s_; } private: string s_; }; struct StringKeyHash { typedef void is_transparent; size_t operator()(const char *s) const { assert(!!s); return Hash(s, s + strlen(s)); } size_t operator()(const StringKey &s) const { return Hash(s.getS().data(), s.getS().data() + s.getS().size()); } private: static size_t Hash(const char* begin, const char* end) { return accumulate(begin, end, (size_t) 0u, [](size_t a, char c) -> size_t { return a + (size_t)c; }); } }; struct StringKeyEq { typedef void is_transparent; bool operator()(const StringKey& x, const StringKey& y) const { return IsEq(x.getS().c_str(), y.getS().c_str()); } bool operator()(const StringKey& x, const char* y) const { assert(!!y); return IsEq(x.getS().c_str(), y); } bool operator()(const char* x, const StringKey& y) const { assert(!!x); return IsEq(x, y.getS().c_str()); } private: static bool IsEq(const char *x, const char *y) { assert(!!x); assert(!!y); for (;; ++x, ++y) { if (*x == *y) { if (!*x) return true; continue; } assert(*x != *y); return false; } } }; int main(int, char*[]) { cout << "*** fill up the unordered map\n"; unordered_map<StringKey, string, StringKeyHash, StringKeyEq> um { { "key1", "val1" }, { "key2", "val2" }, }; const char* key3 = "key3"; const char* key4 = "key4"; const char* key1 = "key1"; cout << "*** now do the the find\n"; const auto i1 = um.find(key1); const bool r1 = i1 == um.end(); const auto i3 = um.find(key3); const bool r3 = i3 == um.end(); const auto i4 = um.find(key4); const bool r4 = i4 == um.end(); cout << "find(" << key1 << "):" << r1 << ' ' << "find(" << key4 << "):" << r4 << ' ' << "find(" << key3 << "):" << r3 << endl; return 0; } // ------------- code end cut here ------------------------- // --- example run output if compiled by g++ -std=c++20 begin --- $ ./a.out *** fill up the unordered map (StringKey(key1) called) (StringKey(key2) called) *** now do the the find find(key1):0 find(key4):1 find(key3):1 // --- example run output if compiled by g++ -std=c++20 end --- // --- example run output if compiled by g++ -std=c++17 begin --- $ ./a.out *** fill up the unordered map (StringKey(key1) called) (StringKey(key2) called) *** now do the the find (StringKey(key1) called) (StringKey(key3) called) (StringKey(key4) called) find(key1):0 find(key4):1 find(key3):1 // --- example run output if compiled by g++ -std=c++17 end --- >> close but imprecise. That's simply not there. > That's a theoretically valid possibility, > so it can't be said that it doesn't exist. This is "theoretically valid" only for those who refuse to read the standard; the others know the specialization is not there. |
Bonita Montero <Bonita.Montero@gmail.com>: Sep 07 05:56PM +0200 Am 07.09.2023 um 17:08 schrieb Pavel: > Not true. I was benevolently trying to make you read the standard > so you could become a better C++ programmer but you refused. You're making a total differnt discussion and don't notce where's my point. Within find for a string key a string key is generated from K if the key is a string-object; that's all. What's wrong with that ? The alternative you're showing below is sth. completely different and the idea came across my mind when I discovered what's the problem with my bug. But a string_view could much more performant if you won't do any additional allocations per inserted node. But that's not my point. Should I disable just my code debugging with MSVC and show you the code which converts a string-pointer to a com- pararable string-object ? Rest unread. Idiot ! |
Bonita Montero <Bonita.Montero@gmail.com>: Sep 07 06:04PM +0200 Am 07.09.2023 um 17:56 schrieb Bonita Montero: > You're making a total differnt discussion and don't notce where's my > point. Within find for a string key a string key is generated from K > if the key is a string-object; that's all. What's wrong with that ? Here are the declarations for C++20 from en.cppreference.com: template< class K > iterator find( const K& x ); (3) (since C++20) template< class K > const_iterator find( const K& x ) const; (4) (since C++20) MSVC hasn't code for that even I use C++20. I think MS dropped that because an internal conversion within find() could be also done externally while calling find(). Smart decision. |
Bonita Montero <Bonita.Montero@gmail.com>: Sep 07 05:55PM +0200 #include <iostream> #include <type_traits> #include <string> #include <chrono> #include "char_conv.h" using namespace std; using namespace chrono; int main() { string strValue( "-3.14159266359e-300" ); else auto compare = [&]<bool Precise, bool Std>( char const *prefix, uint64_t reference, bool_constant<Precise>, bool_constant<Std> ) { double value; if constexpr( !Std ) parse_double<Precise>( strValue.cbegin(), strValue.cend(), value ); else from_chars( strValue.data(), strValue.data() + strValue.size(), value ); uint64_t bin = bit_cast<uint64_t>( value ); cout << prefix << hexfloat << value; if( unsigned diffBits = 64 - countl_zero( bin ^ reference ); reference ) cout << ": " << diffBits; cout << endl; return bit_cast<uint64_t>( value ); }; uint64_t dummyReference = 0, pdImprecise = compare( "imprecise: ", dummyReference, false_type(), false_type() ), pdPrecise = compare( "precise vs imprecise: ", pdImprecise, true_type(), false_type() ), fcVsImprecise = compare( "from_chars vs. ip: ", pdImprecise, false_type(), true_type() ), fcVsPrecise = compare( "from_chars vs. p: ", pdPrecise, false_type(), true_type() ); } I implemented sth. like from_chars myself. I wanted to have a maximum, precision with that. For the given value above the highest different bit of my precise solution vs. from_chars is bit 11, so 12 bits of the mantissa are more exact. If you take a naive approach and sum up the suffix-digits multiplied by their 10 ^ N value from left to right the internediate values you add become smaller the further you get right and at last won't partipate in the mantissa. So I do the math from right to left by first putting everything into a table (100 digits, if overflow everything is put into a thread-local vector which isn't reallocated for the next call). And if you get further in the 10 ^ N row there are N errors of each / 10 or * 10 operation. My function has a template parameter which leads to a computation of the 10 ^ N value for each digit by my own pow10() function, which feeds from a table for each bit of the N-exponent, i.e. there are two tables which store the positive and negative eponent's values (10 ^ (2 ^ N)). With the more precise solution I get a result where the lower 12 mantissa bits are different for the above string value. |
Tony Oliver <guinness.tony@gmail.com>: Sep 07 03:36AM -0700 On Wednesday, 6 September 2023 at 23:15:02 UTC+1, Amine Moulay Ramdane wrote: > Don't worry, i have just posted three more posts about artificial intelligence , since i have just wanted to explain more my views on artificial intelligence, and it is my last post here. > Thank you, > Amine Moulay Ramdane. But of course it won't be, you habitual liar. |
You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page. To unsubscribe from this group and stop receiving emails from it send an email to comp.lang.c+++unsubscribe@googlegroups.com. |
No comments:
Post a Comment