- Tricky ... - 18 Updates
- How to get mantissa of long double? - 5 Updates
- Most efficient prefetching distance - 1 Update
- "Improving Stability with Modern C++, Part 1" by Ralph Kootker - 1 Update
| "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com>: Oct 01 05:14PM -0700 On 10/1/2021 3:13 PM, Branimir Maksimovic wrote: > Same thing practically, except linux futex, which is same thing. > Interrestingly Darwin does not have it and I am really interrested > how Apple immplements pthread_mutex? Ohhhh... Good question. I am not sure about Darwin. Actually, there is a way to implement the mutex I showed you using binary semaphores for the slow path. Iirc, it went like this, with the futex part commented out, and the initialization of the binary sema to zero, auto-reset event on windoze, also out: ____________________________ struct ct_futex_mutex { ULONG alignas(CT_L2_ALIGNMENT) m_state; ct_futex_mutex() : m_state(0) { } void lock() { if (InterlockedExchange(&m_state, 1)) { while (InterlockedExchange(&m_state, 2)) { //ULONG cmp = 2; //WaitOnAddress(&m_state, &cmp, sizeof(ULONG), INFINITE); WaitForSingleObject(m_event, INFINITE); } } } void unlock() { if (InterlockedExchange(&m_state, 0) == 2) { //WakeByAddressSingle(&m_state); SetEvent(m_event); } } }; ____________________________ That works as well. Humm... https://github.com/apple/darwin-libpthread/blob/main/man/pthread_mutex_lock.3 https://github.com/apple/darwin-libpthread/blob/main/src/pthread_mutex.c Need to example this! There is another way to create a nice FIFO mutex using a fast semaphore. Iirc, it was called a benaphore: https://www.haiku-os.org/legacy-docs/benewsletter/Issue1-26.html#Engineering1-26 |
| "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com>: Oct 01 05:16PM -0700 On 10/1/2021 5:14 PM, Chris M. Thomasson wrote: > On 10/1/2021 3:13 PM, Branimir Maksimovic wrote: >> On 2021-10-01, Chris M. Thomasson <chris.m.thomasson.1@gmail.com> wrote: [...] > There is another way to create a nice FIFO mutex > using a fast semaphore. Iirc, it was called a benaphore: > https://www.haiku-os.org/legacy-docs/benewsletter/Issue1-26.html#Engineering1-26 Basically, its adding the ability to avoid the kernel using a fast-path on the semaphore logic. Also, its wait-free on the fast-path because it can be implemented using XADD. |
| "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com>: Oct 01 06:00PM -0700 On 9/30/2021 12:10 AM, David Brown wrote: > synchronisation right - and a lot of /really/ tough stuff in getting it > right and optimally efficient on big processors. But "acquire" and > "release" semantics are helpfully named! They are named nicely. Back on the SPARC acquire is: MEMBAR #LoadStore | #LoadLoad Release is: MEMBAR #LoadStore | #StoreStore Ahhh, then we have hardcore: #StoreLoad. SMR required this on the SPARC. |
| "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com>: Oct 01 06:01PM -0700 On 10/1/2021 6:00 PM, Chris M. Thomasson wrote: > Release is: > MEMBAR #LoadStore | #StoreStore > Ahhh, then we have hardcore: #StoreLoad. SMR required this on the SPARC. Heck, it even required it on the x86! LOCK'ed atomic or MFENCE. |
| Branimir Maksimovic <branimir.maksimovic@icloud.com>: Oct 02 01:44AM > slow path. Iirc, it went like this, with the futex part commented out, > and the initialization of the binary sema to zero, auto-reset event on > windoze, also out: Problem is that Apple act like student newbs. They don't care about API stability and code in general. Puring empty to void, you have to waste time a lot if programming for macOS... > }; > ____________________________ > That works as well. Humm... yeah... > Need to example this! There is another way to create a nice FIFO mutex > using a fast semaphore. Iirc, it was called a benaphore: > https://www.haiku-os.org/legacy-docs/benewsletter/Issue1-26.html#Engineering1-26 I look... -- 7-77-777 Evil Sinner! |
| Branimir Maksimovic <branimir.maksimovic@icloud.com>: Oct 02 02:14AM > Need to example this! There is another way to create a nice FIFO mutex using > a fast semaphore. Iirc, it was called a benaphore: > https://www.haiku-os.org/legacy-docs/benewsletter/Issue1-26.html#Engineering1-26 Here is Apple version: #include <locale> #include <iostream> #include <thread> #include <semaphore.h> #include <functional> #define CT_L2_ALIGNMENT 128 #define CT_THREADS 32 #define CT_ITERS 666666 using ULONG = unsigned long; struct ct_futex_mutex { sem_t sema; alignas(CT_L2_ALIGNMENT) ULONG m_state; ct_futex_mutex() : m_state(0) { sem_init(&sema,0,0); } void lock() { if (__sync_swap(&m_state, 1)) { while (__sync_swap(&m_state, 2)) { sem_wait(&sema); } } } void unlock() { if (__sync_swap(&m_state, 0) == 2) { sem_post(&sema); } } }; struct ct_shared { ct_futex_mutex m_mtx; unsigned long m_count; ct_shared() : m_count(0) {} ~ct_shared() { if (m_count != 0) { std::cout << "counter is totally fubar!\n"; } } }; void ct_thread(ct_shared& shared) { for (unsigned long i = 0; i < CT_ITERS; ++i) { shared.m_mtx.lock(); ++shared.m_count; shared.m_mtx.unlock(); shared.m_mtx.lock(); --shared.m_count; shared.m_mtx.unlock(); } } int main() { std::locale mylocale(""); std::cout.imbue(mylocale); // use locale number formatting style std::thread *threads = new std::thread[CT_THREADS]; std::cout << "Starting up...\n"; { ct_shared shared; for (unsigned long i = 0; i < CT_THREADS; ++i) { threads[i] = std::thread(ct_thread, std::ref(shared)); } std::cout << "Running...\n"; for (unsigned long i = 0; i < CT_THREADS; ++i) { threads[i].join(); } } std::cout << "Completed!\n"; return 0; } /** ;^) >>> stacks. >> Sure. Whatever you say Bonita. Cough.... Cough... ;^) > Heh, rsyncing something, looking bloat and have no patience :p ;^) */ -- 7-77-777 Evil Sinner! |
| "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com>: Oct 01 07:33PM -0700 On 10/1/2021 7:14 PM, Branimir Maksimovic wrote: >> a fast semaphore. Iirc, it was called a benaphore: >> https://www.haiku-os.org/legacy-docs/benewsletter/Issue1-26.html#Engineering1-26 > Here is Apple version: [...] Excellent! That is basically identical to the bin-sema version! For what its worth, take reference to a man by the name of Alexander Terekhov! And look at the code in pthreads-win32 sources. I used to converse with this genius way back on comp.programming.threads. He was a pleasure to talk to. Actually, I am SenderX way back here: https://groups.google.com/g/comp.programming.threads/c/KepRbFWBJA4/m/pg83oJTzPUIJ ;^) |
| Bonita Montero <Bonita.Montero@gmail.com>: Oct 02 06:39AM +0200 Am 01.10.2021 um 22:04 schrieb Chris M. Thomasson: >> If it hasn't anything other to do than "waiting" for a new entry it >> spins. > Huh? Ever heard of a futex, or an eventcount? Lock-free is without any kernel-structures and polling only. No futex. |
| Bonita Montero <Bonita.Montero@gmail.com>: Oct 02 06:40AM +0200 Am 01.10.2021 um 22:20 schrieb Chris M. Thomasson: >> spins. Lock-free and wait-free datastructures are idiocracy except >> from lock-free stacks. > So a lock-free stack is okay with you, but not a lock-free queue? Why? Because the use-cases of lock-free stacks are so that the lock-free stacks are never polled. |
| "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com>: Oct 01 10:12PM -0700 On 10/1/2021 9:39 PM, Bonita Montero wrote: >> Huh? Ever heard of a futex, or an eventcount? > Lock-free is without any kernel-structures and polling only. > No futex. Lock-free on the fast-path... Ever heard of such a thing? Wow. |
| "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com>: Oct 01 10:12PM -0700 On 10/1/2021 9:40 PM, Bonita Montero wrote: >> So a lock-free stack is okay with you, but not a lock-free queue? Why? > Because the use-cases of lock-free stacks are so that the > lock-free stacks are never polled. Huh? What are you talking about? |
| "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com>: Oct 01 10:14PM -0700 On 10/1/2021 9:40 PM, Bonita Montero wrote: >> So a lock-free stack is okay with you, but not a lock-free queue? Why? > Because the use-cases of lock-free stacks are so that the > lock-free stacks are never polled. Never polled? A slow-path on a lock-free stack can be waited on. |
| Bonita Montero <Bonita.Montero@gmail.com>: Oct 02 07:35AM +0200 Am 02.10.2021 um 07:12 schrieb Chris M. Thomasson: >> Lock-free is without any kernel-structures and polling only. >> No futex. > Lock-free on the fast-path... Ever heard of such a thing? Wow. There is no slow path with lock-free structures. |
| Bonita Montero <Bonita.Montero@gmail.com>: Oct 02 07:36AM +0200 Am 02.10.2021 um 07:14 schrieb Chris M. Thomasson: >> Because the use-cases of lock-free stacks are so that the >> lock-free stacks are never polled. > Never polled? A slow-path on a lock-free stack can be waited on. Then it isn't lock-free. I think I'm talking to a complete moron here. |
| "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com>: Oct 01 10:49PM -0700 On 10/1/2021 10:35 PM, Bonita Montero wrote: >>> No futex. >> Lock-free on the fast-path... Ever heard of such a thing? Wow. > There is no slow path with lock-free structures. Ummmm... You are just trolling me right? A slow path would be what to do when one needs to wait, on say, an empty condition? Humm... Why do you troll? |
| "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com>: Oct 01 10:55PM -0700 On 10/1/2021 10:36 PM, Bonita Montero wrote: >> Never polled? A slow-path on a lock-free stack can be waited on. > Then it isn't lock-free. > I think I'm talking to a complete moron here. Fast-path lock/wait-free, slow-path might have to hit the kernel to wait on certain conditions. You seem to have never differentiated between the two possible paths? |
| Bonita Montero <Bonita.Montero@gmail.com>: Oct 02 08:00AM +0200 Am 02.10.2021 um 07:49 schrieb Chris M. Thomasson: > Ummmm... You are just trolling me right? A slow path would be what to do > when one needs to wait, on say, an empty condition? Humm... Why do you > troll? Lock-free is when there's no kernel-locking involved. And a slow-path involves kernel-locking. |
| Bonita Montero <Bonita.Montero@gmail.com>: Oct 02 08:00AM +0200 Am 02.10.2021 um 07:55 schrieb Chris M. Thomasson: > Fast-path lock/wait-free, slow-path might have to hit the kernel to wait > on certain conditions. You seem to have never differentiated between the > two possible paths? Lock-free is when there's no kernel-locking involved. And a slow-path involves kernel-locking. |
| James Kuyper <jameskuyper@alumni.caltech.edu>: Oct 01 08:19PM -0400 On 10/1/21 2:53 PM, Branimir Maksimovic wrote: ... > long double max = cumeric_limits<long double>::max(); > long mantissa = max; // impicit conversion "A prvalue of a floating-point type can be converted to a prvalue of an integer type. The conversion truncates; that is, the fractional part is discarded. The behavior is undefined if the truncated value cannot be represented in the destination type." (7.3.10p1). While it's not required to be the case, on most implementations std::numeric_limits<long double>::max() is WAY too large to be represented by a long, so the behavior of such code is undefined. The minimum value of LDBL_MAX (set by the C standard, inherited by the C++ standard) is 1e37, which would require long to have at least 123 bits in order for that conversion to have defined behavior. And even when it has defined behavior, I can't imagine how you would reach the conclusion that this conversion should be the value of the mantissa. |
| wij <wyniijj@gmail.com>: Oct 01 05:23PM -0700 > > The frexp functions break a floating-point number into a normalized fraction and an integer exponent. They store the integer in the int object pointed to by p . > So iexp should be set. When I ran the code, it got set to a value of > 16384. That's not the problem. // ----- file t.cpp ----- #include <math.h> #include <limits> #include <iostream> using namespace std; #define ENDL endl template<typename T> int64_t get_mant(T x) { int iexp; x=frexp(x,&iexp); x=ldexp(x,numeric_limits<T>::digits); return static_cast<int64_t>(x); }; int main() { cout << dec << get_mant(numeric_limits<float>::max()) << ", " << hex << get_mant(numeric_limits<float>::max()) << ENDL; cout << dec << get_mant(numeric_limits<double>::max()) << ", " << hex << get_mant(numeric_limits<double>::max()) << ENDL; cout << dec << get_mant(numeric_limits<long double>::max()) << ", " << hex << get_mant(numeric_limits<long double>::max()) << ENDL; return 0; }; // end file t.cpp ----- $ g++ t.cpp ]$ ./a.out 16777215, ffffff 9007199254740991, 1fffffffffffff -9223372036854775808, 8000000000000000 |
| Branimir Maksimovic <branimir.maksimovic@icloud.com>: Oct 02 02:16AM > And even when it has defined behavior, I can't imagine how you would > reach the conclusion that this conversion should be the value of the > mantissa. It is not undefined as long is larger then mantissa part. ok correct is long long :P -- 7-77-777 Evil Sinner! |
| James Kuyper <jameskuyper@alumni.caltech.edu>: Oct 02 01:28AM -0400 On 10/1/21 10:16 PM, Branimir Maksimovic wrote: >> reach the conclusion that this conversion should be the value of the >> mantissa. > It is not undefined as long is larger then mantissa part. No, the relevant issue isn't the size of the mantissa. As stated above in my quote from the standard, it's whether "the truncated value cannot be represented in the destination type". std::numeric_limits<long double>::max() doesn't have a fractional part, so the truncated value is the same as the actual value. On my system, for instance, that value is 1.18973e+4932. > ok correct is long long :P On my system, changing it to long long doesn't make any different - the maximum value representable by long long is still 9223372036854775807, the same as the maximum value for long; it's still far too small to represent 1.18973e+4932, so the behavior of the conversion is undefined. The actual behavior on my system appears to be saturating at LLONG_MAX == 9223372036854775807. If I change the second line to long long mantissa = 0.75*max; 0.75*max is 8.92299e+4931, which should certainly not have the same mantissa as max itself, but the value loaded into "mantissa" is still 9223372036854775807. |
| James Kuyper <jameskuyper@alumni.caltech.edu>: Oct 02 01:56AM -0400 I accidentally sent this message first to Branimir by e-mail, and he responded in kind. On 10/2/21 1:31 AM, Branimir Maksimovic wrote: >> On 02.10.2021., at 07:27, James Kuyper <jameskuyper@alumni.caltech.edu> wrote: >> On 10/1/21 10:16 PM, Branimir Maksimovic wrote: ... > Problem is that neither long double nor long is defined how small > or large can be… > so it can fit or not… The C++ standard cross-references the C standard for such purposes, and the C standard imposes strict limits on how small those things can be: LLONG_MAX is required to be at least 9223372036854775807, and LDBL_MAX is supposed to be at least 1e37. You are right, however, about there being no limits on how large they can be. It is therefore permissible for an implementation to have LLONG_MAX >= LDBL_MAX, but do you know of any such implementation? In any event, the relevant issue is not the limits imposed on those values by the standard, but the actual values of LLONG_MAX and LDBL_MAX for the particular implementation you're using, and it's perfectly feasible to determine those values from <climits>, <cfloat>, or std::numeric_limits<>::max. What are those values on the implementation you're using? > but question is how to extract mantissa which was answer :P No, that is not the answer. If max did have a value small enough to make the conversion to long long have defined behavior, the result of that conversion would be the truncated value itself (7.3.10p1), NOT the mantissa of the truncated value. What makes you think otherwise? |
| Bonita Montero <Bonita.Montero@gmail.com>: Oct 02 07:09AM +0200 > Modern CPUs for the last decade have included automatic prefetchers > in the cache subsystems. Usually a mix of stride-based and/or predictive > fetchers. If they would be better my program would give the best result of zero prefetching. And there would be no prefetching-instructions at all. > It's very seldom necessary for an application to provide an > explicit prefetching hint except in very unusual circumstances. Automatic prefetchers are dumb. |
| Lynn McGuire <lynnmcguire5@gmail.com>: Oct 01 08:04PM -0500 "Improving Stability with Modern C++, Part 1" by Ralph Kootker https://medium.com/factset/improving-stability-with-modern-c-part-1-getting-started-f7025e97e1c3 "C++ is a workhorse programming language across the financial industry. At FactSet, it has powered many of our products for almost 30 years. Because our code base predates C++ standardization, we have a lot of legacy code to support. However, we've recently completed a major compiler upgrade that enables our workstation developers to start using modern C++ for the first time. While C++ had a reputation for a steep learning curve, we've found the new features safer, easy to adopt, and more stable for our clients. While the C++11 standard recently had its 10th birthday, there are still developers out there using modern C++ for the first time. With that in mind, we've prepared a series of short introductory posts on the features of C++11 and beyond. We're sharing our journey with the wider C++ community in the hopes that others will find it as useful as we have." Lynn |
| You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page. To unsubscribe from this group and stop receiving emails from it send an email to comp.lang.c+++unsubscribe@googlegroups.com. |
No comments:
Post a Comment