Tuesday, December 18, 2018

Digest for comp.lang.c++@googlegroups.com - 22 updates in 7 topics

Chris Vine <chris@cvine--nospam--.freeserve.co.uk>: Dec 18 10:20PM

On 18 Dec 2018 18:58:14 GMT
 
> Now, Udo said that the locale cannot possibly delete the
> facet when the »new« quoted above will throw, so there'd be
> a leak.
 
There may be a bug but I do not think it lies in the code you have
posted.
 
If new throws std::bad_alloc then there is no leak with respect to that
particular allocation because there has been none, and surely neither
locale's constructor nor _Impl's constructor will run? If the _Impl
constructor above throws then the new expression is required to clean
up any memory the operator new() allocated for the _Impl object in
question.
 
Furthermore if the _Impl constructor throws then the locale object's
constructor will execute no further, so it cannot increment __f's
reference count. As the __f object is not passed to _Impl's
constructor I don't immediately see where the problem arises there
either (possibly the object accessed through the __other reference does
something amiss with other facets, who knows). I think you are going
to have to dig deeper into where the leak you have found arises (I have
not examined whether your leak detector works correctly and whether the
bug lies there instead).
Jorgen Grahn <grahn+nntp@snipabacken.se>: Dec 17 11:51PM

On Mon, 2018-12-17, Vir Campestris wrote:
 
> It generates random numbers.
 
> Perhaps they aren't random enough for your needs, which is why your
> concern. But it shouldn't be public facing.
 
I don't understand the "public facing" part.
 
> Perhaps if you tell us a bit more about why you are worried we can help.
 
Yes. "Vulnerability" can mean many different things. I note that
https://en.wikipedia.org/wiki/Mersenne_Twister (unsurprisingly) says
the algorithm is "not cryptographically secure", so the implementation
doesn't matter in that regard.
 
/Jorgen
 
--
// Jorgen Grahn <grahn@ Oo o. . .
\X/ snipabacken.se> O o .
Manfred <invalid@add.invalid>: Dec 18 02:40AM +0100

On 12/18/18 12:51 AM, Jorgen Grahn wrote:
> https://en.wikipedia.org/wiki/Mersenne_Twister (unsurprisingly) says
> the algorithm is "not cryptographically secure", so the implementation
> doesn't matter in that regard.
 
Also in http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/efaq.html
 
David Brown <david.brown@hesbynett.no>: Dec 18 01:06PM +0100

On 18/12/18 00:51, Jorgen Grahn wrote:
 
>> Perhaps they aren't random enough for your needs, which is why your
>> concern. But it shouldn't be public facing.
 
> I don't understand the "public facing" part.
 
As you say, "vulnerability" can mean many things. For software, it can
mean "vulnerable to buffer overflows, special characters, SQL injection,
malicious input data, etc." I am guessing that this is what Vir meant
here - code like this should not be in a "public facing" part of the
system, meaning you are not going to pass data to it directly from the
outside world. So if the code has a function that expects an integer
between 1 and 100000000, and a vulnerability causing a buffer overflow
if it is called with negative values, that will not matter because the
code will not be used with unchecked values.
 
> https://en.wikipedia.org/wiki/Mersenne_Twister (unsurprisingly) says
> the algorithm is "not cryptographically secure", so the implementation
> doesn't matter in that regard.
 
True.
 
"Chris M. Thomasson" <invalid_chris_thomasson@invalid.invalid>: Dec 18 01:30PM -0800

> // Copyright (C) 2000 - 2009, Richard J. Wagner
> // All rights reserved.
> //
 
Fwiw: It really does not create "true" random numbers. Instead, it
outputs pseudo random numbers. If you are interested in a crypto safe
(e.g., wrt a good crypto safe hash, sha384?) HMAC based generation of
pseudo random numbers, look here:
 
http://funwithfractals.atspace.cc/ct_cipher
 
Here is a sample C impl with a hardcoded in-memory secret key of
"Password" and sha256 for the hash used with the HMAC:
 
https://groups.google.com/d/topic/comp.lang.c/a53VxN8cwkY/discussion
(please, read _all_ if interested...)
 
The ciphertext can actually be used for crypto safe pseudo-random numbers.
 
The PRNG stream would be password protected such that different
passwords will produce radically different streams. Also, the random
numbers are actually encrypting files. So, we can encrypt files created
from the output of an actual TRNG. The seed for the CSPRNG is the
password and the hash algo.
 
Interested?
Juha Nieminen <nospam@thanks.invalid>: Dec 17 02:23PM


> Often (usually), the objects are accessed randomly via the pointers, and
> the fact that the objects are located contiguously doesn't help in
> that case.
 
It still reduces memory fragmentation, which may increase efficiency
overall.
scott@slp53.sl.home (Scott Lurndal): Dec 17 04:08PM

>> that case.
 
>It still reduces memory fragmentation, which may increase efficiency
>overall.
 
I don't see that as too likely, and highly dependent upon the
usage model of the data structure in question. But, YMMV.
Juha Nieminen <nospam@thanks.invalid>: Dec 18 09:41AM

>>overall.
 
> I don't see that as too likely, and highly dependent upon the
> usage model of the data structure in question. But, YMMV.
 
If you are, for example, allocating a thousand objects, it significantly
reduces memory fragmentation if you make one single allocation (ie. an
array of a thousand objects), than if you make a thousand individual
allocations.
 
(Granted, if these are the very first allocations you are doing in the
program, there probably will be no memory fragmentation. However, if this
is happening during the execution of large program, with tons and tons of
allocations and deallocations, the amount of memory fragmentation will
increase further and further the more individual allocations and
deallocations you make. If you minimize the number of allocations, and
if you allocate large chunks of memory at a time (in the form of arrays),
this will help reduce the overall amount of memory fragmentation.)
 
Memory fragmentation increses overall memory consumption and slows down
speed (because it increases the amount of cache misses).
scott@slp53.sl.home (Scott Lurndal): Dec 18 01:51PM

>reduces memory fragmentation if you make one single allocation (ie. an
>array of a thousand objects), than if you make a thousand individual
>allocations.
 
Why do you say that? If I allocate all 1000 objects immediately
and store their pointers, how is that fragmenting anything? Or
if I allocate all 1000 objects in a single allocation and
divide the allocation up?
 
And, for that matter, who cares if virtual memory is "fragmented"?
(Physical memory is, of course, always "fragmented" into pages).
 
Anyone doing small frequent allocations is already doing something
wrong.
 
 
>Memory fragmentation increses overall memory consumption and slows down
>speed (because it increases the amount of cache misses).
 
1) How does it increase overall memory consumption? Are you
referring to the 8-16 byte overhead per allocation? That's
in the noise.
 
2) How does it 'increase the amount (number) of cache misses',
particularly when each object consumes multiple cache lines?
Juha Nieminen <nospam@thanks.invalid>: Dec 18 07:10PM

> and store their pointers, how is that fragmenting anything? Or
> if I allocate all 1000 objects in a single allocation and
> divide the allocation up?
 
If all your program ever allocates is those 1000 objects, then maybe.
However, programs do a lot more than that.
 
> And, for that matter, who cares if virtual memory is "fragmented"?
> (Physical memory is, of course, always "fragmented" into pages).
 
Traversing RAM in a random order is slower than traversing it
consecutively (regardless of whether it's physical or virtual
memory). When traversing it consecutively, the hardware prefethces
data and other such things.
 
 
> 1) How does it increase overall memory consumption? Are you
> referring to the 8-16 byte overhead per allocation? That's
> in the noise.
 
Because memory fragmentation causes many gaps of free memory to
form between allocated blocks. When these gaps are too small for
a new allocation, additional memory will be allocated even though
there may be a lot of it free. It's just that all that free memory
is in such small gaps that the new allocation may not fit in any of
them.
 
There's a reason why many garbage-collected runtime systems (such
as the Java runtime) perform memory compaction. (Memory compaction
is rearranging all the allocated blocks into consecutive memory,
removing all the free gaps.)
 
> 2) How does it 'increase the amount (number) of cache misses',
> particularly when each object consumes multiple cache lines?
 
Because traversing memory in random order causes more cache misses
than traversing it consecutively.
scott@slp53.sl.home (Scott Lurndal): Dec 18 08:33PM

>> divide the allocation up?
 
>If all your program ever allocates is those 1000 objects, then maybe.
>However, programs do a lot more than that.
 
Most of mine are performance sensitive and allocate upon
startup, or use pools.
 
 
 
>Traversing RAM in a random order is slower than traversing it
>consecutively (regardless of whether it's physical or virtual
>memory).
 
That's certainly not accurate. Access latency to DRAM is not
address sensitive (modulo minor effects such as leaving pages
open in a DIMM, any memory controller interleaving behavior,
and internal fifo backpressure).
 
The various levels (L1I, L1D, L2, LLC) of cache are designed
to reduce the average access latency to DRAM.
 
>When traversing it consecutively, the hardware prefethces
>data and other such things.
 
The team I work with is responsible for architecting
prefetchers and we do extensive real-world workload tracing
in order to have sufficient data to decide whether a particular
prefetcher is achieving the required performance. Few
interesting server workloads access memory using regular strides;
random accesses are much more common (large graph data structures,
for example).
 
>Because traversing memory in random order causes more cache misses
>than traversing it consecutively.
 
That's entirely dependent upon the stride distance. Which in real-world
workloads is generally larger than the cache line size.
Vir Campestris <vir.campestris@invalid.invalid>: Dec 18 09:12PM

On 14/12/2018 14:02, Scott Lurndal wrote:
> avoided. I prefer to think that if you don't understand
> how to use them properly, you shouldn't be programming in
> a language that supports them.
 
I may be one of them.
 
I believe _raw_ pointers should be avoided.
 
Certainly if I find code that says
 
... * foo = new ...
 
alarm bells go off. unique_ptr is almost always a better solution.
 
Andy
Vir Campestris <vir.campestris@invalid.invalid>: Dec 18 09:16PM

On 14/12/2018 11:25, JiiPee wrote:
> Ye it has like 1000 slots to start with. But the main point I wanted
> vector is because its easy to use (add element in the middle, pushback,
> etc). array is not easy to use.
 
If you are inserting things in the middle frequently vector may not be
the right type either.
 
It involves moving all the objects after the insertion up by one.
 
Andy
Lynn McGuire <lynnmcguire5@gmail.com>: Dec 18 03:24PM -0600

On 12/14/2018 5:23 AM, JiiPee wrote:
> (pushback, remove, clear, insert). array does not have there.. so its
> difficult to use. how do you easily add an element in the middel of an
> array? :)
 
Yup, the key word there is "easily".
 
Lynn
ram@zedat.fu-berlin.de (Stefan Ram): Dec 18 06:58PM

Is this possible: A bug (memory leak) in GCC?
 
What is your opinion about this?
 
Udo S. recently reported in the German C++ newsgroup
"comp.lang.iso-c++" that GCC's library contains code like:
 
template<typename _Facet> locale::locale(const locale& __other, _Facet* __f)
{ _M_impl = new _Impl(*__other._M_impl, 1);
 
Let me remind you: The locales contain reference-counting
memory management for facets. When the last locale
referencing a facet (by a pointer) goes away, the facet also
will be deleted. (Unless the facet was constructed with
constructor argument »1«.)
 
Now, Udo said that the locale cannot possibly delete the
facet when the »new« quoted above will throw, so there'd be
a leak. He said that the situation with LLVM was similar:
 
void locale::__install_ctor(const locale& other, facet* f, long id)
{ if (f)
__locale_ = new __imp(*other.__locale_, f, id);
else
 
. I then wrote a program for a recent GCC to demonstrate
this supposed leak.
 
The following output of my program shows a situation where
»new _Impl« does not throw, and there is no leak:
 
Allocate facet with new:
New 736 bytes at 0x3ca940.
Create new locale with facet:
Delete 736 bytes at 0x3ca940.
Closing.
Closed.
Everything ok, no memory leak.
 
. The following output of my program shows a situation where
»new _Impl« does throw, and there is a leak:
 
Allocate facet with new:
New 736 bytes at 0x3ca940.
Create new locale with facet:
Throwing:
*** Caught bad_alloc. ***
*** Oh no! memory leak detected! ***
 
You can see my source code below. BTW: Note that an old version
of clang emitted a seemingly spurios warning for my code:
 
main.cpp:112:3: warning: This statement is never executed [clang-analyzer-alpha.deadcode.UnreachableCode]
{ try{ main1(); }
^
 
. Source code (might not be portable and only work with a
recent [as of 2018-12] version of gcc):
 
#include <iostream>
#include <locale>
#include <sstream>
#include <new>
 
static void * traced_memory = nullptr; /* the address of the facet allocated */
 
/* DO NOT change "trace_next_allocation" here, but only in main1, directly before the allocation! */
static auto trace_next_allocation { 0 };
 
/* DO NOT change "simulate_bad_alloc" here, but only in main1, directly before the allocation! */
static auto simulate_bad_alloc { 0 };
 
/* replace standard operator "new" by the following customized version */
void * operator new( size_t const size )
{ if( simulate_bad_alloc )
{ simulate_bad_alloc = 0;
::std::cout << "Throwing:\n";
throw ::std::bad_alloc {}; }
auto const mem = malloc(size);
if( trace_next_allocation )
{ ::std::cout << "New " << size << " bytes at " << mem << ".\n";
traced_memory = mem;
trace_next_allocation = 0; }
return mem; }
 
/* replace standard operator "delete" by the following customized version */
void operator delete( void * const ptr )noexcept
{ if( ptr == traced_memory )::std::cout << "Delete bytes at " << ptr << ".\n";
free( ptr );
if( ptr == traced_memory )traced_memory = nullptr; /* ok, it did NOT leak */ }
 
/* replace standard operator "delete (sized)" by the following customized version */
void operator delete( void * const ptr, std::size_t const size )noexcept
{ if( ptr == traced_memory )::std::cout << "Delete " << size << " bytes at " << ptr << ".\n";
free( ptr );
if( ptr == traced_memory )traced_memory = nullptr; /* ok, it did NOT leak */ }
 
static void check()
{ ::std::cout <<
( traced_memory?
"*** Oh no! memory leak detected! ***\n":
"Everything ok, no memory leak.\n" ); }
 
struct csv_whitespace : std::ctype<wchar_t>{};
 
static void main1()
{ { ::std::istringstream const istringstream{ "" };
::std::cout << "Allocate facet with new:" << '\n';
trace_next_allocation = 1;
auto * const csv_whitespace_facet = new csv_whitespace; /* is this allocation leaked? */
 
std::cout << "Create new locale with facet:" << '\n';
simulate_bad_alloc = 1;
std::locale( istringstream.getloc(), csv_whitespace_facet ); /* object intentionally destroyed immediately after creation */
 
std::cout << "Closing.\n"; }
std::cout << "Closed.\n"; }
 
int main()
{ try{ main1(); }
catch( ::std::bad_alloc& ba ){ ::std::cout << "*** Caught bad_alloc. ***\n"; }
check(); }
ram@zedat.fu-berlin.de (Stefan Ram): Dec 18 07:00PM

>"comp.lang.iso-c++"
 
de.comp.lang.iso-c++
User <aitor.folgoso@googlemail.com>: Dec 18 04:15AM -0800

I was testing how can be done. The Qt metacall function works and I was able to implement it, but I was wondering if it was possible to do it using template parameter calls and functions of different prototypes.
 
I'll check the libsigc++ library, but I wanted a straightforward solution, something to write with a couple a function to test.
 
Thanks anyway
 
Am Samstag, 15. Dezember 2018 22:43:10 UTC+1 schrieb Christian Gollwitzer:
"Ross A. Finlayson" <ross.finlayson@gmail.com>: Dec 18 05:29AM -0800


> // invoke should call destinationFunction but without the argument "2"
> }
 
> Can someone help me?
 
Isn't the point to remove the first one?
 
Removing the first parameter or argument
is probably in the runtime.
 
The compiler probably just made it the pointer.
 
Isn't C++ run-time type to the linker or what?
 
Oh, linking object code, eh, always so simple.
 
What about passing arg - sizeof(arg) and skipping zero.
Cholo Lennon <chololennon@hotmail.com>: Dec 18 12:32AM -0300

On 12/15/18 3:25 AM, Chris M. Thomasson wrote:
> #include <mutex>
> #include <cassert>
 
> #define THREADS 4
 
Unbelievable, 2018 and people still using the preprocessor to define
constants :-O
 
> #define CT_MB_ACQ std::memory_order_acquire
> #define CT_MB_REL std::memory_order_release
> #define CT_MB_RLX std::memory_order_relaxed
 
Again, unbelievable :-O
 
 
 
--
Cholo Lennon
Bs.As.
ARG
"Chris M. Thomasson" <invalid_chris_thomasson@invalid.invalid>: Dec 17 08:27PM -0800

On 12/17/2018 7:32 PM, Cholo Lennon wrote:
 
>> #define THREADS 4
 
> Unbelievable, 2018 and people still using the preprocessor to define
> constants :-O
 
A habit from C: Yikes!
 
 
>> #define CT_MB_REL std::memory_order_release
>> #define CT_MB_RLX std::memory_order_relaxed
 
> Again, unbelievable :-O
 
Actually, imvvho, these can be "useful". It allows one to change memory
order. Imagine observing what happens when one defines CT_MB_ACQ to
std::memory_order_relaxed? Btw, they are slightly shorter... I should
label them with do not panic?
 
 
[...]
David Brown <david.brown@hesbynett.no>: Dec 18 01:12PM +0100

On 18/12/18 05:27, Chris M. Thomasson wrote:
> order. Imagine observing what happens when one defines CT_MB_ACQ to
> std::memory_order_relaxed? Btw, they are slightly shorter... I should
> label them with do not panic?
 
How about:
 
const auto ct_mb_acq = std::memory_order_acquire;
 
?
 
 
But don't give them names like this and then change them "to see what
happens". Use a name like "per_thread_memory_order", or something that
identifies how you are using the constant. /Then/ you can change it to
see the effect.
 
You do that with "THREADS" - it is named by usage, and you can change
the value without making the program nonsensical. It would be highly
confusing to write:
 
#define FOUR 4
 
std::thread threads[FOUR];
 
and then experiment with:
 
#define FOUR 8
"Chris M. Thomasson" <invalid_chris_thomasson@invalid.invalid>: Dec 17 11:41PM -0800

This allows a thread to get around the ABA problem, in a poor mans RCU
style. The C++ code uses Relacy Race Detector. It eases my mind to run
it as a unit test before I code it in pure C++11.
 
Using simple per-thread locks and a little quiescence detection scheme
to keep things Kosher.
 
https://groups.google.com/forum/#!topic/lock-free/ryIAaCod7OQ
 
https://pastebin.com/raw/FpnvTqvM
 
_________________________
// Simple Per-Thread Mutex Based Proxy Collector
// For Academic and Experimental things...
// Beginner, Moderate?
// In Good Ol' Relacy! Nice as always.
//_______________________________________________
 
 
//#define RL_DEBUGBREAK_ON_ASSERT
//#define RL_MSVC_OUTPUT
//#define RL_FORCE_SEQ_CST
//#define RL_GC
 
 
#include <relacy/relacy_std.hpp>
#include <vector>
#include <algorithm>
#include <iostream>
 
 
// Simple macro based redirection of the verobsse std membars.
#define CT_MB_ACQ std::memory_order_acquire
#define CT_MB_REL std::memory_order_release
#define CT_MB_RLX std::memory_order_relaxed
#define CT_MB_ACQ_REL std::memory_order_acq_rel
#define CT_MB(mp_0) std::atomic_thread_fence(mp_0)
 
 
// Some global vars directing the show...
#define WORKERS 4
#define REAPERS 2 // set it to one for now...
#define ITERS 5
#define THREADS (WORKERS + REAPERS)
 
 
// A User Node
struct unode
{
std::atomic<unode*> m_next;
VAR_T(unsigned long) m_data;
unode(unsigned long data, unode* next = nullptr) : m_data(data),
m_next(next) {}
};
 
 
// The global ptex state
template<std::size_t T_threads>
struct ptex
{
 
// Per-thread state
struct per_thread
{
std::mutex m_quiesce; // essential for this scheme
VAR_T(unsigned long) m_ver_local; // for future use
std::atomic<unsigned long> m_ver_reap; // for future use
 
 
per_thread() : m_ver_local(0), m_ver_reap(0)
{
//std::cout << "ptex::per_thread::per_thread()\n";
}
 
~per_thread() throw()
{
//std::cout << "ptex::per_thread::~per_thread()\n";
}
 
 
// A thread enters a region
void acquire()
{
unsigned long nver = VAR(m_ver_local)++;
m_quiesce.lock($);
m_ver_reap.store(nver, CT_MB_RLX);
CT_MB(CT_MB_ACQ);
}
 
 
// A thread exits a region
void release()
{
CT_MB(CT_MB_REL);
unsigned long ver = VAR(m_ver_local);
m_ver_reap.store(ver + 1, CT_MB_RLX);
m_quiesce.unlock($);
VAR(m_ver_local) = ver + 1;
}
 
 
void quiesce_brute()
{
// Ensure quiescence
m_quiesce.lock($);
m_quiesce.unlock($);
}
};
 
 
per_thread m_threads[T_threads];
std::atomic<unode*> m_collected;
 
 
ptex() : m_collected(nullptr)
{
//std::cout << "ptex::ptex()\n";
}
 
 
~ptex() throw()
{
RL_ASSERT(!m_collected.load(CT_MB_RLX));
//std::cout << "ptex::~ptex()\n";
}
 
// Gain a ref to our threads per-thread struct
per_thread& get_ref() throw()
{
return m_threads[rl::thread_index()];
}
 
// delete everything...
void dump(unode* n)
{
while (n)
{
unode* next = n->m_next.load(CT_MB_RLX);
delete n;
n = next;
}
}
 
 
// We are finished, collect the node...
void collect(unode* n) throw()
{
unode* head = m_collected.load(CT_MB_RLX);
 
do
{
n->m_next.store(head, CT_MB_RLX);
CT_MB(CT_MB_REL); // release
 
} while (!m_collected.compare_exchange_weak(
head, n, CT_MB_RLX));
}
 
 
// Flush the stack
unode* reap() throw()
{
unode* head = m_collected.exchange(nullptr, CT_MB_RLX);
if (head) CT_MB(CT_MB_ACQ); // acquire
return head;
}
 
 
// Wait for a quiesce
void quiesce_brute() throw()
{
for (std::size_t i = 0; i < T_threads; ++i)
{
per_thread& pt = m_threads[i];
//pt.quiesce();
pt.quiesce_brute();
}
}
};
 
 
// Relacy Multex Test...
struct ct_multex_test
: rl::test_suite<ct_multex_test, THREADS>
{
typedef ptex<THREADS> ptex_t;
ptex_t g_ptex; // Global ptex
std::atomic<unode*> g_worker_stack; // Global User Stack
unsigned long g_contrived_shutdown; // Just for shutdown
 
 
void before()
{
g_worker_stack.store(nullptr, CT_MB_RLX);
g_contrived_shutdown = WORKERS;
}
 
 
void after()
{
 
}
 
 
// Add a user item to the stack
void worker_push(unode* n)
{
unode* head = g_worker_stack.load(CT_MB_RLX);
 
do
{
n->m_next.store(head, CT_MB_RLX);
CT_MB(CT_MB_REL); // release
} while (!g_worker_stack.compare_exchange_weak(head, n,
CT_MB_RLX));
}
 
// Remove a user item from the stack
unode* worker_pop_try()
{
unode* head = g_worker_stack.load(CT_MB_RLX);
unode* next = nullptr;
 
do
{
if (!head) break;
CT_MB(CT_MB_ACQ); // acquire
next = head->m_next.load(CT_MB_RLX); // critical load! ;^o
 
} while (!g_worker_stack.compare_exchange_weak(head, next,
CT_MB_RLX));
 
return head;
}
 
 
// A User Worker Example. The main point!
void worker()
{
ptex_t::per_thread& self = g_ptex.get_ref();
//std::cout << "worker[" << rl::thread_index() << "]\n";
 
// perform some work
for (unsigned long i = 0; i < ITERS; ++i)
{
// Push a node
unode* n = new unode(rl::thread_index());
 
worker_push(n);
 
 
// Pop a node
unode* nx = nullptr;
 
for (;;)
{
self.acquire(); // Per thread acquire
 
nx = worker_pop_try(); // In critical region
 
if (nx)
{
unsigned long data = VAR(nx->m_data);
//std::cout << data << "\n";
RL_ASSERT(data < THREADS);
}
 
self.release(); // Per thread acquire
 
if (nx) break;
 
rl::yield(1, $);
}
 
// Collect a node
g_ptex.collect(nx); // We are fin with nx
 
 
// Quiesce a couple of times per iteration.
if ((i % 2) == 0)
{
//self.quiesce(); // without brute force
}
}
 
// remove our worker
--g_contrived_shutdown;
}
 
 
 
// Holds user items in a stack
struct pending_reap
{
unode* m_head;
unode* m_tail;
 
pending_reap() : m_head(nullptr), m_tail(nullptr) {}
 
// push items
void push(unode* head, unode* tail)
{
if (!m_head)
{
m_head = head;
m_tail = tail;
}
 
else
{
tail->m_next.store(m_head, CT_MB_RLX);
m_head = head;
}
}
 
// dump all
void dump()
{
unode* n = m_head;
m_head = m_tail = nullptr;
 
while (n)
{
unode* next = n->m_next.load(CT_MB_RLX);
delete n;
n = next;
}
}
};
 
 
// Tracks a per_thread quiescence
struct thread_quiesce_node
{
thread_quiesce_node* m_next;
ptex_t::per_thread* m_thread;
 
thread_quiesce_node(ptex_t::per_thread* thread) :
m_next(nullptr), m_thread(thread) {}
};
 
 
// tracks a reapers control of its view of quiescence
struct thread_quiesce
{
thread_quiesce_node* m_head;
thread_quiesce_node* m_ready;
ptex_t& m_ptex;
 
thread_quiesce(ptex_t& ptex) : m_head(nullptr),
m_ready(nullptr), m_ptex(ptex)
{
for (std::size_t i = 0; i < THREADS; ++i)
{
thread_quiesce_node* n =
new thread_quiesce_node(&ptex.m_threads[i]);
 
n->m_next = m_head;
m_head = n;
}
}
 
~thread_quiesce()
{
thread_quiesce_node* n = m_head;
 
// Dump the head
while (n)
{
thread_quiesce_node* next = n->m_next;
delete n;
n = next;
}
 
// Dump ready... We are at dtor anyway.
n = m_ready;
 
while (n)
{
thread_quiesce_node* next = n->m_next;
delete n;
n = next;
}
}
 
 
// Run a quiescence detection cycle
bool quiesce()
{
// Split sort...
thread_quiesce_node* not_ready = nullptr;
thread_quiesce_node* n = m_head;
 
// Iterate...
while (n)
{
thread_quiesce_node* next = n->m_next;
 
if (!n->m_thread->m_quiesce.try_lock($))
{
// NOT ready for the cycle...
n->m_next = not_ready;
not_ready = n;
}
 
else
{
// PERFECT! We can run full steam ahead. :^)
n->m_thread->m_quiesce.unlock($);
n->m_next = m_ready;
m_ready = n;
}
 
n = next;
}
 
m_head = not_ready;
 
if (m_head) return false;
 
// BINGO: TARGET ACQUIRED!
m_head = m_ready;
m_ready = nullptr;
 
return true;
}
};
 
// Add a reap to a reap...
unode* reap_penders(pending_reap& reap)
{
unode* fhead = g_ptex.reap();
 
if (fhead)
{
// Find tail
unode* ftail = fhead;
unode* next = ftail->m_next.load(CT_MB_RLX);
 
while (next)
{
ftail = next;
next = next->m_next.load(CT_MB_RLX);
}
 
// Push onto fresh queue
reap.push(fhead, ftail);
}
 
return fhead;
}
 
 
// Actually deletes user items
void reaper()
{
ptex_t::per_thread& self = g_ptex.get_ref();
//std::cout << "reaper[" << rl::thread_index() << "]\n";
 
pending_reap fresh;
pending_reap old;
thread_quiesce tquiesce(g_ptex);
 
unsigned long qcount = 0;
 
// Reap Loop
{
unode* fhead = nullptr;
 
while (g_contrived_shutdown != 0)
{
reap_penders(fresh);
 
if (tquiesce.quiesce())
{
// The good ol' two step...
pending_reap dumpster = old;
old = fresh;
fresh.m_head = fresh.m_tail = nullptr;
 
dumpster.dump();
}
 
rl::yield(1, $);
}
}
 
// Clean up, for we are fin.
reap_penders(fresh);
 
g_ptex.quiesce_brute(); // wait for a quiescence point
 
// Flush... Drain sound... ;^)
fresh.dump();
old.dump();
}
 
 
// Direct the work to the correct threads... ;^)
void thread(unsigned int tidx)
{
if (tidx < REAPERS)
{
reaper();
}
 
else
{
worker();
}
}
};
 
 
 
// Test away... Or fly? Humm...
int main()
{
{
rl::test_params p;
 
p.iteration_count = 5000000;
//p.execution_depth_limit = 33333;
// p.search_type = rl::sched_bound;
//p.search_type = rl::fair_full_search_scheduler_type;
//p.search_type = rl::fair_context_bound_scheduler_type;
 
rl::simulate<ct_multex_test>(p);
}
 
return 0;
}
 
_________________________
 
 
Any interest?
You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page.
To unsubscribe from this group and stop receiving emails from it send an email to comp.lang.c+++unsubscribe@googlegroups.com.

No comments: