soft and program: Digest for comp.lang.c++@googlegroups.com

comp.lang.c++@googlegroups.com

Google Groups

Simulating piping one program into another, and into another - 7 Updates
Why can't I understand what coroutines are? - 15 Updates
Periodic message transfers using asio - 1 Update
In the end, rason will come - 1 Update
I think that the future looks much more bright for parallel programming - 1 Update

Simulating piping one program into another, and into another

Frederick Gotham <cauldwell.thomas@gmail.com>: Aug 05 03:22AM -0700

So let's say I have three programs. Normally I would run these three programs at the commandline as follows:

prog1 | prog2 | prog3

So let's say I take the source code for these 3 programs and try to combine them into one. So I rename the three 'main' functions and then make a new 'main'. I start off with code like this:

int main_prog1();
int main_prog2();
int main_prog3();

int main()
{
int const retval1 = main_prog1();
int const retval2 = main_prog2();
int const retval3 = main_prog3();

return retval1 & retval2 & retval3;
}

An alternative method would be to start two more threads so that each line would be processed "on the fly", but for now I'm going to work with one thread, so 'main_prog1' will finish completely before 'main_prog2' begins.

How would you go about doing this?

Here's what I'm thinking so far. . .

Go through the code for prog1 and replace all occurrences of "cout" with "cout1". Do the same with prog2 (i.e. cout2). Same goes for standard input (cin2, cin3).

Next create a header file with something like:

#include <iostream>

extern std::stringstream cout1, cout2;

static std::stringtream &cin2 = cout1;
static std::stringtream &cin3 = cout2;

So the first program will write to cout1, and then the second program will read from cin2.

So the previous code snippet becomes something like:

stringstream cout1, cout2;

stringtream &cin2 = cout1;
stringtream &cin3 = cout2;

int main_prog1();
int main_prog2();
int main_prog3();

int main()
{
int const retval1 = main_prog1();

cout1.seekg(0, std::ios::beg);

int const retval2 = main_prog2();

cout2.seekg(0, std::ios::beg);

int const retval3 = main_prog3();

return retval1 & retval2 & retval3;
}

Have any of you ever done this before? What do you think of my idea? What way would you do it?

Frederick

Jorgen Grahn <grahn+nntp@snipabacken.se>: Aug 05 11:48AM

On Mon, 2019-08-05, Frederick Gotham wrote:

> So let's say I have three programs. Normally I would run these three
> programs at the commandline as follows:

> prog1 | prog2 | prog3

That's a core idea in the Unix world, yes.

> So let's say I take the source code for these 3 programs and try to
> combine them into one.

But why? If you have a problem that can be solved with a Unix
pipeline, count yourself lucky. There's no drop-in replacement
in C++ or elsewhere, except possibly in functional languages like
Haskell or Erlang.

/Jorgen

--
// Jorgen Grahn <grahn@ Oo o. . .
\X/ snipabacken.se> O o .

"Öö Tiib" <ootiib@hot.ee>: Aug 05 05:48AM -0700

On Monday, 5 August 2019 13:23:01 UTC+3, Frederick Gotham wrote:
> So let's say I have three programs. Normally I would run these three programs at the commandline as follows:

> prog1 | prog2 | prog3

More or less, there are typically some command line arguments.

> So let's say I take the source code for these 3 programs and try to combine them into one.

Before doing it you should think why. Are pipes inefficient for your
use case? There is Boost.Interprocess with plenty of tools for more
efficient inter-process communications. Do you hope for optimizations
in interfaces between modules? The streams won't anyway allow much.
Do such modules share lot of code? Use shared objects or DLLs.

> Have any of you ever done this before? What do you think of my idea? What way would you do it?

I like to keep modules small if possible. I have done in other
direction split single large code base into several. For frequent
example kicked filters/converters of old, rarely used file formats
or versions or functionality into separate, rarely used processes.
It lets main processing module to use single input format/version
and single output format version and that can simplify it a lot.
It can cause some performance hit to rarely used functionality but
more frequently needed modules load and execute quicker and take
less resources. I have sometimes replaced pipes with RPC so I can do
more than pipes allow and can spread the modules to different
hosts easier. Only thing that is needed for such decisions is to
collect statistics of frequency and performance of feature usage.
That can be tricky with on-premise or embedded software (that C++
is often about).

James Kuyper <jameskuyper@alumni.caltech.edu>: Aug 05 09:58AM -0400

On 8/5/19 6:22 AM, Frederick Gotham wrote:
> }

> An alternative method would be to start two more threads so that each line would be processed "on the fly", but for now I'm going to work with one thread, so 'main_prog1' will finish completely before 'main_prog2' begins.

> How would you go about doing this?

Offhand, I would do "prog1 | prog2 | prog3" - it's a lot simpler and in
many contexts can be more efficient. Why do you want to take a different
approach?

Szyk Cech <szykcech@spoko.pl>: Aug 05 05:28PM +0200

On 05.08.2019 15:58, James Kuyper wrote:
> Offhand, I would do "prog1 | prog2 | prog3" - it's a lot simpler and in
> many contexts can be more efficient. Why do you want to take a different
> approach?

Try debug prog1, prog2 and prog3 simultaneously...

Paavo Helde <myfirstname@osa.pri.ee>: Aug 05 07:41PM +0300

On 5.08.2019 18:28, Szyk Cech wrote:
>> many contexts can be more efficient. Why do you want to take a different
>> approach?

> Try debug prog1, prog2 and prog3 simultaneously...

Why would I want to do that? One of the most important benefits of
modular design like in "prog1 | prog2 | prog3" is better localization of
problems, so the system can be debugged one component at a time, making
the task *much* easier.

James Kuyper <jameskuyper@alumni.caltech.edu>: Aug 05 09:57AM -0700

If the three programs interacted with each other, directly or indirectly, by any method other than the pipeline, the OP's suggestion wouldn't work. If they interact only through the pipeline, there's no need to debug them simultaneously. Debug the first program while dumping it's output to a file; debug the second program while reading from that file and dumping to a second file; debug the third program while reading from the second file.

Why can't I understand what coroutines are?

"Alf P. Steinbach" <alf.p.steinbach+usenet@gmail.com>: Aug 05 01:11AM +0200

On 04.08.2019 11:37, Juha Nieminen wrote:
> that are completely mysterious even to me, I have *absolutely no idea*
> what they are. I just don't get it.

> I go to the Wikipedia page "Coroutine"... and it tells me nothing.

[snip]

I'll address the conceptual only.

When I saw this posting earlier today I thought I'd whip up a concrete
example, like I implemented coroutines in the mid 1990's. However, I
discovered that I would then be late for a dinner, so I put it on hold.

Coroutines are just cooperative multitasking, multitasking with a single
thread of execution but multiple call stacks. When you CALL a coroutine
you create a new instance, with its own stack. There is nothing like
that in the standard library. When a coroutine TRANSFERs to another
coroutine it's like a `longjmp`, except that `longjmp` was designed to
jump up to an earlier point in a call chain, while a coroutine transfer
jumps to a separate call chain.

As I recall you're familiar with 16-bit Windows programming.

In 16-bit Windows (Windows 3.x in the early 1990s) each program
execution was a coroutine. When a program was launched, a stack area was
allocated for it. That's a call of a coroutine. When the program called
`GetMessage` or `Yield`, some other program instance would get a chance
to continue to run (after waiting for /its/ call to `GetMessage` or
`Yield` to return). That's a coroutine transfer.

The 16-bit Windows program execution were /huge/, heavy coroutines.

However, the main advantage of coroutines in ordinary programming is
with coroutines as a very light-weight multitasking solution. Since
there's only one thread of execution there are fewer synchronization
issues. In particular one doesn't have to worry about whether one
coroutine sees the memory changes effected by some other coroutine.

Cheers & hht.,

- Alf

Sam <sam@email-scan.com>: Aug 04 09:32PM -0400

Chris Vine writes:

> > instead of getting torn into shreds in order to conform to event loop or
> > callback-based design patterns.

> I disagree. And "hundreds of connections" is not good enough.

You can disagree all you want. Historical facts prove otherwise. Since
ancient days, the simplest task on Unixes were done as pipelined tasks, as
multiple execution contexts.

$ sort <file | uniq -c

That dates back decades.From the earlier days, shells were explicitly
designed to execute pipelined commands. Pipelined commands were their main
feature. All of this is pretty much I/O bound. Multiple execution threads.

Because of that, and battle-tested over decades, Unix was tuned to
effortlessly implement tasks that use multiple processes working in
parallel. Threads were just the next evolution, dropping most of what little
was left, in terms of per-execution context overhead. Linux inherited Unix's
legacy, and orientation, towards low-overhead multiple execution threads.

All the historical network server processes on Unix, then Linux, were multi-
processor based. Even though select() existed pretty much since 4.2 BSD
days, it was virtually unheard of for a network server to be a monolithic
process, multiplexing for its clients using select(). An incoming network
connection starts a new process, just for that network connection. Didn't
matter what network service it was. SMTP, finger, login, or what, it was all
one process per connection.

There was a reason for that. It didn't take an Einstein to figure out that
with a monolithic server, once it picked up an I/O event it had to do what
that event needs to do, and nothing else can happen until that's done. Even
if another client blurted out a packet, too bad, so sad. It needs to wait
until the current I/O event's processing is done. It made a lot more sense
to simply have another available CPU, not being busy with anything, run with
it. And, goshdarndammit, if that socket had a different execution thread
sucking from it, why, that spare CPU will know exactly what to do.

So, monolithic processes were rare. They are a little bit more common today
than before, but they are still a rarity. And, today, even network servers
that kick off multiple processes per client can be found. So, sorry, but
facts disagree: multiple execution contexts, either as standalone processes
or lightweight intra-process threads rule the roost in I/O bound contexts.
Linux inherits Unix's legacy in this respect, and the differences between
processes and threads, on Linux, are mostly cosmetic. clone() creates both
of them. You just specify which part of the parent process (virtual memory
map, file descriptors, filesystem namespace etc…) are shared with the new
execution context, and that's pretty much.

I can understand why coming from a MS-Windows background makes someone frown
at execution threads, unless all they do is stay away from the operating
system, and work entirely on their own, spinning the CPU. The only reason
stuff like IOCP exists on Windows is because windows sucks at multi-
threading and multi-processing. It always sucked, and will likely always
suck.

But that's not the case with the real world out there. Being able to have a
clean, encapsulated, logical execution thread, always busy with the forward
march of progress from the start pointing to the finish line, without being
forced into contorting into an event/dispatch based framework, results in
smaller, leaner code without all that event-based/dispatching cruft. It
doesn't matter which one of the file descriptors is now ready for I/O. It no
longer matters. The OS kernel already figured it out, and there's no good
reason for userspace to piss away more CPU cycles doing exactly the same
thing. The OS kernel knows which I/O is done, and which thread is waiting on
it, and it already knows whose ass to kick into gear.

A such, execution threads are perfect for I/O-based load. It is no wonder
that Linux rules the roost in TOP500 space. It's a dirty little secret that
all those supercomputers are just a bunch of machines with optimized
networking tying hundreds of thousands threads together. Yes, threads.

Gee whiz, how can that possibly happen?

Robert Wessel <robertwessel2@yahoo.com>: Aug 04 10:04PM -0500

>writing clean, orderly code that runs logically from start to finish,
>instead of getting torn into shreds in order to conform to event loop or
>callback-based design patterns.

If you only have "hundreds" of connections, use any technique you
like, scaling isn't an issue.

"Chris M. Thomasson" <invalid_chris_thomasson_invalid@invalid.com>: Aug 04 08:10PM -0700

On 8/4/2019 2:36 PM, Sam wrote:
> its crap multithreading.

> A thread per connection scales perfectly fine, on Linux. Even with
> hundreds of connections.

Hundreds? Try tens of thousands...

Back on WinNT 4.0, my server code could handle around 40,000 concurrent
TCP connections, using a handful of threads.

There were bursts of activity, where many connections were being rapidly
created and destroyed. My stress testing client programs tried to swamp
the server with various scenarios. This was a long time ago.

Paavo Helde <myfirstname@osa.pri.ee>: Aug 05 09:36AM +0300

On 4.08.2019 22:30, Chris Vine wrote:
> summary wrong (and if so, what's the point of completion ports in the
> first place)? I find it hard to believe microsoft would have gone for
> the one thread per connection option favoured by my respondent.

IOCP is used in Windows to *reduce* the number of needed threads (at
least in user space), which is important because in Windows the thread
creation is a pretty heavyweight operation. So it's strange that IOCP is
mentioned as an example of "Threads can work fairly well with IO", it's
rather the opposite, at least on Windows.

The Linux equivalent of IOCP is AIO, see e.g. "man aio_read". However,
some googling suggests the Linux implementation is not yet optimal (I
guess this means it is not yet built into the kernel to the same extent
as in Windows).

Also, the popular Boost.Asio library has "asynchronous" already in its
name and uses async IO in the background as appropriate. It also has
implementations of coroutines, so maybe one can study its documentation
and examples in order to get familiar with coroutines.

BTW, Boost.Asio does not create a "thread per connection". Vice versa,
it expects the client program to set up a thread pool (with the number
of threads based on the number of cores, typically) which is reused for
all requests.

Martijn van Buul <pino@dohd.org>: Aug 05 07:46AM

* Juha Nieminen:
> C++20 will introduce support for coroutines. To this day, for reasons
> that are completely mysterious even to me, I have *absolutely no idea*
> what they are. I just don't get it.

I've been using lua in the past, and I was heavily using coroutines at
some point. I don't claim to be an expert on these matters, but I'll
tell you why *I* used them.

The key is in the following snippet :

> remove some items from q
> use the items
> yield to produce

... except that the snippet threw away the child with the bath water, by
explicitly yielding to a specific target in both directions. That doesn't
add any real benefit indeed, but when done differently, coroutines offer
two benefits:

1) They decouple the consumer and the producer.
2) You don't need to maintain state in an explicit container, you can
just use local variables on the stack.

While coroutines certainly aren't necessary here ("But I can do it with a
function call to an interface method" is a straw man argument, as noone
claimed otherwise - there are always multiple solutions to a single problem)
but they made some solutions certainly a lot more elegant.

I would suggest reading chapter 9 of "Programming in Lua". I know it's not
C++, but maybe it'll help as a primer. An older version is online at

https://www.lua.org/pil/9.1.html

--
Martijn van Buul - pino@dohd.org

Juha Nieminen <nospam@thanks.invalid>: Aug 05 08:31AM

> results back to the main program. A coroutine that yields waits until
> it's told to resume by the main program; a coroutine that returns
> simply goes away and cannot be told to resume.

I'm still not sure how that's different from a regular function.

Or, perhaps more precisely, a lambda function (because as far as
I understand, a coroutine can have its own state, just like a
lambda can have captured variables, making the lambda effectively
a stateful functor.)

Is the difference that a coroutine can "return" (so to speak)
from anywhere inside itself, and the next time it gets "called"
again it resumes from that point forward, rather than from the
beginning (like a lambda would)?

I'm still not exactly sure how that's so useful in practice
(compared to stateful functors, like lambdas.)

Juha Nieminen <nospam@thanks.invalid>: Aug 05 08:37AM

> This problem was solved a long time ago. It is called "threads".

I think that with "threads" you are implying pre-emptive multitasking.
Doesn't that introduce lots of problems relating to mutual exclusion?
Quite a significant portion of bugs out there are related to
multithreading problems.

Juha Nieminen <nospam@thanks.invalid>: Aug 05 08:42AM

> a bunch of state from local variables), and then continue from that
> point (still those dozen subroutine calls and loops deep, with state
> intact), rather than at the beginning of the routine again.

From all the text and discussion out there I'm getting the picture that
coroutines are a bit like lambda functions (ie. essentially stateful
functors), with the difference that the coroutine can be "called" again
in such a manner that it continues from where it "returned" last time,
rather than always from the beginning (which would happen with lambdas).

Is that correct?

Is there a simple example of a situation where this is significantly
beneficial compared to a lambda or an explicitly written functor
(ie. a class with member variables and an operator())?

Juha Nieminen <nospam@thanks.invalid>: Aug 05 08:50AM

> Coroutines are just cooperative multitasking, multitasking with a single
> thread of execution but multiple call stacks. When you CALL a coroutine
> you create a new instance, with its own stack.

Does this mean that if a coroutine calls other functions normally (which
themselves may then call other functions and so on), this chain of calls
will use its own stack that's separate and independent from the stack
that's used by main() and everything it calls?

And if any of those functions along the chain "yields" (or whatever the
term was to "exit" this "pseudo-thread"), to later be returned to that
point of execution, all that stack is still there and this chain of
function calls will continue as before?

If that's the case, then your explanation gave me completely new insight
into coroutines that I didn't know nor understand before.

Martijn van Buul <pino@dohd.org>: Aug 05 09:51AM

* Juha Nieminen:

> from anywhere inside itself, and the next time it gets "called"
> again it resumes from that point forward, rather than from the
> beginning (like a lambda would)?

Yup. It acts like a blocking call, in a way.

> I'm still not exactly sure how that's so useful in practice
> (compared to stateful functors, like lambdas.)

Consider a producer/consumer implemented using threads. In this
case 'publish' and 'consume' operate on some kind of queue, possibly
with a depth of 1. If a producer publishes something to a full queue,
it blocks. If the consumer tries to consume from an empty queue, it
blocks.

class ProducerClass
{
[...]

void mainLoop()
{
while ( ... )
{
Item newItem;

[... do work on newItem ...]

publish( newItem);
}
}
}

class ConsumerClass
{
[...]

void mainLoop()
{
while ( ... )
{
std::vector<Item> packet;
packet.reserve(10);

for (int i = 0; i < packet.capacity(); ++i)
{
Item newItem = consume();

[.. possibly do something on newItem ...]
packet.push_back(newItem);
}

[... do something on a packet of 10 items ...]

}
}
}

In this case, having the consumer and producer execute in their own
thread offers no performance benefit, but it might still be very beneficial
to implement it this way, depending on details of either consumer or
producer. As with all examples, the above example is a bit silly.

In this case, with a 1:1 relationship between consumer and producer, such
a threaded solution would offer no parallelisation, so using threads here
only serves to simplify the implementation of the consumer or producer -
performancewise it's detrimental. In this case, coroutines would offer
a compromise: The separation and implementation benefits of a threaded
solution, without the overhead caused by blocking.

Another possible use of coroutines would be iterators. Suppose I have a

struct Foo
{

};

struct Bar
{
[...]
std::list<Foo> foos;
}

struct Quux
{
[...]
std::vector<Bar> bars;
}

struct Wibble
{
std::map< ..., Quux> quuxMap;
}

Suppose I want to offer a way to iterate over all instances of "Foo" inside
a "Wibble", without exposing any of the classes inbetween (because they're
private to an implementation, for example). There are numerous options:

* Create a std::container<Foo> GetFoos(const Wibble &) method. Could be
expensive (because Foo is expensive to copy), impossible (because
Foo has a deleted copy constructor) - and is generally undesireable
anyway.
* Create an

ApplyOnFoos(const Wibble &, const std::function<void(const Foo&)> &callback)

method (or something similar using templates, details schmetails) that
iterates over all elements and calls the callback function for every
instance of 'Foo'. Could work, but the receiving end might have
to capture quite a bit to make this work.
* Create an iterator class along the line of

class outputFunctor
{
outputFunctor(Wibble &) {...}

std::optional<std::reference_wrapper<Foo>> operator() {...}
}

(Or create an STL iterator, which is essentially the same). Will
definately work, but implementation is going to be awkward.

The first two have the benefit that their implementation can use simple
constructs (more to the point: A nested range-based for loop will do the
trick). Without coroutines, implementing the functor will require storing
iterators to the intermediate class inside the functor. With coroutines,
you can implement it using the same simple nested-range for loop.

I haven't touched C++'s coroutines yet (I'm waiting for C++20 to arrive)
so I don't know whether the required boilerplate will offset any
implementation benefits, but in Lua it would be something like

function outputIterator(wibble)
local _generator = coroutine.create(
function()
for (_, quux) in pairs(wibble.quuxMap) do
for (_, bar) in pairs(quux.bars) do
for(_, foo) in pairs(bar.foos) do
coroutine.yield(foo)
end
end
end
end)

return function ()
local _, nextFoo = coroutine.resume(_generator)
return nextFoo
end
end

usage:

local myWbble = [....]

for (foo in outputIterator(myWibble)) do
print ("Yay, a foo!", tostring(foo))
end

(Untested, and my lua is a bit rusty)

--
Martijn van Buul - pino@dohd.org

David Brown <david.brown@hesbynett.no>: Aug 05 12:06PM +0200

On 05/08/2019 10:37, Juha Nieminen wrote:
> Doesn't that introduce lots of problems relating to mutual exclusion?
> Quite a significant portion of bugs out there are related to
> multithreading problems.

Yes, it certainly does.

One way to think about coroutines is like threads, but cooperatively
multitasking instead of preemptive multitasking. You, the programmer,
have full control over which coroutine is running at a time, and when it
can be blocked. This has disadvantages, of course - it means more
manual control, and it cannot take advantage of multiple cores (you need
to run your coroutines from different threads for that). But it has the
advantage of being simpler and lighter (not every system is a multi-core
multi-GHz monster), and you don't need to worry about locking,
synchronisation, atomic accesses, races, contention, or anything like that.

I am looking forward to coroutines for my sub-GHz single core embedded
systems.

Juha Nieminen <nospam@thanks.invalid>: Aug 05 12:20PM

> One way to think about coroutines is like threads, but cooperatively
> multitasking instead of preemptive multitasking.

From another forum I got the impression that C++20 coroutines are stackless.
Meaning that they don't have their own stack that's separate from the one
used by main(). Which means that you can't have a chain of function calls
in a coroutine, have one of those functions "yield", and then later return
back to that point and resume execution as normal. (Basically, and if I
understand correctly, only the coroutine function itself can "yield",
not any of the functions it might call in the normal way.)

This would make coroutines rather different from threads. Actual threads
have each their own stack that they use for function call parameters etc,
and which is independent from other threads.

In other words, it sounds to me like coroutines are, essentially,
lambda functions, or stateful functor objects, with a jump at the
beginning of the function to the position in the code that last yielded.

Chris Vine <chris@cvine--nospam--.freeserve.co.uk>: Aug 05 01:35PM +0100

On Mon, 5 Aug 2019 08:42:48 -0000 (UTC)

> Is there a simple example of a situation where this is significantly
> beneficial compared to a lambda or an explicitly written functor
> (ie. a class with member variables and an operator())?

This is essentially correct although in speaking of lambda functions
with closures you are moving from the concept ("what is a coroutine?")
to one possible implementation.

Asymmetric coroutines (ones which can only suspend to the caller) are
indeed like ordinary functions (not necessarily lambda functions) except
that they can emit an enhanced return statement at some point in the
function's execution and, by some arrangement to be implemented, return
to that point later. Symmetric coroutines don't "return" in quite this
way - instead they yield to some other coroutine, which need not be the
same as the one that last resumed them.

One possible but fragile implementation of a coroutine is a stateful
functor built by hand along the lines you describe. For example, for an
asymmetric coroutine you could construct a functor whose operator() has
three return statements as possible exit points. The functor object
could have an enum as member data which describes what point the
function has at present reached. When the functor is to "yield" it
sets the enum to the correct point, saves its local variables to the
functor's data variables and then returns. When invoked again
("resumed") it first examines the enum and jumps to the correct part of
the operator() function where it is to resume, and if necessary
re-establishes its local state. Note also that C has setjmp() and
longjmp() as primitives which are similar to delimited continuations
(but not recommended for use in C++ because longjumps give undefined
behaviour if non-trivial objects are in scope).

In addition to symmetric and asymmetric coroutines, you can also have
stackfull and stackless coroutines. "Stackless" is a bit misleading:
it means that the coroutine does not need a stack while it is suspended
(clearly it needs one when resumed), allowing the stack to be reused
in the meantime.

"Alf P. Steinbach" <alf.p.steinbach+usenet@gmail.com>: Aug 05 04:59PM +0200

On 05.08.2019 10:50, Juha Nieminen wrote:
> themselves may then call other functions and so on), this chain of calls
> will use its own stack that's separate and independent from the stack
> that's used by main() and everything it calls?

Conceptually, yes, and with a completely general implementation such as
Windows fibers or the old Boost coroutines, or e.g. Modula-2 coroutines,
yes.

However, C++ coroutines will be so called "stackless" coroutines as an
optimization and restriction. According to ¹cppreference, "Coroutines
are stackless: they suspend execution by returning to the caller". As I
read it that means all coroutines using the C++20 syntax.

"Stackless" is possible by restricting the place that a coroutine
transfer can be specified to the couroutine body itself, with no
transfer in any code that it calls.

With this restriction one ensures that the coroutine's stack has a
/very/ small maximum size when a transfer happens. AND its own
parameters and expression evaluation is then all that it needs its own
little stack for, because when a call out of the routine can't cause a
transfer, then such a call will return with the stack that's used for
the call, reverted back to the state at the call. Which means that calls
out of the coroutine can be done on a stack that's common to all such
"stackless" coroutines.

The restriction also plays another rôle: with C++20 coroutines a
function is designated as a coroutine, i.e. that's the way that you
specify that it is a coroutine, by containing one of the keywords that
specify a transfer, namely any of `co_await`, `co_yield` or `co_return`.

> term was to "exit" this "pseudo-thread"), to later be returned to that
> point of execution, all that stack is still there and this chain of
> function calls will continue as before?

Yes, with general coroutines.

But not with C++20 coroutines, due to the optimization & restriction.

> If that's the case, then your explanation gave me completely new insight
> into coroutines that I didn't know nor understand before.

Thanks.

Cheers!,

- Alf

Links:
¹ https://en.cppreference.com/w/cpp/language/coroutines

Periodic message transfers using asio

Jorgen Grahn <grahn+nntp@snipabacken.se>: Aug 05 12:05PM

On Sun, 2019-08-04, Paavo Helde wrote:
> On 4.08.2019 5:53, M Powell wrote:

>> For starters I apologize if I'm in the wrong forum. A colleague is
>> using asio IO

You later write about TCP so I assume that's what it really is, TCP
I/O implemented using some part of the huge and varied Asio library.

>> to transfer messages between two applications. The
>> message size is 300 bytes and one of the apps is leveraging a

Replace 'is leveraging' with 'uses' and you'll annoy less people.

>> before incrementing the counter and sending the message

>> The process repeats periodically at the interval specified. app1
>> has metrics that tracks transfer rate and dropped messages.

Dropped messages over TCP?

> especially the variations in its performance. And the 10 kHz number is
> not something special, similar problems are there always when you
> require something to happen during some fixed time.

I'm pretty sure it would be both. TCP is a stream-oriented protocol
with no real-time properties; if the stack sees you do tiny writes to
the stream at a high rate it might (and /should/) start lumping them
together to avoid wasting computer and network resources.

Sounds to me the protocol is misdesigned, if it expects heartbeats
to work over TCP in a timely fashion, even at 10 or 100 Hz.

> For hard guarantees one should use some real-time OS instead. In
> Windows/Linux one must accept the possibility of occasional slowdowns,
> and code accordingly.

[snip more stuff I agree with]

/Jorgen

--
// Jorgen Grahn <grahn@ Oo o. . .
\X/ snipabacken.se> O o .

In the end, rason will come

"Öö Tiib" <ootiib@hot.ee>: Aug 05 02:56AM -0700

On Friday, 2 August 2019 21:46:30 UTC+3, David Brown wrote:
> bool foo2(int x, int y, int z) {
> return (x + z) > (y + z);
> }

The proposal basically is that "-fwrapv" is default. The
std::numeric_limits<int>::is_modulo must be true by default.
So one who wants the optimization need to use some kind of:

#pragma GCC optimize "-fno-wrapv"

The proposal does leave those optimizations beyond scope to keep
itself simple.

> Turning broken code with arbitrary bad behaviour into
> broken code with predictable bad behaviour is not particularly useful.

That is where I have different experience. At least 60% of effort put
into development seems to be about fixing defects and part of that cost
is caused by unreliability or instability of misbehavior that makes it
harder to figure what actually is wrong. Lot of debugging tools are
basically turning bad behavior that does who knows what into reliable
bad behavior that raises signals, throws exceptions, breaks or
terminates.

> compilers, and yet it also is so unlikely to be correct code that
> compilers should warn about it whenever possible and require specific
> settings to disable the warning? Isn't that a little inconsistent?

Yes, wrapping feature makes logical defects to behave more predictably
and Yes, I consider it good. Yes, wrapping feature is sometimes useful
also on its own. Yes, there are compiler intrinsic functions so I can
live without the feature. Yes, I would still like warnings. Yes, I can
live without warnings. Yes, way to disable warnings can be good. Yes,
way to enable non-wrapping optimizations can be good. Yes, I can live
without non-wrapping optimizations in 95% of code and do those manually
in rest. I am not sure how it all is inconsistent. There just are
priorities what I favor more and these priorities are likely bit
different for all people.

> #pragma STDC_OVERFLOW_TRAP
> #pragma STDC_OVERFLOW_UNDEFINED

> (or whatever variant is preferred)

That would be even better indeed, but what the proposal suggested
was simpler to implement and to add to standard leaving that possible
but beyond scope of it.

> Again, what do you think does not work with -fwrapv?

I have used it rarely and experimentally. It did sometimes optimize
int i loops when it should not. Yes, loop optimization might give up
to 10% performance difference on extreme case but that is again about
requiring some "-fno-wrapv" to allow compiler to do that optimization
not other way around. When currently "-fwrapv" is defective and
std::numeric_limits<int>::is_modulo is false then it is valid
to weasel out of each such case by saying that it is not a bug.

> and efficiently - but there is no reason not to have
> -fsanitize=signed-integer-overflow for your PC-based unit tests and
> simulations.

I did not say that tests rely on undefined behavior or code relies on
undefined behavior. I mean that most actual code (including unit tests)
written by humans (and gods we can't hire) contains defects.
It reduces effort of finding and fixing when these defects behave more
uniformly. Usage of various debugging tools is good idea that helps
to reduce that effort too but is orthogonal to it and not in conflict.

> If there were enough benefit from the additional behaviour, that would
> be fair enough. But there isn't any benefit of significance - correct
> code remains correct after this change, and broken code remains broken.

Thanks, you have point there. If people will start to use that wrapping
behavior a lot for to achieve various effects then diagnosing it will
become more and more of false positive for those people.

I suspect that people will use it only on limited but important cases
(like for self-diagnosing or for cryptography). Other possible option
would to standardize compiler intrinsic functions for those cases.
That means the (sometimes surprising) optimizations will stay valid
by default. I haven't seen people objecting much when they then need
to mark or to rewrite questionable places to suppress false positive
diagnostics about well-defined code. I likely miss some depth of it
or am too naive about something else and it is hard to predict future.

I think that the future looks much more bright for parallel programming

aminer68@gmail.com: Aug 04 05:01PM -0700

Hello,

I think that the future looks much more bright for parallel programming,
because even if race conditions detection is NP-hard problem, there exist even "scalable" race detectors that become more and more powerful rapidly with the "exponential" growth of performance of scalable algorithms on parallel computers, look for example here to notice it :

Scalable race detection for Android applications

https://dl.acm.org/citation.cfm?id=2814303

I think deadlocks are much easier to detect.

Thank you,
Amine Moulay Ramdane.

You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page.
To unsubscribe from this group and stop receiving emails from it send an email to comp.lang.c+++unsubscribe@googlegroups.com.

soft and program

Monday, August 5, 2019

Digest for comp.lang.c++@googlegroups.com - 25 updates in 5 topics

No comments:

Blog Archive

About Me