- Simulating piping one program into another, and into another - 7 Updates
- Why can't I understand what coroutines are? - 15 Updates
- Periodic message transfers using asio - 1 Update
- In the end, rason will come - 1 Update
- I think that the future looks much more bright for parallel programming - 1 Update
Frederick Gotham <cauldwell.thomas@gmail.com>: Aug 05 03:22AM -0700 So let's say I have three programs. Normally I would run these three programs at the commandline as follows: prog1 | prog2 | prog3 So let's say I take the source code for these 3 programs and try to combine them into one. So I rename the three 'main' functions and then make a new 'main'. I start off with code like this: int main_prog1(); int main_prog2(); int main_prog3(); int main() { int const retval1 = main_prog1(); int const retval2 = main_prog2(); int const retval3 = main_prog3(); return retval1 & retval2 & retval3; } An alternative method would be to start two more threads so that each line would be processed "on the fly", but for now I'm going to work with one thread, so 'main_prog1' will finish completely before 'main_prog2' begins. How would you go about doing this? Here's what I'm thinking so far. . . Go through the code for prog1 and replace all occurrences of "cout" with "cout1". Do the same with prog2 (i.e. cout2). Same goes for standard input (cin2, cin3). Next create a header file with something like: #include <iostream> extern std::stringstream cout1, cout2; static std::stringtream &cin2 = cout1; static std::stringtream &cin3 = cout2; So the first program will write to cout1, and then the second program will read from cin2. So the previous code snippet becomes something like: stringstream cout1, cout2; stringtream &cin2 = cout1; stringtream &cin3 = cout2; int main_prog1(); int main_prog2(); int main_prog3(); int main() { int const retval1 = main_prog1(); cout1.seekg(0, std::ios::beg); int const retval2 = main_prog2(); cout2.seekg(0, std::ios::beg); int const retval3 = main_prog3(); return retval1 & retval2 & retval3; } Have any of you ever done this before? What do you think of my idea? What way would you do it? Frederick |
Jorgen Grahn <grahn+nntp@snipabacken.se>: Aug 05 11:48AM On Mon, 2019-08-05, Frederick Gotham wrote: > So let's say I have three programs. Normally I would run these three > programs at the commandline as follows: > prog1 | prog2 | prog3 That's a core idea in the Unix world, yes. > So let's say I take the source code for these 3 programs and try to > combine them into one. But why? If you have a problem that can be solved with a Unix pipeline, count yourself lucky. There's no drop-in replacement in C++ or elsewhere, except possibly in functional languages like Haskell or Erlang. /Jorgen -- // Jorgen Grahn <grahn@ Oo o. . . \X/ snipabacken.se> O o . |
"Öö Tiib" <ootiib@hot.ee>: Aug 05 05:48AM -0700 On Monday, 5 August 2019 13:23:01 UTC+3, Frederick Gotham wrote: > So let's say I have three programs. Normally I would run these three programs at the commandline as follows: > prog1 | prog2 | prog3 More or less, there are typically some command line arguments. > So let's say I take the source code for these 3 programs and try to combine them into one. Before doing it you should think why. Are pipes inefficient for your use case? There is Boost.Interprocess with plenty of tools for more efficient inter-process communications. Do you hope for optimizations in interfaces between modules? The streams won't anyway allow much. Do such modules share lot of code? Use shared objects or DLLs. > Have any of you ever done this before? What do you think of my idea? What way would you do it? I like to keep modules small if possible. I have done in other direction split single large code base into several. For frequent example kicked filters/converters of old, rarely used file formats or versions or functionality into separate, rarely used processes. It lets main processing module to use single input format/version and single output format version and that can simplify it a lot. It can cause some performance hit to rarely used functionality but more frequently needed modules load and execute quicker and take less resources. I have sometimes replaced pipes with RPC so I can do more than pipes allow and can spread the modules to different hosts easier. Only thing that is needed for such decisions is to collect statistics of frequency and performance of feature usage. That can be tricky with on-premise or embedded software (that C++ is often about). |
James Kuyper <jameskuyper@alumni.caltech.edu>: Aug 05 09:58AM -0400 On 8/5/19 6:22 AM, Frederick Gotham wrote: > } > An alternative method would be to start two more threads so that each line would be processed "on the fly", but for now I'm going to work with one thread, so 'main_prog1' will finish completely before 'main_prog2' begins. > How would you go about doing this? Offhand, I would do "prog1 | prog2 | prog3" - it's a lot simpler and in many contexts can be more efficient. Why do you want to take a different approach? |
Szyk Cech <szykcech@spoko.pl>: Aug 05 05:28PM +0200 On 05.08.2019 15:58, James Kuyper wrote: > Offhand, I would do "prog1 | prog2 | prog3" - it's a lot simpler and in > many contexts can be more efficient. Why do you want to take a different > approach? Try debug prog1, prog2 and prog3 simultaneously... |
Paavo Helde <myfirstname@osa.pri.ee>: Aug 05 07:41PM +0300 On 5.08.2019 18:28, Szyk Cech wrote: >> many contexts can be more efficient. Why do you want to take a different >> approach? > Try debug prog1, prog2 and prog3 simultaneously... Why would I want to do that? One of the most important benefits of modular design like in "prog1 | prog2 | prog3" is better localization of problems, so the system can be debugged one component at a time, making the task *much* easier. |
James Kuyper <jameskuyper@alumni.caltech.edu>: Aug 05 09:57AM -0700 If the three programs interacted with each other, directly or indirectly, by any method other than the pipeline, the OP's suggestion wouldn't work. If they interact only through the pipeline, there's no need to debug them simultaneously. Debug the first program while dumping it's output to a file; debug the second program while reading from that file and dumping to a second file; debug the third program while reading from the second file. |
"Alf P. Steinbach" <alf.p.steinbach+usenet@gmail.com>: Aug 05 01:11AM +0200 On 04.08.2019 11:37, Juha Nieminen wrote: > that are completely mysterious even to me, I have *absolutely no idea* > what they are. I just don't get it. > I go to the Wikipedia page "Coroutine"... and it tells me nothing. [snip] I'll address the conceptual only. When I saw this posting earlier today I thought I'd whip up a concrete example, like I implemented coroutines in the mid 1990's. However, I discovered that I would then be late for a dinner, so I put it on hold. Coroutines are just cooperative multitasking, multitasking with a single thread of execution but multiple call stacks. When you CALL a coroutine you create a new instance, with its own stack. There is nothing like that in the standard library. When a coroutine TRANSFERs to another coroutine it's like a `longjmp`, except that `longjmp` was designed to jump up to an earlier point in a call chain, while a coroutine transfer jumps to a separate call chain. As I recall you're familiar with 16-bit Windows programming. In 16-bit Windows (Windows 3.x in the early 1990s) each program execution was a coroutine. When a program was launched, a stack area was allocated for it. That's a call of a coroutine. When the program called `GetMessage` or `Yield`, some other program instance would get a chance to continue to run (after waiting for /its/ call to `GetMessage` or `Yield` to return). That's a coroutine transfer. The 16-bit Windows program execution were /huge/, heavy coroutines. However, the main advantage of coroutines in ordinary programming is with coroutines as a very light-weight multitasking solution. Since there's only one thread of execution there are fewer synchronization issues. In particular one doesn't have to worry about whether one coroutine sees the memory changes effected by some other coroutine. Cheers & hht., - Alf |
Sam <sam@email-scan.com>: Aug 04 09:32PM -0400 Chris Vine writes: > > instead of getting torn into shreds in order to conform to event loop or > > callback-based design patterns. > I disagree. And "hundreds of connections" is not good enough. You can disagree all you want. Historical facts prove otherwise. Since ancient days, the simplest task on Unixes were done as pipelined tasks, as multiple execution contexts. $ sort <file | uniq -c That dates back decades.From the earlier days, shells were explicitly designed to execute pipelined commands. Pipelined commands were their main feature. All of this is pretty much I/O bound. Multiple execution threads. Because of that, and battle-tested over decades, Unix was tuned to effortlessly implement tasks that use multiple processes working in parallel. Threads were just the next evolution, dropping most of what little was left, in terms of per-execution context overhead. Linux inherited Unix's legacy, and orientation, towards low-overhead multiple execution threads. All the historical network server processes on Unix, then Linux, were multi- processor based. Even though select() existed pretty much since 4.2 BSD days, it was virtually unheard of for a network server to be a monolithic process, multiplexing for its clients using select(). An incoming network connection starts a new process, just for that network connection. Didn't matter what network service it was. SMTP, finger, login, or what, it was all one process per connection. There was a reason for that. It didn't take an Einstein to figure out that with a monolithic server, once it picked up an I/O event it had to do what that event needs to do, and nothing else can happen until that's done. Even if another client blurted out a packet, too bad, so sad. It needs to wait until the current I/O event's processing is done. It made a lot more sense to simply have another available CPU, not being busy with anything, run with it. And, goshdarndammit, if that socket had a different execution thread sucking from it, why, that spare CPU will know exactly what to do. So, monolithic processes were rare. They are a little bit more common today than before, but they are still a rarity. And, today, even network servers that kick off multiple processes per client can be found. So, sorry, but facts disagree: multiple execution contexts, either as standalone processes or lightweight intra-process threads rule the roost in I/O bound contexts. Linux inherits Unix's legacy in this respect, and the differences between processes and threads, on Linux, are mostly cosmetic. clone() creates both of them. You just specify which part of the parent process (virtual memory map, file descriptors, filesystem namespace etc…) are shared with the new execution context, and that's pretty much. I can understand why coming from a MS-Windows background makes someone frown at execution threads, unless all they do is stay away from the operating system, and work entirely on their own, spinning the CPU. The only reason stuff like IOCP exists on Windows is because windows sucks at multi- threading and multi-processing. It always sucked, and will likely always suck. But that's not the case with the real world out there. Being able to have a clean, encapsulated, logical execution thread, always busy with the forward march of progress from the start pointing to the finish line, without being forced into contorting into an event/dispatch based framework, results in smaller, leaner code without all that event-based/dispatching cruft. It doesn't matter which one of the file descriptors is now ready for I/O. It no longer matters. The OS kernel already figured it out, and there's no good reason for userspace to piss away more CPU cycles doing exactly the same thing. The OS kernel knows which I/O is done, and which thread is waiting on it, and it already knows whose ass to kick into gear. A such, execution threads are perfect for I/O-based load. It is no wonder that Linux rules the roost in TOP500 space. It's a dirty little secret that all those supercomputers are just a bunch of machines with optimized networking tying hundreds of thousands threads together. Yes, threads. Gee whiz, how can that possibly happen? |
Robert Wessel <robertwessel2@yahoo.com>: Aug 04 10:04PM -0500 >writing clean, orderly code that runs logically from start to finish, >instead of getting torn into shreds in order to conform to event loop or >callback-based design patterns. If you only have "hundreds" of connections, use any technique you like, scaling isn't an issue. |
"Chris M. Thomasson" <invalid_chris_thomasson_invalid@invalid.com>: Aug 04 08:10PM -0700 On 8/4/2019 2:36 PM, Sam wrote: > its crap multithreading. > A thread per connection scales perfectly fine, on Linux. Even with > hundreds of connections. Hundreds? Try tens of thousands... Back on WinNT 4.0, my server code could handle around 40,000 concurrent TCP connections, using a handful of threads. There were bursts of activity, where many connections were being rapidly created and destroyed. My stress testing client programs tried to swamp the server with various scenarios. This was a long time ago. |
Paavo Helde <myfirstname@osa.pri.ee>: Aug 05 09:36AM +0300 On 4.08.2019 22:30, Chris Vine wrote: > summary wrong (and if so, what's the point of completion ports in the > first place)? I find it hard to believe microsoft would have gone for > the one thread per connection option favoured by my respondent. IOCP is used in Windows to *reduce* the number of needed threads (at least in user space), which is important because in Windows the thread creation is a pretty heavyweight operation. So it's strange that IOCP is mentioned as an example of "Threads can work fairly well with IO", it's rather the opposite, at least on Windows. The Linux equivalent of IOCP is AIO, see e.g. "man aio_read". However, some googling suggests the Linux implementation is not yet optimal (I guess this means it is not yet built into the kernel to the same extent as in Windows). Also, the popular Boost.Asio library has "asynchronous" already in its name and uses async IO in the background as appropriate. It also has implementations of coroutines, so maybe one can study its documentation and examples in order to get familiar with coroutines. BTW, Boost.Asio does not create a "thread per connection". Vice versa, it expects the client program to set up a thread pool (with the number of threads based on the number of cores, typically) which is reused for all requests. |
Martijn van Buul <pino@dohd.org>: Aug 05 07:46AM * Juha Nieminen: > C++20 will introduce support for coroutines. To this day, for reasons > that are completely mysterious even to me, I have *absolutely no idea* > what they are. I just don't get it. I've been using lua in the past, and I was heavily using coroutines at some point. I don't claim to be an expert on these matters, but I'll tell you why *I* used them. The key is in the following snippet : > remove some items from q > use the items > yield to produce ... except that the snippet threw away the child with the bath water, by explicitly yielding to a specific target in both directions. That doesn't add any real benefit indeed, but when done differently, coroutines offer two benefits: 1) They decouple the consumer and the producer. 2) You don't need to maintain state in an explicit container, you can just use local variables on the stack. While coroutines certainly aren't necessary here ("But I can do it with a function call to an interface method" is a straw man argument, as noone claimed otherwise - there are always multiple solutions to a single problem) but they made some solutions certainly a lot more elegant. I would suggest reading chapter 9 of "Programming in Lua". I know it's not C++, but maybe it'll help as a primer. An older version is online at https://www.lua.org/pil/9.1.html -- Martijn van Buul - pino@dohd.org |
Juha Nieminen <nospam@thanks.invalid>: Aug 05 08:31AM > results back to the main program. A coroutine that yields waits until > it's told to resume by the main program; a coroutine that returns > simply goes away and cannot be told to resume. I'm still not sure how that's different from a regular function. Or, perhaps more precisely, a lambda function (because as far as I understand, a coroutine can have its own state, just like a lambda can have captured variables, making the lambda effectively a stateful functor.) Is the difference that a coroutine can "return" (so to speak) from anywhere inside itself, and the next time it gets "called" again it resumes from that point forward, rather than from the beginning (like a lambda would)? I'm still not exactly sure how that's so useful in practice (compared to stateful functors, like lambdas.) |
Juha Nieminen <nospam@thanks.invalid>: Aug 05 08:37AM > This problem was solved a long time ago. It is called "threads". I think that with "threads" you are implying pre-emptive multitasking. Doesn't that introduce lots of problems relating to mutual exclusion? Quite a significant portion of bugs out there are related to multithreading problems. |
Juha Nieminen <nospam@thanks.invalid>: Aug 05 08:42AM > a bunch of state from local variables), and then continue from that > point (still those dozen subroutine calls and loops deep, with state > intact), rather than at the beginning of the routine again. From all the text and discussion out there I'm getting the picture that coroutines are a bit like lambda functions (ie. essentially stateful functors), with the difference that the coroutine can be "called" again in such a manner that it continues from where it "returned" last time, rather than always from the beginning (which would happen with lambdas). Is that correct? Is there a simple example of a situation where this is significantly beneficial compared to a lambda or an explicitly written functor (ie. a class with member variables and an operator())? |
Juha Nieminen <nospam@thanks.invalid>: Aug 05 08:50AM > Coroutines are just cooperative multitasking, multitasking with a single > thread of execution but multiple call stacks. When you CALL a coroutine > you create a new instance, with its own stack. Does this mean that if a coroutine calls other functions normally (which themselves may then call other functions and so on), this chain of calls will use its own stack that's separate and independent from the stack that's used by main() and everything it calls? And if any of those functions along the chain "yields" (or whatever the term was to "exit" this "pseudo-thread"), to later be returned to that point of execution, all that stack is still there and this chain of function calls will continue as before? If that's the case, then your explanation gave me completely new insight into coroutines that I didn't know nor understand before. |
Martijn van Buul <pino@dohd.org>: Aug 05 09:51AM * Juha Nieminen: > from anywhere inside itself, and the next time it gets "called" > again it resumes from that point forward, rather than from the > beginning (like a lambda would)? Yup. It acts like a blocking call, in a way. > I'm still not exactly sure how that's so useful in practice > (compared to stateful functors, like lambdas.) Consider a producer/consumer implemented using threads. In this case 'publish' and 'consume' operate on some kind of queue, possibly with a depth of 1. If a producer publishes something to a full queue, it blocks. If the consumer tries to consume from an empty queue, it blocks. class ProducerClass { [...] void mainLoop() { while ( ... ) { Item newItem; [... do work on newItem ...] publish( newItem); } } } class ConsumerClass { [...] void mainLoop() { while ( ... ) { std::vector<Item> packet; packet.reserve(10); for (int i = 0; i < packet.capacity(); ++i) { Item newItem = consume(); [.. possibly do something on newItem ...] packet.push_back(newItem); } [... do something on a packet of 10 items ...] } } } In this case, having the consumer and producer execute in their own thread offers no performance benefit, but it might still be very beneficial to implement it this way, depending on details of either consumer or producer. As with all examples, the above example is a bit silly. In this case, with a 1:1 relationship between consumer and producer, such a threaded solution would offer no parallelisation, so using threads here only serves to simplify the implementation of the consumer or producer - performancewise it's detrimental. In this case, coroutines would offer a compromise: The separation and implementation benefits of a threaded solution, without the overhead caused by blocking. Another possible use of coroutines would be iterators. Suppose I have a struct Foo { }; struct Bar { [...] std::list<Foo> foos; } struct Quux { [...] std::vector<Bar> bars; } struct Wibble { std::map< ..., Quux> quuxMap; } Suppose I want to offer a way to iterate over all instances of "Foo" inside a "Wibble", without exposing any of the classes inbetween (because they're private to an implementation, for example). There are numerous options: * Create a std::container<Foo> GetFoos(const Wibble &) method. Could be expensive (because Foo is expensive to copy), impossible (because Foo has a deleted copy constructor) - and is generally undesireable anyway. * Create an ApplyOnFoos(const Wibble &, const std::function<void(const Foo&)> &callback) method (or something similar using templates, details schmetails) that iterates over all elements and calls the callback function for every instance of 'Foo'. Could work, but the receiving end might have to capture quite a bit to make this work. * Create an iterator class along the line of class outputFunctor { outputFunctor(Wibble &) {...} std::optional<std::reference_wrapper<Foo>> operator() {...} } (Or create an STL iterator, which is essentially the same). Will definately work, but implementation is going to be awkward. The first two have the benefit that their implementation can use simple constructs (more to the point: A nested range-based for loop will do the trick). Without coroutines, implementing the functor will require storing iterators to the intermediate class inside the functor. With coroutines, you can implement it using the same simple nested-range for loop. I haven't touched C++'s coroutines yet (I'm waiting for C++20 to arrive) so I don't know whether the required boilerplate will offset any implementation benefits, but in Lua it would be something like function outputIterator(wibble) local _generator = coroutine.create( function() for (_, quux) in pairs(wibble.quuxMap) do for (_, bar) in pairs(quux.bars) do for(_, foo) in pairs(bar.foos) do coroutine.yield(foo) end end end end) return function () local _, nextFoo = coroutine.resume(_generator) return nextFoo end end usage: local myWbble = [....] for (foo in outputIterator(myWibble)) do print ("Yay, a foo!", tostring(foo)) end (Untested, and my lua is a bit rusty) -- Martijn van Buul - pino@dohd.org |
David Brown <david.brown@hesbynett.no>: Aug 05 12:06PM +0200 On 05/08/2019 10:37, Juha Nieminen wrote: > Doesn't that introduce lots of problems relating to mutual exclusion? > Quite a significant portion of bugs out there are related to > multithreading problems. Yes, it certainly does. One way to think about coroutines is like threads, but cooperatively multitasking instead of preemptive multitasking. You, the programmer, have full control over which coroutine is running at a time, and when it can be blocked. This has disadvantages, of course - it means more manual control, and it cannot take advantage of multiple cores (you need to run your coroutines from different threads for that). But it has the advantage of being simpler and lighter (not every system is a multi-core multi-GHz monster), and you don't need to worry about locking, synchronisation, atomic accesses, races, contention, or anything like that. I am looking forward to coroutines for my sub-GHz single core embedded systems. |
Juha Nieminen <nospam@thanks.invalid>: Aug 05 12:20PM > One way to think about coroutines is like threads, but cooperatively > multitasking instead of preemptive multitasking. From another forum I got the impression that C++20 coroutines are stackless. Meaning that they don't have their own stack that's separate from the one used by main(). Which means that you can't have a chain of function calls in a coroutine, have one of those functions "yield", and then later return back to that point and resume execution as normal. (Basically, and if I understand correctly, only the coroutine function itself can "yield", not any of the functions it might call in the normal way.) This would make coroutines rather different from threads. Actual threads have each their own stack that they use for function call parameters etc, and which is independent from other threads. In other words, it sounds to me like coroutines are, essentially, lambda functions, or stateful functor objects, with a jump at the beginning of the function to the position in the code that last yielded. |
Chris Vine <chris@cvine--nospam--.freeserve.co.uk>: Aug 05 01:35PM +0100 On Mon, 5 Aug 2019 08:42:48 -0000 (UTC) > Is there a simple example of a situation where this is significantly > beneficial compared to a lambda or an explicitly written functor > (ie. a class with member variables and an operator())? This is essentially correct although in speaking of lambda functions with closures you are moving from the concept ("what is a coroutine?") to one possible implementation. Asymmetric coroutines (ones which can only suspend to the caller) are indeed like ordinary functions (not necessarily lambda functions) except that they can emit an enhanced return statement at some point in the function's execution and, by some arrangement to be implemented, return to that point later. Symmetric coroutines don't "return" in quite this way - instead they yield to some other coroutine, which need not be the same as the one that last resumed them. One possible but fragile implementation of a coroutine is a stateful functor built by hand along the lines you describe. For example, for an asymmetric coroutine you could construct a functor whose operator() has three return statements as possible exit points. The functor object could have an enum as member data which describes what point the function has at present reached. When the functor is to "yield" it sets the enum to the correct point, saves its local variables to the functor's data variables and then returns. When invoked again ("resumed") it first examines the enum and jumps to the correct part of the operator() function where it is to resume, and if necessary re-establishes its local state. Note also that C has setjmp() and longjmp() as primitives which are similar to delimited continuations (but not recommended for use in C++ because longjumps give undefined behaviour if non-trivial objects are in scope). In addition to symmetric and asymmetric coroutines, you can also have stackfull and stackless coroutines. "Stackless" is a bit misleading: it means that the coroutine does not need a stack while it is suspended (clearly it needs one when resumed), allowing the stack to be reused in the meantime. |
"Alf P. Steinbach" <alf.p.steinbach+usenet@gmail.com>: Aug 05 04:59PM +0200 On 05.08.2019 10:50, Juha Nieminen wrote: > themselves may then call other functions and so on), this chain of calls > will use its own stack that's separate and independent from the stack > that's used by main() and everything it calls? Conceptually, yes, and with a completely general implementation such as Windows fibers or the old Boost coroutines, or e.g. Modula-2 coroutines, yes. However, C++ coroutines will be so called "stackless" coroutines as an optimization and restriction. According to ¹cppreference, "Coroutines are stackless: they suspend execution by returning to the caller". As I read it that means all coroutines using the C++20 syntax. "Stackless" is possible by restricting the place that a coroutine transfer can be specified to the couroutine body itself, with no transfer in any code that it calls. With this restriction one ensures that the coroutine's stack has a /very/ small maximum size when a transfer happens. AND its own parameters and expression evaluation is then all that it needs its own little stack for, because when a call out of the routine can't cause a transfer, then such a call will return with the stack that's used for the call, reverted back to the state at the call. Which means that calls out of the coroutine can be done on a stack that's common to all such "stackless" coroutines. The restriction also plays another rôle: with C++20 coroutines a function is designated as a coroutine, i.e. that's the way that you specify that it is a coroutine, by containing one of the keywords that specify a transfer, namely any of `co_await`, `co_yield` or `co_return`. > term was to "exit" this "pseudo-thread"), to later be returned to that > point of execution, all that stack is still there and this chain of > function calls will continue as before? Yes, with general coroutines. But not with C++20 coroutines, due to the optimization & restriction. > If that's the case, then your explanation gave me completely new insight > into coroutines that I didn't know nor understand before. Thanks. Cheers!, - Alf Links: ¹ https://en.cppreference.com/w/cpp/language/coroutines |
Jorgen Grahn <grahn+nntp@snipabacken.se>: Aug 05 12:05PM On Sun, 2019-08-04, Paavo Helde wrote: > On 4.08.2019 5:53, M Powell wrote: >> For starters I apologize if I'm in the wrong forum. A colleague is >> using asio IO You later write about TCP so I assume that's what it really is, TCP I/O implemented using some part of the huge and varied Asio library. >> to transfer messages between two applications. The >> message size is 300 bytes and one of the apps is leveraging a Replace 'is leveraging' with 'uses' and you'll annoy less people. >> before incrementing the counter and sending the message >> The process repeats periodically at the interval specified. app1 >> has metrics that tracks transfer rate and dropped messages. Dropped messages over TCP? > especially the variations in its performance. And the 10 kHz number is > not something special, similar problems are there always when you > require something to happen during some fixed time. I'm pretty sure it would be both. TCP is a stream-oriented protocol with no real-time properties; if the stack sees you do tiny writes to the stream at a high rate it might (and /should/) start lumping them together to avoid wasting computer and network resources. Sounds to me the protocol is misdesigned, if it expects heartbeats to work over TCP in a timely fashion, even at 10 or 100 Hz. > For hard guarantees one should use some real-time OS instead. In > Windows/Linux one must accept the possibility of occasional slowdowns, > and code accordingly. [snip more stuff I agree with] /Jorgen -- // Jorgen Grahn <grahn@ Oo o. . . \X/ snipabacken.se> O o . |
"Öö Tiib" <ootiib@hot.ee>: Aug 05 02:56AM -0700 On Friday, 2 August 2019 21:46:30 UTC+3, David Brown wrote: > bool foo2(int x, int y, int z) { > return (x + z) > (y + z); > } The proposal basically is that "-fwrapv" is default. The std::numeric_limits<int>::is_modulo must be true by default. So one who wants the optimization need to use some kind of: #pragma GCC optimize "-fno-wrapv" The proposal does leave those optimizations beyond scope to keep itself simple. > Turning broken code with arbitrary bad behaviour into > broken code with predictable bad behaviour is not particularly useful. That is where I have different experience. At least 60% of effort put into development seems to be about fixing defects and part of that cost is caused by unreliability or instability of misbehavior that makes it harder to figure what actually is wrong. Lot of debugging tools are basically turning bad behavior that does who knows what into reliable bad behavior that raises signals, throws exceptions, breaks or terminates. > compilers, and yet it also is so unlikely to be correct code that > compilers should warn about it whenever possible and require specific > settings to disable the warning? Isn't that a little inconsistent? Yes, wrapping feature makes logical defects to behave more predictably and Yes, I consider it good. Yes, wrapping feature is sometimes useful also on its own. Yes, there are compiler intrinsic functions so I can live without the feature. Yes, I would still like warnings. Yes, I can live without warnings. Yes, way to disable warnings can be good. Yes, way to enable non-wrapping optimizations can be good. Yes, I can live without non-wrapping optimizations in 95% of code and do those manually in rest. I am not sure how it all is inconsistent. There just are priorities what I favor more and these priorities are likely bit different for all people. > #pragma STDC_OVERFLOW_TRAP > #pragma STDC_OVERFLOW_UNDEFINED > (or whatever variant is preferred) That would be even better indeed, but what the proposal suggested was simpler to implement and to add to standard leaving that possible but beyond scope of it. > Again, what do you think does not work with -fwrapv? I have used it rarely and experimentally. It did sometimes optimize int i loops when it should not. Yes, loop optimization might give up to 10% performance difference on extreme case but that is again about requiring some "-fno-wrapv" to allow compiler to do that optimization not other way around. When currently "-fwrapv" is defective and std::numeric_limits<int>::is_modulo is false then it is valid to weasel out of each such case by saying that it is not a bug. > and efficiently - but there is no reason not to have > -fsanitize=signed-integer-overflow for your PC-based unit tests and > simulations. I did not say that tests rely on undefined behavior or code relies on undefined behavior. I mean that most actual code (including unit tests) written by humans (and gods we can't hire) contains defects. It reduces effort of finding and fixing when these defects behave more uniformly. Usage of various debugging tools is good idea that helps to reduce that effort too but is orthogonal to it and not in conflict. > If there were enough benefit from the additional behaviour, that would > be fair enough. But there isn't any benefit of significance - correct > code remains correct after this change, and broken code remains broken. Thanks, you have point there. If people will start to use that wrapping behavior a lot for to achieve various effects then diagnosing it will become more and more of false positive for those people. I suspect that people will use it only on limited but important cases (like for self-diagnosing or for cryptography). Other possible option would to standardize compiler intrinsic functions for those cases. That means the (sometimes surprising) optimizations will stay valid by default. I haven't seen people objecting much when they then need to mark or to rewrite questionable places to suppress false positive diagnostics about well-defined code. I likely miss some depth of it or am too naive about something else and it is hard to predict future. |
aminer68@gmail.com: Aug 04 05:01PM -0700 Hello, I think that the future looks much more bright for parallel programming, because even if race conditions detection is NP-hard problem, there exist even "scalable" race detectors that become more and more powerful rapidly with the "exponential" growth of performance of scalable algorithms on parallel computers, look for example here to notice it : Scalable race detection for Android applications https://dl.acm.org/citation.cfm?id=2814303 I think deadlocks are much easier to detect. Thank you, Amine Moulay Ramdane. |
You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page. To unsubscribe from this group and stop receiving emails from it send an email to comp.lang.c+++unsubscribe@googlegroups.com. |
No comments:
Post a Comment