Wednesday, July 18, 2018

Digest for comp.lang.c++@googlegroups.com - 25 updates in 8 topics

Tim Rentsch <txr@alumni.caltech.edu>: Jul 18 03:42AM -0700


> is there something such as a mark and compact garbage collector?
> if the simple implementation is difficult perhaps with the
> combined used of smart pointers?
 
You have asked a question about a very big topic.
 
Automatic memory management (often generically called "garbage
collection", or GC) has a wide range of approaches and
implementation strategies. None of these questions have
simple answers. (Before going further, disclaimer: I know
very little about C++'s smart pointers.)
 
Reference counting (which I believe is the approach C++ smart
pointers take) has several advantageous properties: it can be
implemented locally; it tends to be "smooth" in that memory is
deallocated in small slices rather than a bunch at once; freeing
happens more or less immediately, and deterministically; and it
is easy to understand. Reference counting also has several
disadvantageous properties: it will not reclaim cyclic structures
unless they explicitly have the cycles removed; it consumes more
resources (as a general rule) than more global schemes, in both
space and time; the invariants maintained must be kept exactly
right, which makes it reliant on destructors being run, which may
lead to hard-to-understand behavior in the presence of exception
processing. Reference counting works well for certain classes
of applications.
 
More global schemes, such as mark/sweep, compacting collectors,
or generation scavenging, centralize the memory management
function, and usually are what people mean when they say "GC".
Some techniques require more control over the language than is
available in C++. Some additional important comments:
 
Having GC available doesn't completely eliminate the need to do
manual or explicit memory management. It does reduce it by a
very large factor, but sometimes it's important to nil out a
pointer, or take some other step.
 
Early GC schemes were what is called "stop the world" collectors,
where all other processing stops until the collector is done.
That is no longer true in modern GC implementations. In fact,
for some time now there are GC algorithms where the collector
runs in a separate thread.
 
Conservative collectors sound horrifying, but in practice they
work effectively enough so the difference can be ignored in most
applications, especially on machines with 64-bit pointers. Some
measurements were done for the Boehm collector, which should be
easy to find if someone is interested to look.
 
GC has a reputation (at least with some people) as being slower
than manual memory management. This reputation is not deserved.
Bjarne wrote about this some years ago, and pointed out that it
is notoriously difficult to compare the time costs of the two
approaches. Certainly in some cases programs get faster after
switching to a GC-based scheme (it was the Boehm collector
in particular in one case I'm familiar with).
 
Probably the biggest downside of GC-style management is loss of
control over when (or sometimes even whether) finalizers are
run. This is a special case of the earlier comment about GC
not eliminating the need to manage some resources manually in
some cases.
 
Probably the biggest upside of GC-style memory management is a
big increase in productivity, which has been measured as a
factor somewhere between 1.5 and 2. I know of no studies that
disagree with these findings.
 
I think it is commonly true that various developers are either
pro-GC or anti-GC. I try to be more agnostic about it: there
are some cases where having GC is an enormous boon, and other
cases where it is essential to maintain manual control. I don't
think either stance is right all the time. So I hope this has
given you a flavor of the various plusses and minusses of the
different possibilities.
Paavo Helde <myfirstname@osa.pri.ee>: Jul 18 05:23PM +0300

On 18.07.2018 13:42, Tim Rentsch wrote:
 
> Reference counting also has several
> disadvantageous properties: it will not reclaim cyclic structures
> unless they explicitly have the cycles removed;
 
Using std::weak_ptr might provide some mitigation here.
 
> it consumes more
> resources (as a general rule) than more global schemes, in both
> space and time;
 
Depends on the smartpointer. For example, std::unique_ptr consumes zero
resources in both space and time. Yes, std::shared_ptr is a bit
heavyweight because the reference-count update is required to be
multithread-safe. A single-thread only smartpointer is much faster in
multithreaded programs, but its usage obviously requires more care.
 
> right, which makes it reliant on destructors being run, which may
> lead to hard-to-understand behavior in the presence of exception
> processing.
 
Not sure what you want to say here? That if the program is buggy it
might not work properly? Or that resources other than memory cannot be
released by GC? Yes, that's the main problem with GC.
 
[...]
> Probably the biggest upside of GC-style memory management is a
> big increase in productivity, which has been measured as a
> factor somewhere between 1.5 and 2.
 
Compared to what? To the proper C++ code using std::vector, std::string,
std::make_unique, std::make_shared, or to the C or C-style C++ code
calling malloc/free or new/delete manually?
 
I'm sure GC suits fine some programs.
woodbrian77@gmail.com: Jul 18 09:42AM -0700

On Wednesday, July 18, 2018 at 9:24:09 AM UTC-5, Paavo Helde wrote:
> > factor somewhere between 1.5 and 2.
 
> Compared to what? To the proper C++ code using std::vector, std::string,
> std::make_unique,
 
That was my question also.
 
> std::make_shared, or to the C or C-style C++ code
> calling malloc/free or new/delete manually?
 
Vector and string use new/delete manually.
 
 
Brian
Ebenezer Enterprises - Enjoying programming again.
https://github.com/Ebenezer-group/onwards
Ralf Fassel <ralfixx@gmx.de>: Jul 18 11:23AM +0200

* "Alf P. Steinbach" <alf.p.steinbach+usenet@gmail.com>
| On 17.07.2018 15:17, Thiago Adams wrote:
| > int main()
| > {
| > X x;
| > F();
| > Y y;
| > }
 
| In this case, without a `catch` that catches the exception, you are
| not guaranteed that destructors are called. It's up to the
| implementation.
 
You mean, main() is special with regards to exceptions and DTORs? Isn't
the whole point of exception safety actually to guarantee that DTORs are
called when exceptions occur? Consider std::lock_guard and related...
 
IMHO in the above the DTOR of X should always get called regardless of
F() throwing or not.
 
R'
Bo Persson <bop@gmb.dk>: Jul 18 12:26PM +0200

On 2018-07-18 11:23, Ralf Fassel wrote:
 
> IMHO in the above the DTOR of X should always get called regardless of
> F() throwing or not.
 
> R'
 
No, main is not special (in this regard). The difference is having a
try-catch, or not.
 
Implementations are allowed to first go looking for the catch clause,
and only then run the destructors for the stack objects. Or it could
unwind one stack frame at a time while searching for the catch-clause.
 
Makes a difference if the is NO try-catch in the current line of
execution, so that the thrown exception leaves main. Could mean that the
program is just terminated without executing any destructors.
 
 
Bo Persson
Ben Bacarisse <ben.lists@bsb.me.uk>: Jul 18 12:35PM +0100

>> superfluous. It's the default for `main`, in both C and C++.
 
> It's unclear to me if "return 0;" or "return EXIT_SUCCESS;" is the default
> in C++ when no explicit return is specified in main().
 
When there is no explicit return, main behaves as if return 0; were
executed. This is the same in C and C++.
 
> (In the vast majority of systems EXIT_SUCCESS is 0, but I think theoretically
> it could be something else.)
 
It's true that EXIT_SUCCESS need not be equal to 0, but a return from
main with either value (or, indeed, a call to std::exit with either
value as the argument) will be seen as a successful termination.
 
Obviously neither standard can really say much about what that means,
but both C and C++ it make it clear that they are equivalent as far as
signalling success or failure is concerned.
 
--
Ben.
jameskuyper@alumni.caltech.edu: Jul 18 04:57AM -0700

On Wednesday, July 18, 2018 at 4:00:27 AM UTC-4, Juha Nieminen wrote:
...
> It's unclear to me if "return 0;" or "return EXIT_SUCCESS;" is the default
> in C++ when no explicit return is specified in main().
 
"If control reaches the end of main without encountering a return statement,
the effect is that of executing
return 0;" (3.6.1p5)
Paavo Helde <myfirstname@osa.pri.ee>: Jul 18 03:04PM +0300

On 18.07.2018 11:00, Juha Nieminen wrote:
> in C++ when no explicit return is specified in main().
 
> (In the vast majority of systems EXIT_SUCCESS is 0, but I think theoretically
> it could be something else.)
 
In 18.5/8: "If status is zero or EXIT_SUCCESS, an implementation-defined
form of the status successful termination is returned."
 
So 0 and EXIT_SUCCESS appear to be at least interchangeable, even if not
exactly the same.
Thiago Adams <thiago.adams@gmail.com>: Jul 18 06:04AM -0700

On Wednesday, July 18, 2018 at 4:58:03 AM UTC-3, Juha Nieminen wrote:
> principle quite strictly here. (In this case it means that code that isn't
> using exceptions shouldn't suffer a speed penalty just because it has to
> be prepared for an exception to happen somewhere along the line.)
 
It must have some overhead in size because it has to build
the "landing pads" and tables state->landing pads.
Also it has to change some state (let's say change one pointer,
I don't known) while the code is completing ctors.
 
Maybe zero overhead is compared with manual solution
of error codes.
 
I don't understand how heavy is the computation on
stack unwinding, but the document [1] says it uses
compression. The compression means that the states
table can be large.
 
https://itanium-cxx-abi.github.io/cxx-abi/exceptions.pdf
 
 
Something interesting is to compare the size of the current
exception mechanisms against this:
 
Zero-overhead deterministic exceptions: Throwing values
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0709r0.pdf
 
 
The "zero-overhead" is also something unrealistic in my option.
Error propagation has a cost.
 
The memory that holds the throw object also
is especial - not on the stack - and some
implementations may have to allocate the memory
dynamically (not sure) for this object.
 
Maybe this is the reason why some people says
that the computation after throw is not deterministic.
"Alf P. Steinbach" <alf.p.steinbach+usenet@gmail.com>: Jul 18 04:20PM +0200

>> Yes, that `catch` does make a difference. Note that the `return 0;` is
>> superfluous. It's the default for `main`, in both C and C++.
 
> Int is also the default return type for main()
 
You're posting to a C++ group only.
 
Implicit int was removed in the first C++ standard, in 1998, with the
following rationale:
 
C++98 Annex §C1.5 clause 7.1.5, "In C++, implicit int creates several
opportunities for ambiguity between expressions involving function-like
casts and declarations. Explicit declaration is increasingly considered
to be proper style. Liaison with WG14 (C) indicated support for (at
least) deprecating implicit int in the next revision of C."
 
 
> so lets bin that too, oh and
> untyped function parameters default to int
 
Nope. See above.
 
 
> compiler has to say about it.
 
> You do love to put brevity above clarity, were you a Perl coder in a past
> life?
 
This personal attack following a demonstration of incompetence, sounds
like a "Boltar" that I've killfiled six identities of.
 
Plink.
alfswibblingagain@theshed.com: Jul 18 03:18PM

On Wed, 18 Jul 2018 16:20:51 +0200
>>> superfluous. It's the default for `main`, in both C and C++.
 
>> Int is also the default return type for main()
 
>You're posting to a C++ group only.
 
Except you mentioned C.
 
>Implicit int was removed in the first C++ standard, in 1998, with the
>following rationale:
 
See above.
 
>casts and declarations. Explicit declaration is increasingly considered
>to be proper style. Liaison with WG14 (C) indicated support for (at
>least) deprecating implicit int in the next revision of C."
 
Fascinating.
 
>> so lets bin that too, oh and
>> untyped function parameters default to int
 
>Nope. See above.
 
For C yes. See above.
 
>This personal attack following a demonstration of incompetence, sounds
>like a "Boltar" that I've killfiled six identities of.
 
>Plink.
 
Your killfile must be full of junk by now. You obviously still haven't figured
out that I just generate these id's at random with no limit. Perhaps educate
yourself as to how NNTP works.
Bo Persson <bop@gmb.dk>: Jul 18 05:23PM +0200

On 2018-07-18 15:04, Thiago Adams wrote:
>> be prepared for an exception to happen somewhere along the line.)
 
> It must have some overhead in size because it has to build
> the "landing pads" and tables state->landing pads.
 
This can be a disk size overhead only. Using virtual memory the tables
and code can be swapped in on demand.
 
And most of the possible exceptions will not be thrown during a normal
execution. So doesn't really have to be loaded.
 
> Also it has to change some state (let's say change one pointer,
> I don't known) while the code is completing ctors.
 
The constructors would have to execute anyway, so how is this an overhead?
 
 
> Maybe zero overhead is compared with manual solution
> of error codes.
 
Tons of "if (result == error) goto end;" IS overhead. :-)
 
And is problematic in C++ if the goto jumps over the construction of
some objects.
 
> stack unwinding, but the document [1] says it uses
> compression. The compression means that the states
> table can be large.
 
Pretty heavy, but doesn't happen that often.
 
 
 
Bo Persson
Bo Persson <bop@gmb.dk>: Jul 18 05:27PM +0200

On 2018-07-18 17:23, Bo Persson wrote:
 
>> Also it has to change some state (let's say change one pointer,
>> I don't known) while the code is completing ctors.
 
> The constructors would have to execute anyway, so how is this an overhead?
 
Oh, I was thinking destructors here. Never mind!
 
"Alf P. Steinbach" <alf.p.steinbach+usenet@gmail.com>: Jul 18 05:41PM +0200

On 18.07.2018 17:18, "Boltar" wrote:
 
>> Plink.
 
> Your killfile must be full
 
Eight identity of "Boltar" killfiled.
Paavo Helde <myfirstname@osa.pri.ee>: Jul 18 06:58PM +0300

On 18.07.2018 16:04, Thiago Adams wrote:
 
> It must have some overhead in size because it has to build
> the "landing pads" and tables state->landing pads.
 
Yes, it trades some disk space for speed, but the size of executables
does not interest most people nowadays (except some nutwits in this group).
 
> Also it has to change some state (let's say change one pointer,
> I don't known) while the code is completing ctors.
 
In principle the state of the IP (instruction pointer) could be enough,
don't know how it is actually done.
 
> stack unwinding, but the document [1] says it uses
> compression. The compression means that the states
> table can be large.
 
I believe they deal with relative jump offsets which are typically small
numbers, so they are compressing away the high-order zero bytes. Can
probably give something like 8x compression. Decompressing is a bit
tedious but as this happens on the exceptional code path only it does
not matter.
"Rick C. Hodgin" <rick.c.hodgin@gmail.com>: Jul 18 08:22AM -0400

On 7/17/2018 3:26 PM, Rick C. Hodgin wrote:
> Somebody posted on comp.lang.asm.x86 an algorithm for fizz buzz.
> It made me think of one here in C/C++.
 
An optimization occurred to me driving in to work this morning.
 
 
>     // Cycling bit patterns
>     uint32_t three_data = 0x924;
>     uint32_t five_data  = 0x84210;
 
Shift the five_data bit pattern left by one bit:
 
uint32_t five_data = 0x84210 << 1;
 
>         for (int lnI = 1; lnI <= 100; ++lnI)
>         {
>             funcs[(three_data & 1) + (2 * (five_data & 1))].func(lnI);
 
It removes the need for a multiply here:
 
funcs[(three_data & 1) + (five_data & 2)].func(lnI);
 
>             three_data = (three_data >> 1) | ((three_data & 1) << 11);
>             five_data  = (five_data  >> 1) | ((five_data  & 1) << 19);
 
Then shift over 20 instead of 19:
 
five_data = (five_data >> 1) | ((five_data & 1) << 20);
 
>     }
 
> The rules of the FizzBuzz game are here:
 
>     http://wiki.c2.com/?FizzBuzzTest
 
 
Just popped in my head while driving. If there are any others, please
post them. I like this solution. Seems simple and elegant.
 
--
Rick C. Hodgin
Ben Bacarisse <ben.lists@bsb.me.uk>: Jul 18 03:38PM +0100


>>     http://wiki.c2.com/?FizzBuzzTest
 
> Just popped in my head while driving. If there are any others, please
> post them. I like this solution. Seems simple and elegant.
 
It's always good to see a wide range of opinions. I find your
assessment very odd as the program contains redundant code, unnecessary
globals, overly mysterious constants, odd variable names, unused
variables, an unnecessary include and overly specific types. These are
details though, and I'll explain the biggest issue I have with it later
on.
 
Removing the unnecessary struct, making local data local, simplifying
the types and writing the bit patterns in their simplest and (to me)
most obvious form, we get:
 
#include <stdio.h>
 
void fizz(int num) { printf("fizz\n"); }
void buzz(int num) { printf("buzz\n"); }
void fizzbuzz(int num) { printf("fizzbuzz\n"); }
void other(int num) { printf("%d\n", num); }
 
int main(void)
{
void (*funcs[4])(int) = { other, fizz, buzz, fizzbuzz };
int threes = 1 << 2, fives = 1 << 4;
 
for (int i = 1; i <= 100; i++) {
funcs[(threes & 1) + (2 * (fives & 1))](i);
threes = (threes >> 1) | ((threes & 1) << 2);
fives = (fives >> 1) | ((fives & 1) << 4);
}
}
 
though even this tidied-up version has a magic expression in the array
index.
 
But the biggest problem is that the output depends on some state that
must be updated in line with the loop variable -- if the loop were
changed to test only every other int, the code would break. The 'i++'
and the two bit rotations are tied together.
 
I'd favour a solution where there is a single, separable bit of code
that prints the required output based solely on the number being
considered. For example
 
#include <stdio.h>
 
int main(void)
{
const char *const f = "fizz", *const b = "buzz";
for (int i = 1; i <= 100; i++) {
switch (i % 15) {
case 0:
fputs(f, stdout);
fputs(b, stdout);
break;
case 3: case 6: case 9: case 12:
fputs(f, stdout);
break;
case 5: case 10:
fputs(b, stdout);
break;
default:
printf("%d", i);
}
putchar('\n');
}
}
 
--
Ben.
"Rick C. Hodgin" <rick.c.hodgin@gmail.com>: Jul 18 10:51AM -0400

On 7/18/2018 10:38 AM, Ben Bacarisse wrote:
> variables, an unnecessary include and overly specific types. These are
> details though, and I'll explain the biggest issue I have with it later
> on.
 
Every time you post anything to me it's negative, and overtly so.
I'm to the point of ignoring your posts now from this point forward.
 
Please continue to post for others to read. They can learn from my
mistakes, and your chastising of me.
 
--
Rick C. Hodgin
 
PS -- Once again you've posted something to both comp.lang.c and
comp.lang.c++, but set the follow-up only to comp.lang.c.
ram@zedat.fu-berlin.de (Stefan Ram): Jul 18 01:31PM

>Enforce at compile time that an object can only be
>instanciate on the stack.
 
I see no way to accomplish this. Wild guess: Overwrite
operator »new« to fail for such classes?
 
>Enforce at compile tiem that an object can only be
>instanciate on the heap.
 
That should be possible by all-private constructors
and a factory function that only returns a pointer.
 
>This is to ensure That no "Big" object are instanciated on
>the stack.
 
The object also can put a small object into storage with
automatic lifetime and carry the rest in storage with
dynamic lifetime. Parts of C++ were designed just for
objects of this type.
% <Persent@gmail.com>: Jul 18 06:23AM -0700

On 2018-07-17 3:32 AM, Storage Unit wrote:
> those paths contain non-ASCII characters.
 
>> Microsoft has been reliant on hard-coded paths since before Windows,
>> and Windows 10 is no different.
 
then why aren't you using 10
stayprivate@gmail.com: Jul 18 06:14AM -0700

I would like to create a class/template but want to prevent at compile time where it can live. Seems like type_traits or the upcoming concepts cannot be use for that.
 
I have two scenarios:
- Enforce at compile time that an object can only be instanciate on the stack. This is to implement a "safe" dynarray type like object.
- Enforce at compile tiem that an object can only be instanciate on the heap. This is to ensure That no "Big" object are instanciated on the stack.
 
I guess being able to enforce/prevent being global could be usefull. Example an object you don't want global because its constructure depends on some stuff initialised in the main().
 
Does this idea make any kind of sense?
 
I can see it being complicated to implement with scenarios that might be impossible detect at compile time. Maybe it's a feature more suited to a static analyser?
 
Regards.
Juha Nieminen <nospam@thanks.invalid>: Jul 18 07:53AM

> A good suggestion IMO. A problem with lists (as I think Bjarne has
> pointed out) is they suffer from lack of cache locality, so in
> practice arrays/vectors nearly always do better.
 
That's completely true, but in this case it may be a micro-optimization,
even premature optimization, with little to no benefit.
 
I this were a number-crunching example, where the list of elements is
being traversed and modified millions of times per second, as fast as
the computer can possibly do it, I would never use a linked list
(unless it's absolutely necessary, which is the case with a few
algorithms; rarely, but they exist).
 
However, this example is a list of a couple dozen elements, give or take,
which is traversed and modified once per frame (ie. typically 60 times
per second). It's hardly a bottleneck.
 
Moreover, depending on the particular implementation, the projectiles
themselves may be dynamically allocated objects (which is very common
with many game engines, where sprites are typically dynamically
allocated), so you are not really saving a lot of dynamic allocations
by not using a linked list.
 
The advantage of using std::list is that the code becomes simpler,
and thus the likelihood of bugs is smaller.
 
Know your tools, and use them efficiently.
Bo Persson <bop@gmb.dk>: Jul 18 12:14PM +0200

On 2018-07-18 09:50, Tim Rentsch wrote:
 
> I wasn't offering any opinion about how much C++ is to write home
> about. My comment is only about the laughable proposition that
> C++-minus-templates is only mildly larger than C.
 
Yeah! One data point:
 
K&R "The C Programming Language" - 200 pages
 
Stroustrup "The C++ Programming Language" - 1300 pages
 
 
 
Bo Persson
scott@slp53.sl.home (Scott Lurndal): Jul 18 01:01PM


>Yeah! One data point:
 
>K&R "The C Programming Language" - 200 pages
 
>Stroustrup "The C++ Programming Language" - 1300 pages
 
Ah, you're only off by an order of magnitude.
 
Stroustrup "The C++ Programming Language" (July 1987). 328 pages including index
and a sample 'string' class implementation.
 
_That's_ a "mild superset" of C.
Ian Collins <ian-news@hotmail.com>: Jul 18 09:16PM +1200

On 18/07/18 21:08, jacobnavia wrote:
> message. They must be different since errno should have been set by sqrt.
 
> But for clang they aren't.
 
> Is this a bug in clang?
 
Old clang?
 
$ clang++ -std=c++14 x.cc; ./a.out
Error: Numerical argument out of domain
 
$ g++ -std=c++14 x.cc; ./a.out
Error: Numerical argument out of domain
 
$ clang++ --version
clang version 7.0.0-svn337135-1~exp1+0~20180715204310.423~1.gbp8ac377
(trunk)
g++ --version
g++ (Ubuntu 8-20180414-1ubuntu2) 8.0.1 20180414 (experimental) [trunk
revision 259383]
 
Even an old one:
 
$ clang++-4.0 -std=c++14 x.cc; ./a.out
Error: Numerical argument out of domain
 
--
Ian.
You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page.
To unsubscribe from this group and stop receiving emails from it send an email to comp.lang.c+++unsubscribe@googlegroups.com.

No comments: