Wednesday, July 4, 2018

Digest for comp.lang.c++@googlegroups.com - 25 updates in 8 topics

David Brown <david.brown@hesbynett.no>: Jul 04 08:28AM +0200

On 03/07/18 18:33, Bart wrote:
 
> Yes, you can find out how crap or otherwise a compiler is at handling
> lots of such expansions. The fact that it crashed (not due to memory)
> suggests there is a bug.
 
gcc is a /big/ program. It has bugs. It would be nice if it did not,
but there are few big programs that are bug-free. Sometimes a
stress-test like this can be useful for finding bugs - so yes, that is a
use of such a bizarre source file, even though I was looking for a use
of the code itself.
 
> f()
 
> But it'll be interesting to see what magic C++'s modules will perform;
> here's my take on the problem:
 
(The flexibility mentioned is from ideas like metaclasses, that are
still a long way from standardisation - modules are about improving
build times.)
 
 
> What is the internal state of the compiler just after those 100,000
> lines? Devise a way to remember that state, or to restore that state if
> compiling multiple small modules with the same invocation of the compiler.
 
That is roughly what modules are. A compiled module is specific to the
compiler - it is not portable in any way. It will include analysed code
in whatever internal formats suit the compiler. It might also contain
bits of generated object code, but often the object code itself is
generated later once the code is all combined.
 
Pre-compiled headers, supported by many compilers, are a sort of
internal compiler state dump generated after a set of include files.
The problem has always been that C and C++ are too flexible about things
- the compilation may change due to different macros being defined
(which could be set in the command line), different options, different
ordering of the include files, and all sorts of little details that
usually don't make a difference, but could in theory.
 
Modules have restrictions to stop this kind of thing - you can't have
conditional compilation based on the value of the __DATE__ macro, and
other gems in your module code. The can be independent, and you can use
many modules in any order in your code.
 
>> serious candidate for such a thing.
 
> Is that why it seems to be an extension in gcc for C++? There it uses <?
> and >? operators rather than min and max.
 
That extension existed for a while, because someone had thought it was a
good idea and added it to gcc. It was dropped long ago - in gcc 4, from
2005, as part of the push towards close standardisation of C++ in gcc.
No one complained, because no one used it.
 
So yes, the gcc extension has shown that you /can/ make working min and
max operators as extensions to C++ (and it would work for C too). It
also shows that it is pointless to do so. Learn from the mistakes of
others, rather than copying them.
 
 
> "Since <? and >? are built into the compiler, they properly handle
> expressions with side-effects; int min = i++ <? j++; works correctly."
 
> https://gcc.gnu.org/onlinedocs/gcc-3.4.3/gcc/Min-and-Max.html
 
It also notes that there are other gcc extensions that let you avoid
side-effect complications with a min and max macro.
David Brown <david.brown@hesbynett.no>: Jul 04 08:32AM +0200

On 03/07/18 16:59, Scott Lurndal wrote:
 
>> So, when people say that C++ is slow to compile (compared to equivalent
>> code in other languages), then what is the reason?
 
> Who says that?
 
Lots of C++ users? It is one of the main motivations for modules, which
is an eagerly awaited upcoming feature for C++. (It is not the only
motivation for modules, of course.)
 
Of course, there are many factors involved, and the speed of compilation
(and the speed you want) will vary hugely from project to project and
developer to developer. For many, compilation speed is not an issue -
but for many others, it is.
 
Mind you, I am not sure how you get equivalent code in other compiled
languages in order to compare the speed. I suppose you could measure
build times for KDE and Gnome.
"Chris M. Thomasson" <invalid_chris_thomasson@invalid.invalid>: Jul 03 11:36PM -0700

On 7/3/2018 4:20 PM, Bart wrote:
 
>> One of my tests for developing CAlive required creating a little parser
>> that converted C code to x86 assembly.
 
> That sort of sounds like a compiler...
[...]
 
> I decided to start my C compiler project on 25-Dec-16. It was usable by
> Easter '17. (However that is now just an interesting, quirky project as
> it plays no part in my current code development, except as a test program.)
 
Bart also said he fairly quickly altered his compiler to handle C99
Generics, read here:
 
https://groups.google.com/d/msg/comp.lang.c/i31kS7cJ70w/cfkEVXjfAAAJ
 
Thanks again Bart. :^)
Wouter Verhelst <w@uter.be>: Jul 04 10:22AM +0200

On 03-07-18 11:23, David Brown wrote:
> it). You also need at least -O1 to get good static warnings.
 
> Sure, many people prefer -O0 for debugging - but it is not "obviously"
> the case :-)
 
Fair enough; I hadn't considered that, but it makes sense.
 
> separate action. For bigger projects, I find that it is often linking
> that takes the noticeable time, rather than compiling (unless I change a
> commonly used header, forcing many files to be re-compiled).
 
That's my experience too.
 
The problem of linking time is much reduced by proper use of "static" or
compiler pragmas to declare a symbol as an export (or not), which I see
some less experienced programmers forget about; but yeah, even so, large
projects tend to take a while being linked together.
Wouter Verhelst <w@uter.be>: Jul 04 10:26AM +0200

On 03-07-18 14:04, Bart wrote:
> who like small tools that work more or less instantly. Compilation
> considered as a task that converts a few hundred KB of input to a few
> hundred KB of output, shouldn't really take that long.
 
There are cycles, and then there are cycles.
 
I do care about cycles -- but about cycles of my resulting code, *not*
about cycles of my compiler. If the compiler spends 100 extra seconds to
build an application that is .01% faster but runs about a second on
average, then that application needs only needs to run 10000 times and
the extra time which the compiler spent is worth it.
 
An optimizing compiler's job is way more than just "convert a few
hundred KB of input into a few hundred KB of output".
 
I guess your opinion and mine of what compilers do is not the same thing.
Ian Collins <ian-news@hotmail.com>: Jul 04 09:27PM +1200

On 04/07/18 20:22, Wouter Verhelst wrote:
> compiler pragmas to declare a symbol as an export (or not), which I see
> some less experienced programmers forget about; but yeah, even so, large
> projects tend to take a while being linked together.
 
That shouldn't be a problem with modern tools. I guess our project is
quite large, but the link is reasonably quick:
 
$ ls -lh bin/Posix_Debug/Tests
-rwxr-xr-x 1 ian ian 205M Jul 4 21:23 bin/Posix_Debug/Tests
 
$ rm bin/Posix_Debug/Tests
 
$ time ninja bin/Posix_Debug/Tests
[1/1] LINK bin/Posix_Debug/Tests
 
real 0m3.334s
user 0m3.004s
sys 0m0.310s
 
Not too shabby...
 
--
Ian.
David Brown <david.brown@hesbynett.no>: Jul 04 11:33AM +0200

On 04/07/18 10:22, Wouter Verhelst wrote:
> compiler pragmas to declare a symbol as an export (or not), which I see
> some less experienced programmers forget about; but yeah, even so, large
> projects tend to take a while being linked together.
 
There is no shortage of "static" in my code!
 
I wouldn't call linking time a problem, not for the projects I work on -
but it is often longer than the compiles. When you have a traditional
make-based compile-then-link build, builds usually only involve a single
compile of the source file you have changed. You might need a few
compiles when changing a header, but those run in parallel. The
linking, however, is serial and scales with the number of files and the
size of the project. (Big projects can do more complex builds, with
incremental linkage.)
Bart <bc@freeuk.com>: Jul 04 11:45AM +0100

On 04/07/2018 10:33, David Brown wrote:
> linking, however, is serial and scales with the number of files and the
> size of the project. (Big projects can do more complex builds, with
> incremental linkage.)
 
Linking is something I could never understand why it was such a big deal
or why it took so long. (I'm remembering back to mainframe and
minicomputer days.)
 
When it started to impinge on my work on microcomputers, I just wrote my
own version, which I called a loader - it was just a question of reading
a series of object files, doing some fixups, and writing out the result
in the format demanded by the OS for its executable files. It took no
time at all.
 
More recently it has again become a problem - not for its speed (my
biggest project might take a quarter of a second to link via gcc), but
because (1) there was this huge dependency on gcc; (2) gcc would add a
bunch of crap to the executable; (3) if I used 'ld' separately, it
wouldn't work reliably on Windows 10; (4) if I used an independent
linker, that had issues of its own that rendered it unusable.
 
So again I had to tackle the problem.
 
(By first eliminating object files; compilers generate .asm files. The
assembler does the job by reading and assembling all the .asm files of a
project - it's /very/ quick - and fixing up the resulting in-memory code
to the .exe format required and writing out the file.
 
That last step appears to take 1ms for a 300KB executable (I assume the
OS completes the actual file write after my app terminates). In any case
'linking' is no longer a problem as the concept itself has been largely
eliminated.
 
(In this scheme, external libraries must be .dll (shared library) files
which become imports in the .exe file.))
 
--
bart
David Brown <david.brown@hesbynett.no>: Jul 04 01:52PM +0200

On 04/07/18 12:45, Bart wrote:
 
> Linking is something I could never understand why it was such a big deal
> or why it took so long. (I'm remembering back to mainframe and
> minicomputer days.)
 
I guess this is another one of these cases where you have a nice,
efficient simple solution and the rest of the programming world has
vastly over-complicated and slow solutions. It must be frustrating
being so much smarter than the rest of the world put together!
 
Linking /can/ be simple, for simple systems, few features, small
programs, simple languages. Outside that, in the real world of
professional programming, it gets more complicated.
 
Most of my projects aren't big enough for linking time to be an issue,
but on some the link of the whole project can take longer than
individual compiles. (Not longer than all the compiles - but as noted
earlier, builds frequently only need to re-compile one file. Linking,
using a simple build, starts from scratch each time.)
 
> bunch of crap to the executable; (3) if I used 'ld' separately, it
> wouldn't work reliably on Windows 10; (4) if I used an independent
> linker, that had issues of its own that rendered it unusable.
 
So by "bunch of crap", you mean "vital things that I don't understand" ?
Bart <bc@freeuk.com>: Jul 04 01:35PM +0100

On 04/07/2018 12:52, David Brown wrote:
 
> I guess this is another one of these cases where you have a nice,
> efficient simple solution and the rest of the programming world has
> vastly over-complicated and slow solutions.
 
Pretty much, yes. In the 80s I was writing commercial applications and
the development process WAS much streamlined by using just a simple
loader (which, as it happened, also managed overlays).
 
So in that case, the traditional linker was not necessary. I don't know
what weird and wonderful applications other people were writing, but I'd
wager a big chunk of them were just as straightforward as mine.
 
 
> Linking /can/ be simple, for simple systems, few features, small
> programs, simple languages. Outside that, in the real world of
> professional programming, it gets more complicated.
 
In what way? The task is still to take a load of .o, .a. .so files, and
produce an executable or another .so file, right?
 
>> because (1) there was this huge dependency on gcc; (2) gcc would add a
>> bunch of crap to the executable
 
> So by "bunch of crap", you mean "vital things that I don't understand" ?
 
Things to do with C startup code, which wasn't relevant for me as I was
linking a generic program not specific to C or gcc. Actually, if you run:
 
gcc hello.c
 
then even this simple program invokes ld using the 48 parameters shown
below. You are free not to regard this as a 'bunch of crap'.
 
Since in my case I would be trying to link a program represented in its
entirety by the ASM given at the end of this post, which exports one
symbol 'main', and imports two symbols 'printf' and 'exit' from
msvcrt.dll, then my opinion as to the relevance of all this junk may
well be different to yours.
 
>> More recently it has again become a problem - not for its speed (my
>> biggest project might take a quarter of a second to link via gcc),
 
(That's on a desktop PC. On a cheap laptop using SD cards for storage,
linking was taking several seconds. I can't remember how much faster my
'linker' was; I'll have to find the charger lead before I can test it.)
 
--------------------------------------------------------------
1: -plugin
2: c:/TDM/bin/../libexec/gcc/x86_64-w64-mingw32/5.1.0/liblto_plugin-0.dll
3:
-plugin-opt=c:/TDM/bin/../libexec/gcc/x86_64-w64-mingw32/5.1.0/lto-wrapper.exe
4: -plugin-opt=-fresolution=C:\Users\user\AppData\Local\Temp\ccqzJCDj.res
5: -plugin-opt=-pass-through=-lmingw32
6: -plugin-opt=-pass-through=-lgcc
7: -plugin-opt=-pass-through=-lmoldname
8: -plugin-opt=-pass-through=-lmingwex
9: -plugin-opt=-pass-through=-lmsvcrt
10: -plugin-opt=-pass-through=-lpthread
11: -plugin-opt=-pass-through=-ladvapi32
12: -plugin-opt=-pass-through=-lshell32
13: -plugin-opt=-pass-through=-luser32
14: -plugin-opt=-pass-through=-lkernel32
15: -plugin-opt=-pass-through=-lmingw32
16: -plugin-opt=-pass-through=-lgcc
17: -plugin-opt=-pass-through=-lmoldname
18: -plugin-opt=-pass-through=-lmingwex
19: -plugin-opt=-pass-through=-lmsvcrt
20: -m
21: i386pep
22: --exclude-libs=libpthread.a
23: -Bdynamic
24:
c:/TDM/bin/../lib/gcc/x86_64-w64-mingw32/5.1.0/../../../../x86_64-w64-mingw32/lib/../lib/crt2.o
25: c:/TDM/bin/../lib/gcc/x86_64-w64-mingw32/5.1.0/crtbegin.o
26: -Lc:/TDM/bin/../lib/gcc/x86_64-w64-mingw32/5.1.0
27: -Lc:/TDM/bin/../lib/gcc
28:
-Lc:/TDM/bin/../lib/gcc/x86_64-w64-mingw32/5.1.0/../../../../x86_64-w64-mingw32/lib/../lib
29: -Lc:/TDM/bin/../lib/gcc/x86_64-w64-mingw32/5.1.0/../../../../lib
30:
-Lc:/TDM/bin/../lib/gcc/x86_64-w64-mingw32/5.1.0/../../../../x86_64-w64-mingw32/lib
31: -Lc:/TDM/bin/../lib/gcc/x86_64-w64-mingw32/5.1.0/../../..
32: C:\Users\user\AppData\Local\Temp\ccjoN5Hg.o
33: -lmingw32
34: -lgcc
35: -lmoldname
36: -lmingwex
37: -lmsvcrt
38: -lpthread
39: -ladvapi32
40: -lshell32
41: -luser32
42: -lkernel32
43: -lmingw32
44: -lgcc
45: -lmoldname
46: -lmingwex
47: -lmsvcrt
48: c:/TDM/bin/../lib/gcc/x86_64-w64-mingw32/5.1.0/crtend.o
 
-------------------------------------------------------
`main::
sub Dstack, 8
 
sub Dstack, 32
mov D10, KK1
call `printf*
add Dstack, 32
 
sub Dstack, 32
mov D10, 0
call exit*
 
segment idata
align 8
KK1:db "Hello World!",10,0
 
 
--
bart
Tim Rentsch <txr@alumni.caltech.edu>: Jul 02 01:42AM -0700

>>>> long long : max_llong, unsigned long long : max_ullong, \
>>>> float : max_float, double : max_double, long double : max_ldouble \
>>>> )(a, b)
 
[...]
 
> the binding of the '>' operator is taken at the point at which the
> 'make_max' macro is called and not at the point where that macro is
> defined. [...]
 
I see at least three other criticisms:
 
(1) Doesn't cover all standard types;
 
(2) A bug in one of the types it does cover;
 
(3) Falls down badly in cases where the two arguments have
different types.
Wouter Verhelst <w@uter.be>: Jul 02 09:02AM +0200

On 01-07-18 18:47, Bart wrote:
> bit of AST, doing the same optimisations and the same <max> pattern
> recognition and the conversion into intrinsic versions or into optimised
> code, over and over again.
 
Yeah, that's true, but who cares? If the compiler is slow during compile
time so that it can produce a fast program, I couldn't care less.
 
The resulting program will have the optimized max opcode at every place
where it is needed, and that's what matters -- not how the compiler gets
there, IMO.
 
> spend its time on. Implementing 'max' etc efficiently can be done once,
> instead of in N different places in a program, every time it is
> compiled, and for every other programmer and application that uses it.
 
As far as I'm concerned, the compiler can spend an hour optimizing a
simple "hello world" program if that means that program's runtime is cut
in half. Yes, I'm exaggerating, but you get the point.
 
Given your past statements, I'm sure you disagree with that. That's fine :-)
Keith Thompson <kst-u@mib.org>: Jul 01 12:14PM -0700

> On 01.07.2018 11:08, Bart wrote:
[...]
 
> They corresponded directly, and still correspond, to very common
> processor instructions, and at that time it was more the programmer's
> job, and not the compiler's, to optimize the resulting machine code.
 
[...]
 
> Of course, historically that was ten years or so after development of C
> started, PC 1981 versus plain C 1971. I think the original C development
> was on a PDP-10.
 
Off-by-three error. It was a PDP-7.
 
> I had some limited exposure to the PDP-11, but all I
> remember about the assembly language was that it was peppered with @
> signs (probably indicating macros),
 
No, @ (or parentheses) indicate "deferred" addressing modes.
For example R3 refers to the R3 register, (R3) or @R3 refers to
the memory location whose address is stored in R3.
 
> and that the registers were numbered
> and memory-mapped.
 
Numbered, yes (R0..R7, where R7 is the PC (Program Counter)), but not
memory-mapped.
 
> But I'm pretty sure that if the PDP-11 didn't have
> add to register, I'd remember that. And so, presumably also the PDP-10.
 
If I recall correctly,
 
ADD #42, R0
 
would add 42 to the contents of R0 and store the sum in R0.
 
I'm less familiar with the PDP-7, but I think it also had 2-operand
instructions, where one of the operands was the target.
 
[...]
 
--
Keith Thompson (The_Other_Keith) kst-u@mib.org <http://www.ghoti.net/~kst>
Working, but not speaking, for JetHead Development, Inc.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
David Brown <david.brown@hesbynett.no>: Jul 04 03:19PM +0200

On 04/07/18 14:35, Bart wrote:
> symbol 'main', and imports two symbols 'printf' and 'exit' from
> msvcrt.dll, then my opinion as to the relevance of all this junk may
> well be different to yours.
 
Your problem is that you like to write "hello world" programs, and get
upset that tools for real code are more sophisticated than you need.
These options appear to be handling a variety of libraries that most
/real/ programs need, or that are needed to support C and C++ standards.
 
gcc (and related programs, like binutils, libraries, etc.) is not a toy
- it is a real development tool, with a great deal of features. You
only use a tiny part of those features - most gcc users only use a small
part (though I expect few use as little as you). But gcc was not made
for you - it was not, like your own little compiler, written
specifically to suit /your/ personal needs and wants. You only want a
tool that compile "hello world", and maybe stretch to programs with 2 or
3 source files - other people want a compiler that works well for
programs with hundreds of thousands of source files and hundreds of MB
of code. You only need a subset of an ancient C standard - other people
want the latest versions of a several different programming languages.
You only need a couple of variations of a single cpu, others want
support for many dozen processors. And so on.
 
All this means that compiling and linking might take a few seconds on a
PC that was outdated decades ago, when it is possible to for a limited
toy tool to work in a fraction of a second. Speed doesn't matter when
your tool can't do the job other people need.
 
You are comparing a Tonka Truck with a Toyota HiAce, using the time
taken out of the garage as a benchmark.
 
Juha Nieminen <nospam@thanks.invalid>: Jul 04 07:12AM

> Le 01/07/2018 à 23:40, Vir Campestris a écrit :
>> Use the old syntax. Then you can type i +=2 to skip the alternate ones.
 
> Why is the index not included as info with the new syntax?
 
A range-based for loop uses, as its name implies, an iterator range.
An iterator range can cover more than just random access data containers
(for which an "index" value makes sense).
 
You could have an integral *counter* that tells how many elements have
been processed so far, but that's not an actual "index" value categorically
speaking.
Vir Campestris <vir.campestris@invalid.invalid>: Jul 01 10:40PM +0100

On 30/06/2018 19:09, jacobnavia wrote:
> __index, so there is no way to do that C like code with the new syntax.
 
> We lost the index with the new syntax.
 
> Or I am just wrong and I have overlooked something?
 
Use the old syntax. Then you can type i +=2 to skip the alternate ones.
 
Andy
"Alf P. Steinbach" <alf.p.steinbach+usenet@gmail.com>: Jul 04 10:11AM +0200

On 01.07.2018 19:08, Paavo Helde wrote:
 
> This is one of the few occasions I have found macros useful. All my
> mutex locks go through macros which define a lock variable, e.g:
 
> SCOPED_LOCK(g_pages_mutex);
 
This seems to imply one such macro for each type where the RAII wrapper
idiom is used.
 
I wonder how you dealt with the name of that object, considering that
one might want to have two or more such objects in the same scope?
 
 
> above checks.
 
> 4. The lock variable name is hidden which is fine as it does not matter
> most of the time (it's needed only when one wants to wait on the mutex).
 
A more general macro like exemplified below keeps benefits 1 and 4, but
if used directly appears to lose benefits 2 and 3.
 
Direct usage would go like
 
using Lock = std::lock_guard<std::mutex>;
 
//...
BLAHBLAH_WITH( Lock{ g_pages_mutex } )
{
// code
}
 
where a C++17 definition of BLAHBLAH_LOCK can go like this:
 
#define BLAHBLAH_WITH( ... ) \
if( const auto& _ = __VA_ARGS__; !!&_ ) // The !!&_ avoids
warning about unused.
 
Here the variable's name isn't so much of a problem since the macro
introduces a nested scope.
 
* * *
 
I've found that the main problem with defining such a macro is to make
the usage syntax natural.
 
With earlier versions of C++ one would have to use a definition where
the usage would look like
 
BLAHBLAH_WITH( Lock,( g_pages_mutex ) )
 
or
 
BLAHBLAH_WITH( Lock, g_pages_mutex )
 
in order to separate the constructor arguments from the type.
 
The latter usage syntax is perhaps more conventional, e.g. it's used by
`std::thread` constructor.
 
* * *
 
A lesser problem is the all uppercase shouting and noisy prefixes like
BLAHBLAH_. The fearless programmer can use `$` as a macro name prefix.
It's non-standard but all compilers seem to support it.
 
Potential problem: when Herb Sutter tried that for his library code
(just after I'd tried that for my experimental library code), with a
large number of testers, some complained that their company used some
preprocessing scheme where the `$` was significant. So as I understand
it he ditched that scheme. But it's there, for the fearless. :)
 
E.g. after seeing your posting I defined
 
 
----------------------------------------------------------------------
#pragma once
 
// For example
// $with( Lock{ mutex } ) access_resource();
 
#if not defined( CPPX_NO_DOLLAR_NAMES )
# define $with CPPX_WITH

No comments: