Thursday, September 9, 2021

Digest for comp.lang.c++@googlegroups.com - 24 updates in 1 topic

Bart <bc@freeuk.com>: Sep 09 01:27AM +0100

On 09/09/2021 00:06, Ian Collins wrote:
> and propitiatory projects fall into that category.  Even our little
> embedded product has 2600+ source and 4500+ header files.  Combining
> those would probably take longer than building them!
 
Have you tried it?
 
I've just downloaded the sources of the GMP project that has been
discussed, from github.
 
There are about 960 .c and .h files in all (not including any that may
be synthesised when building, but including files specific to many
different targets).
 
Extracting all those files from the directory structure, into a single
file, took my script 0.6 seconds.
 
That 5MB file contained 173K lines of code, which is not huge. (The 5
language projects of mine that I'd posted about, coincidentally also
totalled 173K lines of code; and also coincidentally took 0.6 seconds to
build into 5 executables.)
 
The 'configure' script that comes with this is 30.5K lines of somewhat
denser code than the C sources.
 
That's like needing a heavy 100-line script in order to compile a
600-line program.
 
I couldn't see much in the way of .s files, or inline asm() statements,
so I don't know where that comes from.
Ian Collins <ian-news@hotmail.com>: Sep 09 01:05PM +1200

On 09/09/2021 12:27, Bart wrote:
>> embedded product has 2600+ source and 4500+ header files.  Combining
>> those would probably take longer than building them!
 
> Have you tried it?
 
Nope.
 
> language projects of mine that I'd posted about, coincidentally also
> totalled 173K lines of code; and also coincidentally took 0.6 seconds to
> build into 5 executables.)
 
So what happens if you get an error at line 16,789? How do you map that
back to the original source?
 
It's much easier to combine C files than it is to combine C++ where
there's often a fair bit of code in the headers. Even if it were
feasible to combine all of the files, the resulting TU would probably
break the compiler on a machine with a reasonable amount of RAM. Even
it it could be compiled, it would take an age on a single core.
 
--
Ian.
Juha Nieminen <nospam@thanks.invalid>: Sep 09 08:11AM

> arithmetics however can be satisfied by not using GMP but other things
> for example Boost.Multiprecision (that has modern interface and is quite
> simple to build for very lot of platforms).
 
Yeah, why would anybody want to use the most efficient and best library
if they don't have access to a multi-million-dollar server farm? Even
the notion is totally ludicrous!
MisterMule@stubborn.uk: Sep 09 08:54AM

On Wed, 8 Sep 2021 20:22:52 +0300
>when
>> moo.h changes and main.c should be recompiled when anything changes?
 
>Such dependencies are taken care automatically by the gcc -MD option,
 
No unless the compiler is clairvoyant they arn't.
MisterMule@stubborn.uk: Sep 09 08:55AM

On Thu, 9 Sep 2021 10:58:13 +1200
 
>> I've written cross platform code for years. Generally its 1 or 2 lines in
>> the Makefile that need to be commented in/out. How is CMake any simpler?
 
>Windows?
 
No, but then CMake and make are rarely used on Windows, its VC++ solution files.
Paavo Helde <myfirstname@osa.pri.ee>: Sep 09 12:15PM +0300

> No, but then CMake and make are rarely used on Windows, its VC++ solution files.
 
Yes, on Windows VC++ project files are often used, and they can be
easily produced by cmake:
 
> cmake.exe -h
[...]
Generators
 
The following generators are available on this platform (* marks default):
* Visual Studio 16 2019 = Generates Visual Studio 2019 project files.
Use -A option to specify architecture.
Visual Studio 15 2017 [arch] = Generates Visual Studio 2017 project
files.
Optional [arch] can be "Win64" or "ARM".
Visual Studio 14 2015 [arch] = Generates Visual Studio 2015 project
files.
Optional [arch] can be "Win64" or "ARM".
Visual Studio 12 2013 [arch] = Generates Visual Studio 2013 project
files.
Optional [arch] can be "Win64" or "ARM".
Visual Studio 11 2012 [arch] = Generates Visual Studio 2012 project
files.
Optional [arch] can be "Win64" or "ARM".
Visual Studio 10 2010 [arch] = Generates Visual Studio 2010 project
files.
Optional [arch] can be "Win64" or "IA64".
Visual Studio 9 2008 [arch] = Generates Visual Studio 2008 project
files.
Optional [arch] can be "Win64" or "IA64".
 
[...]
MisterMule@stubborn.uk: Sep 09 09:19AM

On Thu, 9 Sep 2021 12:15:07 +0300
>files.
 
>Yes, on Windows VC++ project files are often used, and they can be
>easily produced by cmake:
 
Using one obtuse tool to produce project files that are even more obstuse
makes me glad I don't do windows dev now. The VC++ build system is an utter
mess hidden behind a pretty front end.
Bart <bc@freeuk.com>: Sep 09 10:29AM +0100

On 09/09/2021 02:05, Ian Collins wrote:
>>> those would probably take longer than building them!
 
>> Have you tried it?
 
> Nope.
 
So you're guessing. IME pushing files around is insignificant comparing
with building them, and that's with my fast tools.
 
My timings all take advantage of files have been recently processed and
therefore cached. Which will be the case if you've just downloaded some
project and unpacked the files.
 
>> build into 5 executables.)
 
> So what happens if you get an error at line 16,789?  How do you map that
> back to the original source?
 
What happens if you get an error in line 16,789 of the configure file?
 
(Which for GMP is this line:
 
$CC -Zdll -Zcrtdll -o $output_objdir/$soname $libobjs $deplibs
$compiler_flags $output_objdir/$libname.def~
 
But I don't get that far:
 
"'.\configure' is not recognized as an internal or external command,
operable program or batch file.")
 
What happens if you get an error on line 1234 of a makefile which is
already gobbledygook? (Which I use to get all the time, though not
specifically on that line.)
 
In the case of a single file C distribution, you will have checked it
compiles correctly using a range of recommended compilers before making
it available.
 
The range of likely errors is considerably smaller than the vast
possibilities of going wrong with scripts, makefiles, running assorted
other langages (like m4) and various source files not being in the right
place or not being found because they don't exist.
 
With a single file, it's relatively easy to check if it's present or not!
 
I was anyway suggesting a streamlined version of the sources, not
necessarily one amalgamated file. Like having them in one place.
 
(However GMP has about 30% of the files having clashing names, which
sounds like it needs to sort something out with the various targets. For
example to have a separate version for each; after all I don't need the
source code for MIPS etc, when I want to build for x64.)
 
> feasible to combine all of the files, the resulting TU would probably
> break the compiler on a machine with a reasonable amount of RAM.  Even
> it it could be compiled, it would take an age on a single core.
 
That's just a mark against the approach used by C++. I don't even
attempt to build such projects, and can't use libraries with a C++ API.
 
But it also happens that some of the easiest C libraries to use are
implemented as a single header file (just .h, not even .h and .c;
example stb_image.h). Those are the ones where the authors have given
some thought to ease of deployment.
Ian Collins <ian-news@hotmail.com>: Sep 09 10:19PM +1200

>>> the Makefile that need to be commented in/out. How is CMake any simpler?
 
>> Windows?
 
> No, but then CMake and make are rarely used on Windows, its VC++ solution files.
 
You said you've written cross platform code for years and asked how does
CMake simplify things. If cross platform includes Windows, it certainly
does.
 
Even the Visual Studio team recommends it for cross platform development.
 
https://docs.microsoft.com/en-us/cpp/build/get-started-linux-cmake?view=msvc-160
 
https://devblogs.microsoft.com/cppblog/cmake-support-in-visual-studio/
 
--
Ian.
Ian Collins <ian-news@hotmail.com>: Sep 09 10:28PM +1200

On 09/09/2021 21:29, Bart wrote:
 
>> Nope.
 
> So you're guessing. IME pushing files around is insignificant comparing
> with building them, and that's with my fast tools.
 
We can use distributed building to build our code.
 
 
>> So what happens if you get an error at line 16,789?  How do you map that
>> back to the original source?
 
> What happens if you get an error in line 16,789 of the configure file?
 
We don't have one. We have a premake Lua script.
 
Most of the open source libraries we use have CMake scripts.
 
>> it it could be compiled, it would take an age on a single core.
 
> That's just a mark against the approach used by C++. I don't even
> attempt to build such projects, and can't use libraries with a C++ API.
 
So why are you complaining in a C++ group?
 
You have failed to address the question of how distributed teams work
with code. At some point, even if they distribute a single source (like
sqlite) there must be a system to build individual files.
 
You also gloss over the fact that any reasonably large project will
build significantly faster as discrete files.
 
--
Ian
Bart <bc@freeuk.com>: Sep 09 12:40PM +0100

On 09/09/2021 11:28, Ian Collins wrote:
 
>> What happens if you get an error in line 16,789 of the configure file?
 
> We don't have one.  We have a premake Lua script.
 
> Most of the open source libraries we use have CMake scripts.
 
CMake on Windows is a 93MB installion with 6000 files. For something I
don't quite get the point of. (If it does scripting, Lua is a better bet
with usually a single, smallish executable.)
 
 
>> That's just a mark against the approach used by C++. I don't even
>> attempt to build such projects, and can't use libraries with a C++ API.
 
> So why are you complaining in a C++ group?
 
I can't remember; I think the thread has veered across both C and C++
groups and I haven't paid much attention.
 
> You have failed to address the question of how distributed teams work
> with code.
 
I don't have experience with such code, so I don't know the problems and
the approaches I might use to fix them.
 
But I do have experience of trying to use open-source products, where
what would be a trivial-to-build project, turns into a exasperating hunt
for the necessary information which is buried with two levels of
encryption inside makefiles and scripts.
 
> At some point, even if they distribute a single source (like
> sqlite) there must be a system to build individual files.
 
In order to build Lua (that script language mentioned above), the
hundreds of lines of makefiles and scripts can actually be reduced down
to a list 34 .c files that needed to be compiled into the same
executable; that's it.
 
I used to build it with a @ file containing that list of files plus
whatever compile options were needed.
 
I just wish more projects would also provide that basic information.
Some may be too impractical for that like yours, but very many could
benefit from that approach.
 
People building such things will usually be programmers; they can
arrange their own scripting when necessary!
 
> You also gloss over the fact that any reasonably large project will
> build significantly faster as discrete files.
 
The actual compile-time is not that critical for something that
hopefully you only do once. It will still be faster than running that
brain-dead configure script!
 
Optimisation is not important for the first step of getting /anything/
working, so that you can use a fast compiler, or turn off some options.
But when you do need the extra speed, a single file gives you
whole-program optimisation for free.
 
It's true that discrete files that make parallelising a build more
practical. But it's also true that a fast compiler can generate up to
1-2MB of unoptimised binary code per second per core; so exactly how big
is the thing you're building?
Paavo Helde <myfirstname@osa.pri.ee>: Sep 09 02:41PM +0300


> Using one obtuse tool to produce project files that are even more obstuse
> makes me glad I don't do windows dev now. The VC++ build system is an utter
> mess hidden behind a pretty front end.
 
I fail to see how cooperating with a most widely used build system on a
platform might make anything obtuse.
 
But if it does, then I'm all happy about that, I have hurt myself enough
times by supposedly sharp tools like autoconf or plain makefiles.
Paavo Helde <myfirstname@osa.pri.ee>: Sep 09 02:47PM +0300

>>> moo.h changes and main.c should be recompiled when anything changes?
 
>> Such dependencies are taken care automatically by the gcc -MD option,
 
> No unless the compiler is clairvoyant they arn't.
 
I am sure Clairvoyant is happy if somebody pays them $$ for features
which other people have enjoyed for free for decades.
MisterMule@stubborn.uk: Sep 09 04:09PM

On Thu, 9 Sep 2021 22:19:03 +1200
 
>You said you've written cross platform code for years and asked how does
>CMake simplify things. If cross platform includes Windows, it certainly
>does.
 
Well my cross platform doesn't include Windows. It includes Linux, MacOS, *BSD
and very occasionally Solaris. I had the misfortune of being forced to do
Windows dev for 6 months using the abortion known as VC++ and I have no
intention of doing it again.
 
>Even the Visual Studio team recommends it for cross platform development.
 
Thats nice.
MisterMule@stubborn.uk: Sep 09 04:13PM

On Thu, 9 Sep 2021 14:47:12 +0300
 
>> No unless the compiler is clairvoyant they arn't.
 
>I am sure Clairvoyant is happy if somebody pays them $$ for features
>which other people have enjoyed for free for decades.
 
No, they haven't, because the compiler has no idea which files to rebuild if
it doesn't know what those files are. The build system knows, the compiler
doesn't.
"Öö Tiib" <ootiib@hot.ee>: Sep 09 09:25AM -0700

On Thursday, 9 September 2021 at 11:11:51 UTC+3, Juha Nieminen wrote:
> Yeah, why would anybody want to use the most efficient and best library
> if they don't have access to a multi-million-dollar server farm? Even
> the notion is totally ludicrous!
 
There are no silver bullet libraries. Depending on problem other libraries
(like NTL, LiDIA or CLN) can work faster than GMP. But it indeed has to
be quite a problem for quite a data centre for such choice to be important
and give significantly faster ROI.
James Kuyper <jameskuyper@alumni.caltech.edu>: Sep 09 12:41PM -0400

>>> On Wed, 8 Sep 2021 09:38:24 +1200
>>> Ian Collins <ian-news@hotmail.com> wrote:
>>>> On 08/09/2021 04:41, Paavo Helde wrote:
...
>>> the Makefile that need to be commented in/out. How is CMake any simpler?
 
>> Windows?
 
> No, but then CMake and make are rarely used on Windows, its VC++ solution files.
 
I don't have a lot of Windows experience, but I have worked on two
different projects that targeted Windows platforms. Both of those
projects used CMake to simplify the process of making the software
portable to both Windows and Linux.
I don't like CMake, because when it built a program incorrectly, I had a
great deal of trouble trying to figure out what I needed to change to
make it build the project correctly. But to be fair, I never had the
time I would have needed to become anywhere near as familiar with CMake
as I was with Unix make files. That's probably also due to the fact that
I was 30 when I first used make, and nearly twice that age when I first
used CMake - the mind does get less flexible with age.
James Kuyper <jameskuyper@alumni.caltech.edu>: Sep 09 12:41PM -0400

>>> moo.h changes and main.c should be recompiled when anything changes?
 
>> Such dependencies are taken care automatically by the gcc -MD option,
 
> No unless the compiler is clairvoyant they arn't.
 
That option causes a dependencies file to be created specifying all the
dependencies that the compiler notices during compilation. That file can
then be used to avoid unnecessary re-builds the next time the same file
is compiled. The dependency file is therefore always one build
out-of-date; if you created any new dependencies, or removed any old
ones, the dependencies file will be incorrect until after the next time
you do a build. It's therefore not a perfect solution - but neither is
it useless.
James Kuyper <jameskuyper@alumni.caltech.edu>: Sep 09 12:42PM -0400


> No, they haven't, because the compiler has no idea which files to rebuild if
> it doesn't know what those files are. The build system knows, the compiler
> doesn't.
 
 
A C compiler is supposed to replace a #include directive with the
contents of the specified file - if it doesn't know what file that is,
how can a compiler perform that replacement? If it does know what file
that is, it can record that dependency for use in later runs, which is
precisely what gcc -MD does.
 
Most compilers I'm familiar with also invoke the linker themselves, and
therefore know precisely which files they told the linker to link
together. They also use information provided by the linker to determine
which library files the linker ended up linking into the program.
scott@slp53.sl.home (Scott Lurndal): Sep 09 06:37PM

>target_link_libraries(...)
>file(GLOB MY_SOURCES *.cpp *.h)
>target_sources(... ${MY_SOURCES})
 
TOP=..
include $(TOP)/Makefile.defs
 
SOURCES = analyze.cpp
SOURCES += base.cpp
SOURCES += breakpoint.cpp
SOURCES += cf.cpp
SOURCES += channel.cpp
SOURCES += clear.cpp
SOURCES += command.cpp
SOURCES += control.cpp
SOURCES += decompile.cpp
SOURCES += delete.cpp
SOURCES += dis.cpp
SOURCES += dump.cpp
SOURCES += exchange.cpp
SOURCES += haltwait.cpp
SOURCES += iocbdump.cpp
SOURCES += iodump.cpp
SOURCES += load.cpp
SOURCES += mem.cpp
SOURCES += mp.cpp
SOURCES += quit.cpp
SOURCES += rle.cpp
SOURCES += run.cpp
SOURCES += save.cpp
SOURCES += search.cpp
SOURCES += so.cpp
SOURCES += start.cpp
SOURCES += state.cpp
SOURCES += status.cpp
SOURCES += step.cpp
SOURCES += stop.cpp
SOURCES += table.cpp
SOURCES += to.cpp
SOURCES += trace.cpp
 
OBJECTS = $(SOURCES:.cpp=.o)
 
all: $(LIBMP)
 
install:: all
 
$(LIBMP): $(OBJECTS)
ar cr $@ $+
 
include $(TOP)/Makefile.rules
-include *.d
 
Yes, I could replace the SOURCES += with a single wildcard expansion in gmake,
but I prefer them listed individually for flexibility in the contents of the
directory containing the (sub) makefile.
scott@slp53.sl.home (Scott Lurndal): Sep 09 06:43PM


>Yeah, why would anybody want to use the most efficient and best library
>if they don't have access to a multi-million-dollar server farm? Even
>the notion is totally ludicrous!
 
GMP is an open source project. I'm sure they'd welcome your contributions
to the GMP library and/or build environment that help make it useful to you.
David Brown <david.brown@hesbynett.no>: Sep 09 10:17PM +0200

On 09/09/2021 18:41, James Kuyper wrote:
> ones, the dependencies file will be incorrect until after the next time
> you do a build. It's therefore not a perfect solution - but neither is
> it useless.
 
The trick is to have makefile (or whatever build system you use) rules
along with gcc so that the dependency file not only labels the object
file as dependent on the C or C++ file and all the include files it
uses, recursively, but also labels the dependency file itself to be
dependent on the same files. Then if the source file or includes are
changed, the dependency file is re-created, and make is smart enough to
then reload that dependency file to get the new dependencies for
building the object file.
 
The makefile rules involved are close to APL in readability, but once
you have figured out what you need, you can re-use it for any other
project. And it solves the problem you have here.
 
 
So, for example, if you have these files:
 
a.h
---
#include "b.h"
 
b.h
---
#define TEST 1
 
c.c
---
#include "a.h"
#include <stdio.h>
 
int main(void) {
printf("Test is %d\n", TEST);
}
 
 
Then "gcc -MD c.c" makes a file
 
c.d
---
c.o: c.c /usr/include/stdc-predef.h a.h b.h /usr/include/stdio.h \
/usr/include/x86_64-linux-gnu/bits/libc-header-start.h \
/usr/include/features.h /usr/include/x86_64-linux-gnu/sys/cdefs.h \
/usr/include/x86_64-linux-gnu/bits/wordsize.h \
...
 
 
Using "gcc -MMD c.c" is more helpful, usually, because it skips the
system includes:
 
c.d
---
c.o: c.c a.h b.h
 
 
But the real trick is "gcc -MMD -MT 'c.d c.o' c.c" :
 
c.d
---
c.d c.o: c.c a.h b.h
 
 
Now "make" knows that the dependency file is also dependent on the C
file and headers.
Ian Collins <ian-news@hotmail.com>: Sep 10 09:51AM +1200

On 09/09/2021 23:40, Bart wrote:
> what would be a trivial-to-build project, turns into a exasperating hunt
> for the necessary information which is buried with two levels of
> encryption inside makefiles and scripts.
 
We build binaries for 28 open source projects for four platforms
(includes Windows). All are trivial to build except for openssl, which
is a bitch to build on Windows.
 
> hundreds of lines of makefiles and scripts can actually be reduced down
> to a list 34 .c files that needed to be compiled into the same
> executable; that's it.
 
Reducing down to a list of source files is effectively what build
generators like CMake and premake do. In our case, we use premake to
generate ninja files which ninja build uses to perform a flat compile of
every file that matches the search rules in the project tree.
 
 
> The actual compile-time is not that critical for something that
> hopefully you only do once. It will still be faster than running that
> brain-dead configure script!
 
If you change a top level header, you build a lot and often. Everyone
who pulls your changes also builds a lot.
 
> working, so that you can use a fast compiler, or turn off some options.
> But when you do need the extra speed, a single file gives you
> whole-program optimisation for free.
 
Optimization may not be important initially, but error checking is.
It's not uncommon for embedded products to be unusable unless optimised,
so cross compile builds nearly always default to optimised builds.
 
> practical. But it's also true that a fast compiler can generate up to
> 1-2MB of unoptimised binary code per second per core; so exactly how big
> is the thing you're building?
 
2,600 C++ source files x 3 platforms + Windows.
 
--
Ian.
Bart <bc@freeuk.com>: Sep 10 12:07AM +0100

On 09/09/2021 22:51, Ian Collins wrote:
>> But when you do need the extra speed, a single file gives you
>> whole-program optimisation for free.
 
> Optimization may not be important initially, but error checking is.
 
For development, certainly. But I'm mainly talking about a product or
application someone may want to use, for which, for one reason or
another, a ready-to-run binary is not available.
 
Then, bearing in mind that (1) this should be a finished debugged
application; (2) a process already exists to flatten even an untidy
source repository into a small number (often one), of linear binary
files, it might be possible to take just one step back from that final
stage, and have a flat epresentation not quite yet committed to a
specific target.
 
All my current language programs take multiple files as input, and
produce a single file as output.
 
While it's no surprise that outputs such as EXE or DLL are single files,
this also appplies to OBJ (one for the whole program); ASM (one for the
whole program); C where supported (one for the whole program); PCL (my
new portable IL, again representing an entire program); MA (a specific
amalgamation of the source files of a project).
 
So, getting one-file outputs is a by-product of how I do things, but it
turns out to be useful for not-quite-binary distributions too, and is
likely to be useful also on somewhat larger scales than my own projects.
 
I doubt any real products work like this, or that anyone is actively
working on making non-binary distributions smaller, simpler, faster, or
more foolproof (they are mainly intent on creating bigger products!)
 
People could however give more thought to making things better by
conventional means, and minimising dependencies.
 
> It's
> not uncommon for embedded products to be unusable unless optimised, so
> cross compile builds nearly always default to optimised builds.
 
This might be another characteristic of C++ code, where lots of
boilerplate is generated, and it /needs/ an optimising compiler to
reduce it all down.
 
(I normally keep away from such features in a language, but my new
intermediate language, if set to generate C source, will produce
absolutely appalling code in a subset I call 'Linear-C'. That also
/needs/ an optimising compiler to remove all the redundancy.)
You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page.
To unsubscribe from this group and stop receiving emails from it send an email to comp.lang.c+++unsubscribe@googlegroups.com.

No comments: