Wednesday, April 3, 2019

Digest for comp.lang.c++@googlegroups.com - 25 updates in 7 topics

"Rick C. Hodgin" <rick.c.hodgin@gmail.com>: Apr 03 10:55AM -0400

In developing a new IDE and general compiler philosophy, it's oc-
curred to me that I wanted a way to keep the compiler settings with
the source files, rather than having them in a project settings.
 
That way any compiler that was able to parse the source file could
parse the settings as well, allowing for multiple tools to be able
to work natively on the source files without having to know how to
parse a particular project file or solution (in Visual Studio).
 
I was wondering what people think of this?
 
-----
In general, it would be a block at the top (included within the
comment characters) so older compilers would simply ignore it:
 
/*compiler settings {
}*/
 
The IDE would read this section and load the input and set the
various flags and present a screen which allows those settings
to be viewed, along with all the other settings that are avail-
able. Clicking and setting things beyond their default values
would write them to that section.
 
The source file would maintain the settings within that block,
but the IDE wouldn't show them in the source file, but would
separate out that block and present it in its own tab or as an
option when chosen. When saving the source file back to disk,
both that block and the normal source code would be written.
 
In this way, the settings always go with the file. Options
are encoded for 32-bit, 64-bit compile targets, different ISAs,
different pre- and post- build events to copy input or output
files or run secondary processing, etc.
 
Example:
 
/*compiler_settings {
char = 8; // Bits
short = 16;
int = 32;
long = 64;
 
requires = 3.4; // Some version of a standard
code align = 16; // Paragraph alignment on code
data align = 1; // Byte alignment on data
extern = "C"; // Use unmangled exports
 
// Generic build settings here
build lib {
}
build dll {
}
build exe {
}
 
// Special settings here for 32-bit x86 code
target x86 {
build lib {
}
build dll {
}
build exe {
}
}
 
// Special settings here for 64-bit x86 code
target x64 {
}
 
// Special settings here for 32-bit ARM code
target arm-32 {
}
 
// Special settings here for 32-bit ARM code
target arm-64 {
}
 
// Et cetera
}*/
 
Any thoughts?
 
--
Rick C. Hodgin
Mr Flibble <flibbleREMOVETHISBIT@i42.co.uk>: Apr 03 05:21PM +0100

On 03/04/2019 15:55, Rick C. Hodgin wrote:
 
>         // Et cetera
>     }*/
 
> Any thoughts?
 
Most settings are platform specific so it would be foolish to put them in
the source file and the sizes of the fundamental types aren't settings at
all but platform/hardware specific implementation details so it makes even
less sense for them to be in the source file.
 
A) You are trying to solve a problem with a deeply flawed solution.
B) You are trying to solve a problem that no longer exists with the advent
of CMake and such.
 
Makefiles have always been separate to the source files that are built and
with good reason: we want a separation of concerns separating WHAT we are
building from HOW we are building it.
 
/Flibble
 
--
"You won't burn in hell. But be nice anyway." – Ricky Gervais
 
"I see Atheists are fighting and killing each other again, over who
doesn't believe in any God the most. Oh, no..wait.. that never happens." –
Ricky Gervais
 
"Suppose it's all true, and you walk up to the pearly gates, and are
confronted by God," Bryne asked on his show The Meaning of Life. "What
will Stephen Fry say to him, her, or it?"
"I'd say, bone cancer in children? What's that about?" Fry replied.
"How dare you? How dare you create a world to which there is such misery
that is not our fault. It's not right, it's utterly, utterly evil."
"Why should I respect a capricious, mean-minded, stupid God who creates a
world that is so full of injustice and pain. That's what I would say."
Mr Flibble <flibbleREMOVETHISBIT@i42.co.uk>: Apr 03 05:24PM +0100

On 03/04/2019 17:21, Mr Flibble wrote:
 
> Makefiles have always been separate to the source files that are built and
> with good reason: we want a separation of concerns separating WHAT we are
> building from HOW we are building it.
 
Oh and I forgot to mention that your idea of using of comments to do such
things is egregious: we have #pragma for a reason.
 
/Flibble
 
--
"You won't burn in hell. But be nice anyway." – Ricky Gervais
 
"I see Atheists are fighting and killing each other again, over who
doesn't believe in any God the most. Oh, no..wait.. that never happens." –
Ricky Gervais
 
"Suppose it's all true, and you walk up to the pearly gates, and are
confronted by God," Bryne asked on his show The Meaning of Life. "What
will Stephen Fry say to him, her, or it?"
"I'd say, bone cancer in children? What's that about?" Fry replied.
"How dare you? How dare you create a world to which there is such misery
that is not our fault. It's not right, it's utterly, utterly evil."
"Why should I respect a capricious, mean-minded, stupid God who creates a
world that is so full of injustice and pain. That's what I would say."
"Rick C. Hodgin" <rick.c.hodgin@gmail.com>: Apr 03 01:09PM -0400

On 4/3/2019 12:21 PM, Mr Flibble wrote:
> source file and the sizes of the fundamental types aren't settings at all but
> platform/hardware specific implementation details so it makes even less sense
> for them to be in the source file.
 
The statement of specifying the required size of each fundamental type
allows the compiler to generate a warning if the default sizes are not
those sizes.
 
 
> Makefiles have always been separate to the source files that are built and
> with good reason: we want a separation of concerns separating WHAT we are
> building from HOW we are building it.
 
This solution provides for a new makefile format:
 
file.obj: file1.cpp file2.cpp file3.cpp
data.obj: data1.cpp data2.cpp
 
file.exe: file.obj data.obj
 
The developer goes to a single source location to edit their code and
settings (the IDE provides it for them).
 
Much simpler.
 
--
Rick C. Hodgin
"Rick C. Hodgin" <rick.c.hodgin@gmail.com>: Apr 03 01:18PM -0400

On 4/3/2019 12:24 PM, Mr Flibble wrote:
> Oh and I forgot to mention that your idea of using of comments to do such
> things is egregious: we have #pragma for a reason.
 
I don't know of an efficient way to do a multi-line #pragma. And I'm
not particularly keen in having a block of #pragma keywords.
 
In CAlive, the syntax would be:
 
compiler settings
{
// Settings here
}
 
linker settings
{
// Settings here
}
 
I suggest the commented /*compiler settings {..} linker settings {..}*/
syntax for backward compatibility for those systems that still use legacy
CMake builders, for example.
 
--
Rick C. Hodgin
"Alf P. Steinbach" <alf.p.steinbach+usenet@gmail.com>: Apr 03 04:58AM +0200

On 02.04.2019 17:33, Alf P. Steinbach wrote:
> definition `*(*(arr + 0) + 5)` which is equivalent to `*(*arr + 5)`.
 
> And `arr[1][0]` is by definition `*(arr[1] + 0)` which in turn is by
> definition `*(*(arr + 5) + 0)` which is `*(*arr + 5)`.
 
Ouch, sorry about the typo.
 
`arr[1][0]` is by definition `*(arr[1] + 0)` which in turn is by
definition `*(*(arr + 1) + 0)`,
 
which is equivalent to `*((*arr + 1×5) + 0)` (with "(*" not "*("), which
is `*(*arr + 5)`
 
To understand how that result expression works:
 
1) `arr` decays to pointer to first sub-array.
2) `*arr` forms an lvalue referring to that whole sub-array.
3) That sub-array expression decays to pointer to first `int` item.
4) The compiler adds 5 to that pointer.
5) The result, a pointer to one past the first sub-array, is dereferenced.
 
You have argued that this, the definition of the indexing, is Undefined
Behavior, possibly on the grounds that the derivation step covered by my
typo and above marked by a blank line, is not specified directly by the
standard; and by interpreting the standard's
 
"point to elements of the same array object"
 
as not referring to a multidimensional array object, because you
maintained that the standard had no notion of such.
 
I showed by direct quote that the standard does indeed use that term and
have that notion, and it constitutes a much more reasonable interpretation.
 
If it hadn't then filing a Defect Report would IMO be in order.
 
 
> On which platform are these expressions not equivalent?
 
> I.e., when you say it holds on "most platforms", on which platform does
> the equivalence not hold?
 
 
Cheers!,
 
- Alf
Mr Flibble <flibbleREMOVETHISBIT@i42.co.uk>: Mar 31 10:05PM +0100

On 31/03/2019 21:35, Bonita Montero wrote:
 
> That shoudln't be a problem since they could partitially point to the
> shifted elements. That could be guaranteed by the standard withoud any
> restrictions on the implementation.
And you had the nerve to call my capabilities into question. Invalid
iterators should never be used even if they still "point" to valid objects
as this is undefined behaviour according to the standard.
 
/Flibble
 
--
"You won't burn in hell. But be nice anyway." – Ricky Gervais
 
"I see Atheists are fighting and killing each other again, over who
doesn't believe in any God the most. Oh, no..wait.. that never happens." –
Ricky Gervais
 
"Suppose it's all true, and you walk up to the pearly gates, and are
confronted by God," Bryne asked on his show The Meaning of Life. "What
will Stephen Fry say to him, her, or it?"
"I'd say, bone cancer in children? What's that about?" Fry replied.
"How dare you? How dare you create a world to which there is such misery
that is not our fault. It's not right, it's utterly, utterly evil."
"Why should I respect a capricious, mean-minded, stupid God who creates a
world that is so full of injustice and pain. That's what I would say."
Ben Bacarisse <ben.usenet@bsb.me.uk>: Apr 03 11:41AM +0100


>>> Consider int arr[5][5]. Now accessing arr[0][5] is one kind of
>>> buffer overflow and it is bad that it is equivalent to accessing
>>> arr[1][0] on most platforms.
<cut>
 
 
> "point to elements of the same array object"
 
> as not referring to a multidimensional array object, because you
> maintained that the standard had no notion of such.
 
Given int arr[5][5]; the expression arr[0][5] is a special case because
constructing a pointer "just past" the end of an array (or just after a
single non-array object) is specifically defined.
 
To take a more clear-cut example, is it your view that arr[0][6] is also
well-defined in C++ and that it corresponds to arr[1][1]? The wording
looks similar to that of the C standard, and it is generally regarded as
undefined in C, though it will usually work of course.
 
The problem with relying on the "elements of the same array object"
wording is that in and 'int arr[5][5]' there are only two plausible
arrays that that text could be referring to. One is arr itself which
has 5 elements. The other is array[0] which also has 5 elements.
Neither of the these has a 7th element (indexed by 6).
 
The other problem with that wording is that is only applies to
subtracting two pointers. The wording that explains P + N says:
 
"if the expression P points to the i-th element of an array object,
the expressions (P)+N (equivalently, N+(P)) nd (P)-N (where N has the
value n) point to, respectively, the i + n-th and i − n-th elements of
the array object, provided they exist."
 
This is why I explained the problem in terms of the number of elements.
There is no array that has enough elements in this example.
 
--
Ben.
jameskuyper@alumni.caltech.edu: Apr 03 06:51AM -0700

On Monday, April 1, 2019 at 6:18:35 PM UTC-4, Alf P. Steinbach wrote:
> consecutive items, no padding, in the multidimensional array.
 
> Perhaps you were not aware?
 
> Or if you were aware of that, can you give an example of the UB way?
 
int array[3][4];
int *p = array+1;
p[5] = 6;
jameskuyper@alumni.caltech.edu: Apr 03 07:00AM -0700

On Tuesday, April 2, 2019 at 5:58:30 AM UTC-4, Alf P. Steinbach wrote:
> On 02.04.2019 10:49, Öö Tiib wrote:
...
> > and global indexing by other means in it is not defined.
 
> That sounds incorrect but it depends on what you mean by "global indexing."
 
It's not a well-defined term, but in context I would assume he's referring to the use of a T[0] to access elements of T[i] where i>0.
 
> > and [] refer) does explicitly tell that it is undefined behavior to
> > go any farther than from (p + 0) to (p + N) with it.
 
> No, it doesn't.
 
"If the expression P points to element x[i] of an array object x with n
elements, the expressions P + J and J + P (where J has the value j)
point to the (possibly-hypothetical) element x[i + j]
if 0 <= i + j >= n; otherwise, the behavior is undefined."
 
p-1 and p+N+1 both violate the condition near the end of that sentence,
and therefore would have undefined behavior. The standard makes no
exception from that rule for the special case where the array x is
itself an element of another array.
jameskuyper@alumni.caltech.edu: Apr 03 08:10AM -0700

On Tuesday, April 2, 2019 at 7:23:40 AM UTC-4, Alf P. Steinbach wrote:
> On 02.04.2019 12:59, Öö Tiib wrote:
...
 
> Do you agree that for the abstract machine defined by the standard, the
> history of how a valid pointer value was computed does not matter if
> that history did not involve UB (as it cleary does not for (P)+1)?
 
No. The history of how a valid pointer was computed can affect what it
is legal to do with it.
 
...
> > that (p + N + 1) is explicitly told to be undefined behavior.
 
> Not if you require (P+2) to be equivalent to ((P+1)+1), which is well
> defined.
 
Comparing equal doesn't mean that they're required to be equivalent,
only that they're required to represent the same address in memory.
 
The fact that the standard says the behavior is undefined gives
implementors permission to, for instance, create heavy pointers that
keep a record of start and end of the array from which they are
obtained, and to cause problems if an attempt is made to add or subtract
from them a number that puts them outside the valid range, or to
dereference them when they point one past the end of that range. That
would incur an enormous performance penalty, but some compilers have a
mode where they enable such a feature for debugging purposes. Because
the behavior is undefined, turning on that option does not render the
implementation non-conforming.
 
More realistically, an implementation is allowed to, for example, look
at the expressions array[0][i] and array[1][j], and assume that neither
expression will be evaluated in a context where i and j have values that
are out of range, and that it is therefore unnecessary to consider the
possibility that they alias the same location. Since the behavior would
be undefined in those cases, ignoring that possibility would not rende
the implementation non-conforming.
 
> But we can reason about the standard imposing requirements such as P+2
> having to be split up in two operations (P+1)+1, and so on, that for
 
Such a re-write does nothing to avoid either of the possibilities I
mentioned above. If p+2 would trigger a bounds-check, then so would
(p+1)+1. array[0]+(N-1)+1 might compare equal to array[1], but unlike
the second expression, it would be undefined behavior to use the first
one access the value stored in that location, so an implementation need
not consider the possibility that the two expressions alias each other.

> at the technical cost of the inefficiency of "fat pointers", and the
> higher level cost of breaking a really large amount of existing code,
> and introducing the notion that the history of a pointer matters.
 
No, that notion was introduced a long time ago, when a similar clause was
first written in the C90; C++ merely inherited the notion.
jameskuyper@alumni.caltech.edu: Apr 03 09:21AM -0700

On Tuesday, April 2, 2019 at 9:57:14 AM UTC-4, Alf P. Steinbach wrote:
> On 02.04.2019 14:48, Öö Tiib wrote:
...
 
> So, say you do P2 = P+1, in the case where P2 is guaranteed to compare
> equal to a pointer to the first element of the next inner array.
 
> Is the history, that P2 was computed as P+1, forgotten at some point?
 
When the lifetime of P2 ends, or when it is assigned a new value.
 
> Or will it be UB to dereference the stored pointer value in P2, just as
> it is with the expression P+1?
 
Yes.
 
> P2 points to the first item in an array,
 
It points one past the end of an array, and compares equal to a pointer
that points at the first element of the array, but has undefined
behavior if used to actually access that element.
 
> ... but as I understand it you mean
> that it's UB to form P2+1, and well-defined to form P2-1, because it
> /came from/ an earlier part; is that reasonable, do you think?
 
Correct.
 
> Or can your argument be applied to P2 also, that no arithmetic
> whatsoever can be done with it?
 
Incorrect.
 
> Can't go forward because it came from previous inner array. ...
 
Correct.
 
> ... Can't go
> backward
 
Incorrect, because your claim that
 
> because it's at the start of an array.
 
is also incorrect.
 
...
 
> But think about it.
 
> The nice checking processor can't prevent me from traversing the array
> one step at a time, which is indisputably well-defined.
 
I dispute it.
 
> All this extra hardware and fat pointer overhead is surely what the C++
> committee had in mind, a small cost indeed to pay for detecting some
> programmers' bad practices that give buffer overruns.
 
What they had in mind was not to do that kind of processing. If you need
to do something like that, define a single large array, and then slicing
to simulate multi-dimensionality using the same techniques as
std::valarray.
 
...
> > I wrote about it above how I think that it is meant.
 
> No, you didn't.
 
You're saying he's lying about the reason he had for writing that?
 
 
> There are no buffer overflows in code to iterate through a
> multidimensional array.
 
> The wording it appears that you focus on is for individual arrays.
 
It's for arrays in general, whether complete objects in their own right,
or an array whose elements are themselves arrays, or an array that's an
element of another array.
 
...
> >>>> It's just a sabotage meme originating with some unreasoning socially
> >>>> oriented first-year students that had a Very Ungood Teacher™ (VUT™).
 
> >>>> Or at least that's my theory. :)
 
Would you consider that an accurate description of the members of the C committee? (I mention the C committee rather than the C++ committee, only because I know where to find a document in which the C committee has addressed this question - I'm not certain whether the C++ committee has ever felt a need to make it's own distinct decision on this matter).
 
Defect Report #017 dated 10 Dec 1992 to C89:
> following code has undefined behavior:
 
> int a[4][5];
 
> a[1][7] = 0; /* undefined */
 
When C++ was standardized in 1998, it contained essentially the same
wording for all of the relevant clauses as was used in C. They didn't
make any changes to justify concluding that DR 017 didn't also apply to
C++. To the best of my knowledge, neither committee has ever reversed
the decision of DR 017.
Horizon68 <horizon@horizon.com>: Apr 03 06:26AM -0700

Hello..
 
 
We're about to take a big step back to the centralized/controlled past
 
The end of the desktop?
 
Read more here:
 
https://www.computerworld.com/article/3384713/the-end-of-the-desktop.html
 
 
 
Thank you,
Amine Moulay Ramdane.
Mr Flibble <flibbleREMOVETHISBIT@i42.co.uk>: Mar 30 09:27PM

My universal compiler will cause a technological singularity.
 
https://neos.dev
 
/Flibble
 
--
"You won't burn in hell. But be nice anyway." – Ricky Gervais
 
"I see Atheists are fighting and killing each other again, over who
doesn't believe in any God the most. Oh, no..wait.. that never happens." –
Ricky Gervais
 
"Suppose it's all true, and you walk up to the pearly gates, and are
confronted by God," Bryne asked on his show The Meaning of Life. "What
will Stephen Fry say to him, her, or it?"
"I'd say, bone cancer in children? What's that about?" Fry replied.
"How dare you? How dare you create a world to which there is such misery
that is not our fault. It's not right, it's utterly, utterly evil."
"Why should I respect a capricious, mean-minded, stupid God who creates a
world that is so full of injustice and pain. That's what I would say."
James Kuyper <jameskuyper@alumni.caltech.edu>: Mar 28 08:21PM -0400

On 3/28/19 13:25, Paavo Helde wrote:
> of the number of seconds since 00:00 January 1, 1970 (a C run-time
> time_t value), that indicates when the file was created."
 
> I do not see the high 32 bits of the time_t value stored anywhere.
 
I think they're assuming that you can make a reasonable guess as to what
they are. After all, they're going to remain 0 until 2106-02-07T06:28:16
<https://www.timeanddate.com/date/dateadded.html?m1=01&d1=01&y1=1970&type=add&ay=&am=&aw=&ad=&h1=&i1=&s1=&ah=&ai=&as=4294967296&rec=>
Bonita Montero <Bonita.Montero@gmail.com>: Mar 28 06:31PM +0100

> :-) Whatever timestamp fir posted It is unlikely time_t like you describe.
 
I just checked a PE built by myself and converted the hex-value
given by dumpbin ("5C9CBC41 time date stamp Thu Mar 28 13:21:21
2019") to a decimal value with calc.exe and put it into:
http://www.onlineconversion.com/unix_time.htm
And I got exactly the date and time the EXE was built.
Robert Wessel <robertwessel2@yahoo.com>: Mar 28 08:45PM -0500

On Thu, 28 Mar 2019 19:25:34 +0200, Paavo Helde
 
>I do not see the high 32 bits of the time_t value stored anywhere. Maybe
>the TimeDateStamp field is meant just as a convenient label for telling
>apart different versions of the file?
 
 
IIRC, .NET executables do something funky with the TimeDateStamp
field.
Bonita Montero <Bonita.Montero@gmail.com>: Mar 28 05:49PM +0100

The timestamp is a time_t-value, i.e. the passed seconds since 1.1.1970.
So we are not off-topic here because this isn't Windows-specific. ;-)
Paavo Helde <myfirstname@osa.pri.ee>: Mar 28 07:25PM +0200

On 28.03.2019 18:49, Bonita Montero wrote:
> The timestamp is a time_t-value, i.e. the passed seconds since 1.1.1970.
> So we are not off-topic here because this isn't Windows-specific. ;-)
 
The TimeDateStamp in the COFF header is documented as "The low 32 bits
of the number of seconds since 00:00 January 1, 1970 (a C run-time
time_t value), that indicates when the file was created."
 
I do not see the high 32 bits of the time_t value stored anywhere. Maybe
the TimeDateStamp field is meant just as a convenient label for telling
apart different versions of the file?
Paavo Helde <myfirstname@osa.pri.ee>: Mar 28 01:02PM +0200

On 28.03.2019 11:01, Juha Nieminen wrote:
>> If file reading is a performance bottleneck then one should use mmap
>> instead.
 
> In which version of the C++ standard was mmap introduced?
 
That's what I said. You care about other things more than about the
performance. The other things appear to be standard conformance and
convenience (reading the whole file in in one go).
 
There is nothing wrong about these preferences, that's a perfectly fine
approach, but then it sounds a bit silly to complain about the time
wasted on std::vector initialization.
 
I just made a little performance test, reading a 2.3 GB file and summing
all its bytes. The results are here:
 
large vector: 1.55176 s
large new[] : 1.40286 s, 9.59564 % win
small vector: 0.768879 s, 50.4511 % win
small new[] : 0.759881 s, 51.031 % win
mmap : 0.46249 s, 70.1958 % win
 
Here, large means the whole file read into a single buffer, and small
means a 16k buffer.
 
IIRC your approach was "large vector" (read the whole file into a
std::vector). So, using an uninitialized buffer with new[] would win ca
10% in this task (that's much more than I expected, must be because the
file is already in OS caches). That's the overhead you complained about.
 
However, by using a smaller buffer and thus reducing stress on memory
allocator you can win 50% instead, fully standard-conformant.
 
And finally, if you care about performance more than having pure
standard-conforming code, then you can use memory mapping and win 72%.
 
 
Code follows (Windows-only, no error checks, sorry):
 
#include <iostream>
#include <numeric>
#include <string>
#include <functional>
#include <chrono>
#include <algorithm>
#include <io.h>
#include <Windows.h>
 
int main() {
 
std::string filename = "D:/test/columbus/Case 00647038.zip";
unsigned int x1, x2, x3, x4, x5;
 
// put mmap first to warm caches up and still win
auto start3 = std::chrono::steady_clock::now();
{
HANDLE h = ::CreateFileA(filename.c_str(), GENERIC_READ,
FILE_SHARE_READ|FILE_SHARE_WRITE, NULL, OPEN_EXISTING,
FILE_ATTRIBUTE_NORMAL, NULL);
LARGE_INTEGER li;
::GetFileSizeEx(h, &li);
size_t n = li.QuadPart;
HANDLE m = ::CreateFileMapping(h, NULL, PAGE_READONLY, 0, 0, NULL);
unsigned char* view = static_cast<unsigned char*>(::MapViewOfFile(m,
FILE_MAP_READ, 0, 0, n));
x3 = std::accumulate(view, view+n, 0u);
::UnmapViewOfFile(view);
::CloseHandle(m);
::CloseHandle(h);
}
auto finish3 = std::chrono::steady_clock::now();
 
auto start1 = std::chrono::steady_clock::now();
{
FILE* f = fopen(filename.c_str(), "rb");
size_t n = _filelengthi64(fileno(f));
std::vector<unsigned char> a(n);
fread(a.data(), 1, n, f);
fclose(f);
x1 = std::accumulate(a.begin(), a.end(), 0u);
 
}
auto finish1 = std::chrono::steady_clock::now();
 
auto start2 = std::chrono::steady_clock::now();
{
FILE* f = fopen(filename.c_str(), "rb");
size_t n = _filelengthi64(fileno(f));
unsigned char* b = new unsigned char[n];
fread(b, 1, n, f);
x2 = std::accumulate(b, b+n, 0u);
delete[] b;
fclose(f);
}
auto finish2 = std::chrono::steady_clock::now();
 
auto start4 = std::chrono::steady_clock::now();
{
FILE* f = fopen(filename.c_str(), "rb");
size_t n = _filelengthi64(fileno(f));
const size_t bufferSize = 4*4096;
std::vector<unsigned char> a(bufferSize);
x4 = 0;
while (true) {
size_t k = fread(a.data(), 1, bufferSize, f);
x4 = std::accumulate(a.data(), a.data()+k, x4);
if (k<bufferSize) {
break;
}
}
fclose(f);
}
auto finish4 = std::chrono::steady_clock::now();
 
auto start5 = std::chrono::steady_clock::now();
{
FILE* f = fopen(filename.c_str(), "rb");
size_t n = _filelengthi64(fileno(f));
const size_t bufferSize = 4*4096;
unsigned char* a = new unsigned char[bufferSize];
x5 = 0;
while (true) {
size_t k = fread(a, 1, bufferSize, f);
x5 = std::accumulate(a, a+k, x5);
if (k<bufferSize) {
break;
}
}
delete[] a;
fclose(f);
}
auto finish5 = std::chrono::steady_clock::now();
 
auto dur1 =
std::chrono::duration_cast<std::chrono::duration<double>>(finish1-start1);
auto dur2 =
std::chrono::duration_cast<std::chrono::duration<double>>(finish2-start2);
auto dur3 =
std::chrono::duration_cast<std::chrono::duration<double>>(finish3-start3);
auto dur4 =
std::chrono::duration_cast<std::chrono::duration<double>>(finish4-start4);
auto dur5 =
std::chrono::duration_cast<std::chrono::duration<double>>(finish5-start5);
 
std::cout << "mmap : " << dur3.count() << " s, " <<
100.0*(dur1.count()-dur3.count())/dur1.count() << " % win\n";
std::cout << "large vector: " << dur1.count() << " s\n";
std::cout << "large new[] : " << dur2.count() << " s, " <<
100.0*(dur1.count()-dur2.count())/dur1.count() << " % win\n";
std::cout << "small vector: " << dur4.count() << " s, " <<
100.0*(dur1.count()-dur4.count())/dur1.count() << " % win\n";
std::cout << "small new[] : " << dur5.count() << " s, " <<
100.0*(dur1.count()-dur5.count())/dur1.count() << " % win\n";
 
if (x1!=x2 || x1!=x3 || x1!=x4 || x1!=x5) {
std::cerr << "Something wrong\n";
}
return x1-x2;
}
Juha Nieminen <nospam@thanks.invalid>: Mar 28 09:03AM

> And this waste is completely insignificant in this case because file
> access takes orders of magnitudes more time.
 
Why should I be paying that extra time, no matter how "insignificant"
it may be? I thought the design principle of C++ is that you don't have
to pay for what you don't use. In this case I'm not using, at all, the
fact that std::vector zero-initializes its contents when you resize it,
yet I'm still forced to pay for it.
 
--- news://freenews.netfront.net/ - complaints: news@netfront.net ---
Juha Nieminen <nospam@thanks.invalid>: Mar 28 09:01AM

> If file reading is a performance bottleneck then one should use mmap
> instead.
 
In which version of the C++ standard was mmap introduced?
 
--- news://freenews.netfront.net/ - complaints: news@netfront.net ---
Melzzzzz <Melzzzzz@zzzzz.com>: Mar 31 11:23AM


> Umm, because I was hoping to use C++ threading as its syntatically tidier
> than pthreads and getting lambdas to work with pthreads would probably be a
> PITA (I don't know, never tried).
it's easy:
eg
#include <functional>
#include <stdio.h>
using std::function;
void f(void* lambda) {
function<void(int)>* pf = (function<void(int)>*)lambda;
(*pf)(5);
}
 
int main(){
int j = 5;
function<void(int)> l = [j](int){ printf("%d\n",j); };
f(&l);
}
 
 
 
--
press any key to continue or any other to quit...
blt_rHkx@rjrnwk17q3gki8i5ps4p50dlbz.org: Mar 31 10:22AM

On Sun, 31 Mar 2019 09:27:34 GMT
>> shared_mutex seems to be mentioned but the functionality seems different to
>> me pluts its 2017 only anyway which rules out the compiler I'm using.
 
>Use pthread_rwlock_t, as you don't have c++17 compliant compiler.
 
Umm, because I was hoping to use C++ threading as its syntatically tidier
than pthreads and getting lambdas to work with pthreads would probably be a
PITA (I don't know, never tried).
"Chris M. Thomasson" <invalid_chris_thomasson_invalid@invalid.com>: Mar 31 02:02PM -0700

> Is there any equivalent of the pthreads pthread_rwlock_t in C++ threading?
> shared_mutex seems to be mentioned but the functionality seems different to
> me pluts its 2017 only anyway which rules out the compiler I'm using.
 
Fwiw, you just might be interested in the following thread:
 
https://groups.google.com/d/topic/comp.lang.c++/DBIG55vCBSA/discussion
 
Here is my crude C++ code:
 
https://pastebin.com/raw/1QtPCGhV
You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page.
To unsubscribe from this group and stop receiving emails from it send an email to comp.lang.c+++unsubscribe@googlegroups.com.

No comments: