Thursday, February 15, 2018

Digest for comp.lang.c++@googlegroups.com - 13 updates in 4 topics

legalize+jeeves@mail.xmission.com (Richard): Feb 15 06:53PM

[Please do not mail me a copy of your followup]
 
"Alf P. Steinbach" <alf.p.steinbach+usenet@gmail.com> spake the secret code
 
>However, both g++ 7.1 and Visual C++ 2017 fail to implement a proper
>`constexpr` `std::string_view` constructor taking `char const*`.
 
I bumped into this when I made the string view koan. I'm surprised
that g++ got it wrong; I thought I tested it on g++ and got it
working.
 
 
> constexpr auto newline =
> is_windows? string_view{ "\r\n", 2 } : string_view{ "\n", 1 };
> } // namespace os
 
Here's an alternative workaround (untested):
 
namespace os
{
using std::string_view;
using namespace std::string_view_literals;
 
// ...
 
constexpr auto newline = is_windows ? "\r\n"sv : "\n"sv;
}
--
"The Direct3D Graphics Pipeline" free book <http://tinyurl.com/d3d-pipeline>
The Terminals Wiki <http://terminals-wiki.org>
The Computer Graphics Museum <http://computergraphicsmuseum.org>
Legalize Adulthood! (my blog) <http://legalizeadulthood.wordpress.com>
jak <please@nospam.ty>: Feb 15 07:55PM +0100

Il 15/02/2018 18:34, James Kuyper ha scritto:
> putchar('\n');
 
> return 0;
> }
 
I'm sorry because, probably, my bad English prevents me from
understanding about you do not agree. Now I take your sample code and it
does not work differently from mine: when I open the file in text mode
and write in a '\n' I get a file which in windows is 2 bytes long while
in linux only 1. Respectively the two files contain 0A 0D and 0A.
Reopening the files, this time in binary mode, I read from them exactly
what 'they contain either with fread, fgetc or fgets and that allows me
to understand which end line adopts the operating system. Maybe you want
to tell me that I'm using the fgets function inappropriately because the
file was opened in binary mode? in this case you're right but I thought
it would work anyway and I allowed the imprecision. If I misunderstood,
then I apologize because I did not understand.
this program gives the same result as yours:
 
#include <stdio.h>
int main(void)
{
char str[3];
// Write a new line in text mode.
FILE *file = fopen("line_ending.txt", "wb");
putc('\n', file);
fclose(file);
 
// Read and hexdump the file in binary mode.
file = fopen("line_ending.txt", "rb");
fgets(str, 3, file);
fputs("Line ending:", stdout);
for(int c = 0; c < 3; c++)
printf(" %#x", str[c]);
putchar('\n');
 
return 0;
}
legalize+jeeves@mail.xmission.com (Richard): Feb 15 06:55PM

[Please do not mail me a copy of your followup]
 
Jorgen Grahn <grahn+nntp@snipabacken.se> spake the secret code
 
> [...] (E.g. Python has os.linesep,
>but I suspect it's not used much.)
 
Sometimes you need to preserve the existing line endings. I fought
with Python at one point over this and lost; Python simply insisted on
mangling the data in the file to match the EOL convention of the OS
instead of preserving the EOL convention of the file.
--
"The Direct3D Graphics Pipeline" free book <http://tinyurl.com/d3d-pipeline>
The Terminals Wiki <http://terminals-wiki.org>
The Computer Graphics Museum <http://computergraphicsmuseum.org>
Legalize Adulthood! (my blog) <http://legalizeadulthood.wordpress.com>
legalize+jeeves@mail.xmission.com (Richard): Feb 15 06:56PM

[Please do not mail me a copy of your followup]
 
Paavo Helde <myfirstname@osa.pri.ee> spake the secret code
 
>Appears this is not so easy, one should use at least
 
>#if defined(_WIN32) || defined(__CYGWIN__)
 
Well, there's you're problem. Don't use cygwin, it's an abomination :)
--
"The Direct3D Graphics Pipeline" free book <http://tinyurl.com/d3d-pipeline>
The Terminals Wiki <http://terminals-wiki.org>
The Computer Graphics Museum <http://computergraphicsmuseum.org>
Legalize Adulthood! (my blog) <http://legalizeadulthood.wordpress.com>
Paavo Helde <myfirstname@osa.pri.ee>: Feb 15 09:16PM +0200

On 15.02.2018 20:56, Richard wrote:
 
>> Appears this is not so easy, one should use at least
 
>> #if defined(_WIN32) || defined(__CYGWIN__)
 
> Well, there's you're problem. Don't use cygwin, it's an abomination :)
 
Well, I don't, I just use it for testing random snippets for c.l.c++ ;-)
 
It just was that the pipe example from jak did not compile in Visual
C++, but compiled with CygWin g++ and produced 0x0D 0x0A, despite not
having _WIN32 defined.
 
OTOH, writing "\n" into a c++ text ofstream and rereading it with a
binary ifstream produced just 0x0A with the CygWin g++. Not sure why a
pipe and a file behave differently. If anything, it's just another
reason for avoiding the "text mode".
 
Cheers
Paavo
James Kuyper <jameskuyper@verizon.net>: Feb 15 03:07PM -0500

On 02/15/2018 01:55 PM, jak wrote:
> Il 15/02/2018 18:34, James Kuyper ha scritto:
...
> file was opened in binary mode? in this case you're right but I thought
> it would work anyway and I allowed the imprecision. If I misunderstood,
> then I apologize because I did not understand.
 
Part of the problem is that you probably don't have access to any system
where line endings are indicated in a manner that would cause serious
problems for fgets(). Neither do I, so I can't demonstrate the problem
directly, but I can tell you what I would expect your program to do on
systems using some of the more popular alternatives:
 
> int main(void)
> {
> char str[3];
 
This array is uninitialized.
 
> fputs("Line ending:", stdout);
> for(int c = 0; c < 3; c++)
> printf(" %#x", str[c]);
 
Unless your code filled up the entire array (which is possible), you're
printing at least one uninitialized value. For character types, that's
safe, but in general that's something to be avoided. I'll ignore the
uninitialized values in my discussion below.
 
> putchar('\n');
 
> return 0;
> }
 
Let's consider what your code would do in systems with a variety of
different ways of representing the end of a line.
 
1. End of line is indicated by a newline character
Your program would cause str[0] to end up containing '\n', while str[1]
would contain a terminating '\0'.
 
2. End of line is indicated by "\n\r".
In text mode, fgets() would read both the '\n' and the '\r', but would
convert them into a single '\n' before storing it in str[0], the same as
in case 1.
However, since you're using binary mode, fgets would stop after reading
in '\n', storing it in str[0], leaving the '\r' unread, and would place
a terminating '\0' in str[1], exactly the same as in text mode.
Therefore, it would NOT have read the entire line.
 
3. End of line is indicated by "\r\n".
In text mode, fgets would read both the '\r' and the '\n', but would
convert them into a single '\n' before storing it in str[0].
However, in binary mode, fgets() would simply read in the '\r' directly
into str[0], then reading the '\n' into str[1], at which point it would
stop and place a terminating '\0' in str[2]. In this case, it would read
the entire line.
 
4. End of line is indicated by '\r'.
In text mode, fgets() would read in the '\r' and convert it to '\n'
before storing it in str[0].
However, in binary mode, it would read the '\r' and then reach the end
of the file without having read in a newline character, so the contents
of str[] would be unchanged, and it would return a null pointer,
indicating failure. Since your code doesn't bother checking whether
fgets() failed, it wouldn't notice. To be fair, I suppressed all error
checking in my code, too - but my code wouldn't result in this error
condition.
 
5. Text files are stored in fix sized blocks of length 256 bytes, with
the first byte of each block indicating how many bytes of that block are
used. The blocks are padded with blanks. The result of writing a '\n'
character to a file is a block containing a use-count of 0 followed by
254 blank characters.
In text mode, fgets() would read in the entire block, notice that the
use count is 0, and write only a single '\n' to str[0], and a
terminating '\0' to str[1].
However, in binary mode, it would read the use count of 0 into str[0],
and the first padding blank character into str[1]. It would NEVER read
in a new-line character from a text file. It would, however, reach the
limit of 3 characters that you gave fgets(), so it would insert a
terminating '\0' in str[2], without having read in the entire line.
 
Exercise for the student: If text files are stored in fix-sized blocks
of length 256, with the end of a line indicated by padding the rest of
the block with null characters, what would jak's program do?
Vir Campestris <vir.campestris@invalid.invalid>: Feb 15 09:24PM

On 15/02/2018 05:09, Robert Wessel wrote:
> will be a sequence of records).
 
> Unfortunately it's implementation dependent whether a stringstream pay
> any attention to binary mode, so that's not an option.
 
It's the 1980s since I last used a system with files that are sequences
of records - and I didn't run 'C' on them.
 
I suspect that writing a newline in text, then reading back in binary,
would give you nothing at all. But I'm not sure.
 
It's kind of implicit in the design of C I/O that files are _not_
sequences of records (let alone clever things like ISAM and HRAM files!).
 
And it's implicit in the design of record-based filesystems that you
don't read a byte, or a char, or any such thing - you read a record.
 
Then process that.
 
Andy
--
Yes, I do mean the 1980s. Early 80s, so I won't be surprised to be out
of date. And curiously when I search for ISAM and HRAM Google tells me
about Islam...
legalize+jeeves@mail.xmission.com (Richard): Feb 15 09:49PM

[Please do not mail me a copy of your followup]
 
Vir Campestris <vir.campestris@invalid.invalid> spake the secret code
 
>Yes, I do mean the 1980s. Early 80s, so I won't be surprised to be out
>of date. And curiously when I search for ISAM and HRAM Google tells me
>about Islam...
 
Yeah, you have to go to wikipedia otherwise search engines helpfully
"correct" your spelling into something you didn't search for.
 
<https://en.wikipedia.org/wiki/ISAM>
 
Around the same time as you, I was using RSTS/E on a PDP-11 and they
had "Record Management Services (RMS)" for record-based file I/O.
<https://en.wikipedia.org/wiki/Record_Management_Services>
--
"The Direct3D Graphics Pipeline" free book <http://tinyurl.com/d3d-pipeline>
The Terminals Wiki <http://terminals-wiki.org>
The Computer Graphics Museum <http://computergraphicsmuseum.org>
Legalize Adulthood! (my blog) <http://legalizeadulthood.wordpress.com>
Paavo Helde <myfirstname@osa.pri.ee>: Feb 15 06:31AM +0200

> How to get the OS dependent new line character(s) and store in a variable?
 
> const char newline[]=std::endl; // this line shows intention but don't work
 
In practice (assuming the old classic Mac OS is derelict):
 
#ifdef _WIN32
const char newline[]="\r\n";
#else
const char newline[]="\n";

No comments: