Wednesday, December 3, 2014

Digest for comp.lang.c++@googlegroups.com - 24 updates in 6 topics

comp.lang.c++@googlegroups.com Google Groups
Unsure why you received this message? You previously subscribed to digests from this group, but we haven't been sending them for a while. We fixed that, but if you don't want to get these messages, send an email to comp.lang.c+++unsubscribe@googlegroups.com.
Christopher Pisz <nospam@notanaddress.com>: Dec 03 02:26PM -0600

I need to replace any occurance of "\r\n" occurring in an istream with
just "\n".
 
I could so this by copying the stream contents to a string, then
replacing the occurances within the string, and creating another stream
object from that string.
 
Is there a more efficient way?
Victor Bazarov <v.bazarov@comcast.invalid>: Dec 03 03:34PM -0500

On 12/3/2014 3:26 PM, Christopher Pisz wrote:
> replacing the occurances within the string, and creating another stream
> object from that string.
 
> Is there a more efficient way?
 
First you need to ask yourself, what is inefficient about it?
Second, why bother with creating another stream when you can simply
incorporate that action into your stream parser?
 
V
--
I do not respond to top-posted replies, please don't ask
Christopher Pisz <nospam@notanaddress.com>: Dec 03 02:42PM -0600

On 12/3/2014 2:34 PM, Victor Bazarov wrote:
> Second, why bother with creating another stream when you can simply
> incorporate that action into your stream parser?
 
> V
 
 
I'd figure the copies were inefficient. Although I know creating a
stringstream from a const string the docs say it just moves the
stringbuf pointer, I don't think the reverse works the same way. I could
be wrong.
 
But then, I don't even see how to create a string from an istream unless
it was a stringstream and I cast it, but I can't be sure what kind of
stream I am receiving as a parameter.
 
My parser orginally just worked with string, but then I made it work
with istream so that I could work with filestream and stringstream both
for on disk or in memory text. I did this just by fudging someone else's
3000 line unreadable chunk of poopy where they had taken a string and
created a stream and then used getline on it. It passed testing there,
so I didn't bother messing with it.
 
Now I come across the windows newline scenario in the data and have to
fix that.
Christopher Pisz <nospam@notanaddress.com>: Dec 03 03:11PM -0600

On 12/3/2014 2:34 PM, Victor Bazarov wrote:
> Second, why bother with creating another stream when you can simply
> incorporate that action into your stream parser?
 
> V
 
Seems very inefficient to be copying string to stream and back again.
Here is a compilable example of what I have:
 
// Standard Includes
#include <iostream>
#include <sstream>
#include <vector>
 
// Existing function to replace occurances of a string within a string
std::string ReplaceAllOccurrences(const std::string & original, const
std::string & search, const std::string & replacement)
{
std::string temp = original;
size_t position = 0;
 
while((position = temp.find(search, position)) != std::string::npos)
{
temp.replace(position, search.length(), replacement);
position += replacement.length();
}
 
return temp;
}
 
// Function I am working on
void Test(std::istream & stream)
{
// Verify the stream is good
if(!stream)
{
// Error - stream was given in an error state
throw std::exception("Stream to be parsed was given with an
error state set");
}
 
// Replace all occurances of "\r\n" with "\n" so the newline can be
handled the same way
// TODO - This can't be the most efficient way
stream.seekg (0, stream.end);
const unsigned length = stream.tellg();
stream.seekg (0, stream.beg);
 
std::string temp('\0', length);
temp = ReplaceAllOccurrences(temp, "\r\n", "\n");
std::istringstream formattedStream(temp);
 
// Snip the actual work
}
 
// Test
int main()
{
std::istringstream testData("Hello\r\nI am a Windows
string\r\nBecause I like carriage returns\r\n");
Test(testData);
}
Christopher Pisz <nospam@notanaddress.com>: Dec 03 03:18PM -0600

On 12/3/2014 3:11 PM, Christopher Pisz wrote:
 
>> V
 
> Seems very inefficient to be copying string to stream and back again.
> Here is a compilable example of what I have:
 
Whoops, left a few mistakes in the former listing.
Corrected code:
 
// Standard Includes
#include <iostream>
#include <sstream>
#include <vector>
 
// Existing function to replace occurances of a string within a string
std::string ReplaceAllOccurrences(const std::string & original, const
std::string & search, const std::string & replacement)
{
std::string temp = original;
size_t position = 0;
 
while((position = temp.find(search, position)) != std::string::npos)
{
temp.replace(position, search.length(), replacement);
position += replacement.length();
}
 
return temp;
}
 
// Function I am working on
void Test(std::istream & stream)
{
// Verify the stream is good
if(!stream)
{
// Error - stream was given in an error state
throw std::exception("Stream to be parsed was given with an
error state set");
}
 
// Replace all occurances of "\r\n" with "\n" so the newline can be
handled the same way
// TODO - This can't be the most efficient way
stream.seekg (0, stream.end);
const unsigned length = stream.tellg();
stream.seekg (0, stream.beg);
 
std::string temp(length, '\0');
stream.read(&temp[0], length);
 
temp = ReplaceAllOccurrences(temp, "\r\n", "\n");
std::istringstream formattedStream(temp);
 
// Snip the actual work
}
 
// Test
int main()
{
std::istringstream testData("Hello\r\nI am a Windows
string\r\nBecause I like carriage returns\r\n");
Test(testData);
}
Geoff <geoff@invalid.invalid>: Dec 03 01:42PM -0800

On Wed, 03 Dec 2014 14:26:48 -0600, Christopher Pisz
>replacing the occurances within the string, and creating another stream
>object from that string.
 
>Is there a more efficient way?
 
I was under the impression that \r\n only appears in disk files and
the system automatically converts them to \n in memory as the file is
read in text mode. Do streams behave differently?
 
Why not just do the replacement in the stream as you read it in?
 
Many moons ago I wrote a small and dirty utility in C for Windows that
converted \n to \r\n on files and I had to use fopen(inpath, "rb") to
accomplish it on Windows. I never wrote an equivalent \r\n to \n tool
and I never tried doing it in C++.
Victor Bazarov <v.bazarov@comcast.invalid>: Dec 03 04:43PM -0500

On 12/3/2014 3:42 PM, Christopher Pisz wrote:
> so I didn't bother messing with it.
 
> Now I come across the windows newline scenario in the data and have to
> fix that.
 
OK, this is how I do it when I need to (and that's not often), use it if
you think it suits you, or maybe it'll give you an idea of your own.
 
I would extract the lines from the stream using std::getline and \n as
the delimiter. Once the line (a string) is in my possession, I check
whether it ends with \r\n (I don't remember if the delimiter is also put
in the string or not, if not then check for \r only), if so I simply
change the ending to remove \r and proceed with extracting the rest of
information from that string.
 
If you need multiple strings, then the next tool in the pipeline should
be the concatenator. If not, the parser.
 
Since you have the somebody's piece of code that already uses 'getline',
all you need to add is a small function to do the \r clean-up in each
line you get. Call your function right after 'getline' returns a string.
 
If you need more information, do ask.
 
V
--
I do not respond to top-posted replies, please don't ask
Jorgen Grahn <grahn+nntp@snipabacken.se>: Dec 03 09:56PM

On Wed, 2014-12-03, Victor Bazarov wrote:
 
>>> First you need to ask yourself, what is inefficient about it?
>>> Second, why bother with creating another stream when you can simply
>>> incorporate that action into your stream parser?
 
...
 
> Since you have the somebody's piece of code that already uses 'getline',
> all you need to add is a small function to do the \r clean-up in each
> line you get. Call your function right after 'getline' returns a string.
 
Seconded. It's not technically elegant, but it's a simple,
maintainable solution -- at least for people like me who prefer to
read streams line by line.
 
On the other hand, writing an istream which acts as a filter stacked
on top of another istream can be an interesting exercise. Doesn't
Boost have that kind of stuff?
 
/Jorgen
 
--
// Jorgen Grahn <grahn@ Oo o. . .
\X/ snipabacken.se> O o .
Jorgen Grahn <grahn+nntp@snipabacken.se>: Dec 03 10:00PM

On Wed, 2014-12-03, Geoff wrote:
 
> I was under the impression that \r\n only appears in disk files and
> the system automatically converts them to \n in memory as the file is
> read in text mode. Do streams behave differently?
 
That's the typical behavior on some systems (MS-DOS etc) but e.g. on
Unix there's no such mechanism ... and sometimes you get DOS text
files on Unix, and someone expects you to handle them sensibly anyway.
 
/Jorgen
 
--
// Jorgen Grahn <grahn@ Oo o. . .
\X/ snipabacken.se> O o .
Ian Collins <ian-news@hotmail.com>: Dec 04 11:15AM +1300

Christopher Pisz wrote:
> replacing the occurances within the string, and creating another stream
> object from that string.
 
> Is there a more efficient way?
 
Write your own streambuf and do the filtering in its underflow() member?
 
--
Ian Collins
Geoff <geoff@invalid.invalid>: Dec 03 02:22PM -0800

On Wed, 03 Dec 2014 16:43:57 -0500, Victor Bazarov
>in the string or not, if not then check for \r only), if so I simply
>change the ending to remove \r and proceed with extracting the rest of
>information from that string.
 
std::getline strips the delimiter by default. Reading a Windows \r\n
from a file stream yields a string that has no carriage return or
newline in it.
 
std::getline(input, str, '\n') and std::getline(input, str) behave
identically so there is really no need to explicitly delimit with
newline unless you prefer it for style/maintenance purposes.
Geoff <geoff@invalid.invalid>: Dec 03 02:30PM -0800

On 3 Dec 2014 22:00:19 GMT, Jorgen Grahn <grahn+nntp@snipabacken.se>
wrote:
 
 
>That's the typical behavior on some systems (MS-DOS etc) but e.g. on
>Unix there's no such mechanism ... and sometimes you get DOS text
>files on Unix, and someone expects you to handle them sensibly anyway.
 
I see. I am often the recipient of the converse on Windows, converting
Unix eol's to DOS. :) Somehow I got the mis-impression that
Christopher was running his test on Windows.
Luca Risolia <luca.risolia@linux-projects.org>: Dec 03 11:43PM +0100

Il 03/12/2014 22:18, Christopher Pisz ha scritto:
 
> Test(testData);
> }
> main.cpp: In function 'void Test(std::istream&)':
 
g++ -std=c++14 -O2 -Wall -pedantic -pthread main.cpp && ./a.out
 
main.cpp:28:85: error: no matching function for call to
'std::exception::exception(const char [54])'
 
throw std::exception("Stream to be parsed was given with an
error state set");
Luca Risolia <luca.risolia@linux-projects.org>: Dec 03 11:44PM +0100

Il 03/12/2014 22:18, Christopher Pisz ha scritto:
 
> Whoops, left a few mistakes in the former listing.
> Corrected code:
 
 
g++ -std=c++14 -O2 -Wall -pedantic -pthread main.cpp && ./a.out
 
main.cpp:28:85: error: no matching function for call to
'std::exception::exception(const char [54])'
 
throw std::exception("Stream to be parsed was given with an
error state set");
Ian Collins <ian-news@hotmail.com>: Dec 04 11:48AM +1300

Luca Risolia wrote:
> 'std::exception::exception(const char [54])'
 
> throw std::exception("Stream to be parsed was given with an
> error state set");
 
That probably should be std::runtime_error.
 
--
Ian Collins
Luca Risolia <luca.risolia@linux-projects.org>: Dec 03 11:56PM +0100

Il 03/12/2014 23:48, Ian Collins ha scritto:
> That probably should be std::runtime_error.
 
Not only that...
I wonder why people don't try to compile their code before posting.
red floyd <no.spam@its.invalid>: Dec 03 12:24PM -0800

On 12/3/2014 9:03 AM, Victor Bazarov wrote:
 
> Try declaring a single char and initializing it with a *single quote*
> without an escape sequence... The question mark rule may have something
> to do with trigraph sequences.
 
GMTA, Victor!
Victor Bazarov <v.bazarov@comcast.invalid>: Dec 03 03:32PM -0500

On 12/3/2014 3:24 PM, red floyd wrote:
>> without an escape sequence... The question mark rule may have something
>> to do with trigraph sequences.
 
> GMTA, Victor!
 
Exactly what I thought, too. It's recursive! ;-)
 
V
--
I do not respond to top-posted replies, please don't ask
Martijn Lievaart <m@rtij.nl.invlalid>: Dec 03 08:47PM +0100

On Wed, 03 Dec 2014 12:35:26 +0000, Chris Vine wrote:
 
>> ++i; ++i; has the same effect and is clearer.
 
> I agree. This is for pedagogical purposes only. I hope it would never
> be in live code.
 
Yeah, ++++i feels like some other unspecified computer language. :-)
 
M4
Alain Ketterlin <alain@dpt-info.u-strasbg.fr>: Dec 03 06:53PM +0100

> }
 
> where: typedef vector<int> vector_t;
> (could be also plain array so I post this to clc as well)
 
A plain array would have been better, because the explanation has
nothing to do with vectors/templates/whatnot
 
> or 10x(AVX2) faster loop!
> I am really stumbled as I can't see why type of index variable restricts
> this optimization.
 
Yes: unsigned has well-defined semantics regarding overflow, which
signed has not. Therefore, the compiler is free to make whatever
assumption it likes when the behavior is undefined. See:
 
http://blog.llvm.org/2011/05/what-every-c-programmer-should-know.html
 
which I think is the best introduction to the subject, and includes lots
of examples.
 
-- Alain.
James Kuyper <jameskuyper@verizon.net>: Dec 03 01:13PM -0500

On 12/03/2014 12:53 PM, Alain Ketterlin wrote:
>> (could be also plain array so I post this to clc as well)
 
> A plain array would have been better, because the explanation has
> nothing to do with vectors/templates/whatnot
 
He's already posted (nearly three hours ago) a message indicating that
the problem does not occur when using plain arrays, so the explanation
must indeed have something to do with vectors/templates/whatnot - to be
more precise, it seems to be connected to the use of ghs.size().
Mr Flibble <flibbleREMOVETHISBIT@i42.co.uk>: Dec 03 05:46PM

On 03/12/2014 12:27, Vincenzo Mercuri wrote:
>> object.
 
> I didn't say it doesn't. I said that 'const T& obj' and 'T& obj' are not
> the same thing. Forcing something to be something else is another story.
 
They could both refer to the same object though and if the object is
non-const it is perfectly valid to cast away const from the reference.
This is totally different to your static_cast example.
 
/Flibble
Martijn Lievaart <m@rtij.nl.invlalid>: Dec 03 12:40PM +0100

On Wed, 03 Dec 2014 11:11:13 +0000, JiiPee wrote:
 
> }
 
> return 0;
> }
 
 
martijn@garfield:~/t$ g++ -std=c++11 -pedantic -c -o vararray.o
vararray.cpp
vararray.cpp: In function 'int main()':
vararray.cpp:11:21: warning: ISO C++ forbids variable length array
'arr' [-Wvla]
int arr[howMany];
^
 
It's an extension from C, known as Variable Length Array (VLA). It's not
legal C++. It's accepted by many compilers unless you tell them to be
pedantic.
 
HTH,
M4
JiiPee <no@notvalid.com>: Dec 03 12:25PM

On 03/12/2014 11:40, Martijn Lievaart wrote:
> pedantic.
 
> HTH,
> M4
 
oh ok, thanks. So I guess its not totally portable neither as its not
legal.... so not good to use I would think.
You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page.
To unsubscribe from this group and stop receiving emails from it send an email to comp.lang.c+++unsubscribe@googlegroups.com.

No comments: