Wednesday, September 9, 2015

Digest for comp.lang.c++@googlegroups.com - 11 updates in 5 topics

ram@zedat.fu-berlin.de (Stefan Ram): Sep 09 10:55PM

>scanning every character, deleting many and replacing some with other
>data characters. Is it better (more efficient/faster) to use a string
>iterator or the ".at(pos)" function to do this type of work? TIA
 
It depends on the details. If you want to write the result
into another file deleting all »x«, for example, the following
might be sufficiently fast without using ::std::string at all.
 
#include <iostream>
#include <istream>
#include <ostream>
#include <fstream>
#include <iterator>
 
int main()
{ ::std::ifstream i{ "C:\\example\\source.txt" }; if( i )
{ ::std::ofstream o{ "C:\\example\\target.txt" }; if( o )
{ ::std::istreambuf_iterator< char >O;
::std::istreambuf_iterator< char >p{ i };
::std::ostreambuf_iterator< char >q{ o };
while( p != O )
{ char const ch = *p;
if( ch != 'x' )*q = ch;
++q; ++p; }}}}
ram@zedat.fu-berlin.de (Stefan Ram): Sep 09 11:27PM

>It is important that the cost of coping/moving is still linear.
>In the worst case two 'unnecessary' moving for each inserted element.
 
Nowadays, when people talk about whether »push_back« is a
good idea, they usually do this in the context of
discussions about when to use »emplace_back« instead.
 
However, in general, such micro-optimizations are much less
important than a maintainable large-scale structure of the
source code, which then always will allow to do
micro-optimizations later when deemed necessary.
Jorgen Grahn <grahn+nntp@snipabacken.se>: Sep 09 10:55PM

On Thu, 2015-08-27, Bo Persson wrote:
> array in C to a dynamically resized array in C++, and blame that on the
> language.
 
> Bad code runs slowly in any language.
 
Late comment:
 
A silly thing to do /in benchmarking/ against C arrays yes, and IIRC
that was the context.
 
But push_back() into a vector<T> is not generally a "stupid idea"
which makes "performance go down the drain". There will be a few
reallocations (log N of them, or something) and the associated copying
of T objects, but then you have all the performance benefits of a
std::vector<T>.
 
In C people often resort to linked lists for tasks like these, and
that's frequently a worse choice from a performance point of view.
 
/Jorgen
 
--
// Jorgen Grahn <grahn@ Oo o. . .
\X/ snipabacken.se> O o .
bartekltg <bartekltg@gmail.com>: Sep 10 01:20AM +0200

On 10.09.2015 00:55, Jorgen Grahn wrote:
> reallocations (log N of them, or something) and the associated copying
> of T objects, but then you have all the performance benefits of a
> std::vector<T>.
 
It is important that the cost of coping/moving is still linear.
In the worst case two 'unnecessary' moving for each inserted element.
 
And if I have the slightest idea, how much memory I will need,
I always can use reserve(); No reallocation, no objects moving,
no two times too much memory (if I had bad luck).
 
bartekltg
MikeCopeland <mrc2323@cox.net>: Sep 09 03:15PM -0700

Given files of several thousand records, each up to 1400 characters,
scanning every character, deleting many and replacing some with other
data characters. Is it better (more efficient/faster) to use a string
iterator or the ".at(pos)" function to do this type of work? TIA
 
---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus
Mr Flibble <flibbleREMOVETHISBIT@i42.co.uk>: Sep 09 11:23PM +0100

On 09/09/2015 23:15, MikeCopeland wrote:
> scanning every character, deleting many and replacing some with other
> data characters. Is it better (more efficient/faster) to use a string
> iterator or the ".at(pos)" function to do this type of work? TIA
 
The problem is not iterating/indexing but the complexity of
delete/insert operations which for std::string is O(n). You might want
to look at my container neolib::segmented_array instead.
 
/Flibble
Christopher Pisz <nospam@notanaddress.com>: Sep 09 05:30PM -0500

On 9/9/2015 5:15 PM, MikeCopeland wrote:
 
> ---
> This email has been checked for viruses by Avast antivirus software.
> https://www.avast.com/antivirus
 
Without any knowledge of what said records look like, how they can be
broken up, etc. I doubt much advice could be given.
 
Given a string of n characters, iterating from begin to character x is
exactly the same complexity as calling std::string::at(x) afaik.
 
Deleting and replacing can get a little more hairy as to whether you
want to use an index or an iterator, but you must be careful with both,
because they will no longer point to where you think they do after a delete.
 
I would wonder though, in your description "given files of several
thousand records" why you would be playing with string at all. Storing
files in records is a common enough task that there are tons of third
party libraries out there for it that will probably do a better job of
it then one could do on their own. XML, Json, Binary serializers,
they're all out there.
 
--
I have chosen to troll filter/ignore all subthreads containing the
words: "Rick C. Hodgins", "Flibble", and "Islam"
So, I won't be able to see or respond to any such messages
---
Jorgen Grahn <grahn+nntp@snipabacken.se>: Sep 09 10:34PM

On Wed, 2015-09-09, MikeCopeland wrote:
> Given files of several thousand records, each up to 1400 characters,
> scanning every character, deleting many and replacing some with other
> data characters. Is it better (more efficient/faster)
 
Your first focus should be on clarity and maintainability. You may
find that the work takes a few milliseconds no matter how you do it.
 
> to use a string
> iterator or the ".at(pos)" function to do this type of work? TIA
 
Depends entirely on what you're doing to the strings. Perhaps some
third technique is the best, e.g. regular expressions.
 
Of the two you mention I prefer iterators (or const char* used the
same way) because I'm so used to that idiom, and because they mix well
with the standard algorithms.
 
/Jorgen
 
--
// Jorgen Grahn <grahn@ Oo o. . .
\X/ snipabacken.se> O o .
bartekltg <bartekltg@gmail.com>: Sep 10 01:09AM +0200

On 10.09.2015 00:23, Mr Flibble wrote:
>> scanning every character, deleting many and replacing some with other
>> data characters. Is it better (more efficient/faster) to use a string
>> iterator or the ".at(pos)" function to do this type of work? TIA
 
I agree with the others, to give reasonable answer we have
to known more.
 
> The problem is not iterating/indexing but the complexity of
> delete/insert operations which for std::string is O(n). You might want
> to look at my container neolib::segmented_array instead.
 
Isn't rope from 'almost standard' a similar container?
#include <ext/rope>
https://www.sgi.com/tech/stl/Rope.html
It probably is installed on OP's computer already.
 
bartekltg
mrs@kithrup.com (Mike Stump): Sep 09 04:32PM

In article <ms20p3$gbh$1@dont-email.me>,
>> Also have you tried -fpermissive?
 
>Yes, of course that works. But doing that kind of stuff exposes me to
>the QA manager that surely will look at that with a bad eye
 
That was invented for people just like you, for companies just like
your company, for situations, wait for it, just like this. [ blink ]
 
Use it, in 2 more decades, when your company goes away, or the product
is no longer shipped or maintained, the use of the flag goes away.
 
If you expect to last more than 2 decades, you can start staging in
fixes for the issues and in another 2 years when you get the last one
fixed, you can then remove the flag. Management should be able to
provide guidance if they plan for the company to be around, if the
software is to be around.
1971 powerChina <chinapower1971@gmail.com>: Sep 08 08:25PM -0700

在 2015年9月8日星期二 UTC+8下午10:19:47,Doug Mika写道:
 
> PS.
> It's taken from:
> https://msdn.microsoft.com/en-us/library/system.string.copyto(v=vs.110).aspx?cs-save-lang=1&cs-lang=cpp#code-snippet-2
 
http://sourceforge.net/projects/pwwhashmap/
You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page.
To unsubscribe from this group and stop receiving emails from it send an email to comp.lang.c+++unsubscribe@googlegroups.com.

No comments: