soft and program: Digest for comp.lang.c++@googlegroups.com

comp.lang.c++@googlegroups.com

Google Groups

converting a string containing a comma to a number - 5 Updates
enum initialization - 5 Updates
move semantics - not with temporary objects - 3 Updates
converting a string containing a comma to a number - 2 Updates

converting a string containing a comma to a number

Chris Vine <chris@cvine--nospam--.freeserve.co.uk>: Nov 11 01:05AM

On Thu, 9 Nov 2017 20:10:17 -0600

> > Thanks,
> > Lynn

> A string such as 4,800.1 or 3,334.5e9.

The simplest solution is to imbue a locale into a std::stringstream
object. Given a std::stringstream object 's',
s.imbue(std::locale("en_US")) would do what you want. You can then
input a string with a standard English numeric representation and
output a double of the required value.

This won't be particularly efficient. If efficiency is important then
a simple parser which makes the substitutions you want would be better.

Chris

Jorgen Grahn <grahn+nntp@snipabacken.se>: Nov 11 08:41AM

On Fri, 2017-11-10, Ben Bacarisse wrote:
> For example, the input might include other commas that have to be
> preserved, or the removal of all commas allows input to be parsed which
> should be treated as an error.

Like 33,,,34.,5,,e,9.

> Without knowing more about the overall
> requirement it's hard to give good advice.

Yes. IMHO, not looking at the concrete requirements has been the
problem with this thread all along.

My five cents: if I can define what's valid syntax myself, then I'd do
that first. If that means I cannot use strtod() and friends and
cannot get any help from the locale system, then so be it.

WRT locales, they are, to me, simplistic and problematic. For example,
the real situation with fractional numbers and people in Sweden:

1234,56 The official syntax.

1 234.56 Even better, but cannot be represented in ASCII since
the thousands separator is a half-width non-breaking
space. A full space is too much (and would make
it more than one token in any surrounding syntax).

I don't think the locales reflect this practice, so you
see thousands separators too rarely in computer output.

1234.56 Everyone recognizes this too, since it's what computers
ands pocket calculators have printed for decades.

1.23456e3 Programmers recognize any C syntax.

And so on.

/Jorgen

--
// Jorgen Grahn <grahn@ Oo o. . .
\X/ snipabacken.se> O o .

bartc <bc@freeuk.com>: Nov 11 11:58AM

On 11/11/2017 08:41, Jorgen Grahn wrote:
> ands pocket calculators have printed for decades.

> 1.23456e3 Programmers recognize any C syntax.

> And so on.

I'm very surprised that C thinks this stuff belongs in such a low level
language, rather than in an application, or a library outside the
language itself. (And even applications have trouble with it: try adding
spaces to a credit-card number in a web-form.)

C doesn't even have numeric separators to help writing source, yet
printf is expected to display proper thousands separators depending on
locale? And here we also expect it to figure out the whether a comma is
properly a thousands separator or not, or if the comma is used correctly
in 1000,000.

Usually when such a language does input (from a text file, or from the
keyboard), then commas will separate different numbers, or a number and
non-number.

You can't reasonably have a comma separating thousands unless the string
representing the number has already isolated it. For example, it's from
an edit-box on a form. And then, as suggested, it really needs proper
validation if this is user-input.

--
bartc

Keith Thompson <kst-u@mib.org>: Nov 11 12:50PM -0800

bartc <bc@freeuk.com> writes:
[...]
> C doesn't even have numeric separators to help writing source, yet
> printf is expected to display proper thousands separators depending on
> locale?

No, it isn't. POSIX specifies "%'d", for example, but ISO C has
no facility for handling thousands separators in input or output.
(It does let you query the current locale.)

[...]

--
Keith Thompson (The_Other_Keith) kst-u@mib.org <http://www.ghoti.net/~kst>
Working, but not speaking, for JetHead Development, Inc.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

bartc <bc@freeuk.com>: Nov 11 10:01PM

On 11/11/2017 20:50, Keith Thompson wrote:

> No, it isn't. POSIX specifies "%'d", for example, but ISO C has
> no facility for handling thousands separators in input or output.
> (It does let you query the current locale.)

It works with lccwin.

But there seems a thin dividing line between C headers and POSIX ones,
which isn't exactly obvious in this list:

http://pubs.opengroup.org/onlinepubs/9699919799/idx/head.html

And if I try and compile any open source using only C standard headers,
it will likely fail.

--
bartc

enum initialization

Jorgen Grahn <grahn+nntp@snipabacken.se>: Nov 11 08:54AM

On Fri, 2017-11-10, Gareth Owen wrote:

> So you've a potential cost (someone not understanding the default -
> slightly less comprehensible code), and literally no benefit at besides
> far less typing than it took to explain it.

People /do/ understand the default. Writing superfluous text is
stupid (I've been told to write "unsigned int" instead of just
"unsigned" for this reason.)

> If you're using a pattern, breaking the pattern for *no* *good* *reason*
> is a poor idea. People like patterns, and people understand patterns.

But /that/ is the real reason I agree, and wouldn't write the enum
that way.

In fact, I wouldn't do anything fancy at all and just write:

foo = 1,
bar = 2,
baz = 4,
bat = 8,
xxx = 16,
...

I'd also stop and think if I need such an enum in the first place.

/Jorgen

--
// Jorgen Grahn <grahn@ Oo o. . .
\X/ snipabacken.se> O o .

Gareth Owen <gwowen@gmail.com>: Nov 11 10:21AM

> baz = 4,
> bat = 8,
> xxx = 16,

I'm totally fine with that. It's

foo = 1,
bar,
baz = 4,
bat = 8,
xxx = 16,

that's gratuitously difficult to parse.

Mr Flibble <flibbleREMOVETHISBIT@i42.co.uk>: Nov 11 03:19PM

On 11/11/2017 10:21, Gareth Owen wrote:
> bat = 8,
> xxx = 16,

> that's gratuitously difficult to parse.

Why decimal? I think the following is a better pattern:

foo = 0b000001,
bar = 0b000010,
baz = 0b000100,
bat = 0b001000,
xxx = 0b010000,
/Flibble

Gareth Owen <gwowen@gmail.com>: Nov 11 07:26PM

> bat = 0b001000,
> xxx = 0b010000,
> /Flibble

Thats pretty clear if your compiler supports C++14 -

foo = 0001,
bar = 0002,
baz = 0004,
bat = 0010,
xxx = 0020,

works similarly diagonally if it doesn't

Mr Flibble <flibbleREMOVETHISBIT@i42.co.uk>: Nov 11 08:30PM

On 11/11/2017 19:26, Gareth Owen wrote:
> bat = 0010,
> xxx = 0020,

> works similarly diagonally if it doesn't

Octal? Sneaky.

/Flibble

move semantics - not with temporary objects

porparek@gmail.com: Nov 11 03:09AM -0800

Hi,
It is common that move semantics is closely related to temporary objects.
The following example shows that it isn't

MyClass {...};

MyClass fun ( void ) {
MyClass myObj;
return myObj;
}

int main ( void ) {
MyClass myObj;
myObj = fun();
return 0;
}

When I build it with "g++ -O3" the temporary object is never created inside fun(). Doesn't matter whether I create move constructor and move assignment operator inside MyClass or not. I think that in this example move semantics has got nothing to do with temporary objects. As the result saying that move semantics is closely related with temporary objects is wrong.
In the example above the compiler just knows that myObj inside fun() is going to evaporate and this is why it uses move assignment operator to steal the resources from it.

Please let me know whether I'm right or not.

thanks in advance

"Alf P. Steinbach" <alf.p.steinbach+usenet@gmail.com>: Nov 11 01:52PM +0100

> knows that myObj inside fun() is going to evaporate and this is why
> it uses move assignment operator to steal the resources from it.

> Please let me know whether I'm right or not.

This is commonly known as Return Value Optimization, or RVO, where each
call site passes the address of where the function result should be
constructed.

RVO is one kind of copy construction elision.

However, in C++11 and later it's not just copy constructor invocations
that can be elided: also move constructor invocations can be elided,
which, depending on MyClass, is most probably what happens here.

Cheers & hth.,

- Alf

Chris Vine <chris@cvine--nospam--.freeserve.co.uk>: Nov 11 02:16PM

On Sat, 11 Nov 2017 03:09:54 -0800 (PST)
> knows that myObj inside fun() is going to evaporate and this is why
> it uses move assignment operator to steal the resources from it.

> Please let me know whether I'm right or not.

You have been told about return value optimization in a separate answer.
RVO is one of the few occasions where the C++ standard permits
copy/move constructors to be elided for non-trivial types. Move
construction doesn't occur in your example.

On your more general point about move semantics and temporaries, it
would be more correct to say that move semantics are associated with
rvalues, because rvalues bind to rvalue references. An unnamed object
(temporary) is a prvalue and is one example of an rvalue. Another is an
lvalue cast to rvalue using std::move (that produces an xvalue, which
is another category of rvalue).

This will give you more information:
http://en.cppreference.com/w/cpp/language/value_category

Chris

converting a string containing a comma to a number

ram@zedat.fu-berlin.de (Stefan Ram): Nov 11 12:18AM

>{ ::std::cout << stod( replace( "4,800.1"s, ","s, ""s ))<< '\n';
> ::std::cout << stod( replace( "3,334.5e9"s, ","s, ""s ))<< '\n'; }

main.cpp

#include <iostream>
#include <ostream>
#include <sstream>
#include <locale>
#include <string>

using namespace ::std::literals;

struct s : ::std::numpunct< char>
{ char do_thousands_sep() const override { return ','; }
::std::string do_grouping() const override { return "\3"; }};

static double double_value_of( ::std::string const & string )
{ ::std::stringstream source { string };
source.imbue( ::std::locale( source.getloc(), new s ));
double number; source >> number; return number; }

int main()
{ ::std::cout << double_value_of( "4,800.1"s )<< '\n';
::std::cout << double_value_of( "3,334.5e9"s )<< '\n'; }

transcript

4800.1
3.3345e+012

ram@zedat.fu-berlin.de (Stefan Ram): Nov 11 01:24AM

>source.imbue( ::std::locale( source.getloc(), new s ));

::std::locale possibly implements it's own reference
counting memory manager, so the »new« above might not
be a leak!

You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page.
To unsubscribe from this group and stop receiving emails from it send an email to comp.lang.c+++unsubscribe@googlegroups.com.

soft and program

Saturday, November 11, 2017

Digest for comp.lang.c++@googlegroups.com - 15 updates in 4 topics

No comments:

Blog Archive

About Me