Thursday, July 18, 2019

Digest for comp.lang.c++@googlegroups.com - 6 updates in 1 topic

Christian Gollwitzer <auriocus@gmx.de>: Jul 18 07:06AM +0200

Am 17.07.19 um 23:55 schrieb Soviet_Mario:
> format.
 
> I need a parsing function aware of the human-written date-time.
> But I'll also explore RegEx patterns to define narrower contexts ...
 
 
There are attempts to do that. For example, there is a functino Tcl
which tries to guess the format from the data, it's described here:
https://www.tcl.tk/man/tcl8.6/TclCmd/clock.htm#M80
 
But you'll get an "opinion" at best, because the problem is provably
impossible. Even humans cannot solve it perfectly. For example, if you
see this 05/07/19 it could be 5th of July 2019 or 7th of May 2019. The
first interpretation is more likely if it was written by a British
writer, the second one if it was written by an American. For 13/05/2019
it is not ambiguous, because there is no 13th month.
 
There is a list of traditional forms available here:
https://en.wikipedia.org/wiki/Date_format_by_country
 
You could pick the most likely formats from this list according to the
origin of your data, try to parse the string with each of them, check if
the month is between 1 and 12 and if there remains more than one
interpretation, throw an error.
 
If it is for a one off thing, e.g. you have a long list and want to
parse that once, I'd just use a Tcl script to convert the dates.
 
Christian
Paavo Helde <myfirstname@osa.pri.ee>: Jul 18 09:33AM +0300

On 18.07.2019 0:52, Soviet_Mario wrote:
> I've read a program can be made "a demon" (a TSR) so as a last resort I
> could willy nilly write a small piece of string-to-struct program to
> keep resident.
 
Wow, I have not heard the term "TSR" (terminate and stay resident) used
for a long time! Sweet times! Too bad they are gone already for many years!
 
The idea to create a separate service/daemon for calling an external
program, just because boost::date_time looks too complicated, is just
hilarious! What happened to popen()?
 
Hint: the date/time handling looks complicated because it is. A good
start is to realize a single date lasts around 48 hours. So based on
datetime stamp it is not possible to say at which date the event
happened, even when leaving relativistic effects and Julius calendar out
of the play.
 
I have my own code based on Boost, for parsing dates in multiple
formats, but it only accepts non-ambiguous formats like "2018-03-01".
Anything with slashes is rejected as ambiguous, the caller must figure
them out beforehand.
 
> But I'm far from happy to try an inter-process data exchange.
 
A side note: for portable inter-process date-time exchange one should
only use the ISO datetime UTC format, e.g. "2005-12-31T14:30:25.00173Z".
Soviet_Mario <SovietMario@CCCP.MIR>: Jul 18 02:44PM +0200

On 18/07/19 08:33, Paavo Helde wrote:
 
> Wow, I have not heard the term "TSR" (terminate and stay
> resident) used for a long time! Sweet times! Too bad they
> are gone already for many years!
 
:) :) :) LOL :)
I had said : I'm very rusty and old-fashioned-minded :)
 
 
> The idea to create a separate service/daemon for calling an
> external program, just because boost::date_time looks too
> complicated, is just hilarious!
 
well, maybe. I must admit I have no real idea of the
problems, as I actually had never created a daemon, not to
say tried to communicate with. The gambas code in itself
would be trivial, but I suspect a lot of problems at
system-level
 
> What happened to popen()?
 
sorry ? I don't understand what you mean ...
 
 
> Hint: the date/time handling looks complicated because it
> is.
 
yes I know. To restrict the scope, I just want to support
english formats and my "locale" (ita). No more than this.
And to refer to UTC as a frame reference.
 
> multiple formats, but it only accepts non-ambiguous formats
> like "2018-03-01". Anything with slashes is rejected as
> ambiguous, the caller must figure them out beforehand.
 
mmm. Ugly thing to have to deal with erratic user inputs :\
 
 
> A side note: for portable inter-process date-time exchange
> one should only use the ISO datetime UTC format, e.g.
> "2005-12-31T14:30:25.00173Z".
 
I would like that once parsed every date would be converted
to UTC, yes
 
 
--
1) Resistere, resistere, resistere.
2) Se tutti pagano le tasse, le tasse le pagano tutti
Soviet_Mario - (aka Gatto_Vizzato)
Soviet_Mario <SovietMario@CCCP.MIR>: Jul 18 02:50PM +0200

On 18/07/19 07:06, Christian Gollwitzer wrote:
> more likely if it was written by a British writer, the
> second one if it was written by an American. For 13/05/2019
> it is not ambiguous, because there is no 13th month.
 
yes .... "some" extra data, not deduced from context, might
be profided to the parser to solve amgiguous cases.
 
 
 
> There is a list of traditional forms available here:
> https://en.wikipedia.org/wiki/Date_format_by_country
 
nice ! TY
 
 
> If it is for a one off thing, e.g. you have a long list and
> want to parse that once, I'd just use a Tcl script to
> convert the dates.
 
unfortunately I dont know TCL
Thanks for reply
 
 
--
1) Resistere, resistere, resistere.
2) Se tutti pagano le tasse, le tasse le pagano tutti
Soviet_Mario - (aka Gatto_Vizzato)
James Kuyper <jameskuyper@alumni.caltech.edu>: Jul 18 09:24AM -0400

On 7/18/19 8:44 AM, Soviet_Mario wrote:
...
> yes I know. To restrict the scope, I just want to support
> english formats and my "locale" (ita). No more than this.
> And to refer to UTC as a frame reference.
 
Before you attempt anything that ambitious, I recommend you start small:
ISO 8601
<https://en.wikipedia.org/wiki/ISO_8601> doesn't refer to a single
format, but a family of closely related formats. For instance, you can
leave out trailing parts, such as the seconds, minutes, hour, day, or
month. You can also use 2019-W15-5 to indicate the 5th day of the 15th
week of 2019, or 2019-123 to refer to the 123rd day of 2019.
Just covering all of the variants of ISO 8601 will be a complicated but
well defined task, and careful attention has been taken to make sure
that the interpretation is never ambiguous. Once you have completed
that, you can decide whether you want to go on to cover other date formats.
I predict you'll get tired of dealing with all the complications of date
formats long before you finish.
Keith Thompson <kst-u@mib.org>: Jul 18 01:41PM -0700

> On 18/07/19 08:33, Paavo Helde wrote:
[...]
> system-level
 
>> What happened to popen()?
 
> sorry ? I don't understand what you mean ...
 
If you're on a UNIX-like system, run the "man popen" command.
 
popen() is a function, defined by POSIX (but not by the C or C++
standard) that lets you invoke an external command and capture
its output. Using it is *much* simpler than creating a daemon and
setting up a communication channel. I believe it's also available
under Windows.
 
--
Keith Thompson (The_Other_Keith) kst-u@mib.org <http://www.ghoti.net/~kst>
Will write code for food.
void Void(void) { Void(); } /* The recursive call of the void */
You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page.
To unsubscribe from this group and stop receiving emails from it send an email to comp.lang.c+++unsubscribe@googlegroups.com.

No comments: