Christian Gollwitzer <auriocus@gmx.de>: Jul 18 07:06AM +0200 Am 17.07.19 um 23:55 schrieb Soviet_Mario: > format. > I need a parsing function aware of the human-written date-time. > But I'll also explore RegEx patterns to define narrower contexts ... There are attempts to do that. For example, there is a functino Tcl which tries to guess the format from the data, it's described here: https://www.tcl.tk/man/tcl8.6/TclCmd/clock.htm#M80 But you'll get an "opinion" at best, because the problem is provably impossible. Even humans cannot solve it perfectly. For example, if you see this 05/07/19 it could be 5th of July 2019 or 7th of May 2019. The first interpretation is more likely if it was written by a British writer, the second one if it was written by an American. For 13/05/2019 it is not ambiguous, because there is no 13th month. There is a list of traditional forms available here: https://en.wikipedia.org/wiki/Date_format_by_country You could pick the most likely formats from this list according to the origin of your data, try to parse the string with each of them, check if the month is between 1 and 12 and if there remains more than one interpretation, throw an error. If it is for a one off thing, e.g. you have a long list and want to parse that once, I'd just use a Tcl script to convert the dates. Christian |
Paavo Helde <myfirstname@osa.pri.ee>: Jul 18 09:33AM +0300 On 18.07.2019 0:52, Soviet_Mario wrote: > I've read a program can be made "a demon" (a TSR) so as a last resort I > could willy nilly write a small piece of string-to-struct program to > keep resident. Wow, I have not heard the term "TSR" (terminate and stay resident) used for a long time! Sweet times! Too bad they are gone already for many years! The idea to create a separate service/daemon for calling an external program, just because boost::date_time looks too complicated, is just hilarious! What happened to popen()? Hint: the date/time handling looks complicated because it is. A good start is to realize a single date lasts around 48 hours. So based on datetime stamp it is not possible to say at which date the event happened, even when leaving relativistic effects and Julius calendar out of the play. I have my own code based on Boost, for parsing dates in multiple formats, but it only accepts non-ambiguous formats like "2018-03-01". Anything with slashes is rejected as ambiguous, the caller must figure them out beforehand. > But I'm far from happy to try an inter-process data exchange. A side note: for portable inter-process date-time exchange one should only use the ISO datetime UTC format, e.g. "2005-12-31T14:30:25.00173Z". |
Soviet_Mario <SovietMario@CCCP.MIR>: Jul 18 02:44PM +0200 On 18/07/19 08:33, Paavo Helde wrote: > Wow, I have not heard the term "TSR" (terminate and stay > resident) used for a long time! Sweet times! Too bad they > are gone already for many years! :) :) :) LOL :) I had said : I'm very rusty and old-fashioned-minded :) > The idea to create a separate service/daemon for calling an > external program, just because boost::date_time looks too > complicated, is just hilarious! well, maybe. I must admit I have no real idea of the problems, as I actually had never created a daemon, not to say tried to communicate with. The gambas code in itself would be trivial, but I suspect a lot of problems at system-level > What happened to popen()? sorry ? I don't understand what you mean ... > Hint: the date/time handling looks complicated because it > is. yes I know. To restrict the scope, I just want to support english formats and my "locale" (ita). No more than this. And to refer to UTC as a frame reference. > multiple formats, but it only accepts non-ambiguous formats > like "2018-03-01". Anything with slashes is rejected as > ambiguous, the caller must figure them out beforehand. mmm. Ugly thing to have to deal with erratic user inputs :\ > A side note: for portable inter-process date-time exchange > one should only use the ISO datetime UTC format, e.g. > "2005-12-31T14:30:25.00173Z". I would like that once parsed every date would be converted to UTC, yes -- 1) Resistere, resistere, resistere. 2) Se tutti pagano le tasse, le tasse le pagano tutti Soviet_Mario - (aka Gatto_Vizzato) |
Soviet_Mario <SovietMario@CCCP.MIR>: Jul 18 02:50PM +0200 On 18/07/19 07:06, Christian Gollwitzer wrote: > more likely if it was written by a British writer, the > second one if it was written by an American. For 13/05/2019 > it is not ambiguous, because there is no 13th month. yes .... "some" extra data, not deduced from context, might be profided to the parser to solve amgiguous cases. > There is a list of traditional forms available here: > https://en.wikipedia.org/wiki/Date_format_by_country nice ! TY > If it is for a one off thing, e.g. you have a long list and > want to parse that once, I'd just use a Tcl script to > convert the dates. unfortunately I dont know TCL Thanks for reply -- 1) Resistere, resistere, resistere. 2) Se tutti pagano le tasse, le tasse le pagano tutti Soviet_Mario - (aka Gatto_Vizzato) |
James Kuyper <jameskuyper@alumni.caltech.edu>: Jul 18 09:24AM -0400 On 7/18/19 8:44 AM, Soviet_Mario wrote: ... > yes I know. To restrict the scope, I just want to support > english formats and my "locale" (ita). No more than this. > And to refer to UTC as a frame reference. Before you attempt anything that ambitious, I recommend you start small: ISO 8601 <https://en.wikipedia.org/wiki/ISO_8601> doesn't refer to a single format, but a family of closely related formats. For instance, you can leave out trailing parts, such as the seconds, minutes, hour, day, or month. You can also use 2019-W15-5 to indicate the 5th day of the 15th week of 2019, or 2019-123 to refer to the 123rd day of 2019. Just covering all of the variants of ISO 8601 will be a complicated but well defined task, and careful attention has been taken to make sure that the interpretation is never ambiguous. Once you have completed that, you can decide whether you want to go on to cover other date formats. I predict you'll get tired of dealing with all the complications of date formats long before you finish. |
Keith Thompson <kst-u@mib.org>: Jul 18 01:41PM -0700 > On 18/07/19 08:33, Paavo Helde wrote: [...] > system-level >> What happened to popen()? > sorry ? I don't understand what you mean ... If you're on a UNIX-like system, run the "man popen" command. popen() is a function, defined by POSIX (but not by the C or C++ standard) that lets you invoke an external command and capture its output. Using it is *much* simpler than creating a daemon and setting up a communication channel. I believe it's also available under Windows. -- Keith Thompson (The_Other_Keith) kst-u@mib.org <http://www.ghoti.net/~kst> Will write code for food. void Void(void) { Void(); } /* The recursive call of the void */ |
You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page. To unsubscribe from this group and stop receiving emails from it send an email to comp.lang.c+++unsubscribe@googlegroups.com. |
No comments:
Post a Comment