Thursday, August 3, 2023

Digest for comp.lang.c++@googlegroups.com - 2 updates in 2 topics

Lynn McGuire <lynnmcguire5@gmail.com>: Aug 03 04:35PM -0500

"Inside STL: The vector" by Raymond Chen
https://devblogs.microsoft.com/oldnewthing/20230802-00/?p=108524
 
"The C++ language comes with a standard library, and although
implementations are welcome to implement the library types in whatever
manner they choose, they are constraints imposed by the standard which
often force one of a small number of possible implementations."
 
"The std::vector is one of those types which is constrained to the point
that there's really only one viable implementation."
 
Lynn
Paavo Helde <eesnimi@osa.pri.ee>: Aug 03 11:36AM +0300

03.08.2023 01:55 MarioCPPP kirjutas:
> On 02/08/23 20:15, Paavo Helde wrote:
 
>> For extracting the content of unknown pages
 
> they are not unknown : they are .odt exported as HTML, by LibreOffice.
 
Well, that makes things easier. If we can exclude some complications
like CDATA, HTML comments and nested <p> tags, then it might be indeed
possible to use regex to extract some content.
 
Be sure to use a non-greedy regex to match the closest end tag </p>, and
the equivalent of /s or dotall for '.' to match newlines (or use
(.|\r|\n) instead of dot). This seems to work at first glance:
 
grep -Po '<p (.|\r|\n)*?</p>' abc.xhtml
 
(-P is needed for grep to support non-greedy search).
You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page.
To unsubscribe from this group and stop receiving emails from it send an email to comp.lang.c+++unsubscribe@googlegroups.com.

No comments: