- "Inside STL: The vector" by Raymond Chen - 1 Update
- [OT] Help for a RegEx - 1 Update
Lynn McGuire <lynnmcguire5@gmail.com>: Aug 03 04:35PM -0500 "Inside STL: The vector" by Raymond Chen https://devblogs.microsoft.com/oldnewthing/20230802-00/?p=108524 "The C++ language comes with a standard library, and although implementations are welcome to implement the library types in whatever manner they choose, they are constraints imposed by the standard which often force one of a small number of possible implementations." "The std::vector is one of those types which is constrained to the point that there's really only one viable implementation." Lynn |
Paavo Helde <eesnimi@osa.pri.ee>: Aug 03 11:36AM +0300 03.08.2023 01:55 MarioCPPP kirjutas: > On 02/08/23 20:15, Paavo Helde wrote: >> For extracting the content of unknown pages > they are not unknown : they are .odt exported as HTML, by LibreOffice. Well, that makes things easier. If we can exclude some complications like CDATA, HTML comments and nested <p> tags, then it might be indeed possible to use regex to extract some content. Be sure to use a non-greedy regex to match the closest end tag </p>, and the equivalent of /s or dotall for '.' to match newlines (or use (.|\r|\n) instead of dot). This seems to work at first glance: grep -Po '<p (.|\r|\n)*?</p>' abc.xhtml (-P is needed for grep to support non-greedy search). |
You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page. To unsubscribe from this group and stop receiving emails from it send an email to comp.lang.c+++unsubscribe@googlegroups.com. |
No comments:
Post a Comment