Friday, August 4, 2023

Digest for comp.lang.c++@googlegroups.com - 5 updates in 2 topics

Lynn McGuire <lynnmcguire5@gmail.com>: Aug 03 07:42PM -0500

"Inside STL: The string" by Raymond Chen
https://devblogs.microsoft.com/oldnewthing/20230803-00/?p=108532
 
"You might think that a std::string (and all of its friends in the
std::basic_string family) are basically a vector of characters
internally. But strings are organized differently due to specific
optimizations permitted for strings but not for vectors."
 
I've always thought the internal buffer was a cool idea.
 
Lynn
Paavo Helde <eesnimi@osa.pri.ee>: Aug 04 09:17AM +0300

04.08.2023 03:42 Lynn McGuire kirjutas:
> internally. But strings are organized differently due to specific
> optimizations permitted for strings but not for vectors."
 
> I've always thought the internal buffer was a cool idea.
 
You mean small string optimization? Yes, that's nifty. Still, I think it
could be made better.
 
Current mainstream (64-bit) implementations use SSO buffer of 16 bytes.
However, when a string is used inside an union which is larger, it could
well use a larger buffer, but there is no way to set this up.
 
A polymorphic variant class which I once made is 24 bytes. The last byte
in the class is the variant type tag, which is chosen to be 0 for small
strings, so that I can store zero-terminated small UTF-8 strings of up
to 23 bytes in it. I do not record the string length separately for
small strings as it is cheap to just calculate it by strlen() whenever
needed.
Lynn McGuire <lynnmcguire5@gmail.com>: Aug 04 02:55PM -0500

On 8/4/2023 1:17 AM, Paavo Helde wrote:
> to 23 bytes in it. I do not record the string length separately for
> small strings as it is cheap to just calculate it by strlen() whenever
> needed.
 
We compress large strings of more than 1,000 bytes so this is
interesting to me. Some of our strings go up to a GB in size.
 
Lynn
MarioCPPP <NoliMihiFrangereMentulam@libero.it>: Aug 04 02:01AM +0200

On 03/08/23 10:36, Paavo Helde wrote:
> <p (.|\r|\n)*?</p>
 
intresting. I tried this one and it detects most of
paragraphs, except those that does not have attributes
within the <p> opening tag.
 
Is it there a way to also include those ones ?
 
 
--
1) Resistere, resistere, resistere.
2) Se tutti pagano le tasse, le tasse le pagano tutti
MarioCPPP
Ben Bacarisse <ben.usenet@bsb.me.uk>: Aug 04 01:26AM +0100


> intresting. I tried this one and it detects most of paragraphs, except
> those that does not have attributes within the <p> opening tag.
 
> Is it there a way to also include those ones ?
 
PH's regex insists on a space after the "<p". Whilst this is not
exactly the same as requiring an attribute it will be effectively the
same. You could try
 
<p[ >](.|\r|\n)*?</p>
 
but I can't stress enough -- none of this can really work in all cases.
 
--
Ben.
You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page.
To unsubscribe from this group and stop receiving emails from it send an email to comp.lang.c+++unsubscribe@googlegroups.com.

No comments: