soft and program: Digest for comp.lang.c++@googlegroups.com

comp.lang.c++@googlegroups.com

Google Groups

Is this undefined behavior? - 2 Updates
Augments and variable length data forms - 4 Updates
Fake switch or fake loop only to break - 4 Updates
Fake switch or fake loop only to break - 1 Update
DCAS-atomic - 3 Updates

"Chris M. Thomasson" <chris.m.thomasson.1@gmail.com>: Jun 15 03:52PM -0700

On 6/15/2020 9:44 AM, Juha Nieminen wrote:
> pointers, requiring you to manually free the allocated memory
> (although it may well be that you deliberately extracted this code
> from its RAII context to simplify it for the sake of example).

I need at least one raii object that can wrap the existing code, to call
header_alloc on ctor, and header_free on dtor. The power of RAII is that
it can most likely be adapted to existing explicit apis, create/destroy.
Automate it. One of my favorite examples is good ol' ScopeGuard. :^)

> functions to be exited from pretty much anywhere).

> For very small programs it may be completely fine, though. No need
> to over-engineer it if you never intend to use it in a larger program.

raii would not hurt here, imagine that the code is as it is. Well, we
can wrap it up. Imvho, its a nice convenient feature of C++.

"Chris M. Thomasson" <chris.m.thomasson.1@gmail.com>: Jun 15 04:04PM -0700

On 6/15/2020 3:52 PM, Chris M. Thomasson wrote:
>> to over-engineer it if you never intend to use it in a larger program.

> raii would not hurt here, imagine that the code is as it is. Well, we
> can wrap it up. Imvho, its a nice convenient feature of C++.
For some reason this reminds me of some of my older memory allocators
where the header was aligned on a large boundary. I could do it two
ways. One was round down to get at the header, and another experiment
rounded down then subtracted the size of the header.

Take a cache line allocator, if the header fits in a cache line, then it
can be the first element in an array of lines.

Augments and variable length data forms

Bart <bc@freeuk.com>: Jun 15 09:53PM +0100

> if (fh)
> {
> fread(d, 1, sizeof(data), fh);

Here, you wouldn't normally know the size of 'data' (assuming this is
one of your new kinds of structs). The size depends on the strings it
contains.

Let me ask: if the block of data you've read constains N pairs of
lengths and strings, once it is read into your SDay structure (which
let's say is defined, conveniently, with N pairs of length and strings
too), how does the program locate the offset of the length of the i'th
length (the offset of the i'th string will be 4 bytes further on)?

(I say offset, but you seem to have dropped the idea of using structs
with multiple strings, for an array of single-string elements, using a
flat representation with variable-length elements.)

What magic is performed to do that? And if there is no magic, you just
have to schlepp along the data just as I showed in my getstrn()
function, that what is the point of creating an extensive language
feature just for that one specific data format, out of dozens of
possible such layouts?

In any case, as I understand C++, you can achieve this anyway in that
language; that is, access the fields as though they were at the same
fixed offsets of a normal struct, and adapt that for a number of
different formats. But even without C++, any ordinary language can
access the same; see my example below.

> {
> printf("Memory: %d %04d %s\n", i, dMem->length, dMem->name);
> printf("Disk: %d %04d %s\n", i, dDisk->length, dMem->name);

Don't forget that strings are supposed to be non-zero-terminated; this
needs more work.

> }
> }
> }

This is another example using the code I posted previously. First, to
create an external file containing the binary data, I used this script:

strings:=("One","Two","Three","Four")

f:=createfile("data")
forall s in strings do
outlong(f,s.len)
print @f,s
od
outlong(f,0xFFFFFFFF)

closefile(f)

This produces this binary file (might wrap)

0000: 03 00 00 00 4f 6e 65 03 00 00 00 54 77 6f 05 00
[....One....Two..]
0010: 00 00 54 68 72 65 65 04 00 00 00 46 6f 75 72 ff
[..Three....Four.]
0020: ff ff ff [...]

Now I can read it with this C code, using the routines I've posted
(tweaked for non-zero-termonated), and a routine 'readfile', not shown:

int main(void) {
char* s;
char* data;
int nstrings, length;

data = readfile("data");

nstrings = countstrings(data);

for (int i=1; i<=nstrings; ++i) {
length = getlength(data,i);
s = getstring(data,i);

printf("%d: %04d \"%.*s\"\n",i,length,length,s);
}
}

It displays:

1: 0003 "One"
2: 0003 "Two"
3: 0005 "Three"
4: 0004 "Four"

Determining the size and/or end of the data is still an issue, something
you've glossed over, but I've had to deal with it because mine is actual
working code, unlike yours. So I used a sentinel still.

rick.c.hodgin@gmail.com: Jun 15 03:15PM -0700

On Monday, June 15, 2020 at 4:53:58 PM UTC-4, Bart wrote:

> Here, you wouldn't normally know the size of 'data' (assuming this is
> one of your new kinds of structs). The size depends on the strings it
> contains.

fseek(fh, 0, SEEK_END);
int size = ftell(fh);
fseek(fh, 0, SEEK_SET);

SDay* d = (SDay*)malloc(size);

And if you're receiving data over a network, you know the packet size
by one of a multitude of ways.

> let's say is defined, conveniently, with N pairs of length and strings
> too), how does the program locate the offset of the length of the i'th
> length (the offset of the i'th string will be 4 bytes further on)?

I don't know where the disconnect is, but the purpose of defining this
variable length structure is so that you can pass data of varying
lengths, or no data, as a data object or an array element if in a list,
and process it without having to jump through coding hoops to do so.
The compiler hides all of the access code. You just deal with members
as though they were fixed.

The only requirement is, the lengths defining variable portions have to
be defined in the structure before their variable portion counter-part,
so that the length is a constant.

> (I say offset, but you seem to have dropped the idea of using structs
> with multiple strings, for an array of single-string elements, using a
> flat representation with variable-length elements.)

I used your example back to you. The important part of the struct
definition is the two-element pair:

int length;
char variable_data[0..length];

You can have as many of those as you want in your struct.

> function, that what is the point of creating an extensive language
> feature just for that one specific data format, out of dozens of
> possible such layouts?

Because it is fast. You pass data, and it's already ready to be read.
If I know I'm receiving an email, I can define an SEmail struct, with
all of the variable components it might have, and then access them in
a fixed manner with zero parsing and immediate access to all members.

> fixed offsets of a normal struct, and adapt that for a number of
> different formats. But even without C++, any ordinary language can
> access the same; see my example below.

I know of no way to do that. And especially not with the simple syntax
this extension has.

> Determining the size and/or end of the data is still an issue, something
> you've glossed over, but I've had to deal with it because mine is actual
> working code, unlike yours. So I used a sentinel still.

Okay, code up your example with this form:

struct SAttachment
{
int type;
int length;
char attachment_data[0..length];
};

struct SEmail
{
int length_from;
int length_to;
int length_cc;
int length_bcc;
int length_subject;
int length_header;
int length_body;
int attachmentCount;

char from [0..length_from];
char to [0..length_to];
char cc [0..length_cc];
char bcc [0..length_bcc];
char subject [0..length_subject];
char header [0..length_header];
char body [0..length_body];

// Label is an offset to a location in the structure, but
// one that doesn't consume data space.
label attachments; // Begins a series of SAttachment
// structs of attachmentCount ele-
// ments long.
};

You pass the SEmail struct over the network, or write it to disk and
read it back in later. When you do, everything is already defined.
You reference each member by its name and the compiler injects code
to compute the correct offset for you.

SEmail* e = get_email();
int handle = prepare_email(e->from, e->to, e->cc, e->bcc,
e->subject, e->header, e->body);

SAttachment* a = e->attachments; // No type checking
for (int i = 0; i < e->attachmentCount; ++i)
prepare_attachment(a->type, a->length, a->attachment_data);

That simple bit of source code would allow you to express variable
length data items so beautifully in code, while hiding the variable
access portions of everything.

Yes it's a desirable feature.

--
Rick C. Hodgin

rick.c.hodgin@gmail.com: Jun 15 03:25PM -0700

On Sunday, June 14, 2020 at 10:50:54 AM UTC-4, Bart wrote:

> Now that you variable-element-length arrays, how would you define data
> structures, for this, and how would it be initialised from P? Please, no
> magic!

The purpose of this new feature is that you define the data type at
the generation side, and at the consumption side, which is on the
other side of some kind of data interchange, be it over a network,
be it saved to a disk file and later loaded for batch processing,
be it for inter-process communication within a single system, or
whatever.

The same SWhatever struct is defined to create it. The same SWhatever
struct is defined to read it.

The purpose is data interchange between a source and a destination.

I've had the consideration about how to address your need above.
You essentially have two pieces of incoming data. The first is
the count (number of strings). The second is a list of the type
of struct I'm creating:

struct SString
{
int32 length;
char string[0..length];
};

You would set the SString* variable to where length1 starts, and
then simply iterate:

int count = *(int32*)P;
SString* s = (SString*)((char*)P + sizeof(int32));

for (int i = 0; i < count; ++i, ++s)
{
// s->string is available here
}

> <string2>
> ...
> <stringN>

This requires parsing because it's zero-terminated.

> ...
> <0xFFFFFFFF> # int32, means end of string table
> <stringN>

for (SString* s = (SString*)P; s->length != -1; ++s)
{
// s->string is available here
}

> (so a length of 10 means 10+1 bytes follow).

> Beyond that is gets too complex to have a static data structure, for
> example with mixed, variable content.

Anything with a NULL-termination that marks the end of the string,
like a typical C string, requires parsing.

That's what this variation is seeking to avoid. It requires zero pre-
parsing before use, at the expense of storing the string lengths, and
the modifications to the C/C++ compiler.

--
Rick C. Hodgin

"Chris M. Thomasson" <chris.m.thomasson.1@gmail.com>: Jun 15 03:27PM -0700

>> Here's a challenge for you: a binary file has already been loaded into
>> memory, and P is a char* pointer, pointing into part of it that has this
>> stringtable data, very similar to your count/string pairs:
[...]

As a starter, take a look at DCOM:

https://en.wikipedia.org/wiki/Distributed_Component_Object_Model

Or Corba.

Fake switch or fake loop only to break

Frederick Gotham <cauldwell.thomas@gmail.com>: Jun 15 02:44PM -0700

[I have multi-posted this to comp.lang.c and comp.lang.c++]

Have you ever seen code written as follows?

if ( cond1 )
{
if ( cond2 )
{
if ( cond3 )
{
if ( cond4 )
{
DoSomething();
}
}
}
}

Well some people think that the above is very poorly written code, and they prefer it like this:

if ( !cond1 ) return;

if ( !cond2 ) return;

if ( !cond3 ) return;

if ( !cond4 ) return;

Do Something();

If you can't simply return from the function, and instead want to skip over a section of code, you could use 'goto':

if ( !cond1 ) goto Label_At_End;

if ( !cond2 ) goto Label_At_End;

if ( !cond3 ) goto Label_At_End;

if ( !cond4 ) goto Label_At_End;

Do Something();

Label_At_End:
;

Some programmers and some firms are very much against the use of 'goto'. In order to avoid using 'goto' today, I instead used a switch statement like this:

switch (true)
{
default:;

if ( !cond1 ) break;

if ( !cond2 ) break;

if ( !cond3 ) break;

if ( !cond4 ) break;

Do Something();
}

Another alternative would have been to use a 'do' loop as follows:

do
{
default:;

if ( !cond1 ) break;

if ( !cond2 ) break;

if ( !cond3 ) break;

if ( !cond4 ) break;

Do Something();
} while (false);

Does anyone else use fake switches and fake loops like this just to exploit the 'break' keyword?

Frederick Gotham <cauldwell.thomas@gmail.com>: Jun 15 02:46PM -0700

On Monday, June 15, 2020 at 10:45:11 PM UTC+1, Frederick Gotham wrote:

> do
> {
> default:;

I copy-pasted the code from the fake 'switch' and forgot to remove the 'default'.

Lew Pitcher <lew.pitcher@digitalfreehold.ca>: Jun 15 06:02PM -0400

On June 15, 2020 17:58, Stefan Ram wrote:

> int main2() { if( cond3 )main3(); }
> int main1() { if( cond2 )main2(); }
> int main( ) { if( cond1 )main1(); }

if ((cond1) && (cond2) && (cond3) && (cond4)) DoSomething();

--
Lew Pitcher
"In Skills, We Trust"

"Chris M. Thomasson" <chris.m.thomasson.1@gmail.com>: Jun 15 03:04PM -0700

On 6/15/2020 2:44 PM, Frederick Gotham wrote:
> }
> }
> }
[...]

Yes. Way back in looking at some adventure games written in BASIC on my
Apple IIgs.

Fake switch or fake loop only to break

ram@zedat.fu-berlin.de (Stefan Ram): Jun 15 09:58PM

> }
> }
>}

int main3() { if( cond4 )DoSomething(); }
int main2() { if( cond3 )main3(); }
int main1() { if( cond2 )main2(); }
int main( ) { if( cond1 )main1(); }

DCAS-atomic

Bonita Montero <Bonita.Montero@gmail.com>: Jun 15 09:20PM +0200

> __sync_bool_compare_and_swap_16 and my linker can't find it:
> "undefined reference to `__sync_bool_compare_and_swap_16'"
> Any ideas ?

I got it. I have to compile it with -march=x86-64.

"Chris M. Thomasson" <chris.m.thomasson.1@gmail.com>: Jun 15 01:59PM -0700

On 6/15/2020 8:44 AM, Bonita Montero wrote:
> the volatile methods as well as the compare-exchange-methods that accept
> only one memory-consistency-parameter.
> Here it is:
[...]

Actually, if the underlying system supports lock free DWCAS then C++
"should" support it on said system.

https://groups.google.com/d/msg/lock-free/X3fuuXknQF0/Ho0H1iJgmrQJ

If not, then you need to go rouge. Take careful note of the
memory_order_acq_rel membar.

An atomic CAS with acquire semantics, the membar would go _after_ the CAS.

An atomic CAS with release semantics, the membar would go _before_ the CAS.

"Chris M. Thomasson" <chris.m.thomasson.1@gmail.com>: Jun 15 02:16PM -0700

On 6/15/2020 1:59 PM, Chris M. Thomasson wrote:

> An atomic CAS with acquire semantics, the membar would go _after_ the CAS.

> An atomic CAS with release semantics, the membar would go _before_ the CAS.

Fwiw, think about the standalone (atomic_thread_fence) membars required
for a general purpose lock:
________________________________
Atomic RMW to take the lock

Acquire Membar

[critical section]

Release Membar

Atomic RMW to release the lock
________________________________

An acquire/release would look like:
________________________________
Release Membar

Atomic RMW

Acquire Membar
________________________________

See how the membars line up with the mutex case?

You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page.
To unsubscribe from this group and stop receiving emails from it send an email to comp.lang.c+++unsubscribe@googlegroups.com.

soft and program

Monday, June 15, 2020

Digest for comp.lang.c++@googlegroups.com - 14 updates in 5 topics

No comments:

Blog Archive

About Me