Sunday, June 14, 2020

Digest for comp.lang.c++@googlegroups.com - 11 updates in 4 topics

Nikki Locke <nikki@trumphurst.com>: Jun 14 10:23PM

Available C++ Libraries FAQ
 
URL: http://www.trumphurst.com/cpplibs/
 
This is a searchable list of libraries and utilities (both free
and commercial) available to C++ programmers.
 
If you know of a library which is not in the list, why not fill
in the form at http://www.trumphurst.com/cpplibs/cppsub.php
 
Maintainer: Nikki Locke - if you wish to contact me, please use the form on the website.
Bart <bc@freeuk.com>: Jun 14 12:53AM +0100

>> } S3;
 
> Runtime parsing. Having the variable struct removes the need for
> parsing. The data packet is ready to go as-is.
 
Even if some external data was a fixed number of three length+string
pairs, you would still to know the total size. That means traversing the
data.
 
If that total size of already known (eg. the size of a file containing
only those three), then accessing the 2nd and 3rd strings will still
need traversing the chain of lengths. Something that sounds like would
need to be done on every access, unless some auxilliary data structure
is set up.
 
And if that needs to be done anyway, then what is the point of the new
feature? Just use a function. Then maybe use C++ to be able to write x.u
instead of getstr(x,2).
 
 
Here's a mockup of such data, with the routines needed to extract
lengths and strings at random. Here, the only data structure needed is a
char array. I've used zero-terminated strings in the data, to simplify
printing them, but the data contained embedded strings anyway.
 
And I've used a sentinel to make this test simpler, since we don't have
a real context:
 
-----------------------------------------------
 
#include <stdio.h>
#include <stdint.h>
#include <string.h>
 
char data[]=
"\x06\x00\x00\x00" "Monday\x00"
"\x07\x00\x00\x00" "Tuesday\x00"
"\x09\x00\x00\x00" "Wednesday\x00"
"\x08\x00\x00\x00" "Thursday\x00"
"\x06\x00\x00\x00" "Friday\x00"
"\x08\x00\x00\x00" "Saturday\x00"
"\x06\x00\x00\x00" "Sunday\x00"
"\xFF\xFF\xFF\xFF";
 
 
int getstrn(char* data, int n, char** str) {
char* p=data;
int32_t length;
 
if (n<=0) return -1;
 
do {
memcpy(&length,p,4);
if (length==0xFFFFFFFF) return -1;
p += 4;
if (str) *str = p;
p += length+1;
} while (--n);
 
return length;
}
 
int countstrings(char* data){
int n=0, length;
 
while (1) {
length=getstrn(data,n+1,NULL);
if (length==-1) return n;
++n;
}
}
 
int getlength(char* data,int n) {
return getstrn(data,n,NULL);
}
 
char* getstring(char* data,int n) {
char* s;
getstrn(data,n,&s);
return s;
}
 
int main(void) {
char* s;
int nstrings, length;
 
nstrings = countstrings(data);
 
for (int i=1; i<=nstrings; ++i) {
length = getlength(data,i);
s = getstring(data,i);
 
printf("%d: %04d \"%s\"\n",i,length,s);
}
}
 
-----------------------------------------------
 
The output is:
 
1: 0006 "Monday"
2: 0007 "Tuesday"
3: 0009 "Wednesday"
4: 0008 "Thursday"
5: 0006 "Friday"
6: 0008 "Saturday"
7: 0006 "Sunday"
 
Note that real data could use a variable-length field for the string
length (some of my formats do). That can be accommodated here, but is
not practical using your scheme.
rick.c.hodgin@gmail.com: Jun 14 04:59AM -0700

On Saturday, June 13, 2020 at 7:53:52 PM UTC-4, Bart wrote:
 
> Even if some external data was a fixed number of three length+string
> pairs, you would still to know the total size. That means traversing the
> data.
 
You wouldn't need to know the length in advance if it's a properly for-
matted string. You would traverse it until it terminates with a NULL
or some other stop code, such as all three a, b, c being 0.
 
But even so, when you read a network packet, you know the length. When
you open a file you can get the file length.
 
> need traversing the chain of lengths. Something that sounds like would
> need to be done on every access, unless some auxilliary data structure
> is set up.
 
That's where the compiler comes in. It adds that code automatically so
you don't have to do it. It allows you to use the data, and the compi-
ler handles the mechanics of access.
 
> And if that needs to be done anyway, then what is the point of the new
> feature? Just use a function. Then maybe use C++ to be able to write x.u
> instead of getstr(x,2).
 
It changes the data definition and logic portions of your source code
below to the example I write below.
 
 
> Note that real data could use a variable-length field for the string
> length (some of my formats do). That can be accommodated here, but is
> not practical using your scheme.
 
1) Define the variable length structure. Clear. Concise:
 
struct SDay
{
int length;
char name[0..length];
};
 
2) Populate the data. Clear. Concise:
 
SDay data[] =
{
{ auto, "Sunday" },
{ auto, "Monday" },
{ auto, "Tuesday" },
{ auto, "Wednesday" },
{ auto, "Thursday" },
{ auto, "Friday" },
{ auto, "Saturday" }
};
 
3) Access the elements as they are, even though they're variable.
Clear. Concise:
 
auto p = data[0];
 
for (int i = 0; i < sizeof(data) / sizeof(data[0]); ++i, ++p)
printf("%d: %04d \"data\", i, p->length, p->name);
 
The mechanics of calculating length is handled by the compiler at
compile-time. The mechanics of calculating the offset of p->name
is handled via an injected function or inline. The mechanics of
stepping forward for the ++p is handled via an injected function
or inline.
 
The compiler injects code to make your use of that feature easier.
And ultimately, it allows you to assemble data to transfer from
one source to another (inter-process communication, over the Inter-
net, saved to disk file and read later) without any parsing prior
to use. It allows immediate propagation and use of that data
without hindrance, but only the smallest performance hit by doing
the pointer calculations for member access, many of which could be
removed by placing any fixed members at the start of the struct or
class, and then only having variable length items at the end with
some optimization rearrangements.
 
It's a usable tool, Bart. If it existed natively in C or C++, it
would've already found wide use in normal data storage as it is a
consistent need for real-world data, and especially so in trans-
mitting data over a network.
 
--
Rick C. Hodgin
rick.c.hodgin@gmail.com: Jun 14 05:01AM -0700

> printf("%d: %04d \"data\", i, p->length, p->name);
 
:-) Me and my brain...
 
printf("%d: %04d \"%s\", i, p->length, p->name);
 
--
Rick C. Hodgin
Scott Newman <scott69@gmail.com>: Jun 14 02:02PM +0200

If you're in a context where your authority isn't honored,
you're constantly trolling.
rick.c.hodgin@gmail.com: Jun 14 05:20AM -0700

On Sunday, June 14, 2020 at 8:02:34 AM UTC-4, Scott Newman wrote:
> If you're in a context where your authority isn't honored,
> you're constantly trolling.
 
Another term for that activity is "leading." It depends on the
validity of the underlying principles to determine which it is.
 
For example, compare your template construction idea to the code
I posted for Bart ... and decide for yourself.
 
--
Rick C. Hodgin
Bart <bc@freeuk.com>: Jun 14 03:50PM +0100


> auto p = data[0];
 
> for (int i = 0; i < sizeof(data) / sizeof(data[0]); ++i, ++p)
> printf("%d: %04d \"data\", i, p->length, p->name);
 
You've changed the goalposts. First by treating each length/string in
isolation (like that, it reduces to ordinary C variable length structs,
where the only variable element is the last one, at a fixed offset).
 
Second by introducing another feature, which is that of arrays with
variable length elements, as I assume these are. If not, then you might
as well just use ordinary C++ features here:
 
#include <string>
#include <iostream>
 
using namespace std;
 
string ss [] = {"Monday","Tuesday","Wednesday","Thursday",
"Friday","Saturday","Sunday"};
 
int main(void) {
for (string s : ss)
cout<<" "<<s.size()<<" "<<s<<endl;
}
 
Which is clearer and more concise than your proposal.
 
What it doesn't have is a flat, inline replesentation of the data. But
if this is supposed to mirror what is going on in an external file
(perhaps via mmap), then I don't think that an element like:
 
<count> <'count' characters>
 
is universal enough to warrant special language features. Since there
are can dozens of different elements, all mixed up, and in sequences
which themselves have a count field. Or count fields could themselves be
variable length.
 
Here's a challenge for you: a binary file has already been loaded into
memory, and P is a char* pointer, pointing into part of it that has this
stringtable data, very similar to your count/string pairs:
 
<number of strings> # int32 (assume suitable endian)
<length1> # int32
<string1> # length1 bytes, not 0-terminated
<length2>
<string2>
...
<lengthN>
<stringN>
 
Now that you variable-element-length arrays, how would you define data
structures, for this, and how would it be initialised from P? Please, no
magic!
 
Here's a variation; we still want the lengths accessible, but they have
to be determined:
 
<number of strings>
<string1> # zero-terminated string
<string2>
...
<stringN>
 
Here's a variation of that first one:
 
Just before P was this tag:
 
<tag meaning string table follows>
 
P points here:
<length1> # int32
<string1> # length1 bytes, not 0-terminated
<length2>
<string2>
...
<0xFFFFFFFF> # int32, means end of string table
<stringN>
 
Another variation of the first, is where the strings are zero-terminated
(so a length of 10 means 10+1 bytes follow).
 
Beyond that is gets too complex to have a static data structure, for
example with mixed, variable content.
The Real Non Homosexual <cdalten@gmail.com>: Jun 13 06:20PM -0700

This person paid $10 dollars to post the following winner ad in the computer gigs section..
 
https://sfbay.craigslist.org/sfc/cpg/d/san-francisco-greatest-message-ever/7141385276.html
rick.c.hodgin@gmail.com: Jun 14 05:16AM -0700

On Saturday, June 13, 2020 at 9:20:35 PM UTC-4, The Real Non Homosexual wrote:
> This person paid $10 dollars to post the following winner ad in the computer gigs section..
 
> https://sfbay.craigslist.org/sfc/cpg/d/san-francisco-greatest-message-ever/7141385276.html
 
 
One thing to note: Christianity doesn't call people to religion. It
is a teaching by Jesus that calls us to repentance and eternal life.
It's different than religion.
 
Religion says, "Do."
Jesus says "Done."
 
Jesus completed all work required for our salvation at the cross by
taking our sin upon his own shoulders, and then dying with that sin
charged to his account. He went before God to pay the price of our
sin so that we are set free.
 
Religion has you doing rituals and things.
 
Salvation has you set free from sin, restored to eternal life.
 
Do not confuse the two. What Jesus offers people is salvation from
judgment and condemnation. He offers the path back to God, to be a
part of Heaven, to be a part of God's Kingdom plans.
 
--
Rick C. Hodgin
Pavel <pauldontspamtolk@removeyourself.dontspam.yahoo>: Jun 14 12:25AM -0400

Keith Thompson wrote:
> should be updated to state that <errno.h> shall provide a macro
> definition for errno.
 
> (POSIX apparently goes back to 1998.
more like 1988 if you believe to Wikipedia. I recall first learning about it
circa 1993. Back then, any reference to UNIX would elicit a quip "which UNIX?"
from one shrewd friend of mine.
 
-Pavel
I wonder if that wording was
Keith Thompson <Keith.S.Thompson+u@gmail.com>: Jun 13 11:16PM -0700

> Keith Thompson wrote:
[...]
> more like 1988 if you believe to Wikipedia. I recall first learning about it
> circa 1993. Back then, any reference to UNIX would elicit a quip "which UNIX?"
> from one shrewd friend of mine.
 
In fact Wikipedia is where I got that information. I blame the
standard that puts the '8' and '9' keys next to each other on
my keyboard. (Yes, I meant to type 1988.)
 
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
Working, but not speaking, for Philips Healthcare
void Void(void) { Void(); } /* The recursive call of the void */
You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page.
To unsubscribe from this group and stop receiving emails from it send an email to comp.lang.c+++unsubscribe@googlegroups.com.

No comments: