soft and program: Digest for comp.lang.c++@googlegroups.com

comp.lang.c++@googlegroups.com

Google Groups

sizeof(bitfield struct) - 7 Updates
What does operating on raw bytes mean in a C++ context? - 9 Updates
Module libraries - 6 Updates

Mr Flibble <flibbleREMOVETHISBIT@i42.co.uk>: Nov 05 06:56PM

On 04/11/2018 20:19, Rick C. Hodgin wrote:
[snip]

> member of their types, and not represented by the size of their
> bit encoding. I actually consider it to be a flaw in C/C++ to do
> it that way.

And Satan invented fossils, yes?

/Flibble

--
"You won't burn in hell. But be nice anyway." – Ricky Gervais

"I see Atheists are fighting and killing each other again, over who
doesn't believe in any God the most. Oh, no..wait.. that never happens." –
Ricky Gervais

"Suppose it's all true, and you walk up to the pearly gates, and are
confronted by God," Bryne asked on his show The Meaning of Life. "What
will Stephen Fry say to him, her, or it?"
"I'd say, bone cancer in children? What's that about?" Fry replied.
"How dare you? How dare you create a world to which there is such misery
that is not our fault. It's not right, it's utterly, utterly evil."
"Why should I respect a capricious, mean-minded, stupid God who creates a
world that is so full of injustice and pain. That's what I would say."

"Rick C. Hodgin" <rick.c.hodgin@gmail.com>: Nov 05 11:55AM -0800

On Monday, November 5, 2018 at 1:56:51 PM UTC-5, Mr Flibble wrote:
> And Satan invented fossils, yes?

No. You're listening to someone else suggesting that. Not
the Bible, not me, not the truth. You're letting falseness
rule in your thinking, and not the light of truth.

--
Rick C. Hodgin

Mr Flibble <flibbleREMOVETHISBIT@i42.co.uk>: Nov 05 07:57PM

On 05/11/2018 19:55, Rick C. Hodgin wrote:
[snip]
> No. You're listening to someone else suggesting that. Not
> the Bible, not me, not the truth. You're letting falseness
> rule in your thinking, and not the light of truth.

And Satan invented fossils, yes?

/Flibble

--
"You won't burn in hell. But be nice anyway." – Ricky Gervais

"I see Atheists are fighting and killing each other again, over who
doesn't believe in any God the most. Oh, no..wait.. that never happens." –
Ricky Gervais

"Suppose it's all true, and you walk up to the pearly gates, and are
confronted by God," Bryne asked on his show The Meaning of Life. "What
will Stephen Fry say to him, her, or it?"
"I'd say, bone cancer in children? What's that about?" Fry replied.
"How dare you? How dare you create a world to which there is such misery
that is not our fault. It's not right, it's utterly, utterly evil."
"Why should I respect a capricious, mean-minded, stupid God who creates a
world that is so full of injustice and pain. That's what I would say."

Ian Collins <ian-news@hotmail.com>: Nov 06 09:59AM +1300

On 05/11/18 09:19, Rick C. Hodgin wrote:

> member of their types, and not represented by the size of their
> bit encoding. I actually consider it to be a flaw in C/C++ to do
> it that way.

Why? Bit fields are deliberately loosely specified (apart from the
size..) with much being implementation defined (such as the ordering)
because of differences in hardware support. If you want 16 bits, use
uint16_t. Conversly, if you only want to map a couple of bits in a 32
bit register, you'd use a uint32_t - without silly #defines!

--
Ian.

"Rick C. Hodgin" <rick.c.hodgin@gmail.com>: Nov 05 01:03PM -0800

On Monday, November 5, 2018 at 3:59:31 PM UTC-5, Ian Collins wrote:
> because of differences in hardware support. If you want 16 bits, use
> uint16_t. Conversly, if you only want to map a couple of bits in a 32
> bit register, you'd use a uint32_t - without silly #defines!

That's a philosophical position, the one C/C++ took.

My view is if I define a pattern of 16 bits in a particular
order, the machine had better represent them as I indicate,
and when I step through some structure I'm expecting the ptr
to move by the size of the bits, and not their expressed size.

I truly view this as a fundamental failure of C/C++.

--
Rick C. Hodgin

Ian Collins <ian-news@hotmail.com>: Nov 06 10:20AM +1300

On 06/11/18 10:03, Rick C. Hodgin wrote:
> order, the machine had better represent them as I indicate,
> and when I step through some structure I'm expecting the ptr
> to move by the size of the bits, and not their expressed size.

So if you only declare a 13 bit pattern?

If you wanted 13 bits, why did you use a 32 bit underlying type?

--
Ian.

"Rick C. Hodgin" <rick.c.hodgin@gmail.com>: Nov 05 01:27PM -0800

On Monday, November 5, 2018 at 4:20:39 PM UTC-5, Ian Collins wrote:
> > to move by the size of the bits, and not their expressed size.

> So if you only declare a 13 bit pattern?

> If you wanted 13 bits, why did you use a 32 bit underlying type?

Increment by bit count b:

step_size = (b / 8) + ((b % 8) == 0 ? 0 : 1);

--
Rick C. Hodgin

What does operating on raw bytes mean in a C++ context?

Paul <pepstein5@gmail.com>: Nov 05 04:58AM -0800

On Sunday, November 4, 2018 at 6:52:28 PM UTC, Pavel wrote:

> > Paul

> HTH
> -Pavel

Ok, the following code should satisfy requirements but it hasn't been
extensively tested. Feedback is welcome. I decided to code from
scratch without using library functions.

Thanks,

Paul

// Problem is https://cryptopals.com/sets/1/challenges/1
//49276d206b696c6c696e6720796f757220627261696e206c696b65206120706f69736f6e6f7573206d757368726f6f6d
// should produce
// SSdtIGtpbGxpbmcgeW91ciBicmFpbiBsaWtlIGEgcG9pc29ub3VzIG11c2hyb29t
#include <iostream>
#include <unordered_map>
#include <cctype>
#include <cmath>
#include <utility>
#include <vector>

std::unordered_map<char, int> buildHexMap()
{
std::unordered_map<char, int> hexMap;
for(char letter = 'a'; letter <= 'f'; ++letter)
hexMap[std::toupper(letter)] = hexMap[letter] = 10 - 'a' + letter;

for(char letter = '1'; letter <= '9'; ++letter)
hexMap[letter] = 1 - '1' + letter;

return hexMap;
}

// Use this map to build a std::vector<int> from a string
std::vector<int> hex(const std::string& hexString, std::unordered_map<char, int> hexMap = buildHexMap())
{
std::vector<int> result(hexString.size());
for(int i = 0; i < result.size(); ++i)
result[i] = hexMap[hexString[i]];

return result;
}

// https://en.wikipedia.org/wiki/Base64 is reference
std::unordered_map<int, char> build64Map()
{
constexpr int alphabetSize = 26;
std::unordered_map<int, char> base64Map;
for(char letter = 'A'; letter <= 'Z'; ++letter)
{
base64Map[letter - 'A'] = letter;
base64Map [alphabetSize - 'a' + std::tolower(letter)] = std::tolower(letter);
}

for(char letter = '0'; letter <= '9'; ++letter)
base64Map[52 - '0' + letter] = letter;

base64Map[62] = '+' ;
base64Map[63] = '/';

return base64Map;
}

// A naive conversion can result in excessive zeros at the front.
// These are now removed.
std::vector<int> trim(const std::vector<int>& vec)
{
int i = 0;
while(i < vec.size() && !vec[i])
++i;

if(i == vec.size())
return {0};

return std::vector<int>(vec.begin() + i, vec.end());
}

// A block of 3 hex digits is equivalent to a block of two hex digits
// Use this equivalence to transform a vector of hex digits to a vector of base 64 digits
std::vector<int> hexToBase64(const std::vector<int>& hex)
{
if(hex.empty())
return hex;

constexpr int hexBlock = 3;
constexpr int base64Block = 2;
constexpr int convertBase = 64;
constexpr int hexBase = 16;

const double hexSize = hex.size(); // cast to double for ceiling operation

std::vector<int> result( std::ceil(hexSize/hexBlock) * base64Block);
int finalComponent = result.size() - 1;
for(int i = hexSize - 1; i>= 0; i -= hexBlock)
{
int hexValue = hex[i];
if(i)
hexValue += hexBase * hex[i - 1];
if(i >= 2)
hexValue += hexBase * hexBase * hex[i - 2];

result[finalComponent--] = hexValue % convertBase;
result[finalComponent--] = hexValue / convertBase;
}

return trim(result);
}

std::string hexToBase64(const std::string& hexString)
{
const std::vector<int>& base64 = hexToBase64(hex(hexString));
std::unordered_map<int, char> base64Map = build64Map();
std::string result;
for(int i : base64)
result += base64Map[i];

return result;
}

int main()
{
const std::string hex = "49276d206b696c6c696e6720796f757220627261696e206c696b65206120706f69736f6e6f7573206d757368726f6f6d";
const std::string answer = "SSdtIGtpbGxpbmcgeW91ciBicmFpbiBsaWtlIGEgcG9pc29ub3VzIG11c2hyb29t";

std::cout << ( hexToBase64(hex) == answer ? "Test passed" : "Test failed");
}

Juha Nieminen <nospam@thanks.invalid>: Nov 05 02:59PM

> SSdtIGtpbGxpbmcgeW91ciBicmFpbiBsaWtlIGEgcG9pc29ub3VzIG11c2hyb29t

> I'm confused by the instruction: "Always operate on raw bytes, never on encoded strings. Only use hex and base64 for pretty-printing."

> What does "raw bytes" mean in terms of the input/output parameters.

It simply means that if the input is in hexadecimal, you first decode it
into the correspondent bytes, eg. into an std::vector<unsigned char>
(hexadecimal "00" corresponds to the byte value 0, "01" corresponds to
the byte value 1 and so on, up to "ff" corresponding to the byte value
255), and then you output those bytes in base64.

Converting from ascii hexadecimal representation into bytes is quite
easy: For each pair of ascii characters, see if it's between '0' and
'9', and if it is, subtract '0' from it. If it's between 'a' and 'f',
subtract 'a' from it and add 10. This gives you the upper 4 bits.
Do the same for the second characters, and it gives you the lower
4 bits. If you calculated them eg. into the variables ub and lb,
the byte value will be ub*16+lb.

Converting bytes into base64 is a bit more complicated but there
are easy tutorials out there.

"Alf P. Steinbach" <alf.p.steinbach+usenet@gmail.com>: Nov 05 05:35PM +0100

On 05.11.2018 15:59, Juha Nieminen wrote:
> Do the same for the second characters, and it gives you the lower
> 4 bits. If you calculated them eg. into the variables ub and lb,
> the byte value will be ub*16+lb.

Are you sure that CHAR_BIT, the number of bits per byte, equals 8?

> Converting bytes into base64 is a bit more complicated but there
> are easy tutorials out there.

Cheers!,

- Alf

Paul <pepstein5@gmail.com>: Nov 05 12:01PM -0800

On Monday, November 5, 2018 at 2:59:41 PM UTC, Juha Nieminen wrote:
> the byte value will be ub*16+lb.

> Converting bytes into base64 is a bit more complicated but there
> are easy tutorials out there.

Thanks a lot, but you're giving me advice on how to do something that
I thought I had already done. Is there any reason why the code I presented
is not a valid solution?

Paul

Jorgen Grahn <grahn+nntp@snipabacken.se>: Nov 05 08:15PM

On Mon, 2018-11-05, Paul wrote:

> Ok, the following code should satisfy requirements but it hasn't been
> extensively tested. Feedback is welcome. I decided to code from
> scratch without using library functions.

What library functions? You use plenty of the standard library (not
doing so would be crazy) but of course you don't use someone else's
Base64 encoder.

> // Problem is https://cryptopals.com/sets/1/challenges/1

I still don't understand it, but I accept your interpretation that the
input is a string of hex digits, even though that's problematic (see
below).

> //49276d206b696c6c696e6720796f757220627261696e206c696b65206120706f69736f6e6f7573206d757368726f6f6d
> // should produce
> // SSdtIGtpbGxpbmcgeW91ciBicmFpbiBsaWtlIGEgcG9pc29ub3VzIG11c2hyb29t

That just repeats what main() says better.

...

Your hex decoder below is one reason I don't like the interpretation
of the exercise: you have to do tedious input validation and error
handling. "Hello world!" isn't a hex string. "Abc" is probably a typo
rather than 1 1/2 byte. "01 f0 ff" is a hex string that's
human-readable, but you don't handle that one well.

For reference, this is such a function I've written, and used a lot.
I pretty much need all of the documented features for it to be useful
in practice.

/**
* Decode [begin .. end) from a hex dump (e.g. "f0 00 ba 12")
* into octet buffer 'buf', which is assumed to be large enough.
*
* Tolerated input is hex digits and whitespace. Any amount of whitespace
* is ok, except it must not appear between nybbles:
*
* "12 34 56" - ok
* "1234 56" - also fine; same thing
* "123456" - also fine
* "123 456" - not ok; 0x12 is returned and "3 456" remains
* unencoded
*
* Returns the number of octets read, and updates 'begin' to
* the first undecoded character much like strtoul(3) does.
*/
size_t hexread(uint8_t* const buf,
const char** begin, const char* const end);

> std::vector<int> hex(const std::string& hexString,
> std::unordered_map<char, int> hexMap = buildHexMap())

Why would you ever want to pass in a different "hexmap"?

[snip]

/Jorgen

--
// Jorgen Grahn <grahn@ Oo o. . .
\X/ snipabacken.se> O o .

Juha Nieminen <nospam@thanks.invalid>: Nov 05 08:21PM

> Are you sure that CHAR_BIT, the number of bits per byte, equals 8?

Does any computer system where CHAR_BIT isn't 8 even running
anymore?

woodbrian77@gmail.com: Nov 05 12:50PM -0800

On Saturday, November 3, 2018 at 7:21:47 PM UTC-5, Ben Bacarisse wrote:

> I disagree with the advice you've had that std::string is OK for this
> sort of work. You might get away with it for this first task, but zero
> bytes can be a problem in std::string objects.

I agree. Just because a tech giant does something
doesn't mean they know what they are doing.
https://duckduckgo.com is proving that everyday, right?

Brian
Ebenezer Enterprises
https://github.com/Ebenezer-group/onwards

scott@slp53.sl.home (Scott Lurndal): Nov 05 08:52PM

>> Are you sure that CHAR_BIT, the number of bits per byte, equals 8?

>Does any computer system where CHAR_BIT isn't 8 even running
>anymore?

Yes. The Unisys clearpath dorado systems decended from the Sperry/Univac 1100
come to mind immediately.

woodbrian77@gmail.com: Nov 05 01:15PM -0800

On Monday, November 5, 2018 at 2:22:02 PM UTC-6, Juha Nieminen wrote:
> > Are you sure that CHAR_BIT, the number of bits per byte, equals 8?

> Does any computer system where CHAR_BIT isn't 8 even running
> anymore?

I think for servers. desktops and phones CHAR_BIT is
almost always 8, but embedded devices are another story.

Brian
Ebenezer Enterprises
http://webEbenezer.net

Module libraries

Thiago Adams <thiago.adams@gmail.com>: Nov 04 03:53PM -0800

On Sunday, November 4, 2018 at 9:24:42 PM UTC-2, Pavel wrote:
> ident etc.

> See e.g.
> https://stackoverflow.com/questions/15773282/access-ident-information-in-an-executable

I don't think it a good idea to increase the size to keep
this information.

> file per program or library to contain them all"? If yes, why does it even has
> to be a C/C++ file -- its role is clearly providing list of files rather than
> C/C++ code).

We have two options using the same feature.

One option is intrusive.
If you create a new library you can just add in your header
the corresponding source.

file1.h
-----------

/*
File1.h
*/

#pragma once
#pragma source "File1.c"
...

---

The other option is non intrusive. You can create a different file
to describe all sources. This could be a build.txt. But .h is a
good extension. The will expand the pragma source.

I also would like pragma includedir , pragma lib .

And pragma once span.

pragma once span is to reuse the parsed header in more than
one source file. It will take in account the macros for the
first inclusion and after that it will not be expanded anymore.

Christian Gollwitzer <auriocus@gmx.de>: Nov 05 09:37AM +0100

Am 02.11.18 um 20:16 schrieb Thiago Adams:
> #elif LINUX
> #pragma source "..\Scr\ConsoleLinux.c"
>

soft and program

Monday, November 5, 2018

Digest for comp.lang.c++@googlegroups.com - 22 updates in 3 topics

No comments:

Blog Archive

About Me