- Some bytes, perchance to view - 14 Updates
- a linked list with element different types in C++ - 2 Updates
- Implementation of a CSPRNG algorithm in C - 7 Updates
Daniel <danielaparker@gmail.com>: Oct 04 08:34PM -0700 I'd like to define using bytes_view = std::basic_string_view<uint8_t>; and have it work cross platform. It compiles with vs2015. But do I need to worry if the specialization std::char_traits<uin8_t> always exists? Or would it be safer to define my own character traits? Thanks, Daniel |
"Öö Tiib" <ootiib@hot.ee>: Oct 04 10:59PM -0700 On Thursday, 5 October 2017 06:34:34 UTC+3, Daniel wrote: > It compiles with vs2015. But do I need to worry if the specialization > std::char_traits<uin8_t> always exists? Or would it be safer to > define my own character traits? Theoretically there can be a platform where uin8_t does not exist but in practice I don't think that any such platform has C++ compiler. When there is uint8_t then it is either unsigned char or some "extended unsigned integer type". IIRC standard requires char_traits only for char, wchar_t, char16_t and char32_t. So classes that need char_traits (like std::basic_fstream<uin8_t> or std::basic_string_view<uint8_t> or std::basic_string<uin8_t>) are not required to work. Why you want to use uint8_t for text? |
Daniel <danielaparker@gmail.com>: Oct 05 06:35AM -0700 On Thursday, October 5, 2017 at 1:59:18 AM UTC-4, Öö Tiib wrote: > unsigned integer type". > IIRC standard requires char_traits only for char, wchar_t, char16_t > and char32_t. I guess that means it's not required for "unsigned char" or "signed char" either. > (like std::basic_fstream<uin8_t> or std::basic_string_view<uint8_t> > or std::basic_string<uin8_t>) are not required to work. > Why you want to use uint8_t for text? What is text :-) The use is for binary strings, which would be written as base64 for JSON, or the bytes themselves for CBOR. Daniel |
"Alf P. Steinbach" <alf.p.steinbach+usenet@gmail.com>: Oct 05 04:24PM +0200 On 10/5/2017 7:59 AM, Öö Tiib wrote: >> define my own character traits? > Theoretically there can be a platform where uin8_t does not exist but > in practice I don't think that any such platform has C++ compiler. The usual example of CHAR_BIT > 8 has been Texas Instruments digital signal processors, with CHAR_BIT = 16, and C++ compilers. I guess one could check for existence via UINT8_MAX macro. Cheers!, - Alf |
"James R. Kuyper" <jameskuyper@verizon.net>: Oct 05 12:11PM -0400 On 2017-10-05 09:35, Daniel wrote: > On Thursday, October 5, 2017 at 1:59:18 AM UTC-4, Öö Tiib wrote: ... >> and char32_t. > I guess that means it's not required for "unsigned char" or "signed char" > either. Correct. In particular, std::char_traits<uint8_t> will exist only if uint8_t is a typedef for char; if it's a typedef for unsigned char, that specialization will not exist. That's true even if char is an unsigned type: char, unsigned char, and signed char are always three distinct types, even though char is required to represent the same rante of values as one of the other two types. >> or std::basic_string<uin8_t>) are not required to work. >> Why you want to use uint8_t for text? > What is text :-) Text is the purpose for which std::basic_string<> was created. If you're just looking for an array of uint8_t, then use one of the other standard containers, such as std::vector<uint8_t>. If there's any feature that std::basic_string<> has, which isn't shared by any of the other standard containers, and you need to make use of that feature, then what you're working with probably is text, in some sense. |
Daniel <danielaparker@gmail.com>: Oct 05 10:23AM -0700 On Thursday, October 5, 2017 at 12:11:31 PM UTC-4, James R. Kuyper wrote: > std::basic_string<> has, which isn't shared by any of the other > standard containers, and you need to make use of that feature, then what > you're working with probably is text, in some sense. The question was about the need for a bytes_view, for which there's nothing in the standard library, and whether it was sensible or stupid to base it on std::basic_string_view<uint8_t,?>. I'm leaning towards stupid :-) Regarding text, as far as I can tell, std::basic_string<> doesn't really offer much more than a sequence container of 8, 16 or 32 bit items, with the additional favour of appending a zero with c_str(), with no text semantics except when they coincide with the usual operations on fixed sized items in a sequence container, at least for the default definitions of std::char_traits<>. In practice people seem to either (1) use std::string to hold utf-8 octets, using the member functions when they make sense, and for the rest, using extra functions for determining length in characters (codepoints), iterating over characters (codepoints), etc. Or (2), what you see sometimes on Windows platforms, using std::wstring to hold utf-16 units and using extra functions. Daniel |
"James R. Kuyper" <jameskuyper@verizon.net>: Oct 05 02:18PM -0400 On 2017-10-05 13:23, Daniel wrote: > std::basic_string_view<uint8_t,?>. I'm leaning towards stupid :-) > Regarding text, as far as I can tell, std::basic_string<> doesn't really > offer much more than a sequence container of 8, 16 or 32 bit items, with the basic_string has no such restriction on the sizes of the things it can contain. It has implementation-provided specializations for char, wchar_t, char16_t and char32_t, but there's no requirement that either of those first two types have a size that matches any of the three sizes you've listed. And you can specialize for any user-defined non-array POD type, as long as you provide a specialization of char_traits<> for that same type which meets the requirements specified in 21.2.1. > additional favour of appending a zero with c_str(), In the general case, that's charT() (21.4.5p2) which is not necessarily zero. > except when they coincide with the usual operations on fixed sized items in > a sequence container, at least for the default definitions of > std::char_traits<>. Most of the features that distinguish basic_string from other container types are those listed in 21.4.7, which is, unsurprisingly, titled "String operations". If you don't intend to use any of those operations, you probably should use an ordinary container type. |
"James R. Kuyper" <jameskuyper@verizon.net>: Oct 05 02:31PM -0400 On 2017-10-05 14:18, James R. Kuyper wrote: > On 2017-10-05 13:23, Daniel wrote: ... > wchar_t, char16_t and char32_t, but there's no requirement that either > of those first two types have a size that matches any of the three sizes > you've listed. Actually, there's no such requirement for any of those types, char16_t and char32_t are required to be typedefs for uint_least16_t and uint_least32_t, respectively, which need not have a size of exactly 16 or 32 bits, respectively. It's extremely likely that each of those four types will have one of those three sizes, but it's not a requirement. |
Daniel <danielaparker@gmail.com>: Oct 05 12:22PM -0700 On Thursday, October 5, 2017 at 2:18:34 PM UTC-4, James R. Kuyper wrote: > Most of the features that distinguish basic_string from other container > types are those listed in 21.4.7, which is, unsurprisingly, titled > "String operations". Except for c_str(), and variants of the "string operations" that apply to "null terminated" strings, it seems to me that all of those operations would apply equally to CBOR binary strings. There are no text semantics. std::string, for example, doesn't know about utf8, about continuation bytes, even though that's often what it holds these days. > If you don't intend to use any of those operations, > you probably should use an ordinary container type. Rather, my question was about the need for a bytes_view, for which there's nothing in the standard library, and about the advisability or lack thereof of basing one on std::basic_string_view<uint8_t,?>, I'm leaning towards no. Daniel |
"James R. Kuyper" <jameskuyper@verizon.net>: Oct 05 03:56PM -0400 On 2017-10-05 15:22, Daniel wrote: > Except for c_str(), and variants of the "string operations" that apply > to "null terminated" strings, it seems to me that all of those operations > would apply equally to CBOR binary strings. Everything I know about CBOR is from what I just read at <https://en.wikipedia.org/wiki/CBOR>. Is that what you're referring to? How and why would you want to apply any of the basic_string<>::find*() member functions to CBOR binary strings? It would look at bytes that contain the header, the payload, or the data, without discriminating between them. I can't imagine why you'd want to use any facility on a CBOR string that wasn't aware of the distinction between those parts of the data format. Similarly, how and why would you want to use substr()? I can imagine a use for a container that was aware of the CBOR format, and which parsed the items in a CBOR string into actual data items. I imagine that this container might internally use an array or a standard container of uint8_t to work on the string. But why would any of basic_string<>'s special capabilities be of any particular use for that purpose? As I said before, one of the non-string oriented standard containers would seem to be a better choice. > Rather, my question was about the need for a bytes_view, for which there's > nothing in the standard library, and about the advisability or lack thereof > of basing one on std::basic_string_view<uint8_t,?>, I'm leaning towards no. You haven't really explained anything about what bytes_view is supposed to do, which makes it hard to answer that question. You indicated that it has something to do with JSON and CBOR, which would incline me to agree with your "no". |
"Öö Tiib" <ootiib@hot.ee>: Oct 05 01:36PM -0700 On Thursday, 5 October 2017 22:22:55 UTC+3, Daniel wrote: > would apply equally to CBOR binary strings. There are no text semantics. > std::string, for example, doesn't know about utf8, about continuation bytes, > even though that's often what it holds these days. Currently one likely uses std::string to represent UTF8 in C++. The literal u8"text" is of type const char[] and so there are no additional conversions needed. > Rather, my question was about the need for a bytes_view, for which there's > nothing in the standard library, and about the advisability or lack thereof > of basing one on std::basic_string_view<uint8_t,?>, I'm leaning towards no. If there is a need for class referring to a contiguous sequence of values of type T (that are not characters) somewhere in memory then may be use some non-standard library class like gsl::span<T>? Specializing 'std::char_traits' for uint8_t that are not really meant to be characters just to get 'std::basic_string' to work just to get 'std::basic_string_view' to work (I feel) it can confuse more. From where you get these "bytes strings" of whose "bytes views" you need? Don't you need also 'std::codecvt<uint8_t>' for that? On the other hand Microsoft apparently did it. On third hand Microsoft has questionable practices. And on fourth hand I don't really know your use cases and rest of software and plans. ;) |
Daniel <danielaparker@gmail.com>: Oct 05 01:46PM -0700 On Thursday, October 5, 2017 at 3:56:56 PM UTC-4, James R. Kuyper wrote: > Everything I know about CBOR is from what I just read at > <https://en.wikipedia.org/wiki/CBOR>. Is that what you're referring to? https://tools.ietf.org/html/rfc7049 is a better reference. CBOR supports two types of strings: utf8 encoded, and binary. A binary string is just a contiguous sequence of arbitrary bytes. If formatted to text, it would typically be output as base64. > between them. I can't imagine why you'd want to use any facility on a > CBOR string that wasn't aware of the distinction between those parts of > the data format. Similarly, how and why would you want to use substr()? Point taken :-) On the other hand, you can't sensibly use substr on a utf8 encoded string either, at least for arbitrary indices. find can work, but only because of UTF-8's self-synchronizing features. > and which parsed the items in a CBOR string into actual data items. I > imagine that this container might internally use an array or a standard > container of uint8_t to work on the string. Yes, I have one, to encode/decode between CBOR and an unpacked JSON variant. https://github.com/danielaparker/jsoncons/blob/master/doc/ref/cbor/encode_cbor.md > You haven't really explained anything about what bytes_view is supposed > to do Analogous to string_view, a non-mutable non owning holder of a contiguous sequence of bytes, supporting member functions const uint8_t* data() const, length(), operator==, operator[], begin(), end(), perhaps a couple of others. I was going to write one, but I noticed that somebody else's project in this space was using using bytes_view = std::experimental::basic_string_view<char>; so I thought I'd run that by here, to see what people here thought. All other things equal, I'd prefer to leverage existing things than to introduce new things. That's all. Daniel |
Daniel <danielaparker@gmail.com>: Oct 05 02:43PM -0700 On Thursday, October 5, 2017 at 2:32:03 PM UTC-4, James R. Kuyper wrote: > uint_least32_t, respectively, which need not have a size of exactly 16 > or 32 bits, respectively. It's extremely likely that each of those four > types will have one of those three sizes, but it's not a requirement. Thanks for remarking on that, I'd overlooked that. I find it lacking that there's nothing in basic_string that tags the encoding, and have been using sizeof(CharT) as an indicator of that, e.g. assuming wchar_t holds utf16 if sizeof(wchar_t) == 16, or utf32 if sizeof(wchar_t) == 32. I realize this isn't technically correct. Is there at least a presumption that char32_t holds utf32? as there's nothing that prevents you from stuffing utf8 or utf16 into it. Daniel |
"James R. Kuyper" <jameskuyper@verizon.net>: Oct 05 06:18PM -0400 On 2017-10-05 17:43, Daniel wrote: >> Actually, there's no such requirement for any of those types, char16_t >> and char32_t are required to be typedefs for uint_least16_t and >> uint_least32_t, respectively, which need not have a size of exactly 16 That's not quite right - I was thinking of C, where that statement was perfectly correct. In C++, char16_t and char32_t are their own distinct types. But it's still correct to say that 16 and 32 bits, respectively, are only minimum values for the widths of those types. There's no requirement that they be exactly that size. > been using sizeof(CharT) as an indicator of that, e.g. assuming > wchar_t holds utf16 if sizeof(wchar_t) == 16, or utf32 if sizeof(wchar_t) > == 32. ... I presume you mean sizeof(...)*CHAR_BIT? The encoding used for narrow (char), and wide (wchar_t) strings and characters is completely implementation-defined. There's no guarantee that it has anything to do with either ASCII or Unicode. I gather that, particularly in Japan, it is (or at least, used to be) commonplace for neither of them to have either encoding. > ... I realize this isn't technically correct. Is there at least a > presumption that char32_t holds utf32? as there's nothing that prevents > you from stuffing utf8 or utf16 into it. You're right - there's nothing to prevent you from stuffing a arbitrary numeric value that's within range into any object of either type. However, there's facilities for creating and interpreting utf-8, utf-16 and utf-32 strings, and those facilities use char, char16_t, and char32_t, respectively. "A string literal that begins with u8, such as u8"asdf", is a UTF-8 string literal and is initialized with the given characters as encoded in UTF-8. Ordinary string literals and UTF-8 string literals are also referred to as narrow string literals. A narrow string literal has type "array of n const char", where n is the size of the string as defined below, and has static storage duration (3.7). A string literal that begins with u, such as u"asdf", is a char16_t string literal. A char16_t string literal has type "array of n const char16_t", where n is the size of the string as defined below; it has static storage duration and is initialized with the given characters. A single c-char may produce more than one char16_t character in the form of surrogate pairs. A string literal that begins with U, such as U"asdf", is a char32_t string literal. A char32_t string literal has type "array of n const char32_t", where n is the size of the string as defined below; it has static storage duration and is initialized with the given characters." (2.14.5p7-10) "... The specialization codecvt<char16_t, char, mbstate_t> converts between the UTF-16 and UTF-8 encoding forms, and the specialization codecvt <char32_t, char, mbstate_t> converts between the UTF-32 and UTF-8 encoding forms." (22.4.1.4p3). "For the facet codecvt_utf8: — The facet shall convert between UTF-8 multibyte sequences and UCS2 or UCS4 (depending on the size of Elem) within the program. ... For the facet codecvt_utf16: — The facet shall convert between UTF-16 multibyte sequences and UCS2 or UCS4 (depending on the size of Elem) within the program. ... For the facet codecvt_utf8_utf16: — The facet shall convert between UTF-8 multibyte sequences and UTF-16 (one or two 16-bit codes) within the program." (22.5p4-6) |
Jorgen Grahn <grahn+nntp@snipabacken.se>: Oct 05 07:02PM On Wed, 2017-10-04, Jerry Stuckle wrote: > On 10/3/2017 11:48 PM, Ian Collins wrote: >> On 10/ 4/17 03:10 PM, Jerry Stuckle wrote: ... >>> real answer to the problem. I think this is a case of poor design. >> Representing JSON objects or something similar. > And why would that be necessary? (He didn't say it was necessary.) > objects, i.e. between systems or in files. Every time I've used JSON > objects the first thing I've done in getting one is create a real object > out of it - and do whatever is appropriate for that object. I tend to do the same. There's already the external representation (JSON) and the internal one (the data structures my code does concrete work with); I don't want a third one, especially if it's more open-ended than the other two. I guess design bias like that is the reason I've never used variant/any/etc. /Jorgen -- // Jorgen Grahn <grahn@ Oo o. . . \X/ snipabacken.se> O o . |
Jerry Stuckle <jstucklex@attglobal.net>: Oct 05 05:35PM -0400 On 10/5/2017 3:02 PM, Jorgen Grahn wrote: >>> Representing JSON objects or something similar. >> And why would that be necessary? > (He didn't say it was necessary.) Then why bring it up? > open-ended than the other two. > I guess design bias like that is the reason I've never used variant/any/etc. > /Jorgen For me it's not bias. I've just never found a need for it - there have always been better ways. -- ================== Remove the "x" from my email address Jerry Stuckle jstucklex@attglobal.net ================== |
ribeiroalvo@gmail.com: Oct 05 11:44AM -0700 I need help to implement the algorithm of my own in C language. Someone can help me ? Here is the algorithm: Melgo a csprng by Ribeiro Alvo 2017 Description a,b,c,d and n as ( 16 bit ) i as ( 64 bit ) n = 2**16-1 a = Initialzed in [ 0 , n ] b = Initialzed in [ 0 , n ] c = Initialzed in [ 0 , n ] d = 2**13 X[ from 0 to n ] = Initialzed with a 61 bit values Key-scheduling algorithm for i from 0 to n a = 1 + [ a + c ] mod n b = 1 + [ b + a ] mod n c = 1 + [ c + b ] mod n X[i] = X[i] + a * b * c * d + a endfor Pseudo-random generation algorithm i = 0 while GeneratingOutput: i = i + 1 X(a) = [X(a) + X(b)] mod 2**62 a = [a + c + i] mod [n + 1] Output [X(a) + X(b)] mod 2**56 b = [b + a] mod [n + 1] Output [X(b) + x(c)] mod 2**56 c = [c + b] mod [n + 1] Output [X(c) + x(a)] mod 2**56 endwhile Thank you |
red floyd <dont.bother@its.invalid>: Oct 05 01:18PM -0700 > I need help to implement the algorithm of my own in C language. > Someone can help me ? > [redacted] You may have come to the wrong place. This is comp.lang.c++. You may want to try comp.lang.c instead. |
ribeiroalvo@gmail.com: Oct 05 01:40PM -0700 quinta-feira, 5 de Outubro de 2017 às 20:19:00 UTC, red floyd escreveu: > > [redacted] > You may have come to the wrong place. This is comp.lang.c++. > You may want to try comp.lang.c instead. Thanks I'll do it But if a C ++ version is possible, I would also appreciate it. |
Ben Bacarisse <ben.usenet@bsb.me.uk>: Oct 05 09:43PM +0100 > I need help to implement the algorithm of my own in C language. For C, I'd post in comp.lang.c. This group is for C++. Since I've included C source, I've set the followup-to header to avoid a language debate. If you reply, please honour that header. > c = Initialzed in [ 0 , n ] > d = 2**13 > X[ from 0 to n ] = Initialzed with a 61 bit values How are a, b, c and X initialised? Below, there's a hint you mean 62-bit values. > Key-scheduling algorithm > for i from 0 to n Does i ever get to n? I.e. is the upper bound of the for inclusive or not? > a = 1 + [ a + c ] mod n > b = 1 + [ b + a ] mod n > c = 1 + [ c + b ] mod n What do the []s mean here? Is it grouping the "mod n"? I.e. a = 1 + ((a + c) mod n)) > X[i] = X[i] + a * b * c * d + a Are the X[i] supposed to be reduced to being 61-bit (or 62-bit) values? I'm guessing yes. > c = [c + b] mod [n + 1] > Output [X(c) + x(a)] mod 2**56 > endwhile It would be much better to use consistent notation. Keep [] for indexing and use () for arithmetic grouping. It's not clear if the output refers to three separate outputs or of the generator is to make one 168-bit number at a time. Since making three separate 56-bit values is more interesting, that's the interpretation I've chosen. Here's a first draft. You need C99 or later. #include <stdio.h> unsigned long long csprng(void) { const unsigned d = 1 << 13; const unsigned n = 0xFFFF; const unsigned long long mask_62_bits = (1ull << 62) - 1; const unsigned long long mask_56_bits = (1ull << 56) - 1; static unsigned a, b, c; static unsigned long long X[0x10000]; static int state = 0; static unsigned long long i = 0; switch (state) { case 0: if (i == 0) { /* * Here we want to set initial values for a, b, c an X, * but I don't know how that is supposed to be done. */ for (unsigned long i = 0; i <= n; i++) { a = 1 + ((a + c) % n); b = 1 + ((b + a) % n); c = 1 + ((c + b) % n); X[i] += (unsigned long long)a * b * c * d + a; X[i] &= mask_62_bits; } } i += 1; X[a] = (X[a] + X[b]) & mask_62_bits; a = (a + c + i) & n; state = 1; return (X[a] + X[b]) & mask_56_bits; case 1: b = (b + a) & n; state = 2; return (X[b] + X[c]) & mask_56_bits; case 2: c = (c + b) & n; state = 0; return (X[c] + X[a]) & mask_56_bits; } } int main(int argc, char **argv) { for (int i = 0; i < 100; i++) printf("%llu\n", csprng()); } -- Ben. |
"James R. Kuyper" <jameskuyper@verizon.net>: Oct 05 04:49PM -0400 > Thanks > I'll do it > But if a C ++ version is possible, I would also appreciate it. If it can be done in C, it can generally also be done in C++, using almost exactly the same code. There might also be a C++ way of doing it that's better, using radically different code. Note: regardless of what language you want to use, if this is a homework assignment, many of the people who can give you the best help will generally not give you that help until you've first made an attempt to do it yourself. If your code doesn't work as intended, or maybe even fails to compile, you can post your code here and people will be quite happy to help you fix it - but they won't do your homework for you. If this isn't homework, people who are competent to do so generally expect to get paid for doing programming work for you. How much are you willing to offer, and by what payment method? |
ribeiroalvo@gmail.com: Oct 05 02:08PM -0700 quinta-feira, 5 de Outubro de 2017 às 20:49:25 UTC, James R. Kuyper escreveu: > If this isn't homework, people who are competent to do so generally > expect to get paid for doing programming work for you. How much are you > willing to offer, and by what payment method? This is not homework nor commercial purposes. See: www.number.com/Melgo.html From now on this subject will be dealt with in https://groups.google.com/forum/#!forum/comp.lang.c |
ribeiroalvo@gmail.com: Oct 05 02:10PM -0700 Correction. http://www.number.com.pt/Melgo.html |
You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page. To unsubscribe from this group and stop receiving emails from it send an email to comp.lang.c+++unsubscribe@googlegroups.com. |
No comments:
Post a Comment