- Thank you. (was:Re: uint32_t is not the same as long unsigned int ?) - 3 Updates
- Wading through template instantiation errors for allocator - 3 Updates
- std::copy with std::ostream_iterator<std::array...> - 1 Update
- little contest - 4 Updates
- C++ threshold for "stupid" sorting algorithms (O(n^2)) - 1 Update
- Union type punning in C++ - 1 Update
- Don't be fooled by cpp.sh - 2 Updates
wolfgang bauer <schutz@gmx.de>: Jan 04 03:15PM +0100 Till now I was not aware of some details. Thank you all, for shedding some light onto it. |
James Kuyper <jameskuyper@alumni.caltech.edu>: Jan 04 09:48AM -0500 On 1/4/20 2:56 AM, Keith Thompson wrote: >> ranges into bit counts to better match your question. > A quibble: the required ranges of values for the standard integer types > are copied from the C standard, but are not incorporated by reference. In my copy of n4567.pdf, 18.3.3 says: "1 Table 31 describes the header <climits>. 2 The contents are the same as the Standard C library header <limits.h>" 18.3.3p2 is precisely the kind of wording that "incorporated by reference" means to me. What does it mean to you? The "description" in table 31 is the only place that the C++ standard that all of the *_MIN and *_MAX macros are referred to. The meanings of those macros and the maximum and minimum (respectively) permitted values for those macros which are only given in the C standard. If you read only the C++ standard, you might not even realize that it imposes any limits on the sizes of integer types, however indirectly. |
Keith Thompson <Keith.S.Thompson+u@gmail.com>: Jan 04 02:48PM -0800 > for those macros which are only given in the C standard. If you read > only the C++ standard, you might not even realize that it imposes any > limits on the sizes of integer types, however indirectly. You're right. N4567 3.9.1 [basic.fundamental] paragraph 3 says: The signed and unsigned integer types shall satisfy the constraints given in the C standard, section 5.2.4.2.1. My earlier post was based on N4842, which is a working draft for C++20 (and happens to be the document that I had open at the time). In that draft, 6.8.1 [basic.fundamental] paragraph 3 includes a table showing the minimum widths of the 5 signed integer types: Type Minimum width N signed char 8 short 16 int 16 long 32 long long 64 The widths are sufficient to specify the ranges, since unlike the current edition of the C++ standard, N4842 mandates 2's-complement for signed integers, including the extra negative value: The range of representable values for a signed integer type is −2**(N−1) to 2**(N−1) − 1 (inclusive), where N is called the *width* of the type. ... An unsigned integer type has the same object representation, value representation, and alignment requirements (6.7.6) as the corresponding signed integer type. For each value x of a signed integer type, the value of the corresponding unsigned integer type congruent to x modulo 2**N has the same value of corresponding bits in its value representation. (expressions tweaked to avoid superscripts). Of course N4842 is not a standard, and I should have checked the current edition. -- Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com [Note updated email address] Working, but not speaking, for Philips Healthcare void Void(void) { Void(); } /* The recursive call of the void */ |
Frederick Gotham <cauldwell.thomas@gmail.com>: Jan 04 06:12AM -0800 On Friday, January 3, 2020 at 7:33:08 PM UTC, Paavo Helde wrote: > probably need to define some extra stuff. The allocator requirements > have been in great flux in the recent standards, I'm not sure what it is > missing exactly. I've tried this on three compilers: GNU, Microsoft, Clang The original code which has two template parameters, "typename T, std::size_t capacity", only compiles on the Clang compiler. The second version with only one template parameter, "typename T", compiles on all three compilers. Since the second version works on all three compilers, I don't think that this problem is anything to do with how allocators are implemented in the respective standard libraries for these three compilers. Making the change from two parameters to one parameter shouldn't cause compilation to fail. Here's the code for the second version which works on all three compilers (I've just commented out the 2nd parameter): #include <cstddef> /* size_t */ #include <new> /* Only for bad_alloc */ std::size_t constexpr capacity = 4; /* This is instead of a template parameter */ template<typename T /*, std::size_t capacity*/ > class StaticAllocator { public: typedef T value_type; protected: static T buf[capacity]; public: T *allocate(std::size_t const n) { if (n > capacity) throw std::bad_alloc(); return buf; } void deallocate(T *, std::size_t) { /* Do Nothing */ } }; template<typename T /*, std::size_t capacity*/ > T StaticAllocator<T /*,capacity*/ >::buf[capacity]; using std::size_t; #include <vector> using std::vector; #include <iostream> using std::cout; using std::endl; auto main(void) -> int { vector< char, StaticAllocator<char /*, 4*/ > > v; v.push_back('a'); v.push_back('b'); v.push_back('c'); v.push_back('d'); for (auto const &elem : v) cout << elem << endl; vector< char, StaticAllocator<char /*, 4 */> > v2; v2.push_back('x'); v2.push_back('y'); v2.push_back('z'); for (auto const &elem : v2) cout << elem << endl; // Now try the first vector again for (auto const &elem : v) cout << elem << endl; } |
Bo Persson <bo@bo-persson.se>: Jan 04 03:53PM +0100 On 2020-01-04 at 15:12, Frederick Gotham wrote: > for (auto const &elem : v) > cout << elem << endl; > } Seems like the culprit is the rebind member template from the allocator requirements. MSVC uses that to make sure that the allocator used for vector<T> really allocates T's: using _Rebind_alloc_t = typename allocator_traits<_Alloc>::template rebind_alloc<_Value_type>; The allocator table says: A::template rebind<U>::other (optional)[1] with the very important note: "Notes: rebind is only optional (provided by std::allocator_traits) if this allocator is a template of the form SomeAllocator<T, Args>, where Args is zero or more additional template type parameters." https://en.cppreference.com/w/cpp/named_req/Allocator#cite_note-1 As your second template parameter is a non-type template parameter (the value 4), it doesn't *fully* comply with these requirements and so a compiler doesn't have to accept it. Apparently, some compilers might work if the allocator is *almost* correct, but they don't have to. Bo Persson |
Frederick Gotham <cauldwell.thomas@gmail.com>: Jan 04 02:34PM -0800 Bo wrote: > rebind is only optional (provided by std::allocator_traits) if this > allocator is a template of the form SomeAllocator<T, Args>, where Args > is zero or more additional template type parameters." Well spotted. Here's my workaround: #include <cstddef> /* size_t */ #include <new> /* Only for bad_alloc */ template <std::size_t capacity> class Outer { template<typename T> class StaticAllocator { public: typedef T value_type; protected: static T buf[capacity]; public: T *allocate(std::size_t const n) { if (n > capacity) throw std::bad_alloc(); return buf; } void deallocate(T *, std::size_t) { /* Do Nothing */ } }; }; template<std::size_t capacity> template<typename T> T Outer<capacity>::StaticAllocator<T>::buf[capacity]; using std::size_t; #include <vector> using std::vector; #include <iostream> using std::cout; using std::endl; auto main(void) -> int { vector< char, Outer<4>::StaticAllocator<char> > v; v.push_back('a'); v.push_back('b'); v.push_back('c'); v.push_back('d'); for (auto const &elem : v) cout << elem << endl; vector< char, Outer<4>::StaticAllocator<char> > v2; v2.push_back('x'); v2.push_back('y'); v2.push_back('z'); for (auto const &elem : v2) cout << elem << endl; // Now try the first vector again for (auto const &elem : v) cout << elem << endl; } Now I can get back to testing my new allocator. |
Ike Naar <ike@sdf.lonestar.org>: Jan 04 09:46PM > bool operator<(const struct Foo &r) const { //needed for set > if (i<r.i) return true; > return j<r.j; This looks suspect. Do you want (2,1) to be less than (1,2) ? if you want to define a lexicographical order on (i,j), the comparison should be return i<r.i || (i==r.i && j<r.j); |
"Öö Tiib" <ootiib@hot.ee>: Jan 04 05:21AM -0800 On Saturday, 4 January 2020 12:04:24 UTC+2, Bonita Montero wrote: > > - the TIFF file format > > - linker symbols as seen in the output from 'nm -CP' > That has nothing to do with parsing. It is extending all "deserializtion" into "parsing" that is pedantically wrong to do. But your "nothing to do" is exaggeration. Parsing is subset of activities, deserialization of text formats. |
Jorgen Grahn <grahn+nntp@snipabacken.se>: Jan 04 03:04PM On Sat, 2020-01-04, Öö Tiib wrote: > pedantically wrong to do. But your "nothing to do" is > exaggeration. Parsing is subset of activities, deserialization > of text formats. (Note that the output from nm -CP is text, an address and a C++ name.) I was not aware at all that there is a distinction. If there is one, it must be hard to draw the line. Recursive definition? Not that matters much. The real reason I brought it up is that there's often a choice when designing a data format: - use XML or JSON or similar, and you need help parsing it - make up your own simpler format (often "key: value" is enough) and you can make your own parser, can use normal Unix tools on it ... but don't get any help from XML or JSON tools. This second option may not be popular right now, but it does exist. /Jorgen -- // Jorgen Grahn <grahn@ Oo o. . . \X/ snipabacken.se> O o . |
"Öö Tiib" <ootiib@hot.ee>: Jan 04 12:39PM -0800 On Saturday, 4 January 2020 17:04:29 UTC+2, Jorgen Grahn wrote: > I was not aware at all that there is a distinction. If there is one, > it must be hard to draw the line. Recursive definition? > Not that matters much. Are you saying the term has widened from linguistics (where parsing is semantic analysis of text) to computer science where it now means any kind of deserializations? I am last from whom to ask extent of modern English anyway. > you can make your own parser, can use normal Unix tools on it ... > but don't get any help from XML or JSON tools. > This second option may not be popular right now, but it does exist. I myself like to use well-established portable formats, (like say png for raster picture). Then I use json for everything for what I don't have such format. That might result with lot of files that I prefer to zip into one file in Open Document Format style. |
Mr Flibble <flibbleREMOVETHISBIT@i42.co.uk>: Jan 04 08:50PM >> the top 5% of parsers performance-wise. > Of course you did. Did you solve the 3 body problem and world peace at the > same time? Yes I did: if you don't believe me then you simply have to look at the source code as it is on github. Now kindly fuck off. /Flibble -- "Snakes didn't evolve, instead talking snakes with legs changed into snakes." - Rick C. Hodgin "You won't burn in hell. But be nice anyway." – Ricky Gervais "I see Atheists are fighting and killing each other again, over who doesn't believe in any God the most. Oh, no..wait.. that never happens." – Ricky Gervais "Suppose it's all true, and you walk up to the pearly gates, and are confronted by God," Byrne asked on his show The Meaning of Life. "What will Stephen Fry say to him, her, or it?" "I'd say, bone cancer in children? What's that about?" Fry replied. "How dare you? How dare you create a world to which there is such misery that is not our fault. It's not right, it's utterly, utterly evil." "Why should I respect a capricious, mean-minded, stupid God who creates a world that is so full of injustice and pain. That's what I would say." |
Soviet_Mario <SovietMario@CCCP.MIR>: Jan 04 05:58PM +0100 On 03/01/20 20:12, Öö Tiib wrote: > The <algorithm> also you need to compile in compiler set to C++17. > That perhaps means adding > CONFIG += c++17 tnx to you both I'd have to verify if GCC installed supports such a recent standard. But I guess some proper version of algorithm exist also in less recent version, or at least I hope -- 1) Resistere, resistere, resistere. 2) Se tutti pagano le tasse, le tasse le pagano tutti Soviet_Mario - (aka Gatto_Vizzato) |
Bonita Montero <Bonita.Montero@gmail.com>: Jan 04 02:22PM +0100 > I suspect the results will be highly dependent on details, like the > exact chip you are using, and where you draw the line between "small > blocks" and "big blocks". Here's a little benchmark that compares rep movsq with avx-copying (without loop-unrolling!): C++-Code: #include <Windows.h> #include <iostream> #include <cstring> #include <cstdint> #include <chrono> #include <intrin.h> using namespace std; using namespace chrono; extern "C" void fAvx( __m256 *src, __m256 *dst, size_t size, size_t repts ); extern "C" void fMovs( __m256 *src, __m256 *dst, size_t size, size_t repts ); int main() { size_t const PAGE = 4096, ROUNDS = 100'000; char *pPage = (char *)VirtualAlloc( nullptr, 2 * PAGE, MEM_RESERVE | MEM_COMMIT, PAGE_READWRITE ); __m256 *src = (__m256 *)pPage, *dst = (__m256 *)(pPage + PAGE); memset( pPage, 0, 2 * PAGE ); using timestamp = time_point<high_resolution_clock>; for( size_t size = 1; size <= (PAGE / 32); ++size ) { timestamp start = high_resolution_clock::now(); fAvx( src, dst, size, ROUNDS ); uint64_t avxNs = (uint64_t)duration_cast<nanoseconds>( high_resolution_clock::now() - start ).count();; start = high_resolution_clock::now(); fMovs( src, dst, size, ROUNDS ); uint64_t movsNs = (uint64_t)duration_cast<nanoseconds>( high_resolution_clock::now() - start ).count();; cout << "size: " << size << "\tavx:\t" << avxNs / 1.0E6 << "\tmovs\t" << movsNs / 1.0E6 << endl; } } Asm-Code: _TEXT SEGMENT ; void fAvx( __m256 *src, __m256 *dst, size_t count, size_t repts ); ; rcx: src ; rdx: dst ; r8: count ; r9: repts fAvx PROC test r9, r9 jz zero test r8, r8 jz zero mov rax, r8 shl rax, 5 add rax, rcx sub rdx, rcx mov r10, rcx mov r11, rdx jmp avxLoop reptLoop: mov rcx, r10 mov rdx, r11 avxLoop: vmovups ymm0, [rcx] vmovups [rcx+rdx], ymm0 add rcx, 32 cmp rcx, rax jne avxLoop dec r9 jnz reptLoop zero: ret fAvx ENDP ; void fMovs( __m256 *src, __m256 *dst, size_t count, size_t repts ); ; rcx: src ; rdx: dst ; r8: count ; r9: repts fMovs PROC test r9, r9 jz zero push rsi push rdi mov r10, rcx mov r11, rdx lea rdx, [r8 * 4] reptLoop: mov rsi, r10 mov rdi, r11 mov rcx, rdx rep movsq dec r9 jnz reptLoop pop rdi pop rsi zero: ret fMovs ENDP _TEXT ENDS END That's the relative speedup of AVX over rep movsq: size: 1 1383,79% size: 2 737,12% size: 3 433,35% size: 4 342,41% size: 5 283,20% size: 6 431,57% size: 7 351,47% size: 8 340,53% size: 9 314,24% size: 10 325,57% size: 11 270,96% size: 12 327,83% size: 13 296,13% size: 14 275,73% size: 15 284,19% size: 16 317,27% size: 17 331,54% size: 18 266,05% size: 19 287,00% size: 20 281,83% size: 21 276,17% size: 22 261,85% size: 23 263,01% size: 24 251,48% size: 25 247,98% size: 26 237,64% size: 27 239,66% size: 28 187,04% size: 29 185,92% size: 30 189,09% size: 31 168,90% size: 32 179,31% size: 33 220,31% size: 34 192,71% size: 35 207,33% size: 36 214,69% size: 37 156,90% size: 38 169,47% size: 39 184,87% size: 40 159,98% size: 41 175,79% size: 42 156,60% size: 43 162,29% size: 44 155,36% size: 45 158,09% size: 46 164,42% size: 47 154,88% size: 48 164,17% size: 49 155,84% size: 50 157,59% size: 51 148,29% size: 52 152,67% size: 53 139,59% size: 54 149,78% size: 55 140,99% size: 56 146,94% size: 57 142,01% size: 58 148,15% size: 59 141,62% size: 60 152,89% size: 61 152,00% size: 62 149,20% size: 63 150,13% size: 64 150,45% size: 65 140,96% size: 66 132,11% size: 67 142,80% size: 68 135,96% size: 69 146,18% size: 70 140,17% size: 71 139,63% size: 72 139,22% size: 73 131,02% size: 74 145,43% size: 75 138,23% size: 76 132,02% size: 77 142,05% size: 78 135,97% size: 79 136,52% size: 80 138,93% size: 81 136,06% size: 82 138,59% size: 83 139,08% size: 84 134,50% size: 85 136,64% size: 86 134,28% size: 87 133,35% size: 88 129,82% size: 89 138,07% size: 90 132,57% size: 91 125,16% size: 92 138,73% size: 93 135,70% size: 94 131,55% size: 95 126,62% size: 96 134,87% size: 97 130,83% size: 98 129,21% size: 99 126,70% size: 100 133,07% size: 101 129,39% size: 102 129,12% size: 103 125,27% size: 104 124,14% size: 105 131,78% size: 106 132,87% size: 107 131,40% size: 108 128,29% size: 109 122,95% size: 110 121,13% size: 111 121,73% size: 112 126,26% size: 113 130,87% size: 114 131,31% size: 115 124,70% size: 116 119,53% size: 117 121,42% size: 118 120,34% size: 119 125,65% size: 120 124,95% size: 121 130,36% size: 122 128,35% size: 123 128,25% size: 124 127,47% size: 125 124,28% size: 126 124,14% size: 127 122,69% size: 128 122,76% So movsq is never faster. Here's the result graphically: https://app.unsee.cc/#45f34f42 So its also exact the opposite as Melzzz said: movsq becomes more competitive as the block-size raises. |
boltar@nowhere.org: Jan 04 12:43PM On Fri, 03 Jan 2020 10:09:04 -0600 >done is not really relevant), but has explicitly decided *not* to >support C99 and later. That's changed a smidgen, of late, and will >VLAs no longer being mandatory, They're not? Figures, aside from variadic macros they're the only thing in C99 that I found useful. |
boltar@nowhere.org: Jan 04 12:45PM On Fri, 3 Jan 2020 17:22:08 +0100 >Actually, it is. >Section 6.5.6p5 of the C standard says "The result of the binary + >operator is the sum of the operands." There will be an equivalent Ok, you got me there. I genuinely didn't expect the bleeding obvious to be included in the standard but then I have better things to do with my time than read it. >definition in the C++ standard if you choose to look for it. >At what point will you realise you'll benefit more by trying to learn >from other people, rather than continually making a fool of yourself? Its such fun winding you all up :) |
You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page. To unsubscribe from this group and stop receiving emails from it send an email to comp.lang.c+++unsubscribe@googlegroups.com. |
No comments:
Post a Comment