- "Doing UTF-8 in Windows" by Mircea Neacsu - 5 Updates
- "2020-02 Prague ISO C++ Committee Trip Report — 🎉 C++20 is Done! 🎉" - 2 Updates
- Union type punning in C++ redux - 7 Updates
- transactional memory idea - 2 Updates
Lynn McGuire <lynnmcguire5@gmail.com>: Feb 19 03:35PM -0600 "Doing UTF-8 in Windows" by Mircea Neacsu https://www.codeproject.com/Articles/5252037/Doing-UTF-8-in-Windows "This is (yet another!) article on how to handle UTF-8 encoding on a platform that still encourages the UTF-16 encoding. I am also providing a small library for this purpose. The code works, it is clean, easy to understand and small." "This is an implementation of the solution advocated in the UTF-8 Everywhere manifesto. I would strongly encourage you to go read the whole document to get indoctrinated ☺." http://utf8everywhere.org/ We are finally moving our software to UTF-8. It is horrendous so far. Lynn |
Jorgen Grahn <grahn+nntp@snipabacken.se>: Feb 19 10:16PM On Wed, 2020-02-19, Lynn McGuire wrote: ... > We are finally moving our software to UTF-8. It is horrendous so far. Can you expand on that? E.g. moving from what? /Jorgen -- // Jorgen Grahn <grahn@ Oo o. . . \X/ snipabacken.se> O o . |
Lynn McGuire <lynnmcguire5@gmail.com>: Feb 19 04:26PM -0600 On 2/19/2020 4:16 PM, Jorgen Grahn wrote: >> We are finally moving our software to UTF-8. It is horrendous so far. > Can you expand on that? E.g. moving from what? > /Jorgen ASCII. Our Windows user interface has 450,000 lines of code in C++. Our Calculation Engine has 700,000 lines of F77 and 10,000+ lines of C and C++. Lynn |
Jorgen Grahn <grahn+nntp@snipabacken.se>: Feb 19 10:48PM On Wed, 2020-02-19, Lynn McGuire wrote: >> Can you expand on that? E.g. moving from what? >> /Jorgen > ASCII. Then you're already doing UTF-8! (Only half-joking.) > Our Windows user interface has 450,000 lines of code in C++. > Our Calculation Engine has 700,000 lines of F77 and 10,000+ lines of C > and C++. I guess this is much work or little, depending on how much that code cares about the actual contents of strings. /Jorgen -- // Jorgen Grahn <grahn@ Oo o. . . \X/ snipabacken.se> O o . |
Lynn McGuire <lynnmcguire5@gmail.com>: Feb 19 05:06PM -0600 On 2/19/2020 4:48 PM, Jorgen Grahn wrote: > I guess this is much work or little, depending on how much that code cares > about the actual contents of strings. > /Jorgen Anything that calls the Win32 API or opens a file ... Lynn |
Lynn McGuire <lynnmcguire5@gmail.com>: Feb 19 04:08PM -0600 "2020-02 Prague ISO C++ Committee Trip Report — 🎉 C++20 is Done! 🎉" https://www.reddit.com/r/cpp/comments/f47x4o/202002_prague_iso_c_committee_trip_report_c20_is/ Wow, that is a lot of new stuff that I probably will not use. Lynn |
Jorgen Grahn <grahn+nntp@snipabacken.se>: Feb 19 10:21PM On Wed, 2020-02-19, Lynn McGuire wrote: > "2020-02 Prague ISO C++ Committee Trip Report — 🎉 C++20 is Done! 🎉" > https://www.reddit.com/r/cpp/comments/f47x4o/202002_prague_iso_c_committee_trip_report_c20_is/ > Wow, that is a lot of new stuff that I probably will not use. I wonder what happened to Stroustrup's request to slow down the development? I read no C++-related news, so I am unaware of any results. /Jorgen -- // Jorgen Grahn <grahn@ Oo o. . . \X/ snipabacken.se> O o . |
Daniel <danielaparker@gmail.com>: Feb 18 08:40PM -0800 I've been following the "Union type punning in C++" posts with some interest, but not exactly sure of the conclusion. Would the following be legal C++? #include <string> #include <new> // pod type struct A { uint8_t tag; }; // non-pod struct B { A a; std::string s; B(const std::string& s) : a{1}, s(s) { } }; struct C { A a; double d; C(double d) : a{ 2 }, d(d) { } }; class V { union { A a; B b; C c; }; public: V(const std::string& s) { ::new(&b) B(s); } V(double d) { ::new(&c) C(d); } ~V() { switch (tag()) { case 1: b.~B(); break; case 2: c.~C(); break; default: break; } } uint8_t tag() const { return a.tag; } }; What if B and C were defined through inheritance from A instead, i.e. struct B : A { std::string s; B(const std::string& s) : A{1}, s(s) { } }; struct C : A { double d; C(double d) : A{ 2 }, d(d) { } }; Thanks, Daniel |
Pavel <pauldontspamtolk@removeyourself.dontspam.yahoo>: Feb 19 01:21AM -0500 Daniel wrote: > return a.tag; > } > }; I think no: it is not allowed to inspect a.tag regardless of the constructor used to construct V because `tag' is not a part of the common initial sequence of either V::a and V::b or of V::a and V::c (I think the common initial sequence is empty in both cases). > { > } > }; I think no, for same reason. I think changing implementation of tag() to either of { return b.tag; } or { return c.tag; } would make the code valid (then of course having V::a member would be unnecessary). > Thanks, > Daniel FWIW, -Pavel |
Daniel <danielaparker@gmail.com>: Feb 19 06:09AM -0800 On Wednesday, February 19, 2020 at 1:21:36 AM UTC-5, Pavel wrote: > > Would the following be legal C++? > > snipped > I think no Would the following (non-union) alternative be legal C++? #include <string> #include <new> #include <algorithm> enum class tag_type : uint8_t {b,c}; struct A { tag_type tag; }; struct B : A { uint8_t extra; uint64_t n; B(uint64_t n) : A{tag_type::b}, n(n) { } }; struct C : A { double d; C(double d) : A{tag_type::c}, d(d) { } }; class V { static constexpr size_t data_size = std::max(sizeof(B), sizeof(C)); static constexpr size_t data_align = std::max(alignof(B), alignof(C)); typedef typename std::aligned_storage<data_size, data_align>::type data_t; data_t data_; public: V(uint8_t n) { ::new(&data_) B(n); } V(double d) { ::new(&data_) B(d); } ~V() { switch (tag()) { case tag_type::b: reinterpret_cast<const B*>(&data_)->~B(); break; case tag_type::c: reinterpret_cast<const C*>(&data_)->~C(); break; default: break; } } tag_type tag() const { return reinterpret_cast<const A*>(&data_)->tag; } }; |
"Öö Tiib" <ootiib@hot.ee>: Feb 19 08:25AM -0800 On Wednesday, 19 February 2020 16:10:23 UTC+2, Daniel wrote: > > > snipped > > I think no > Would the following (non-union) alternative be legal C++? No. You have wrong idea that base classes are better. Resulting are not standard layout types. It is because the requirement (in [class.prop]) "has all non-static data members and bit-fields in the class and its base classes first declared in the same class" is not fulfilled. And so "common initial sequence" does not apply and "address of class object is same as address of its first non-static data member object" does not also apply. > return reinterpret_cast<const A*>(&data_)->tag; > } > }; I suggest to get rid of base classes with data members and have A as first data member. Then it is proper approach. Additional note: When you want to have standard library classes as data members of your standard layout classes then always static_assert in code that these are: static_assert(std::is_standard_layout<std::string>::value , "This code needs std::string to be standard layout"); It is because standard does not require it and that can turn your reinterpret_cast of pointer of object into pointer of its first member into undefined behavior. |
Daniel <danielaparker@gmail.com>: Feb 19 09:08AM -0800 On Wednesday, February 19, 2020 at 11:26:22 AM UTC-5, Öö Tiib wrote: > <snipped> > No. > <snipped> That's very helpful, thanks. For this exercise, the design goals are correctness, compactness (assume 10's of millions of V's), and encapsulation of the B's and C's, in that order. For the last example, assuming eight byte alignment, it would be desirable to have sizeof(V) == 16. To that end, my next question is, would this be legal C++: #include <string> #include <new> #include <algorithm> #include <cstring> struct B { uint8_t tag; uint8_t extra; uint64_t n; B(uint64_t n, uint8_t extra = 0) : tag{1}, n(n), extra(extra) { } }; struct C { uint8_t tag; double d; C(double d) : tag{2}, d(d) { } }; class V { static constexpr size_t data_size = std::max(sizeof(B), sizeof(C)); static constexpr size_t data_align = std::max(alignof(B), alignof(C)); typedef typename std::aligned_storage<data_size, data_align>::type data_t; data_t data_; public: V(uint8_t n) { ::new(&data_) B(n); } V(double d) { ::new(&data_) B(d); } ~V() { switch (tag()) { case 1: reinterpret_cast<const B*>(&data_)->~B(); break; case 2: reinterpret_cast<const C*>(&data_)->~C(); break; default: break; } } uint8_t tag() const { uint8_t t; std::memcpy(&t, &data_, sizeof(uint8_t)); return t; } }; > It is because standard does not require it and that can turn > your reinterpret_cast of pointer of object into pointer of its > first member into undefined behavior. Thanks for pointing that out. For my purposes, I do need to support std::allocator_traits<Alloc>::pointer including fancy pointers. Daniel |
"Öö Tiib" <ootiib@hot.ee>: Feb 19 12:22PM -0800 On Wednesday, 19 February 2020 19:09:20 UTC+2, Daniel wrote: > { > case 1: > reinterpret_cast<const B*>(&data_)->~B(); Why const B* not B*? > return t; > } > }; Yes, the classes are standard layout and therefore in tag() you could just return *reinterpret_cast<uint8_t const*>(&data_); as well. > std::allocator_traits<Alloc>::pointer > including fancy pointers. > Daniel Yes, when you are unsure if certain member in your B or C is standard layout or not then check std::is_standard_layout about the member or about whole B or C. |
Daniel <danielaparker@gmail.com>: Feb 19 12:59PM -0800 On Wednesday, February 19, 2020 at 3:22:52 PM UTC-5, Öö Tiib wrote: > > reinterpret_cast<const B*>(&data_)->~B(); > > break; > Why const B* not B*? No reason. Copied that piece from code that accesses the object. > Yes, the classes are standard layout and therefore in tag() > you could just return *reinterpret_cast<uint8_t const*>(&data_); > as well. Thanks! very much appreciate your feedback. Daniel |
Pavel <pauldontspamtolk@removeyourself.dontspam.yahoo>: Feb 19 12:28AM -0500 Bonita Montero wrote: >> your "nice thing" above. Readable code separates concerns and your "nice >> thing" does the opposite. > The code I've shown doesn't have a bad redability. I have never said it had bad readability. I said your "nice thing" serving no purpose was an obfuscation pretending to be simple while not being so. It would be a bad idea (now I am saying "bad") to use it as an example to do anything useful. Your original "transaction" code was rather readable although had an unnecessary "else" clause made it more complex than necessary. While fixing the bug your worsened the readability from ok to rather poor (again I am not calling it "bad"). A better readable code for "transaction", after the bug fix and preserving your preferences for indentation and braces but not new lines and using magic numbers vs symbolic constants could be, for example: enum TsxStatusCategory: int { TSX_ABORTED_DONT_RETRY = -1, TSX_ABORTED_CAN_RETRY = 0, TSX_COMMITTED = 1 }; template<typename L> inline TsxStatusCategory doTsxTransaction( L &lambda ) // function name should be a verb [clause] { unsigned code = _xbegin(); // unsinged is not unsigned BTW if( code == _XBEGIN_STARTED ) { lambda(); _xend(); return TSX_COMMITTED; } // `else' served no purpose here other than obfuscation if (code & _XABORT_EXPLICIT) return TSX_ABORTED_DONT_RETRY; if (code & _XABORT_RETRY) return TSX_ABORTED_CAN_RETRY; return TSX_ABORTED_DONT_RETRY; } > Yours has a bad readabiliy. Yes, it does and as said earlier it is its point. |
Bonita Montero <Bonita.Montero@gmail.com>: Feb 19 07:28AM +0100 > I have never said it had bad readability. I said your "nice thing" serving > no purpose was an obfuscation pretending to be simple while not being so. ... Whoever finds the contradiction may keep it. > return TSX_ABORTED_CAN_RETRY; > return TSX_ABORTED_DONT_RETRY; > } That's a matter of taste. |
You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page. To unsubscribe from this group and stop receiving emails from it send an email to comp.lang.c+++unsubscribe@googlegroups.com. |
No comments:
Post a Comment