- Avoid 'int', 'long' and 'short'... - 9 Updates
- style guides on "#undef" - 4 Updates
- Gratuitous buffer flushing - 3 Updates
- Gratuitous buffer flushing - 3 Updates
- concrete classes - 1 Update
- Checking if a linked list is circular with smart pointers - 2 Updates
"Öö Tiib" <ootiib@hot.ee>: Jun 28 07:10AM -0700 On Sunday, 28 June 2015 16:53:35 UTC+3, Rosario19 wrote: > so wuold be ok for | too and not etc > etc > where is the problem with endianess of the number? Rosario, we did talk about keeping data in some portable binary format. We have portable binary formats for to achieve that one computer saves it to disk or sends over internet and other computer reads or receives it and both understand it in same way. Different computers may keep the numbers in their own memory with different endianness. So when computer reads or writes the bytes of binary format then it must take care that those are ordered correctly. That is what we call taking care about endianness in portable format. |
BGB <cr88192@hotmail.com>: Jun 28 12:17PM -0500 On 6/28/2015 8:43 AM, Öö Tiib wrote: > particularly funny since neither C nor C++ contain standard way for > detecting endianness compile-time. There are some libraries that use > every known non-standard way for that and so produce minimal code. yeah. I prefer fixed endianess formats, personally. granted, probably most everyone else does as well, as most formats are this way. just a few formats exist where the endianess depends on whichever computer saved the file, with magic numbers to detect when swapping is needed. I find these annoying. some of my tools (script language and C compiler, as an extension) have the ability to specify the endianess for variables and pointer types (so, you can be sure the value is stored as either big or little endian, regardless of native endianess), and implicitly also makes it safe for misaligned loads/stores. namely, if you know a value needs to be a particular way, then chances are the person is willing to pay whatever CPU cycles are needed to make it that way. typically, for compile-time stuff, there is a mess of #define's and #ifdef's for figuring out the target architecture and other things, picking appropriate type-sizes and setting values for things like whether or not the target supports misaligned access, ... I guess it could be nicer if more of this were standardized. > then it feels reasonable at least to consider 4 bit wide entries. The > processors crunch numbers at ungodly speeds but it is 4 times shorter > table than one with 16 bit wide entries. could be, but the table entries in this case were fairly unlikely to be that much below 16 bits (so 8 bit or smaller would not have been useful). these were basically offsets within a TLV lump. where you would have one TLV lump which contains lots of payload data (as an array of variable-sized sub-lumps packed end-to-end), and a table to say where each sub-lump is within that bigger lump (to better allow random access). in most of the cases, the lumps were between kB or maybe up to a few MB, so 16 or 24 bits are the most likely cases, and always using 32-bits "to be safe" would be a waste. it would be fairly easy to compress them further, but this would require decoding them before they could be used, which was undesirable in this case. > case? OTOH storage for texts can be significant if there are lot of texts > or lot of translations. Number of PC software let to download and install > translations separately or optionally. yeah, probably should have been clearer. this was for string literals/values in a VM. in the predecessor VM, M-UTF-8 had been used for all the string literals (except the UTF-16 ones), which mostly worked (since direct per-character access is fairly rare), but it meant doing something like "str[idx]" would take 'O(n)' time (and looping over a string per-character would be O(n^2)...). in the use-case for the new VM, I wanted O(1) access here (mostly to make things more predictable, *), but also didn't want the nearly pure waste that is UTF-16 strings. however, the language in question uses UTF-16 as its logical model (so, from high-level code, it appears as if all strings are UTF-16). in the language, strings are immutable, so there is no issue with the use of ASCII or similar for the underlying storage. in C, it isn't really an issue mostly as C makes no attempt to gloss over in-memory storage, so you can just return the raw byte values or similar. *: the VM needs to be able to keep timing latencies bounded, which basically weighs against doing anything in the VM where the time-cost can't be easily predicted in advance. wherever possible, all operations need to be kept O(1), with the operation either being able to complete in the available time-step (generally 1us per "trace"), or the VM will need to halt and defer execution until later (blocking is not allowed, and any operations which may result in unexpected behaviors, such as halting, throwing an exception, ... effectively need to terminate the current trace, which makes them more expensive). for some related reasons, the VM is also using B-Trees rather than hash tables in a few places (more predictable, if slower, than hashes, but less memory waste than AVL or BST variants). likewise, because of their structure, it is possible to predict in advance (based on the size of the tree) approximately how long it will take to perform the operation. > keeping the text dictionaries Huffman encoded all time. If to keep texts > Huffman encoded anyway then UCS-2 or UTF-16 are perfectly fine and there > are no need for archaic tricks like Windows-1252 or Code Page 437. granted, but in this case, it is mostly for string literals, rather than bulk text storage. Windows-1252 covers most general use-cases for text (and is fairly easy to convert to/from UTF-16, as for most of the range the characters map 1:1). CP-437 is good mostly for things like ASCII art and text-based UIs. for literals, it will be the job of the compiler to sort out which format to use. bulk storage will tend to remain in compressed UTF-8. though a more specialized format could be good. I had good results before compressing short fragments (such as character strings) with a combination of LZ77 and MTF+Rice Coding, which for small pieces of data did significantly better than Deflate or LZMA. however, the MTF makes it slower per-character than a Huffman-based option. basically, options like Deflate or LZMA are largely ineffective for payloads much under 200-500 bytes or so, but are much more effective as payloads get bigger. |
JiiPee <no@notvalid.com>: Jun 28 08:01PM +0100 On 26/06/2015 20:39, Mr Flibble wrote: > ... #include <cstdint> instead! > /Flibble Just watching Scott Meyers videos. He seems to also always use : int a =9; not fastint32 a = 9; if int was wrong, surely they would not teach people using int, right? :) |
JiiPee <no@notvalid.com>: Jun 28 08:15PM +0100 On 28/06/2015 20:01, JiiPee wrote: > not > fastint32 a = 9; > if int was wrong, surely they would not teach people using int, right? :) also note that Scott recommends to use auto a = 9; so using auto. so letting the ccomputer to deside the type !!! What will you say about this??? :) |
woodbrian77@gmail.com: Jun 28 12:56PM -0700 On Saturday, June 27, 2015 at 6:47:38 PM UTC-5, Öö Tiib wrote: > All programs that use sounds or images technically use binary formats > but those are abstracted far under some low level API from programmers. > I did not mean that. I didn't mean that either. Brian Ebenezer Enterprises - In G-d we trust. http://webEbenezer.net |
"Alf P. Steinbach" <alf.p.steinbach+usenet@gmail.com>: Jun 28 10:13PM +0200 On 28-Jun-15 9:15 PM, JiiPee wrote: > auto a = 9; > so using auto. so letting the ccomputer to deside the type !!! > What will you say about this??? :) Using `auto` to declare a variable without an explicit type communicates well to the compiler but not to a human reader. Also it's longer to write and read than just `int`. And one can't generally adopt this as a convention, e.g. it doesn't work for a variable without initializer, so it's not forced by a convention. Therefore I consider it an abuse of the language. As to why, I guess that Scott has to use all kinds of fancy features just to grab and keep interest from the audience. And maybe so that novices can ask "what's the `auto`, huh?", so that he can explain it. Explaining things and giving advice is after all how he makes a living. Cheers & hth., - Alf -- Using Thunderbird as Usenet client, Eternal September as NNTP server. |
JiiPee <no@notvalid.com>: Jun 28 09:22PM +0100 On 28/06/2015 21:13, Alf P. Steinbach wrote: >> What will you say about this??? :) > Using `auto` to declare a variable without an explicit type > communicates well to the compiler but not to a human reader. In Visual Studio you can hoover the mouse and see the real type quite easily, also works on other compilers. > Also it's longer to write and read than just `int`. And one can't > generally adopt this as a convention, e.g. it doesn't work for a > variable without initializer, so it's not forced by a convention. think about a funktion returning the size of elements in and container... you could wrongly put: unsigend long getSize(); if it actually returs a 64 bit integer. auto would find the right type straight away. > Therefore I consider it an abuse of the language. in many places it makes coding safer because the auto always finds the correct type. You can get bugs by putting a wrong type... and people have done that when reading forums. > just to grab and keep interest from the audience. And maybe so that > novices can ask "what's the `auto`, huh?", so that he can explain it. > Explaining things and giving advice is after all how he makes a living. with comples types it can increase safety, because auto always gets it right. We might get the type wrong which migh cause bugs. But auto definitely is not always best for sure even if somebody likes it. |
JiiPee <no@notvalid.com>: Jun 28 09:23PM +0100 On 28/06/2015 21:13, Alf P. Steinbach wrote: > Using `auto` to declare a variable without an explicit type > communicates well to the compiler but not to a human reader. Also it's > longer to write and read than just `int`. but if you take an everage value of all types the auto would win big time. on average auto makes typenames much shorter if all types are considered. |
jt@toerring.de (Jens Thoms Toerring): Jun 28 10:33PM > > communicates well to the compiler but not to a human reader. > In Visual Studio you can hoover the mouse and see the real type quite > easily, also works on other compilers. Please keep in mind that VS is an IDE with an attached compiler (beside a lot of other things). So this won't work "on other compilers", since a compiler is a program to comile code and not something you can "hoover over" with the mouse. You may be surprised, but not everyone is using an IDE (for various reasons) - or even a graphical user interface - all of the time (and thus a mouse or something similar)... > unsigend long getSize(); > if it actually returs a 64 bit integer. auto would find the right type > straight away. If you define a variable and assign to it the return value of a function then it's relatively clear what the type will be - it can be easily found out by looking at the function declaration. But something like auto a = 0; is a bit different: you have to very carefully look at that '0' to figure out if this will end up being an 'int' or per- haps something else? And it can be prone to getting the wrong type by vorgetting (or mis-typing) some character after the '0' that makes the variable have a different type. There's definitely a readability issue. > in many places it makes coding safer because the auto always finds the > correct type. You can get bugs by putting a wrong type... and people > have done that when reading forums. Yes, but cases like int a = 0f; are places where this isn't the case. 'auto' is very usefulf in cases like for ( auto it = xyz.begin(); it != xyz.end(); ++i ) instead of maybe for ( std::pair< std::vector< std::pair< int, char const * >, double >, std::vector< std::string > >::iterator it = xyz.begin( ); it != xyz.end( ); ++i ) since you will be aware of the type of 'xyz', but auto a = 0ull; is different since it makes the type of 'a' hard to recognize at a glance. And you may not forget anything of the 'ull' bit at the end or you'll get something you never wanted and thus don't expect. It actually creates a new class of possible bugs. Regards, Jens -- \ Jens Thoms Toerring ___ jt@toerring.de \__________________________ http://toerring.de |
David Brown <david.brown@hesbynett.no>: Jun 28 09:53PM +0200 On 28/06/15 21:19, Stefan Ram wrote: > #undef should not normally be needed. Its use can lead > to confusion with respect to the existence or meaning of > a macro when it is used in the code Use guide C - avoid macros unless they really are the clearest and best way to solve the problem at hand. But don't use #undef except in /really/ special code, as it leads to confusion - macros should normally have exactly the same definition at all times in the program, or at least within the file. |
"Alf P. Steinbach" <alf.p.steinbach+usenet@gmail.com>: Jun 28 10:24PM +0200 On 28-Jun-15 9:19 PM, Stefan Ram wrote: > #undef should not normally be needed. Its use can lead > to confusion with respect to the existence or meaning of > a macro when it is used in the code Ordinary include guards are incompatible with guide A. Well I could agree with a preference for #pragma once instead of include guards (and just don't support any e.g. IBM compiler that doesn't support the pragma), but /requiring/ that one doesn't use include guards is IMHO to go too far. It's more work and less clear, but if someone wants to, hey. As an example that's incompatible with guide B, in Windows desktop programming one will normally, nowadays, defined UNICODE before including <windows.h>. The definition doesn't matter, just that it's defined. But if it is defined in code and there is a previous definition one will get a sillywarning with e.g. g++ or Visual C++. And a simple solution is to #undef it, like this: #undef UNICODE #define UNICODE #include <windows.h> And this is very normal code. Rules to be mechanically followed are generally not compatible with C++ programming, which requires Some Intelligence Applied™. Therefore I think that neither guide referred to and quoted above, can be of very high quality. Cheers & hth., - Alf [Sorry, yet again I inadvertently applied Google Groups experience and hit "Reply". I'm currently searching for the "Unsend" button.] -- Using Thunderbird as Usenet client, Eternal September as NNTP server. |
"Öö Tiib" <ootiib@hot.ee>: Jun 28 01:28PM -0700 On Sunday, 28 June 2015 22:19:56 UTC+3, Stefan Ram wrote: > #undef should not normally be needed. Its use can lead > to confusion with respect to the existence or meaning of > a macro when it is used in the code I use macros for things that are impossible without macros. These are mostly things for better runtime debug diagnostics or traces. Examples: I can't optionally use compiler-specific extensions without macros. I can't get current source code file name, function name, line number or compiling time without macros. I can't both stringize and evaluate part of code without macros. Otherwise I avoid macros. The ones that I use I define in general configuration header that is included everywhere and I never #undef any of those. |
"Öö Tiib" <ootiib@hot.ee>: Jun 28 01:43PM -0700 On Sunday, 28 June 2015 23:24:36 UTC+3, Alf P. Steinbach wrote: > just don't support any e.g. IBM compiler that doesn't support the > pragma), but /requiring/ that one doesn't use include guards is IMHO to > go too far. IBM XL C/C++ certainly supports pragma once. AFAIK only Oracle Solaris Studio does not support it from C++ compilers still under active maintenance. Forbidding include guards is still perhaps going too far with style guide. |
ram@zedat.fu-berlin.de (Stefan Ram): Jun 28 12:48AM >Can anybody tell why would somebody want to flush the stream with end? Usually, ::std::cout is flushed before ::std::cin is used for reading or before the program exits, so one would want to flush it, when this does not suffice. See also: »::std::ios_base::unitbuf«, »::std::cin.tie()«. |
ram@zedat.fu-berlin.de (Stefan Ram): Jun 28 03:34PM Just for fun I'd like to point out that the term of »concrete class« might have changed in Lippman's »C++ primer«. The edition of 2005 still defines: »A concrete class is a class that exposes, rather than hides, its implementation.«. This seems to comply with Stroustrups notion of »concrete types« (and »concrete class« in that context). But a more recent 5th edition of the »C++ primer« now seems to use »concrete class« in the other sense of »a class that is not an abstract class«, although it possibly does not give an explicit definition for this term anymore. A class that is not concrete but owns ressources sometimes is called a »resource handle«. I would use this term for ::std::unique_ptr, but not for ::std::string, because in the case of the former handling the resource is the primary task, but in the case of the latter the resources is just a means to be a variable-length (mutable) string. What kind of classes are out there? POD class primitive class regular class trivially copyable type trivial type standard-layout type canonical class concrete class (a term with at least two different meanings) abstract class literal class resource handle class class with value semantics class with reference semantics constexpr class Any other kind that comes to your mind? |
ram@zedat.fu-berlin.de (Stefan Ram): Jun 28 07:19PM guide A: Don't use macros! OK, (...) treat them as a last resort. (...) And #undef them after you've used them, if possible. guide B: #undef should not normally be needed. Its use can lead to confusion with respect to the existence or meaning of a macro when it is used in the code |
"Öö Tiib" <ootiib@hot.ee>: Jun 28 08:21AM -0700 On Sunday, 28 June 2015 05:56:09 UTC+3, Richard wrote: > printf("%s\n", s); > fflush(stdout); > ...and so-on. Indeed because it adds clutter and we are lazy. If we had no 'endl' then we would rarely write such code in C++ as well: std::cout << i << '\n' << std::flush; std::cout << s << '\n' << std::flush; > If we wouldn't flush the buffer on every line in C, or in any other > language that supported buffered I/O (C#, Java, etc.), why are we > chronically doing this in C++? We are not. We typically do not use <iostream> for massive (so it does affect performance) text I/O and on case when we do then we avoid 'operator<<' whatsoever since that thing trashes performance even more terribly than superfluous flushing. We do use the streams primarily for slow human-readable I/O and even that is primarily for debugging. Now in debugging context I have been actually annoyed that the damn 'printf' did not flush it before it crashed or broke into breakpoint. Therefore it makes sense in code that demonstrates some feature or crash to novice to use 'endl' liberally because novice may want to step it in debugger. > I submit it is simply because people are immitating what they see > around them without thinking about it. You are correct that people do lot of things without thinking too lot about it. Otherwise it is hard to get things done timely. The particular topic is example of something that is good that you brought up since I tend to use '\n' and 'std::endl' in mix but have long stopped thinking about why I do it exactly like I do. |
Rosario19 <Ros@invalid.invalid>: Jun 28 06:20PM +0200 On Sun, 28 Jun 2015 08:21:57 -0700 (PDT), 嘱 Tiib wrote: >avoid 'operator<<' whatsoever since that thing trashes performance >even more terribly than superfluous flushing. >We do use the streams primarily for slow human-readable I/O and i'm not agree standard input and output can be use with trhu pipes for connect programs |
"Öö Tiib" <ootiib@hot.ee>: Jun 28 09:48AM -0700 On Sunday, 28 June 2015 19:20:43 UTC+3, Rosario19 wrote: > i'm not agree > standard input and output can be use with trhu pipes > for connect programs How that contradicts with what I wrote above? I do not understand where is difference. "Primarily" does not mean "always" but it means "for most part" and "mainly". I wrote above even separately about the cases like the one that you pointed out: "when we do then we avoid 'operator<<' whatsoever since that thing trashes performance even more terribly than superfluous flushing." |
Victor Bazarov <v.bazarov@comcast.invalid>: Jun 28 11:56AM -0400 On 6/28/2015 11:34 AM, Stefan Ram wrote: > class with reference semantics > constexpr class > Any other kind that comes to your mind? Empty class (sometimes used to denote a type that is different from any other type in your program). V -- I do not respond to top-posted replies, please don't ask |
Paul <pepstein5@gmail.com>: Jun 28 07:53AM -0700 On Thursday, June 25, 2015 at 2:28:35 PM UTC+1, Öö Tiib wrote: > so you should keep. Iterators should not manage the object > they navigate. Smart pointers (that automatically manage) > are therefore very bad iterators. This is my revised code which uses smart pointers. I also coded another direct way of testing for cycles by seeing if the pointers repeat. Does this seem ok? Thanks a lot for your feedback. #include <cstdio> #include <unordered_set> #include <vector> #include <algorithm> #include <iostream> #include <memory> struct Node { int data; std::shared_ptr<Node> next; }; // A fast pointer and a slow pointer are both initiated at the head. // Circular if the slow pointer is ever ahead of the fast pointer. bool isCycle(std::shared_ptr<Node> head) { auto slowPointer = head; auto fastPointer = head; while(fastPointer && fastPointer->next && fastPointer->next->next) { slowPointer = slowPointer->next; fastPointer = fastPointer->next->next; if(fastPointer == slowPointer || fastPointer->next == slowPointer) return true; } return false; } // A direct algorithm to tell if a cycle is present by seeing if a pointer address repeats. bool isCycleDirect(std::shared_ptr<Node> head) { std::unordered_set<std::shared_ptr<Node>> nodePointers; while(head) { // If trying to insert something already inserted, then must contain cycles. if(nodePointers.find(head) != nodePointers.end()) return true; nodePointers.insert(head); head = head->next; } return false; } // Test against the expected results. void testCycle(std::shared_ptr<Node> head, bool expected) { printf(isCycle(head) == expected ? "Results as expected\n" : "This test case failed\n"); } // Set up tests for small numbers of nodes void smallTests() { std::shared_ptr<Node> emptyList; testCycle(emptyList, false); std::shared_ptr<Node> List1(new Node); std::shared_ptr<Node>ListCircular2(new Node); std::shared_ptr<Node>ListNonCircular2(new Node); std::shared_ptr<Node>ListCircular3(new Node); std::shared_ptr<Node>ListNonCircular3(new Node); List1->next = nullptr; List1->data = 1; testCycle(List1, false); ListCircular2 = List1; ListCircular2 -> next = ListCircular2; testCycle(ListCircular2, true); ListNonCircular2 = ListCircular2; ListNonCircular2->next = std::shared_ptr<Node>(new Node); ListNonCircular2->next->data = 2; ListNonCircular2->next->next = nullptr; testCycle(ListNonCircular2, false); ListNonCircular3 = ListNonCircular2; ListNonCircular3->next->next = std::shared_ptr<Node>(new Node); ListNonCircular3->next->next->data = 3; ListNonCircular3->next->next->next = nullptr; testCycle(ListNonCircular3, false); ListCircular3 = ListNonCircular3; ListCircular3->next->next->next = ListCircular3; testCycle(ListCircular3, true); } int main() { smallTests(); return 0; } Paul |
"Öö Tiib" <ootiib@hot.ee>: Jun 28 08:40AM -0700 On Sunday, 28 June 2015 17:53:42 UTC+3, Paul wrote: > > they navigate. Smart pointers (that automatically manage) > > are therefore very bad iterators. > This is my revised code which uses smart pointers. I also coded another direct way of testing for cycles by seeing if the pointers repeat. Does this seem ok? Thanks a lot for your feedback. You decided to use smart pointers as iterators in 'isCycle'. I already did try to explain why smart pointers are terrible iterators. If you don't make or accustom a class for iterator then raw pointer is still better than smart pointer. Your 'isCycleDirect' seems quite expensive to make 'unordered_set' of whole list. You should perhaps try and compare the two with list of million of entries. Your 'smallTests' is still broken in sense that it leaks memory. |
You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page. To unsubscribe from this group and stop receiving emails from it send an email to comp.lang.c+++unsubscribe@googlegroups.com. |
No comments:
Post a Comment