- Simulating Halt Decider (SHD) Copyright (c) 2022 Mr Flibble - 1 Update
- "C++ on the Move" by Darryl K. Taft - 13 Updates
- Why no structured bings in catch()? - 1 Update
- SIGSEGV in stl_iterator.h - 9 Updates
- Pass by rvalue reference - 1 Update
olcott <polcott2@gmail.com>: Mar 10 10:48AM -0600 On 3/10/2023 6:46 AM, Mr Flibble wrote: > Flibble Simulating Halt Decider is the *first* SHD that solves the > halting problem. > /Flibble I came up with the idea that the Peter Linz halting problem proof is decidable as non-halting six years ago in this forum. On 3/11/2017 3:13 PM [Infinitely Recursive input on HP Proofs] Message-ID: <918df253-d4f0-4370-8f73-88e6690380a1@googlegroups.com> All of the conventional halting theorem proofs have this same issue. Instead of merely detecting the pathological relationship and rejecting this input, the otherwise "impossible" input is correctly determined to be non-halting. -- Copyright 2023 Olcott "Talent hits a target no one else can hit; Genius hits a target no one else can see." Arthur Schopenhauer |
Bonita Montero <Bonita.Montero@gmail.com>: Mar 10 01:38PM +0100 Am 10.03.2023 um 09:17 schrieb Alf P. Steinbach: > When the first .jpg virus was discovered in 2002 I was happy, well > a little, because it was an obvious possibility to me but my students > laughed at me when I mentioned it, middle 1990's. There might be an inconsistent content inside the file which might cause intentional "misinterpretation" by the parser. But that's rather unlikely. |
Bonita Montero <Bonita.Montero@gmail.com>: Mar 10 03:20PM +0100 Am 10.03.2023 um 14:10 schrieb Michael S: > Today .jpg viruses are indeed not likely. Show me any lately incident ... > But PDF is inherently more vulnerable than jpeg. If you use sth. like Acrobat or Foxit that is capable of executing embedded JavaScript. With other Readers that's very unlikely. > Even on the most basic level of functionality PDF is an encapsulated > PostScript, which is Turing-complete programming languge. ... PDF is stripped PostScript without that capabilities. And PostScript can't access external Resources. |
David Brown <david.brown@hesbynett.no>: Mar 10 03:54PM +0100 On 10/03/2023 13:27, Öö Tiib wrote: > internal gates than setting all bits to 0 does." > So the whole reason of such design is lack of market to trap on division > by zero there. Are you basing this on your intimate knowledge of the design decisions made by multiple cpu design teams? Everyone who has made it through primary school knows it makes no sense to divide by zero. It therefore makes no sense for a processor to aim for any particular result when someone tries to do the impossible. If a processor already has strong support for hardware exceptions as a debugging aid and for safe handling of multiple independent programs (as you get on a "big" general purpose cpu), then it is likely to have a trap of some sort on division by zero. If it is a chip optimised for low cost, size or power, for dedicated single-program microcontroller usage, then division by zero can be a "don't care". It might cause a reset, a hang, a nonsense result, a consistent and documented result - whatever. It would be a terrible idea for an efficient language to try to define what happens here, forcing compilers to generate pointless extra run-time checks just because some people have bugs in their code. >> left undefined behaviour). > C and C++ do not want to be portable assemblers of whatever chips, > these want to be programming languages. I have no idea what you are talking about here. C and C++ are not, and never have been, "portable assemblers" - they are high level programming languages defined by standards and abstract machines, not the behaviour of particular processors. That is precisely why trying to retrofit the language standards to match particular processor hardware is such a bad idea. > are no reason why something that hardware does not handle (despite > its cost to performance if handled is hard to notice) should be left > undefined behavior. Of course there is. You have got this whole thing ass-backwards. (For any Americans here who don't understand that phrase, it means putting the donkey behind the cart.) C and C++ leave a number of things as "implementation dependent", when it is reasonable to expect that all hardware will be able to efficiently support /some/ behaviour, but the details of what the behaviour is can vary a lot. They leave some things as "undefined behaviour" when it is reasonable to expect that not all hardware can efficiently support /any/ defined and consistent behaviour. But there are many other reasons for having "undefined behaviour". That also includes cases where there simply is no sensible concept of "correct" behaviour, or even of expected and useful behaviour. Division by zero (for integers) is an example. No matter what definition you pick for it, it will be wrong. That includes trapping, or returning particular values, or setting processor flags. If you try to divide by zero, you have a bug in your code. The bug is /before/ the division. That applies equally to languages that do run-time checking and throw exceptions for division by zero - your code is nonsensical, and therefore wrong. Hardware cannot fix the bug in your program. The compiler cannot fix the bug, nor can run-time checks. (Sometimes these can help you find the bug.) It makes no sense for a language to try to define behaviour on division by zero - it is far better for the language to say "don't do that". Why someone would then want the language to say what will happen when you do something contrary to the rules of the language, is beyond my understanding. > on job market. But the work is anyway plentiful, largely badly or > not done and I could do something more interesting instead of > helping to track down some undefined behavior. What a load of drivel. No one wants /pointless/ undefined behaviour - that is a tautology. I have already explained why undefined behaviour in cases like division by zero and signed integer overflow is directly /useful/ and /beneficial/. If you don't accept my arguments there, fair enough. But at least accept that some programmers view things that way. No programmer has to memorise undefined behaviour. /Everything/ is undefined except the behaviours that are explicitly defined by the language (and/or documented extensions in a particular tool). You have to memories the /defined/ behaviours in C and C++ - this is called "learning the language". If you don't know what the code you are writing actually means, you are not doing your job correctly. (Of course we all make mistakes sometimes - part of the job is finding and fixing these.) And the examples discussed here, such as division by zero, are so obviously incorrect code that you should not have to learn the rules specifically. (There are certainly things in the C and C++ standards that are explicitly labelled as "undefined behaviour" that could be changed to required diagnostics for the compiler or linker, or perhaps to implementation-dependent behaviour or even fully defined behaviour. But division by 0 and signed integer overflow are not amongst them.) |
David Brown <david.brown@hesbynett.no>: Mar 10 04:13PM +0100 On 10/03/2023 14:10, Michael S wrote: >> rather unlikely. > Today .jpg viruses are indeed not likely. But PDF is inherently more vulnerable > than jpeg. Yes. JPEG format does not include executable code of any kind, so it's use as a malware format relies on bugs in JPEG handling software that incorrectly execute data in the file as code. This is how the infamous jpeg viruses worked on Windows (relying on bugs in Internet Exploder, IIRC). Many other file formats /do/ support executable code. Some of Microsoft's font file formats, for example, can happily include any kind of executable code. > which is Turing-complete programming languge. Above that PDF often contain > other embedded formats that are expected to be displayed by the viewer. > And above all that many PDF viewers support JS scripts in the documents. PDF files support only a very small fraction of Postscript, and are not (AFAIUI) suitable for malware in themselves. At worst, you might have a PDF that contains infinite loops or other denial-of-service attacks. Postscript files are much more powerful in terms of their programming language. But as you say, the key vector for PDF malware is embedded Javascript. PDF files can also contain built-in attachments - if an attached executable file can be run from Javascript via an insecure reader, or if the user can be persuaded to start it themselves, then there are few limits to how bad it can be. |
kalevi@kolttonen.fi (Kalevi Kolttonen): Mar 10 03:14PM > No, most C or C++ programmers would /not/ say that. Most have no idea > what an "accumulator" is. I am a quite lousy C programmer who knows only the basics of C++, but I have also programmed a little bit in assembly. I certainly do know how a simple CPU works, and I'd expect this to be very common knowlegde among C programmers. It is no rocket science unless you intend to get intimately familiar with all the features of a modern, complex CPU. The "register" keyword is probably the closest that C language comes to a CPU and the compilers are free to ignore "register", but if you learn the C language, you should also learn the basics of how a CPU and memory access works. It is also good to know something about how a C compiler works, so that you understand how simple C programs are translated into assembly. When I went to my University over 20 years ago, all these things were required knowlegde very early in the education process. br, KK |
David Brown <david.brown@hesbynett.no>: Mar 10 04:38PM +0100 On 10/03/2023 16:01, Scott Lurndal wrote: >> are a few idiomatic cases, like "unsigned int x = -1;".) > For that: > unsigned int x = ~0u; Yes, but the advantage of the "-1" version is that it scales automatically to the size of the variable - you don't need to write "~0ull" or whatever. |
David Brown <david.brown@hesbynett.no>: Mar 10 04:45PM +0100 On 10/03/2023 16:14, Kalevi Kolttonen wrote: > a simple CPU works, and I'd expect this to be very common knowlegde > among C programmers. It is no rocket science unless you intend to > get intimately familiar with all the features of a modern, complex CPU. Most programmers know nothing about assembly. And even many of those that do, would not be particularly familiar with an "accumulator" - or would use the term incorrectly. Most modern processors are not accumulator based. (x86 has a heritage that stretches back to an accumulator-based model, but has long ceased to follow that style.) (I think it is good for C and C++ programmers to have some familiarity with assembly on their main target processors.) > to a CPU and the compilers are free to ignore "register", but if you > learn the C language, you should also learn the basics of how a > CPU and memory access works. Yes (although compilers can't quite ignore "register" - they are obliged to complain if you try to take the address of a "register" variable or parameter). > that you understand how simple C programs are translated into > assembly. When I went to my University over 20 years ago, all these > things were required knowlegde very early in the education process. I agree. But I don't think many programmers /are/ familiar with the workings of processors. I have no statistics, but I don't think more than a fraction of C and C++ programmers learned programming from comprehensive university-level education - short courses and self-learning are, I think, more common. |
David Brown <david.brown@hesbynett.no>: Mar 10 04:48PM +0100 On 10/03/2023 16:23, Malcolm McLean wrote: > That might not go in. I'm showing what the run time check would be, on > a simple imaginary but realistic basic processor. That's more effective than > talking in vague generalities. Perhaps that is true. |
kalevi@kolttonen.fi (Kalevi Kolttonen): Mar 10 04:02PM > than a fraction of C and C++ programmers learned programming from > comprehensive university-level education - short courses and > self-learning are, I think, more common. Long time ago I heard of a Java programmer who found C to be totally incomprehensible. Java was the only language he knew, and as Java is full of mandatory classes/objects everywhere, his understanding of computer's inner workings was based very heavily on the notion of classes/objects. For instance, he could not see how C code could even exist "outside" any classes. I am convinced he had no understanding of how Von Neumann machine based CPUs work either. He could have been a brilliant Java programmer, though. br, KK |
David Brown <david.brown@hesbynett.no>: Mar 10 04:53PM +0100 On 10/03/2023 16:12, Malcolm McLean wrote: > with poor diagnostic facilites. Then you can put a division by zero in the code, > and if the error message triggers, you know that execution has reached that point. > Sometime you ned to resort to these stratagems. I agree that trapping can be useful as a debugging aid - though it is important to remember that the bug is /not/ at the point when the trap comes, but some unknown time prior to that. But it would be wrong for the language to /require/ such behaviour. You can be looking for different features at different stages of development, or for different kinds of program and different kinds of target. Leaving the behaviour undefined in the standards gives users and toolchain developers the freedom to provide options and better features, while defining the behaviour limits things and leaves some platforms obliged to generate inefficient code unnecessarily, while simultaneously stopping more powerful tools from proving more aid to the developer. > If division by zero raises a non-signalling NaN, you might want to use that to generate > a NaN, which sometimes can be useful. If you are talking about floating point here, the mathematical and programming model is significantly different from integer arithmetic. |
scott@slp53.sl.home (Scott Lurndal): Mar 10 04:36PM >Yes, but the advantage of the "-1" version is that it scales >automatically to the size of the variable - you don't need to write >"~0ull" or whatever. You can always use ~0ull and it will be truncated as needed. But I'd argue the programmer should be aware of that caveat when writing the code. |
scott@slp53.sl.home (Scott Lurndal): Mar 10 04:27PM >usage, then division by zero can be a "don't care". It might cause a >reset, a hang, a nonsense result, a consistent and documented result - >whatever. Here's what one of the Burroughs mainframes did with DIV: ==== Function ==== The Divide instruction divides the contents of one memory location **B** by the contents of a second memory location **A** storing the remainder in the **B** data field and storing the quotient in a third memory location **C**. The length of the __dividend__ field must be greater than the length of the __divisor__ field (**BF** greater than **AF**). The length of the __quotient__ field is the difference in length of the **A** and **B** fields (**BF** - **AF**). If the result is too large to fit into the __quotient__ field or if **BF** is not greater than **AF**, the division is not performed, the contents of **B** and **C** are unchanged, the [[processor_state:comparison_flags|Comparison Flags]] are unchanged and the [[processor_state:overflow_flag|Overflow Flag]] is set. If the absolute value of the __divisor__ is not greater than the absolute value of the equivalent number of leading digits of the __dividend__, the division is not performed and the [[processor_state:overflow_flag|Overflow Flag]] is set with the [[processor_state:comparison_flags|Comparison Flags]] remaining unchanged. Note that a __divisor__ which is zero will fail this test and the [[processor_state:overflow_flag|Overflow Flag]] will be set. Store the absolute value of the quotient when the __quotient__ field data type is unsigned (**UN** or **UA**). Store the standard **EBCDIC** form of the result sign as the first digit of the result when the __quotient__ field data type is **SN**. Fill the zone digit with the **EBCDIC** numeric subset code (**F**) when the __quotient__ field data type is alphanumeric (**UA**). Store the absolute value of the remainder when the __remainder__ field data type is unsigned (**UN** or **UA**). Fill the zone digit with the **EBCDIC** numeric subset code (**F**) when the __remainder__ field data type is alphanumeric (**UA**). When the __remainder__ field data type is signed numeric (**SN**), then the absolute value of the remainder is stored after the __remainder__ sign digit, leaving the __remainder__ sign digit with the original contents of the __dividend__ sign digit. Only the numeric digits of an alphanumeric field enter into the operation. Unsigned (**UN** or **UA**) operands are assumed to be positive. The sign of a __quotient__ is positive if the sign of the __divisor__ and the __dividend__ are the same or the __quotient__ is zero, otherwise the sign is negative. If the __dividend__ data type is **SN**, the sign of the __dividend__ will be left unchanged in memory and will thus become the sign of the __remainder__. Therefore this final __remainder__ sign could be other than *C* or *D* and a __remainder__ of zero magnitude could have a negative sign. If the operand data contains undigits other than in the sign digit, cause an //Invalid Arithmetic Data// fault. See [[compatibility_notes:a.16|Compatibility Notes A.16]]. ==== Comparison Flags ==== In all cases except overflow, set the [[processor_state:comparison_flags|Comparison Flags]] to indicate whether the sum is greater than (**HIGH**), equal to (**EQUAL**) or less than (**LOW**) zero. |
kalevi@kolttonen.fi (Kalevi Kolttonen): Mar 10 04:48PM > Typical case is that programmer did invoke undefined behavior but > program appears to work like he wanted ... until it does not. Who > benefits from that? Apologies for mentioning UB in C in a C++ newsgroup, but everybody needs to realize that it is very important to avoid UB at all costs. The following is a true horror story and much worse things than this could happen. When I was working as a system administrator, we used Cyrus IMAPD as email storage. I guess the Cyrus version was 2.4.17 and if I remember right, at the time it was running on Red Hat Enterprise Linux 7. This Cyrus version had worked flawlessly for years, but unfortunately the C code had UB. In the code handling mailboxes database, I think I remember there was an incorrect invocation of strcpy, a well-known C standard library function. The strcpy manual page clearly states: The strings may not overlap, and the destination string dest must be large enough to receive the copy The strings did overlap, but nobody ever noticed it, since glibc implementation of strcpy guarded against this mistake. As we know, with UB, anything could happen, including producing the correct behavior as intended by the programmer. Then one day glibc maintainers decided to optimize their AMD64 implementation of strcpy, and the guards for detecting overlapping strings were removed. Next thing we knew was that the mailboxes database was getting more and more corrupted, ending up in such a bad state that Cyrus IMAPD would no longer start. It was not a fun task to restore the mailboxes database into working order. Lesson: Never rely on UB! br, KK |
Muttley@dastardlyhq.com: Mar 10 04:14PM On Fri, 10 Mar 2023 05:04:51 -0800 (PST) >Swift throws objects that have Error interface. Error and NSError are >exchangeable and interoperate but neither has some kind of generic >catches that can structurally bind to whatever is thrown. Thanks for the heads up Mr Apple. Usually reflection means the runtime can check types on the fly and do the exact thing I mentioned but I've never used Obj-C so it was a guess. |
Jivanmukta <jivanmukta@poczta.onet.pl>: Mar 10 11:49AM +0100 enum index_what : unsigned short { i_actions, i_methods, i_functions, i_properties, i_variables }; // don't change the order constexpr unsigned short index_what_num = i_variables - i_actions + 1; |
Jivanmukta <jivanmukta@poczta.onet.pl>: Mar 10 11:26AM +0100 W dniu 9.03.2023 o 19:03, Jivanmukta pisze: > Question: how to TRACE values: identifiers[what].end(), ids.begin(), > ids.end()? I failed to cast them to unsigned long. Could you answer me please? I Suspect there's something wrong with ids. |
Jivanmukta <jivanmukta@poczta.onet.pl>: Mar 10 11:41AM +0100 W dniu 10.03.2023 o 02:33, Andrey Tarasevich pisze: >> ids.begin(), ids.end()); > The obvious and the most likely candidate is the value of `what`. What > is the value of `what`? Does it go out of `identifiers` range? TRACE("ids.size: " << ids.size()); if (ids.size() > 0) { // nie wiem czy to ma sens? int what = stoi(attr_what.value()); TRACE("what == " << what); TRACE("size == " << identifiers[what].size()); TRACE("before insert"); identifiers[what].insert(identifiers[what].end(), ids.begin(), ids.end()); TRACE("after insert"); // !!!błąd - czasami ids puste i wtedy SIGV } what == 1 is TRACEd. Then SIGSEGV. size is not TRACEd. What does it mean? I have defined: identifiers_vector identifiers[index_what_num] typedef std::vector<std::string> strvector; class identifiers_vector : public strvector { public: identifiers_vector() = default; identifiers_vector(const strvector &v) { for (std::string s : v) { push_back_identifier(s); } } int index_of_identifier(const std::string &needle) const { return index_of_string(*this, needle); } bool has_identifier(const std::string &needle) const { return index_of_identifier(needle) >= 0; } void push_back_identifier(const std::string &s) { if (!has_identifier(s)) { push_back(s); } } void sort_by_descending_length() { std::sort(begin(), end(), [](std::string a, std::string b) { return a.length() > b.length(); }); } }; |
Paavo Helde <eesnimi@osa.pri.ee>: Mar 10 01:15PM +0200 10.03.2023 12:26 Jivanmukta kirjutas: >> Question: how to TRACE values: identifiers[what].end(), ids.begin(), >> ids.end()? I failed to cast them to unsigned long. > Could you answer me please? I Suspect there's something wrong with ids. Looks like so. Most probably you have corrupted your data by using code which has UB. As you do not want to show your code, nobody can guess where the bug is. My crystal ball says it is on line 42, but then again I have not oiled it for a while. |
Paavo Helde <eesnimi@osa.pri.ee>: Mar 10 01:23PM +0200 10.03.2023 12:41 Jivanmukta kirjutas: > puste i wtedy SIGV > } > what == 1 is TRACEd. Then SIGSEGV. size is not TRACEd. What does it mean? Most probably it means that the size of 'identifiers' is less than 1. Why don't you single-step through your code in the debugger and monitor the data values directly? From a short look on your code it looks like you are trying to reinvent std::set or std::map, poorly. Maybe you should start from some book covering the C++ standard library? |
Jivanmukta <jivanmukta@poczta.onet.pl>: Mar 10 12:42PM +0100 W dniu 10.03.2023 o 12:23, Paavo Helde pisze: >> } >> what == 1 is TRACEd. Then SIGSEGV. size is not TRACEd. What does it mean? > Most probably it means that the size of 'identifiers' is less than 1. how is it possible? here declaration of a parameter identifiers: static bool load_cache(string cache_filename, const char *parent_node_name, const char *nodes_name, wstring dir, wstring result_dir, string options, string delim, bool &cached, identifiers_vector identifiers[index_what_num], apostrophed_strings_maps_map *strings) { > Why don't you single-step through your code in the debugger and monitor > the data values directly? because i can't - wherever i set breakpoint i have sigsegv |
Jivanmukta <jivanmukta@poczta.onet.pl>: Mar 10 02:39PM +0100 W dniu 10.03.2023 o 12:42, Jivanmukta pisze: >> Why don't you single-step through your code in the debugger and >> monitor the data values directly? > because i can't - wherever i set breakpoint i have sigsegv There's something wrong with my identifiers array. Here how I allocate it: for (auto dir = vendor_frameworks_dirs.begin(); dir != vendor_frameworks_dirs.end(); ++dir) { *dir = normalize_path(*dir, delim); TRACE("allocating identifiers vectors"); identifiers_vector *ptr = new identifiers_vector[index_what_num]; framework_identifiers.insert(make_pair(*dir, ptr)); TRACE("identifiers vectors allocated and inserted"); } Could *ptr be unitialized???!!! I have defined: identifiers_vector identifiers[index_what_num] typedef std::vector<std::string> strvector; class identifiers_vector : public strvector { public: identifiers_vector() = default; identifiers_vector(const strvector &v) { for (std::string s : v) { push_back_identifier(s); } } int index_of_identifier(const std::string &needle) const { return index_of_string(*this, needle); } bool has_identifier(const std::string &needle) const { return index_of_identifier(needle) >= 0; } void push_back_identifier(const std::string &s) { if (!has_identifier(s)) { push_back(s); } } void sort_by_descending_length() { std::sort(begin(), end(), [](std::string a, std::string b) { return a.length() > b.length(); }); } }; |
Paavo Helde <eesnimi@osa.pri.ee>: Mar 10 04:54PM +0200 10.03.2023 13:42 Jivanmukta kirjutas: > result_dir, string options, string delim, > bool &cached, identifiers_vector > identifiers[index_what_num], apostrophed_strings_maps_map *strings) { Here you are using C-style arrays or pointers, that's probably the root of the problems. This parameter declaration identifiers_vector identifiers[index_what_num] is equivalent to identifiers_vector* identifiers by the ancient C rules. Throw it ought and use either identifiers_vector& identifiers or std::vector<identifiers_vector>& identifiers depending on whether you want one or many identifiers_vector vectors, I am not able to tell which is what you want. Now you have ordered many and probably passing one or zero of them. Also, remove any mention of 'new' and raw pointers from your code, there is no need to complicate ones' life without any reason. The std::vector is perfectly capable of allocating any needed memory for its data internally. Also note that deriving from std::vector is usually considered bad style as it is not meant for deriving, but that's another topic. |
Jivanmukta <jivanmukta@poczta.onet.pl>: Mar 10 04:54PM +0100 W dniu 10.03.2023 o 12:42, Jivanmukta pisze: >> Why don't you single-step through your code in the debugger and >> monitor the data values directly? > because i can't - wherever i set breakpoint i have sigsegv When the code is executed: TRACE("what == " << what); (void)identifiers[what]; TRACE("after identifiers[what]"); TRACE("size == " << identifiers[what].size()); what == 1 and "after identifiers[what]" are TRACEd. Then SIGSEGV. Why .size() causes SIGSEGV? Does it means there's something wrong with identifiers[1] object? I can't debug, I must TRACE. |
Andrey Tarasevich <andreytarasevich@hotmail.com>: Mar 10 03:48PM -0800 On 03/10/23 2:19 PM, Pawel Por wrote: >> argument and chooses move constructor for this initialization. > I think this is what I do here, "unname" a reference via std::move. Am I correct ? > ml.push_front(std::move(dog)); Yeah, except that `dog` here is not a reference but rather a full-blown object. But the effect is pretty much the same: that `std::move` "unnames" an lvalue, turns it into an xvalue. > That's why I expect the move constructor to be called while passing an argument to MyList<T>::push_front(T&&). No, your expectations are completely unjustified. The constructor will not be called here. Again: passing by reference never calls any constructors, unless there's a conversion involved. And in your example there's no conversion. -- Best regards, Andrey |
You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page. To unsubscribe from this group and stop receiving emails from it send an email to comp.lang.c+++unsubscribe@googlegroups.com. |
No comments:
Post a Comment