- Improving runtime by preventing a hashtable from growing - 5 Updates
- [QT creator on "nix"] - "outer" padding within arrays of structures - was getting a strict 8 bit (1 byte) array with no padding - 2 Updates
- [QT creator on "nix"] - getting a strict 8 bit (1 byte) array with no padding - 9 Updates
- Almost Always Auto? (was Re: Substitution failure can be an error) - 2 Updates
- Where are the exception-objects allocated? - 1 Update
| Jorgen Grahn <grahn+nntp@snipabacken.se>: Nov 23 09:38AM On Fri, 2019-11-22, Paul wrote: > Given a sequence in the form of a vector<int> and a non-negative integer r, > find the length of the longest subsequence which has range <= r. > By "range", I mean the stats definition of max - min. Surely that problem must be possible to express in a more understandable way. What does "the stats definition of max - min" mean? What does "stats" mean, for that matter? Since max(subseq) <= max(seq) and min(subseq) >= min(seq), it seems to me the answer is always "the full sequence" or "there isn't one". /Jorgen -- // Jorgen Grahn <grahn@ Oo o. . . \X/ snipabacken.se> O o . |
| Paul <pepstein5@gmail.com>: Nov 23 04:40AM -0800 On Saturday, November 23, 2019 at 9:38:42 AM UTC, Jorgen Grahn wrote: > mean? What does "stats" mean, for that matter? > Since max(subseq) <= max(seq) and min(subseq) >= min(seq), it seems to > me the answer is always "the full sequence" or "there isn't one". Thanks for your interest. The answer depends on both the vector and r. The vector can be considered as a sequence with vec[0], vec[1] being the first two terms for example. As a sequence it has lots of subsequences. Consider all the subsequences S which have the property that max(S) - min(S) <= r. The problem is to find the maximum length of S for all subsequences S which have that property. Another way to say it is that we want the maximum length of all subsequences with range <= r. However, that might be confusing because there's a set-theoretic definition of "range" and a statistical definition of "range" and these definitions are very different. I mean the statistical definition of "range". Paul Epstein |
| Paul <pepstein5@gmail.com>: Nov 23 04:48AM -0800 On Saturday, November 23, 2019 at 12:41:08 PM UTC, Paul wrote: > However, that might be confusing because there's a set-theoretic definition > of "range" and a statistical definition of "range" and these definitions > are very different. I mean the statistical definition of "range". As an example, remember that my function is called hashLongest hashLongest({2,1,3,1}, 1) = 3. [2, 1 ,1] is a subsequence which has range 1. That subsequence has length 3. If any subsequence has range <=1 then its length will always be <= 3. That is why 3 is the answer for this example. Paul |
| "Öö Tiib" <ootiib@hot.ee>: Nov 23 02:20PM -0800 On Friday, 22 November 2019 19:31:22 UTC+2, Paul wrote: > The code is below. > I solved it by hashing as this is much faster than sorting for long vectors. > (I tested for vectors of size >= 10 million). Perhaps your vectors were biased to have only narrow range of integers in those and searched for ranges were very low (like 1 or 2) otherwise your solution is clearly sub-optimal solution of the given problem. The std::sort of vector of 10 millions of really random integers tends to be about twice faster than counting equal elements of it into hash table. Also searching for longest subsequence in sorted vector is O(length) linear while your search from hashtable is O(length*range) rectangular. |
| Paul <pepstein5@gmail.com>: Nov 23 03:02PM -0800 On Saturday, November 23, 2019 at 10:20:57 PM UTC, Öö Tiib wrote: > hash table. Also searching for longest subsequence in sorted vector is > O(length) linear while your search from hashtable is O(length*range) > rectangular. Thanks for your interest. My experimental results don't agree with your post. ***However, there is definitely the possibility that my testing is flawed.*** Accordingly, I will post my entire code, including testing. I will report the results for r = 100 although many other values of r give similar results. The vector has a size of 100 million. It is generated by 100 million calls to std::rand(). The sorting algorithm code and hashtable code will be posted below. The two algorithms were tested against the same vector. The longest subsequence with a range of <= 100 had length 310012 The hashing algorithm took 2.4 seconds and the sorting algorithm took 8.3 seconds. The IDE was a free version of Visual Studio // ************************ ALGOS AND TEST ****************************** // Given a sequence expressed as a vector, and given a non-negative integer r: // Find the maximum length of all subsequences which have a range <= r. // range is defined to be the maximum - minimum as in statistics. // This solution should have O(rN) // Originally, this was in a coding test with r = 1. #include <iostream> #include <vector> #include <unordered_map> #include <algorithm> #include <chrono> #include <cstdlib> #include <utility> #include <climits> int hashLongest(const std::vector<int>& vec, int r) { // For each int in the vector, count how many times it occurs. std::unordered_map<int, int> allCounts; for (int v : vec) ++allCounts[v]; // For each int k in the vector, count how many values of x occur // such that x >= k and x <= k + r std::unordered_map <int, int> leftCounts; // The maximum value of leftCounts is the result int Max = 0; for (auto a : allCounts) { for (int i = 0; i <= r; ++i) if ((long long)a.first + i > INT_MAX) break; else // if (allCounts.find(a.first + i) != allCounts.end()) leftCounts[a.first] += allCounts[a.first + i]; Max = std::max(Max, leftCounts[a.first]); } return Max; } // Previous sorting approach to solve the same problem. // length is continually incremented. int sortLongest(std::vector<int>& vec, int r) { std::sort(vec.begin(), vec.end()); int length = 0; for (int behind = 0; behind < vec.size(); ++behind) { int ahead = behind + length; // Maximal length uncovered so far. // After sorting, see if lower point of the maximal length subinterval is within r of the uppermost point // If so we can extend the length and move to the next point for testing. while (ahead < vec.size() && (long long)vec[behind] + r >= vec[ahead]) ++ahead, ++length; } return length; } // Randomly generate a testing vector // Default to a length of ten million. std::vector<int> generate(int length = 100'000'000) { std::vector<int> vec(length); for (int& i : vec) i = std::rand(); return vec; } // A given vector is tested using either the sorting method // or the hashing method. // A bool is true for the hashing method and false otherwise. // Returned is a pair whose first value is the maximum length // according to the problem constraints, and whose second value // is the number of microseconds taken. std::pair<int, int> longest(std::vector<int>& vec, int r, bool hash) { auto start = std::chrono::high_resolution_clock::now(); int maxLength = hash ? hashLongest(vec, r) : sortLongest(vec, r); auto stop = std::chrono::high_resolution_clock::now(); int time = std::chrono::duration_cast<std::chrono::microseconds>(stop - start).count(); return { maxLength, time }; } // A display function to display the information from a run void display(bool hash, std::pair<int, int> info, int length = 10'000'000, int r = 1) { std::cout << "Experiment done on a randomly generated vector of length " << length; std::cout << "\nAn attempt was made to find the maximum length of a subsequence of range " << r; std::cout << "\nThe method used was " << (hash ? "hashing" : "sorting"); std::cout << "\nThe result is " << info.first << " and the time taken in microseconds is " << info.second << std::endl; } // Test both methods. int main() { std::vector<int> vec = generate(); std::pair<int, int> hashResult = longest(vec, 100, true); display(true, hashResult, vec.size(), 100); std::pair<int, int> sortResult = longest(vec, 100, false); display(false, sortResult, vec.size(), 100); } |
| Soviet_Mario <SovietMario@CCCP.MIR>: Nov 23 01:17PM +0100 I have realized I don't know an aspect of padding vs packing management. In the other thread I have been explained of the aspects of padding/packing INSIDE a given structure/class/union and similar. But now I realize I know nothing for sure about packing in between different element within arrays of such structure/class/union and similar. for a start I haven't found a documentation about the #pragma pack directive applied to a whole array is it possible (and how) to suggest the compiler to pack items ? (*) Are there differences in the outcome of such hints depending on actual size of the packed structures ? I am figuring out a packed struct of ODD total size, when allocated in an array, it could lead to either non packed array or to addresses in ODD/EVEN pattern (is it possible ? Non asking if efficient, just possible). secondly, how much is it, reasonably, the no-padding threshold in an array of class/struct/union objects ? There is some criterion, preferably less restrictive than "powers of two" size of objects" ? I feel quite sure that 64 bit multiple are packed without any suggestion. But, for example, 32 bit ? 16 bit ? The hardware per se has no problem : in fact it can manage COMPACT arrays of native types (int, short int) without memory waste. I would hope that the same could hold for structured types of the same size, but not so sure. What is the truth ? Could packing (assuming it is possible to force the compiler to do it) lead to addressing problems (of the contained data member) ? I am not speaking about SLOWING problems, but strictly crashes or silent wrong addressing or ... dunno ! TY -- 1) Resistere, resistere, resistere. 2) Se tutti pagano le tasse, le tasse le pagano tutti Soviet_Mario - (aka Gatto_Vizzato) |
| Keith Thompson <kst-u@mib.org>: Nov 23 12:35PM -0800 > But now I realize I know nothing for sure about packing in > between different element within arrays of such > structure/class/union and similar. Packing for arrays is not a thing. Every object type has a constant size, which you can query using sizeof. Each element of an array is allocated the size of the element's type. For example, given: some_type arr[N]; it's guaranteed that sizeof (arr) == N * sizeof (some_type). (We can use that to determine the number of elements in an array: sizeof arr / sizeof arr[0]). If a non-standard #pragma pack has been applied to some_type itself, that might change the value of sizeof (some_type), but it does not change the relationship between sizeof (some_type) and sizeof (some_type[N]). The size of any type is a multiple of its alignment, which is what makes arrays possible. If some_type is struct some_type { long big; char small; }; then there is likely to be padding after small, precisely so that in an array of some_type elements each big sub-element is correctly aligned. The padding is within the structure. There is no padding between array elements. [...] -- Keith Thompson (The_Other_Keith) kst-u@mib.org <http://www.ghoti.net/~kst> Working, but not speaking, for Philips Healthcare void Void(void) { Void(); } /* The recursive call of the void */ |
| Soviet_Mario <SovietMario@CCCP.MIR>: Nov 23 12:25AM +0100 Il 22/11/19 20:11, James Kuyper ha scritto: >> I have abandoned the unsigned Bit_Field : 8 solution as soon >> as you (a lot of people) stated that u_int_8 MUST have > uint8_t here the macro or typedef is u_int8_t actually > any C99+ implementation that fails to support uint8_t, and on any C > platform where unsigned char doesn't have 8 bits, you can't do what you > want to do anyway. ok, that's a fact > the fact that you are trying to pack something that cannot possibly be > bigger than a byte, and therefore cannot have an alignment requirement > bigger than 1, that renders the pragma useless and unnecessary. when I say "in that case" I am not extrapolating, I am contextualizing to the specific case > ... >> u_int_32 AllChannels; > uint_32 maybe QT headers decorate differently here it is u_int32_t >> the files are packed. > OK - which fields are positioned differently from the way they would be > positioned if not packed? as I said, I do not have a working complete program yet, and I hope to get to a scenario where this wrong situation is avoided at all -- 1) Resistere, resistere, resistere. 2) Se tutti pagano le tasse, le tasse le pagano tutti Soviet_Mario - (aka Gatto_Vizzato) |
| James Kuyper <jameskuyper@alumni.caltech.edu>: Nov 22 11:56PM -0500 On 11/22/19 6:25 PM, Soviet_Mario wrote: >> uint8_t > here the macro or typedef is > u_int8_t actually Everyone who's been talking with you about this subject has been talking about std::uint8_t from <cstdint>. If you're talking about something else, you need to identify what it is and where is it is declared. In gcc, and in many other compilers as well, there's a command line option (-E in gcc) that produces output after pre-processing, but before any other of the translation phases, but with #line directives inserted to make sure that if you compile the output from -E, any resulting error messages will refer to the correct file. Compile one of the modules that uses u_int8_t using -E. Search the output for the first occurrence of u_int8_t, which should be a typedef declaration. What file is referred to by the last #line directive preceding that declaration? ... >>> u_int_32 AllChannels; >> uint_32 > maybe QT headers decorate differently A search of the QT online reference manual for u_int8_t turns up nothing: <https://doc.qt.io/qt-5/search-results.html?q=%22u_int8_t%22> Nor do I get anything from u_int_32: <https://doc.qt.io/qt-5/search-results.html?q=%22u_int_32%22> Whatever is the source of this typedef, I don't think it's QT. > as I said, I do not have a working complete program yet, and > I hope to get to a scenario where this wrong situation is > avoided at all I think you can avoid it perfectly without doing anything at all about it, because it can't occur. I recommend not wasting any more time worrying about it until you can come up with a plausible reason for doing so. |
| Keith Thompson <kst-u@mib.org>: Nov 22 10:04PM -0800 Soviet_Mario <SovietMario@CCCP.MIR> writes: [...] > without PRAGMA, do I have any explicit or implicit warranty > that my union be 32 bit long ? If so, PRAGMA is useless. If > not, it seems it might be useful. Fixing the spelling of uintN_t (n=8,32) and adding a tag to the union so you can use it to define objects: union u { struct { uint8_t Red, Green, Blue, Alpha; }; uint32_t AllChannels; }; (You have an anonymous struct, which is standard but wasn't added until C11, so older compilers might not support it, or might not support it by default.) You're guaranteed that the Red and AllChannels members start at the same location, the very beginning of the union. You're guaranteed that Red, Green, Blue, and Alpha are at increasing offsets in the order you defined them. It's very very likely that the struct and the union will both be exactly 32 bits. A compiler is not prohibited from adding padding between members or after the last one, but there's unlikely to be any good reason to do so in this case. The most likely way this could fail is if a compiler insisted on making the structure 64 bits. You can easily add a compile-time or run-time assertion that sizeof (struct u) == 4 && CHAR_BIT == 8 so your program won't compile, or at least won't run, in the unlikely event that it's used with an implementation that doesn't meet your requirements. There is *no* guarantee about which of the four uint8_t members correspond to which part of AllChannels. Google "endianness" if you're not familiar with this. It's very common for Red to overlap either the high-order byte or the low-order byte of AllChannels. The other 22 permutations are rare or nonexistent, but Google "PDP-11 middle-endian" for an odd case. If you're doing bitwise operations on the entire 32-bit word without caring about the order of the bytes, then that might not be an issue. Attempting to use #pragma pack is very likely to do nothing *at best*. [...] -- Keith Thompson (The_Other_Keith) kst-u@mib.org <http://www.ghoti.net/~kst> Working, but not speaking, for Philips Healthcare void Void(void) { Void(); } /* The recursive call of the void */ |
| Soviet_Mario <SovietMario@CCCP.MIR>: Nov 23 11:12AM +0100 Il 23/11/19 05:56, James Kuyper ha scritto: > Everyone who's been talking with you about this subject has been talking > about std::uint8_t from <cstdint>. If you're talking about something > else, you need to identify what it is and where is it is declared. I'm using QT who seems to like to replicate anything its own way even when avoiding specific Q-prefixes. so, HERE, including #include <sys/types.h> I find this type (everyone might guess it is equivalent to the other), u_int8_t, which finally stems from a typedef to UNSIGNED CHAR. > any other of the translation phases, but with #line directives inserted > to make sure that if you compile the output from -E, any resulting error > messages will refer to the correct file. I'm not so confident with QT environment to manually set options to the compiler as yet, I just use the ide-forms to set or unset some warning levels (I keep enabled almost all even if the outcome is very verbose) > Compile one of the modules that uses u_int8_t using -E. Search the > output for the first occurrence of u_int8_t, which should be a typedef > declaration. it is an unsigned char, I saw it DIRECTLY opening the included header file (sys/types.h) this type is also replicated in another header <stdint.h> actually, ending up always in unsigned char > <https://doc.qt.io/qt-5/search-results.html?q=%22u_int8_t%22> > Nor do I get anything from u_int_32: > <https://doc.qt.io/qt-5/search-results.html?q=%22u_int_32%22> sorry, but not an issue. > Whatever is the source of this typedef, I don't think it's QT. omg ... are you saying I am just lieing for the fun of it ??? what the heck am I supposed to know about which kind of headers has included this install ? I am using the FREE version from a .DEB package (which in turn launched an updated and installed tons of "kits" by itself"). I just opened the headers and never dreamed about editing them. Just did a search. the header file reports /* Copyright (C) 1991-2016 Free Software Foundation, Inc. This file is part of the GNU C Library. The GNU C Library is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation; either version 2.1 of the License, or (at your option) any later version. The GNU C Library is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details. You should have received a copy of the GNU Lesser General Public License along with the GNU C Library; if not, see <http://www.gnu.org/licenses/>. */ /* * POSIX Standard: 2.6 Primitive System Data Types <sys/types.h> */ and the macros and typedefs for such sized ints seem to apply for GCC >= 2.7 (but I might misunderstand) #if !__GNUC_PREREQ (2, 7) -- 1) Resistere, resistere, resistere. 2) Se tutti pagano le tasse, le tasse le pagano tutti Soviet_Mario - (aka Gatto_Vizzato) |
| Soviet_Mario <SovietMario@CCCP.MIR>: Nov 23 11:26AM +0100 Il 23/11/19 07:04, Keith Thompson ha scritto: >> not, it seems it might be useful. > Fixing the spelling of uintN_t (n=8,32) and adding a tag to the union so > you can use it to define objects: again ... Keith, I did not edit the header manually, and sized ints HERE are spelled as I wrote. ...omissis... #ifdef __USE_MISC /* Old compatibility names for C types. */ typedef unsigned long int ulong; typedef unsigned short int ushort; typedef unsigned int uint;
Subscribe to:
Post Comments (Atom)
|
No comments:
Post a Comment