- string_view problem - 12 Updates
Ralf Goertz <me@myprovider.invalid>: Jun 27 08:43AM +0200 Am Wed, 26 Jun 2019 12:11:57 -0000 (UTC) > > tester=permuter+permuter.substr(0,permuter.size()-1); //* > If you are aiming for the utmost extreme speed and optimization, lines > like this should *immediately* ring alarm bells in your head. I thought a little about that. While I can see your point, I remember having heard (I am no professional programer, so I never properly learnt C++) that every variable should be declared where it is first used. Which somewhat conflicts with your statement: > heavy algorithms is going to kill the efficiency. (There are > situations where it simply can't be avoided, but if you can avoid it, > you should.) I think I also heard that compilers usually generate code that optimises away all but the first allocation. In my case the size of permuter doesn't change within the loop so the memory allocated in the first run could simply be reused in the subsequent runs, no? |
Juha Nieminen <nospam@thanks.invalid>: Jun 27 07:57AM > having heard (I am no professional programer, so I never properly learnt > C++) that every variable should be declared where it is first used. > Which somewhat conflicts with your statement: That's a principle relating to syntax, not relating to where objects should be created. In old C variables needed to be declared at the beginning of the function, even if they were only needed later in the code. In other words, you had to write like: void foo() { int i, j; // lots of code here... i = 5; j = 10; // etc... } In C++, and in modern C, it's recommended for clarity to do it like: void foo() { // lots of code here... int i = 5, j = 10; // etc... } Obviously the advise is not saying that if you have some buffer of fixed size for calculations, and this buffer is used inside the innermost loop of a heavy algorithm, you should create the buffer again and again inside that inner loop. > away all but the first allocation. In my case the size of permuter > doesn't change within the loop so the memory allocated in the first run > could simply be reused in the subsequent runs, no? You are not only creating the 'tester' string in the inner loop, but you are also creating temporary strings. These strings get destroyed at the end of the scope where they are created. The compiler isn't going to optimize that away. |
Ralf Goertz <me@myprovider.invalid>: Jun 27 10:37AM +0200 Am Thu, 27 Jun 2019 07:57:22 -0000 (UTC) > you are also creating temporary strings. These strings get destroyed > at the end of the scope where they are created. The compiler isn't > going to optimize that away. So the following would be okay and the allocation is done just once? Also, will it be be initialised as the constructor says although this is not needed at all? size_t const size=permuter.size(); do { bool found(false); boyer_moore_searcher b(permuter.begin(),permuter.end()); for (int i=0;i<result.size();++i) { string tester(size*2-2,' '); //or vector<char> copy(result[i].begin()+1,result[i].end(),tester.begin()); copy(result[i].begin() ,result[i].end()-1,tester.begin()+size-1); if (b(tester.begin(),tester.end()).first!=tester.end()) { found=true; break; } } … } while (next_permutation(permuter.begin()+1,permuter.end())); |
James Kuyper <jameskuyper@alumni.caltech.edu>: Jun 27 07:26AM -0400 On 6/27/19 3:57 AM, Juha Nieminen wrote: > Ralf Goertz <me@myprovider.invalid> wrote: ... >> I thought a little about that. While I can see your point, I remember >> having heard (I am no professional programer, so I never properly learnt >> C++) that every variable should be declared where it is first used. That's not always possible, if the scope for the variable needs to be larger than the scope in which the first use occurs. The key requirement is that it be declared no later than first use, and preferably as late as possible, consistent with the way it will be used. > should be created. In old C variables needed to be declared at the > beginning of the function, even if they were only needed later in > the code. ... That may have been true in the very earliest days of C, but it hasn't been true since at least K&R C, which allowed you to declare objects at the start of any compound statement, not just at the start of the outermost compound statement of a function: int func(in) int in; { if(in) { int total; ... C++ added the ability to intermingle declarations and statements, which was later adopted by C99 as well. However, the freedom to place declarations in positions other than the start of the function dates back at least to K&R C. ... > size for calculations, and this buffer is used inside the innermost loop > of a heavy algorithm, you should create the buffer again and again inside > that inner loop. Actually, I would not hesitate to define a buffer as a C style array inside a loop, if there's no need for the contents of that buffer to carry over from one pass of the loop to another. I would normally expect the compiler to simply reuse the same piece of memory during each pass through the loop, at no extra cost. This is particularly true if there's a need to zero initialize the buffer between passes, which can be achieved by simply providing a {0} initializer for the buffer, rather than having to call memset(). In C++, I would hesitate to define a standard container in that location, because I couldn't rely on the compiler to figure out how to optimize away the constructor and destructor calls that nominally must occur at the start and end of each pass through the loop. |
Tim Rentsch <tr.17687@z991.linuxsc.com>: Jun 27 05:49AM -0700 > My idea for the program was to go over all linear arrangements (with the > first element fixed) and store those which were no circular permutations > of any of the previously stored arrangements. [...] This algorithm does a lot more work than is needed. The program can be much faster (I ran a 14 14 input in about 0.65 seconds). Hint: it isn't necessary to store any previous results to check if next_permutation() gives a string none of whose rotations have been seen before. |
Manfred <noname@add.invalid>: Jun 27 04:15PM +0200 On 6/27/2019 1:26 PM, James Kuyper wrote: > larger than the scope in which the first use occurs. > The key requirement is that it be declared no later than first use, and > preferably as late as possible, consistent with the way it will be used. The "preferably" part is subject to debate, and I don't think it should be stated as a general recommendation. That is probably what you meant with "consistent with the way it will be used", but it doesn't sound that clear. Some experienced programmers consider the habit of intermingling declarations and statements as sloppy practice, and this is not just a matter of being old fashioned. As an example, when I have to implement an algorithm of some sort, I take care to identify the data elements that compose the state of the logic involved. Managing a component's state requires proper discipline, so I consider such state data best declared at the top of the context where the algorithm is defined (which may well be a nested scope), so as to make clear that it belongs to a whole which is key to the functioning of the following procedure. |
Ralf Goertz <me@myprovider.invalid>: Jun 27 04:52PM +0200 Am Thu, 27 Jun 2019 05:49:37 -0700 > > of the previously stored arrangements. [...] > This algorithm does a lot more work than is needed. The program > can be much faster (I ran a 14 14 input in about 0.65 seconds). What number of permutations did you get? > Hint: it isn't necessary to store any previous results to check > if next_permutation() gives a string none of whose rotations have > been seen before. I am surprised to hear that. How else do I know that when I get to the permutation ABBA in the example above, it is the same as AABB so that I must discard it? The example also shows that the equivalence classes are in general not of the same size which is the very reason the formula is rather complicated. (The reference I gave corrected an erroneous previous attempt by another author.) 15 years ago I thought a lot about circular permutations and I still don't think I have an intuitive understanding of them. At that time I lacked the ability (and the computer power) to enumerate them. 14 14 wouldn't have sufficed anyway. The goal was to prove that for certain inputs there where permutations without neighbouring elements of the same category. The smallest relevant example (which is [probably] the only such example where there is no such permutation) is 3 1 1. I ended up proving the theorem for infinitely many (but not all) inputs with a completely different approach. But the idea of coming to grips with the problem by looking at the actual permutations somehow stuck. So I am not only surprised that there is a much simpler algorithm but also very impressed that you found it with apparent ease. Regrettably, your hint doesn't help me enough. So could you please elaborate? |
James Kuyper <jameskuyper@alumni.caltech.edu>: Jun 27 08:59AM -0700 On Thursday, June 27, 2019 at 10:15:55 AM UTC-4, Manfred wrote: > > preferably as late as possible, consistent with the way it will be used. > The "preferably" part is subject to debate, and I don't think it should > be stated as a general recommendation. I do. > That is probably what you meant with "consistent with the way it will be > used", I don't think so. The main thing I meant by that comment is that each variable must be declared in a location that gives it sufficient scope to be referred to by name in all of the parts of the code that need to refer to it by name. That requirement sometimes constrains how close a declaration can be to the point of first use. That isn't the issue you're talking about. > Some experienced programmers consider the habit of intermingling > declarations and statements as sloppy practice, and this is not just a > matter of being old fashioned. I know that they do, and I disagree. I think that declaring an identifier as close as possible to the point of first use makes it easier to find and understand, and minimizes the opportunities for conflict with other identifiers. Some people feel that a different location, such as the top of the function, would make it easier to find, but I haven't seen a reasonable explanation for that feeling. |
Tim Rentsch <tr.17687@z991.linuxsc.com>: Jun 27 09:57AM -0700 > where the algorithm is defined (which may well be a nested scope), > so as to make clear that it belongs to a whole which is key to the > functioning of the following procedure. I agree with your sentiments. The policy of declaring variables "preferably as late as possible" is overly simplistic, and over sold. |
Tim Rentsch <tr.17687@z991.linuxsc.com>: Jun 27 10:16AM -0700 > to the permutation ABBA in the example above, it is the same as > AABB so that I must discard it? [...] Regrettably, your hint > doesn't help me enough. So could you please elaborate? Consider looking at ABBA. It has four rotations, which is to say the original plus three non-zero rotations: ABBA BBAA BAAB AABB The last of these comes lexicographically before ABBA. Therefore we must have seen AABB earlier, and ABBA is a duplicate. (The "earlier" in the above statement relies on next_permutation() giving its results in lexicographic order, but if what we're doing is counting them the order doesn't matter as long as we eventually get every permutation - we will count the "smallest" rotations, and skip the rest, in whatever order they might appear.) |
Ralf Goertz <me@myprovider.invalid>: Jun 27 08:10PM +0200 Am Thu, 27 Jun 2019 10:16:17 -0700 > is counting them the order doesn't matter as long as we eventually > get every permutation - we will count the "smallest" rotations, > and skip the rest, in whatever order they might appear.) Great, thanks! I still need 2.7 seconds for 14 14, but that was not doable before. Maybe you've got some insight as to why I still seem to be much slower than you (assuming our hardware is comparable). #include <iostream> #include <vector> #include <algorithm> #include <iterator> #include <string> using namespace std; template<typename T> bool has_smaller_permutation(T const &v) { static T rotator(v.size(),' '); for (auto i=1;i<v.size();++i) { copy(v.begin(),v.begin()+i,copy(v.begin()+i,v.end(),rotator.begin())); if (rotator<v) return true; } return false; } int main(int argc, char *argv[]) { vector<char> permuter; //string permuter; char c='A'; for (auto i=1;i<argc;++i) { auto n=stoi(argv[i]); permuter.insert(permuter.end(),n,c++); } if (!permuter.size()) { cout<<0<<endl; return 0; } vector<string> result; size_t sum=0; do { if (!has_smaller_permutation(permuter)) { //copy(permuter.begin(),permuter.end(),ostream_iterator<char>(cout,"")); cout<<'\n'; ++sum; } } while (next_permutation(permuter.begin()+1,permuter.end())); cout<<sum<<endl; } |
Keith Thompson <kst-u@mib.org>: Jun 27 12:06PM -0700 > On 6/27/19 3:57 AM, Juha Nieminen wrote: [...] > was later adopted by C99 as well. However, the freedom to place > declarations in positions other than the start of the function dates > back at least to K&R C. [...] Yes. Historical note: In the 1975 C Reference Manual, declarations are allowed only at the top of a function definition. Compound statements cannot contain declarations. K&R1, 1978, redefined compound statements to allow zero or more declarations followed by zero or more statements. https://www.bell-labs.com/usr/dmr/www/cman.pdf -- Keith Thompson (The_Other_Keith) kst-u@mib.org <http://www.ghoti.net/~kst> Will write code for food. void Void(void) { Void(); } /* The recursive call of the void */ |
You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page. To unsubscribe from this group and stop receiving emails from it send an email to comp.lang.c+++unsubscribe@googlegroups.com. |
No comments:
Post a Comment