| Bonita Montero <Bonita.Montero@gmail.com>: Nov 29 09:26PM +0100 Why is sorting a pointer-array to strings slower than sorting the strings themselfes ? I've checked that with that code: #include <iostream> #include <vector> #include <fstream> #include <string> #include <algorithm> #include <random> #include <concepts> #include <chrono> #include <execution> using namespace std; using namespace chrono; int main( int argc, char **argv ) { if( argc < 2 ) return EXIT_FAILURE; using vc_t = vector<char>; using vc_it = vc_t::iterator; vector<char> block; { ifstream ifs; ifs.exceptions( ios_base::badbit ); ifs.open( argv[1], ios_base::binary ); ifs.seekg( 0, ios_base::end ); streampos size = ifs.tellg(); if( size > (size_t)-1 || !size ) return EXIT_FAILURE; ifs.seekg( 0, ios_base::beg ); block.resize( (size_t)size ); ifs.read( &*block.begin(), (streamsize)size ); } cout << "read" << endl; auto parse = [&]<typename Callback>( Callback callback ) -> size_t requires requires( Callback callback, char const *str, size_t len ) { { callback( str, len ) }; } { size_t n = 0; for( vc_it cit = block.begin(); cit != block.end(); ) { vc_it lineBegin = cit; for( ; cit != block.end() && (unsigned char)*cit > ' '; ++cit ); if( cit != lineBegin ) ++n, callback( &*lineBegin, cit - lineBegin ); for( ; cit != block.end() && (unsigned char)*cit <= ' '; ++cit ); } return n; }; vector<string> strings; size_t sumLen = 0; strings.reserve( parse( [&]( char const *, size_t len ) { sumLen += len; } ) ); if( !strings.capacity() ) return EXIT_FAILURE; cout << "counted, avg length: " << (double)(ptrdiff_t)sumLen / (ptrdiff_t)strings.capacity() << endl; parse( [&]( char const *str, size_t len ) { strings.emplace_back( str, len ); } ); cout << "parsed" << endl; sort( strings.begin(), strings.end() ); strings.erase( unique( strings.begin(), strings.end() ), strings.end() ); cout << "uniqued" << endl; strings.shrink_to_fit(); block.resize( 0 ), block.reserve( 0 ); auto randomize = []<typename RandomIt>( RandomIt begin, size_t n ) requires random_access_iterator<RandomIt> { mt19937_64 mt; uniform_int_distribution<size_t> uidIdx( 0, n - 1 ); for( size_t i = 0, j; i != n; ++i ) { while( (j = uidIdx( mt )) == i ); swap( begin[i], begin[j] ); } }; randomize( strings.begin(), strings.size() ); auto start = high_resolution_clock::now(); sort( strings.begin(), strings.end() ); double ns = (double)(int64_t)duration_cast<nanoseconds>( high_resolution_clock::now() - start ).count() / (ptrdiff_t)strings.size(); cout << "string-sort: " << ns << endl; vector<string *> pStrings; pStrings.reserve( strings.size() ); for( string &str : strings ) pStrings.emplace_back( &str ); randomize( pStrings.begin(), pStrings.size() ); start = high_resolution_clock::now(); sort( pStrings.begin(), pStrings.end(), []( string *left, string *right ) { return *left < *right; } ); ns = (double)(int64_t)duration_cast<nanoseconds>( high_resolution_clock::now() - start ).count() / (ptrdiff_t)strings.size(); cout << "pString-sort: " << ns << endl; } I've read 1.4 million strings with an average lengh of 9 characters, so most strings should fall under the short string optimization and swapping the string-objects would have a higher cost than swapping two pointers or a pointer and a length. And sorting string-pointers is actually nearly twice times slower than sorting the string-objects. Why is that ? |
| Marcel Mueller <news.5.maazl@spamgourmet.org>: Nov 29 10:54PM +0100 Am 29.11.21 um 21:26 schrieb Bonita Montero: > Why is sorting a pointer-array to strings slower than sorting the > strings themselfes ? I've checked that with that code: [...] > so most strings should fall under the short string optimization and > swapping the string-objects would have a higher cost than swapping > two pointers or a pointer and a length. Swapping a few bytes is not the performance killer. Cache misses are the performance killer. Every time you touch a memory address not in the cache you get the latency of the RAM. And this latency has not improved significantly in the last two decades. > And sorting string-pointers > is actually nearly twice times slower than sorting the string-objects. > Why is that ? Using pointers touches more different memory locations. The array and the string storage. With that large array almost any (random) access is a cache miss. Marcel |
| Juha Nieminen <nospam@thanks.invalid>: Nov 29 05:57AM > Assuming than a three way comparison is often a bit more expensive, it > might even be less efficient for some collections. And not only does it require a slower three-way comparison, it also uses two conditionals instead of the one that std::binary_search/std::lower_bound uses. Depending on the situation the additional conditional may actually make the code slightly slower (especially since it's likely that the CPU won't predict it correctly in about half the iterations, on average). Unpredictable conditionals are performance killers. |
| Bonita Montero <Bonita.Montero@gmail.com>: Nov 29 08:26AM +0100 Am 29.11.2021 um 06:57 schrieb Juha Nieminen: > And not only does it require a slower three-way comparison, ... Check the generated code: the result of the three way comparison is in the flags, depending on the type. > ..., it also uses two conditionals instead of the one that > std::binary_search/std::lower_bound uses. The ==-conditional is predictable almost any time since it matches only one time in the loop. But you're right: in most cases this code is slower than with lower bound. But I've developed a lower_bound alternative which is slightly faster than the lower-bound alternative: template<typename RandomIt, typename T, typename Pred> inline RandomIt xlower_bound( RandomIt begin, RandomIt end, T const &key, Pred pred ) requires std::random_access_iterator<RandomIt> && requires( Pred pred, decltype(*RandomIt()) &elem, T const &key ) { { pred( elem, key ) } -> std::convertible_to<std::strong_ordering>; } { using namespace std; assert(end >= begin); size_t hit = end - begin, lower = 0, upper = hit, mid; while( lower != upper ) { mid = lower + (upper - lower) / 2; if( pred( begin[mid], key ) >= 0 ) hit = mid, upper = mid; else lower = mid + 1; } return begin + hit; } On a vector of 1, 4 million string_view-objects I get about 3% more performance on my computer with clang 13. |
| Bonita Montero <Bonita.Montero@gmail.com>: Nov 29 02:02PM +0100 This is a small benchmark: auto bench = [&]<typename SearchFn>( SearchFn searchFn ) -> double requires requires( SearchFn searchFn, vsv_it iterator, string_view const &key ) { { searchFn( iterator, iterator, key ) } -> convertible_to<vsv_it>; } { size_t sumExist = 0; auto start = high_resolution_clock::now(); for( size_t it = ITERATIONS; it; --it ) sumExist += searchFn( strings.begin(), strings.end(), strings[uidIndex( mt )] ) != strings.end(); double ns = (double)(int64_t)duration_cast<nanoseconds>( high_resolution_clock::now() - start ).count() / ITERATIONS; aSumExist += sumExist; return ns; }; double lb = bench( []( vsv_it begin, vsv_it end, string_view const &key ) -> vsv_it { vsv_it it = lower_bound( begin, end, key ); if( it == end && *it != key ) return end; return it; } ); cout << "lb: " << lb << endl; double xlb = bench( []( vsv_it begin, vsv_it end, string_view const &key ) -> vsv_it { vsv_it it = xlower_bound( begin, end, key, []( string_view &elem, string_view const &key ) -> strong_ordering { return elem <=> key; } ); if( it == end && *it != key ) return end; return it; } ); cout << "xlb: " << xlb << endl; double xbs = bench( []( vsv_it begin, vsv_it end, string_view const &key ) { return xbinary_search( begin, end, key, []( string_view &elem, string_view const &key ) -> strong_ordering { return elem <=> key; } ); } ); cout << "xbs: " << xbs << endl; vsv_it is vector<string_view>::iterator 144,420,73 sorted string_view items, avgerage length: 8.68825 characters lower_bound: 1835.47ns xlower_bound: 1741.33ns xbinary_search: 1845.06 I think with such a large dataset the time isn't decided upon the algorithm but upon the random memory-accesses. |
| You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page. To unsubscribe from this group and stop receiving emails from it send an email to comp.lang.c+++unsubscribe@googlegroups.com. |


