- what is the fastest way to store a list of unique pointers ? - 13 Updates
- Anyway to uuencode and uudecode a string in C++ - 3 Updates
- what is the fastest way to store a list of unique pointers ? - 3 Updates
- Visual C++ 2008 - Using a pure virtual destructor. Any side effects I should be aware of ? - 4 Updates
| Manfred <invalid@invalid.add>: Aug 12 01:38AM +0200 On 08/12/2017 01:27 AM, Alf P. Steinbach wrote: >> I think that std::set is the closest to the current behaviour >> (std::map using only keys). > Can't say that I understand what you're thinking of here. keys in a std::map are ordered, aren't they? Since the OP is not using the map values, but only the keys, then I think that such usage of the map is practically equivalent to using a std::set. >> And I think it is faster than unordered_set too, although I have to >> say I did not check. > Not this either. Since when did hashing become slower than a tree search? You may be right here, as I said, I did not check. |
| "Alf P. Steinbach" <alf.p.steinbach+usenet@gmail.com>: Aug 12 03:37AM +0200 On 12.08.2017 01:38, Manfred wrote: >>> (std::map using only keys). >> Can't say that I understand what you're thinking of here. > keys in a std::map are ordered, aren't they? Oh. Well consider these points: • The order of raw pointers is pretty arbitrary. • `std::map` was not chosen for its good fit to the problem: they're not using the mapping. I.e. there is AFAICS no reason to believe the order is significant even if the keys were of a kind with a natural order. > Since the OP is not using the map values, but only the keys, then I > think that such usage of the map is practically equivalent to using a > std::set. Yes. >>> say I did not check. >> Not this either. Since when did hashing become slower than a tree search? > You may be right here, as I said, I did not check. An implementation of `unordered_set` can be slower than `set` for a given use case, but it's unlikely. The reason is that a search in a tree structure, while logarithmic, is on average much slower than an on average constant time lookup. Cheers & hth., - Alf |
| Marcel Mueller <news.5.maazl@spamgourmet.org>: Aug 12 08:44AM +0200 On 12.08.17 00.11, Lynn McGuire wrote: > What is the fastest way to store a list of unique pointers ? Fastest for scalable sets are in general hash sets. In C++ they are implemented by unordered_set<>. set<> itself uses Red-Black-Tree or something similar. Although the complexity is O(log n) this is in practice amazingly inefficient because of the large memory overhead (in case of small content) and the bad locality (degrading cache efficiency). I almost never use this containers (including map<>). A reasonable compromise to the ordered standard containers are B-Trees. Almost every database uses them. Unfortunately there is still no B-Tree implementation in the standard. > We were storing a list of unique pointers in a std::vector list. But > searching it is slow before deciding to add a new pointer. This is the worst choice. You ended up with complexity O(n²). > always "add" the pointer to the list but never have to look it up. But, > the resulting value is always just set to 1. The resulting std::map > uses about 10% of the time that the std::vector did. Use unordered_set<> as there is no need to keep the pointers in some particular order as well as there is no need for an associative container. Marcel |
| Melzzzzz <Melzzzzz@zzzzz.com>: Aug 12 06:57AM > A reasonable compromise to the ordered standard containers are B-Trees. > Almost every database uses them. Unfortunately there is still no B-Tree > implementation in the standard. There is nothing that prevents implementing std::map/set as btree. -- press any key to continue or any other to quit... |
| Mr Flibble <flibbleREMOVETHISBIT@i42.co.uk>: Aug 12 10:29AM +0100 On 12/08/2017 07:44, Marcel Mueller wrote: > of the large memory overhead (in case of small content) and the bad > locality (degrading cache efficiency). I almost never use this > containers (including map<>). One can use a pool allocator with set/map to improve allocation speed and cache locality; boost has fast pool allocator for this purpose. /Flibble |
| "Öö Tiib" <ootiib@hot.ee>: Aug 12 03:23AM -0700 On Saturday, 12 August 2017 12:29:40 UTC+3, Mr Flibble wrote: > > containers (including map<>). > One can use a pool allocator with set/map to improve allocation speed > and cache locality; boost has fast pool allocator for this purpose. Yes, but this tells us that for example when we have 42 000 'Growl*' some of what share value then you suggest that set with pool allocator like 'std::set<Growl*, std::less<Growl*>, boost::fast_pool_allocator<Growl*>>' is faster to fill with those than just 'std::set<Growl*>'. The real question (that OP seemingly had) is if it is also faster than 'std::unordered_set<Growl*>'. |
| Manfred <noname@invalid.add>: Aug 12 01:52PM +0200 On 08/12/2017 03:37 AM, Alf P. Steinbach wrote: > • `std::map` was not chosen for its good fit to the problem: they're not > using the mapping. I.e. there is AFAICS no reason to believe the order > is significant even if the keys were of a kind with a natural order. Whatever the ordering for raw pointers, still std::map keys are ordered - their intent has been since the beginning of time that of fast search. I agree with you that the OP is misusing it, but, with respect to fast lookup, the choice makes sense. > given use case, but it's unlikely. > The reason is that a search in a tree structure, while logarithmic, is > on average much slower than an on average constant time lookup. That is typically true for large sets. Thanks for the clarification. |
| Manfred <noname@invalid.add>: Aug 12 04:04PM +0200 On 08/12/2017 03:40 PM, Stefan Ram wrote: > vector, because the implementations of the sets are so slow. > But my measurements seem to confirm the naïve assumption: > That unordered_set is fastest. This obviously depends on the usage of the container: for the general case of random or sequential access, then vector wins. When it comes to more complex operations like searching, then it is understandable that other types of container work better. What Bjarne often shows is that in unexpected cases, like insertion, that list would naively be better equipped for, actually vector wins - so the recommendation is to use vector as a default, but this does not mean always. |
| Marcel Mueller <news.5.maazl@spamgourmet.org>: Aug 12 04:26PM +0200 On 12.08.17 08.57, Melzzzzz wrote: >> Almost every database uses them. Unfortunately there is still no B-Tree >> implementation in the standard. > There is nothing that prevents implementing std::map/set as btree. NACK! The guarantee that iterators to untouched elements remain valid on insert/delete makes a reasonable B-Tree implementation impossible. Marcel |
| Marcel Mueller <news.5.maazl@spamgourmet.org>: Aug 12 04:26PM +0200 On 12.08.17 11.29, Mr Flibble wrote: >> containers (including map<>). > One can use a pool allocator with set/map to improve allocation speed > and cache locality; boost has fast pool allocator for this purpose. In this case only the memory overhead of roughly one order of magnitude (for small objects) remains. And well, if I always have to use a pool allocator to keep the performance reasonable the question arises whether it is the right container. Since I usually do not place large objects into a container std::set and std::map is of no particular use for me. Marcel |
| Melzzzzz <Melzzzzz@zzzzz.com>: Aug 12 03:21PM > NACK! > The guarantee that iterators to untouched elements remain valid on > insert/delete makes a reasonable B-Tree implementation impossible. Well, that makes restrictions to implementations, for sure. I tested BTree implementation in Rust against my AVL tree . Iteration through btree is order of magnitude faster then binary tree. Insertion and deletion is about twice as fast in btree. That is just because btree is cache friendly. -- press any key to continue or any other to quit... |
| Mr Flibble <flibbleREMOVETHISBIT@i42.co.uk>: Aug 12 06:33PM +0100 On 12/08/2017 15:26, Marcel Mueller wrote: > container. > Since I usually do not place large objects into a container std::set and > std::map is of no particular use for me. Anyone who claims that std::set and std::map is no use to them is obviously Doing It Wrong (TM). They are essential containers and should be used when it is appropriate to use them. If you don't know when it is appropriate to use them or you think that it is never appropriate to use them then you are newb. /Flibble |
| Paavo Helde <myfirstname@osa.pri.ee>: Aug 13 12:10AM +0300 On 12.08.2017 19:56, Stefan Ram wrote: > retrieving ::std::set 713,790.944 > . This means that when we actually find some numbers > again, now ::std::unordered_set is fastest. It appears that your definition of "sorted_vector" is useless (a poor implementation of std::set). A proper "sorted_vector" is what you fill by e.g. push_back() first, then sort once and keep (at least mostly) constant afterwards. This should get rid of the apparent 100x overhead in filling it and make the numbers more comparable. This is not to say that it might beat std::unordered_set, the latter may still outwin because it does not have to keep the data in a given order. Cheers Paavo |
| kushal bhattacharya <bhattacharya.kushal4@gmail.com>: Aug 11 09:54PM -0700 On Saturday, August 12, 2017 at 12:46:54 AM UTC+5:30, Paavo Helde wrote: > UTF-8 charset declaration. > hth > Paavo as in the base 64 spec it states that the encoding format is used for transport preservation purposes used mainly in mail systems |
| Jorgen Grahn <grahn+nntp@snipabacken.se>: Aug 12 07:14AM On Fri, 2017-08-11, Ian Collins wrote: > On 08/11/17 05:43 PM, kushal bhattacharya wrote: ... >> socket too.So thats whyi would prefer some good encoding mechanism >> here > Sockets are typically used to transfer binary data, That could be misinterpreted; lots of common protocols use text. But yes, e.g. a TCP socket has always been two streams of bytes. > why do you see special characters as a problem? I think his problem is really about protocol design. If he wants to do more than send a single string over a single TCP socket, with no verification, then he has to do something, e.g. send a content-length like HTTP does, or define an end marker (and escape it in the data) like SMTP does. /Jorgen -- // Jorgen Grahn <grahn@ Oo o. . . \X/ snipabacken.se> O o . |
| Paavo Helde <myfirstname@osa.pri.ee>: Aug 12 11:46PM +0300 On 12.08.2017 7:54, kushal bhattacharya wrote: >> hth >> Paavo > as in the base 64 spec it states that the encoding format is used for transport preservation purposes used mainly in mail systems Yes, for transport via e-mail systems ca 30 years ago. So tell me again why your C++ data is sent to PHP via 30-year old e-mail systems? Cheers Paavo |
| ram@zedat.fu-berlin.de (Stefan Ram): Aug 12 01:18AM >The first idea that would spring to my mind would be a >sorted vector with a binary search, because it should I measured the runtime of this loop: for( unsigned long l = 0; l < 10'000'000; ++l ) container.insert( rand() ); . unordered_set was the fastest and set the slowest with my custom "sorted_vector" in between: std::unordered_set<unsigned long> 337,000.000 sorted_vector<unsigned long> 1'294,000.000 std::set<unsigned long> 2'024,000.000 . Did not measure retrieval because OP asked for "store". (It is possible that some noise have been added to the times and/or all times were multiplied with the same factor to protect my privacy.) |
| ram@zedat.fu-berlin.de (Stefan Ram): Aug 12 01:40PM >std::unordered_set<unsigned long> 337,000.000 >sorted_vector<unsigned long> 1'294,000.000 >std::set<unsigned long> 2'024,000.000 Above, one can read the times for inserts. Today, I measured the times for retrievals after the container was filled with 10'000'000 random numbers using: ::timer t; decltype(s)::const_iterator const e = s.end(); int i; for( unsigned long l = 0; l < 10'000'000; ++l ) i = i +( e == s.find( rand() )); t.print(); escape( &i ); . And the results were very similar to the results for inserts. Since parts of the vector have to be shifted during inserts, I would have expected lookups on the sorted_vector to be much faster than inserts, but this was not observable (they were only slightly faster). Both Herb and Chandler (and to some extend also Bjarne) suggest, as far as I have understood them, to always use a vector, because the implementations of the sets are so slow. But my measurements seem to confirm the naīve assumption: That unordered_set is fastest. |
| ram@zedat.fu-berlin.de (Stefan Ram): Aug 12 04:56PM >case of random or sequential access, then vector wins. >When it comes to more complex operations like searching, then it is >understandable that other types of container work better. You are right. In my case, I used »random()« but I was not aware of RAND_MAX still being 32767 or so and therefore, the final section of the sorted_vector rarely got shifted. I now use a random-number generator that generates random numbers up to ::std::numerics_limits<unsigned long>::max() or so and get these results: filling ::sorted_vector 79'893,747.427 retrieving ::sorted_vector 238,574.119 filling ::std::unordered_set 360,994.757 retrieving ::std::unordered_set 248,835.371 filling ::std::set 741,265.933 retrieving ::std::set 722,563.192 . Here are the results with »::randomize( 0 );« added to find the same numbers again: filling ::sorted_vector 83'803,941.903 retrieving ::sorted_vector 266,760.153 filling ::std::unordered_set 362,649.427 retrieving ::std::unordered_set 161,849.880 filling ::std::set 754,459.535 retrieving ::std::set 713,790.944 . This means that when we actually find some numbers again, now ::std::unordered_set is fastest. There also might have been another error in my code yesterday, because I reported different results yesterday. My source code is not cleaned-up, but just so that you can check my results (estimated runtime between 1 and 20 minutes): #include <initializer_list> #include <chrono> #include <functional> #include <iostream> #include <vector> #include <unordered_set> #include <set> #include <algorithm> #include <climits> #include <cstdlib> #include <string> #include <iomanip> #include <ios> #include <random> using namespace ::std::literals; static void escape( volatile void * p ) { asm volatile( "" : : "g"(p) : "memory" ); } static void escape( void * p ) /* by Chandler Carruth */ { asm volatile( "" : : "g"(p) : "memory" ); } template< typename T > ::std::string format( T value ) { double privacy_factor = 1.0; /* default: 1.0 */ double privacy_noise = 0.0; /* default: 0.0 */ ::std::string s; int i = 0; int n = 0; long double v = value; v = v * privacy_factor; v = v *( 1.0 + privacy_noise * ( rand() /( 1.0 + RAND_MAX )- 0.5 )); value = v; if( value ) while( value ) { if( i &&( i % 3 == 0 )) { s = " .,';:-=%#$"[ i/3 ] + s; ++n; } /* above there is a bug: i/3 might be larger than the string */ ++i; ++n; s = "0123456789"[ value % 10 ] + s; value /= 10; } else s = "0"; while( n++ < 20 )s = " " + s; return s; } struct timer { using timepoint = ::std::chrono::high_resolution_clock::time_point; ::timer::timepoint t1; ::timer::timepoint t2; decltype( ::timer::t2 - ::timer::t1 )sum{}; timer(){ this->reset(); } void reset() { this->t1 = ::std::chrono::high_resolution_clock::now(); this->t2 = t1; sum = t2 - t1; } void print( ::std::string const s, int const text_width, int const number_width ) { this->t2 = ::std::chrono::high_resolution_clock::now(); this->sum += this->t2 - this->t1; ::std::cout << ::std::setw( text_width )<< ::std::left << s << ::std::setw( number_width )<< ::std::right << format(( sum ).count()) << '\n'; }}; template <class T> struct sorted_vector { ::std::vector< T >vector; using iterator = typename ::std::vector< T >::iterator ; using const_iterator = typename ::std::vector< T >::const_iterator ; iterator begin() { return vector.begin(); } iterator end() { return vector.end(); } const_iterator begin() const { return vector.begin(); } const_iterator end() const { return vector.end(); } explicit sorted_vector(): vector{} { } iterator insert( const T & t ) { iterator i = lower_bound( begin(), end(), t ); if( i == end() || t < *i ) { vector.insert( i, t ); } return i; } const_iterator find( const T& t ) const { const_iterator i = lower_bound( begin(), end(), t ); return i == end() || t < *i ? end() : i; }}; auto const seed { ::std::chrono::system_clock::now().time_since_epoch().count() }; auto linear_congruential_engine { ::std::linear_congruential_engine< unsigned long, 40014uL,0uL,2147483563uL >( seed ) }; auto dice_distribution { ::std::uniform_int_distribution< unsigned long >{ 0, ULONG_MAX }}; void randomize( unsigned long const value ) { linear_congruential_engine.seed( value ); } unsigned long dice() { return dice_distribution( linear_congruential_engine ); } template< typename T > static inline void fill( T & container ) { ::randomize( 0 ); for( unsigned long l = 0; l < 1'000'000; ++l )container.insert( dice() ); escape( &container ); } template< typename T > static inline void retrieve( T const & container ) { typename T::const_iterator const e = container.end(); int i = 0; ::randomize( 0 ); for( unsigned long l = 0; l < 1'000'000; ++l )i = i+( e == container.find( dice() )); escape( &i ); } template< typename T > static inline void measure( ::std::string const name, int const text_width, int const number_width ) { ::timer timer; T container; timer.reset(); fill( container ); timer.print( "filling "s + name, text_width, number_width ); timer.reset(); retrieve( container ); timer.print( "retrieving "s + name, text_width, number_width ); } void measurement() { int const text_width = 32; int const number_width = 14; measure< ::sorted_vector< unsigned long >>( "::sorted_vector", text_width, number_width ); measure< ::std::unordered_set< unsigned long >>( "::std::unordered_set", text_width, number_width ); measure< ::std::set< unsigned long >>( "::std::set", text_width, number_width ); ::std::cout << '\n'; } int main() { for( int i = 0; i < 3; ++i )measurement(); } |
| Manfred <invalid@invalid.add>: Aug 12 01:30AM +0200 On 08/11/2017 05:42 PM, R.Wieser wrote: > I'm calling an external (to the VC++ program) object (stored in a > dynamically loaded DLL) which methods are based on the COM model, meaning > that the first argument is supposed to be a 'this' reference. I suspect that you are a bit misled about what you are trying to do - just a suspect as you did not post any code sample. I suggest you post some sample of both ends of the interface you are trying to work with. The COM model you mention has a number of layers, most of which are Microsoft-specific, and your reference about the 'this' pointer (not reference) looks like there is some misunderstanding going on here. |
| "R.Wieser" <address@not.available>: Aug 12 11:52AM +0200 Kalle, > I meant, the C++ standard does not define the format or > existence of vtables, Ah, yes. I found that rather odd: the mechanism by which virtual methods interface with the actual class do not seem to be described/defined. But even as that is so, the mechanism is proven to work on both windows as well as Linux (where the programs external object is placed in an .so file), and has been doing so for at least the last decade. So the VTable mechanism (even if not named as such) is not quite as unstable (prone to changes for whatever reason at whatever time) as it might appear at first glance ... :-) But yes, that concern is still lingering in the back of my mind. Thanks for reminding me. :-( :-) Regards, Rudy Wieser -- Origional message: Kalle Olavi Niemitalo <kon@iki.fi> schreef in berichtnieuws 87zib5q3x0.fsf@Niukka.kon.iki.fi... |
| "R.Wieser" <address@not.available>: Aug 12 12:32PM +0200 Manfred, > I suspect that you are a bit misled about what you are trying > to do. I'm listening, please continue. But I don't think you quite understood the situation here: The mechanism *already works*. I'm just trying to "neat it up" a bit. My current thread is about if I can use destructor declared as pure virtual to call the destructor method in the external object. As it turns out, I can't use "__stdcall" when declaring the classes virtual destructor *, as it does not cause the "this" reference/pointer to be transferred as the first argument. No problem though, I'm just keep using what I already got. :-) *actualy I can, it just doesn't do anything there. :-( > I suggest you post some sample of both ends of the interface > you are trying to work with. You already got the vc++ end. Anything wrong with it ? How ? > and your reference about the 'this' pointer (not reference) looks > like there is some misunderstanding going on here. Poteto, potato. I think you will be able to understand from the example I posted in my initial message in the previous thread how I'm using it, and thereby what I ment. If not ... To be honest, I have no idea when you guys call it a reference, and when a pointer. To me the two are the same. https://stackoverflow.com/questions/57483/what-are-the-differences-between-a -pointer-variable-and-a-reference-variable-in Oh, nice. So in VC++ a reference is just a special, limited kind of pointer. I guess I ment reference than. Regards, Rudy Wieser -- Origional message: Manfred <invalid@invalid.add> schreef in berichtnieuws omlelm$g80$1@gioia.aioe.org... > On 08/11/2017 05:42 PM, R.Wieser wrote: > > I'm calling an external (to the VC++ program) object (stored in a > > dynamically loaded DLL) which methods are based on the COM model, meaning |
| Manfred <noname@invalid.add>: Aug 12 02:23PM +0200 On 08/12/2017 12:32 PM, R.Wieser wrote: >> I suspect that you are a bit misled about what you are trying >> to do. > I'm listening, please continue. It looks as if you are trying to coerce two different C++ runtime implementations in the same process space - which would be trouble (it is undefined behaviour at best). > But I don't think you quite understood the situation here: The mechanism > *already works*. I'm just trying to "neat it up" a bit. The question is if it works by coincidence or because it is correct. Based on the information provided, both can be the case. > argument. > No problem though, I'm just keep using what I already got. :-) > *actualy I can, it just doesn't do anything there. :-( It is not clear if you are trying to destroy an object by explicitly calling its destructor, which could also be trouble. >> I suggest you post some sample of both ends of the interface >> you are trying to work with. > You already got the vc++ end. Anything wrong with it ? How ? It depends on what is on the other end, how the objects are allocated on both ends, what is the C++ runtime implementation on both ends, and what more about the ABI. > Poteto, potato. I think you will be able to understand from the example I > posted in my initial message in the previous thread how I'm using it, and > thereby what I ment. If not ... It's not poteto potato. The 'this' pointer (not reference) is passed by the C++ implementation, according to the rules dictated by the ABI. Instead of talking about ECX register and the like, you should make clear what is the ABI specified by your implementation (which must be the same on both ends, by the way) > -pointer-variable-and-a-reference-variable-in > Oh, nice. So in VC++ a reference is just a special, limited kind of > pointer. I guess I ment reference than. I guess you guessed wrong... |
| You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page. To unsubscribe from this group and stop receiving emails from it send an email to comp.lang.c+++unsubscribe@googlegroups.com. |
No comments:
Post a Comment