- Hashing for unordered containers? - 9 Updates
- High-tech, higher calling was: Onwards and upwards - 1 Update
- C++ 20 modules - 1 Update
Jorgen Grahn <grahn+nntp@snipabacken.se>: Oct 11 08:37AM It strikes me I still don't know the best way to do this. When I write small classes (e.g. a class Photo for the file name of a photo) I tend to add an operator< because: - It's easy. - It's natural; it makes sense to show there's an ordering. - I can then use Photo as a map or set key whenever I want. However, when I want to use Photo as a key in unordered_map or unordered_set (which is at least as reasonable as using it in a map or set) what I end up doing is: - Add a size_t Photo::hash() const member, implemented in terms of funny things like std::hash<std::string> {} (). - Specialize the std::hash class for Photo, and arrange for its operator() to call Photo::hash(). That's a lot of boilerplate, compared to operator<. Is there a better way? I know I can do things to the std::unordered_set itself, but would prefer not to (and I doubt there would be less boilerplate, even for a single instantiation). My context is C++11 or C++14; a C++17 or later solution would be interesting, but of no immediate use to me. /Jorgen -- // Jorgen Grahn <grahn@ Oo o. . . \X/ snipabacken.se> O o . |
Bonita Montero <Bonita.Montero@gmail.com>: Oct 11 01:20PM +0200 > operator() to call Photo::hash(). > That's a lot of boilerplate, compared to operator<. Is there a better > way? 1. A unordered map is usually faster. 2. You can define the hashing-class as a lambda and use decltype(hc) as the teplate-parameter. |
"Öö Tiib" <ootiib@hot.ee>: Oct 11 05:54AM -0700 On Sunday, 11 October 2020 11:37:23 UTC+3, Jorgen Grahn wrote: > operator() to call Photo::hash(). > That's a lot of boilerplate, compared to operator<. Is there a better > way? If you always hash photos in same way then it is perhaps fine like that: namespace std { template <> struct hash<Photo> { size_t operator()(const Photo& p) const noexcept { return std::hash<std::string>{}(p.name()); } }; } 8 lines is not lot. The std::equal_to uses Photo::operator== already so (unless your case needs some potentially confusing differences) that isn't needed to specialize. > I know I can do things to the std::unordered_set itself, but would > prefer not to (and I doubt there would be less boilerplate, even for > a single instantiation). The plain std::unordered_set<Photo> is fine when std::hash<Photo> and std::equal_to<Photo> compile. > My context is C++11 or C++14; a C++17 or later solution would be > interesting, but of no immediate use to me. What I wrote was AFAIK so since C++11 ... C++17 did not touch it. C++17 added std::pmr::unordered_set but that is of no concern in current thread. |
Brian Wood <woodbrian77@gmail.com>: Oct 11 07:38AM -0700 On Sunday, October 11, 2020 at 7:55:01 AM UTC-5, Öö Tiib wrote: > } > }; > } A little indentation is helpful in my opinion: namespace std { template <> struct hash<Photo> { size_t operator()(Photo const& p) const noexcept { return ::std::hash<::std::string>{}(p.name()); } }; } Not sure if I'd use noexcept there. Brian Ebenezer Enterprises https://github.com/Ebenezer-group/onwards |
Jorgen Grahn <grahn+nntp@snipabacken.se>: Oct 11 02:51PM On Sun, 2020-10-11, Öö Tiib wrote: > }; > } > 8 lines is not lot. It's IMO a lot, compared to the operator< that std::map and friends need. And (like I hinted at, but didn't explain) I can't hash this class via its public interface (it doesn't have a name() method) so I have to delegate to a Photo::hash() method. "A lot" is subjective. Perhaps it's better to say I think it makes people prefer std::set to std::unordered set, especially for short-lived containers which maybe just help eliminate duplicates from a sequence. And maybe use std::strings as keys rather than their own types. As you can tell, it's a bit frustrating to me that the better containers (when you don't need ordering etc) mean more work. But, at least it seems I haven't missed some nice shortcut that everyone else uses. /Jorgen -- // Jorgen Grahn <grahn@ Oo o. . . \X/ snipabacken.se> O o . |
Jorgen Grahn <grahn+nntp@snipabacken.se>: Oct 11 03:02PM On Sun, 2020-10-11, Bonita Montero wrote: >> That's a lot of boilerplate, compared to operator<. Is there a better >> way? > 1. A unordered map is usually faster. Yes, that's the only reason I bother with it. (Another reason would be if my type was easy to hash but hard to compare, but I don't think that has happened to me yet.) > 2. You can define the hashing-class as a lambda and use > decltype(hc) as the teplate-parameter. That /did/ occur to me this morning. It would make it possible to write std::unordered_set<Photo, decltype([] (const Photo& p) {...})> my_set; but it seems perverse to (seemingly) create a lambda function just for its type. And more importantly, I want the "hashability" to be a property of class Photo so I can use it anywhere. I may e.g. want to build on it when I make MugShot and std::pair<Photo, Photo> hashable. /Jorgen -- // Jorgen Grahn <grahn@ Oo o. . . \X/ snipabacken.se> O o . |
Manfred <noname@add.invalid>: Oct 11 05:18PM +0200 On 10/11/2020 5:02 PM, Jorgen Grahn wrote: > its type. And more importantly, I want the "hashability" to be a > property of class Photo so I can use it anywhere. I may e.g. want to > build on it when I make MugShot and std::pair<Photo, Photo> hashable. From cppreference.com: Named requirements: UnorderedAssociativeContainer (std::set belongs to this set ;) ) "Unordered associative containers are parametrized by ... Hash, a Hash function object which acts as hash function on Key ..." https://en.cppreference.com/w/cpp/named_req/UnorderedAssociativeContainer named requirements: Hash "A Hash is a /function object/ ..." https://en.cppreference.com/w/cpp/named_req/Hash So, instead of decltype(...) you may use std::function<size_t(const Photo&)> |
"Öö Tiib" <ootiib@hot.ee>: Oct 11 10:43AM -0700 On Sunday, 11 October 2020 17:51:34 UTC+3, Jorgen Grahn wrote: > class via its public interface (it doesn't have a name() method) so > I have to delegate to a Photo::hash() method. > "A lot" is subjective. Yes. I've often been helping with code bases of hundreds of thousands of lines and so it is non-issue for my subjective taste. C++ is verbose, I use it not because it is terse. Some C-like languages like Swift are way more laconic but these have their own issues. Program is put together of patterns and it is business of compiler to reduce it to minimal binary code. > short-lived containers which maybe just help eliminate duplicates from > a sequence. And maybe use std::strings as keys rather than their own > types. May be it is indeed better to use unordered_map<std::string, Photo>. I think of such idiom as "dictionary idiom" as it is rather common. That does compile when <string> is included. What you search for photo in your set of photos with? Without file name in external interface? Dummy photo? > containers (when you don't need ordering etc) mean more work. > But, at least it seems I haven't missed some nice shortcut that > everyone else uses. I have not felt like that is issue. Most searchable containers I use are sorted vectors, hash-table based containers or Boost.Intrusives. Sometimes I use my own implementation of sorted deque (as std::deque does not let to specify block sizes). The std::set or std::map I avoid because these either lose in performance or lack characteristics that I need. For example Boost.Intrusive sets are way more verbose than std::sets to handle as these do not own the elements but when used correctly then have great qualities that nothing else provides. For example it is possible to have polymorphic objects in those very efficiently, similarly to boost::poly_collection::base_collection. Consider std::set<unique_ptr<Base>> ... boost::intrusive::set just wipes floors with it in every benchmark. |
Jorgen Grahn <grahn+nntp@snipabacken.se>: Oct 11 05:50PM On Sun, 2020-10-11, Brian Wood wrote: > On Sunday, October 11, 2020 at 7:55:01 AM UTC-5, Öö Tiib wrote: ... >> }; >> } > A little indentation is helpful in my opinion: Tiib's post /was/ indented; Google must have broken it for you. /Jorgen -- // Jorgen Grahn <grahn@ Oo o. . . \X/ snipabacken.se> O o . |
Brian Wood <woodbrian77@gmail.com>: Oct 10 10:07PM -0700 On Sunday, September 30, 2018 at 11:38:32 AM UTC-5, Brian Wood wrote: > If you would indulge me, I'd like to have a little guessing > game as far as what quote I am thinking of in regards to this > milestone. Now I can't remember what the quote was, but I'm still working on the software and appreciate ideas on how to improve it. Brian Ebenezer Enterprises - Enjoying programming again. https://github.com/Ebenezer-group/onwards |
Brian Wood <woodbrian77@gmail.com>: Oct 10 09:21PM -0700 > the compilers knows what lib it has to include? > I cannot see any motivation for modules apart of (fast compilation > and hiding macros). For years I was looking forward to them, but they are so slow in coming that I don't think about them much now. Maybe in a few years it will be different. Brian Ebenezer Enterprises https://github.com/Ebenezer-group/onwards |
You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page. To unsubscribe from this group and stop receiving emails from it send an email to comp.lang.c+++unsubscribe@googlegroups.com. |
No comments:
Post a Comment