Sunday, October 11, 2020

Digest for comp.lang.c++@googlegroups.com - 11 updates in 3 topics

Jorgen Grahn <grahn+nntp@snipabacken.se>: Oct 11 08:37AM

It strikes me I still don't know the best way to do this.
 
When I write small classes (e.g. a class Photo for the file name of a
photo) I tend to add an operator< because:
 
- It's easy.
- It's natural; it makes sense to show there's an ordering.
- I can then use Photo as a map or set key whenever I want.
 
However, when I want to use Photo as a key in unordered_map or
unordered_set (which is at least as reasonable as using it in a map or
set) what I end up doing is:
 
- Add a size_t Photo::hash() const member, implemented in terms
of funny things like std::hash<std::string> {} ().
- Specialize the std::hash class for Photo, and arrange for its
operator() to call Photo::hash().
 
That's a lot of boilerplate, compared to operator<. Is there a better
way?
 
I know I can do things to the std::unordered_set itself, but would
prefer not to (and I doubt there would be less boilerplate, even for
a single instantiation).
 
My context is C++11 or C++14; a C++17 or later solution would be
interesting, but of no immediate use to me.
 
/Jorgen
 
--
// Jorgen Grahn <grahn@ Oo o. . .
\X/ snipabacken.se> O o .
Bonita Montero <Bonita.Montero@gmail.com>: Oct 11 01:20PM +0200

> operator() to call Photo::hash().
> That's a lot of boilerplate, compared to operator<. Is there a better
> way?
 
1. A unordered map is usually faster.
2. You can define the hashing-class as a lambda and use
decltype(hc) as the teplate-parameter.
"Öö Tiib" <ootiib@hot.ee>: Oct 11 05:54AM -0700

On Sunday, 11 October 2020 11:37:23 UTC+3, Jorgen Grahn wrote:
> operator() to call Photo::hash().
 
> That's a lot of boilerplate, compared to operator<. Is there a better
> way?
 
If you always hash photos in same way then it is perhaps fine like that:
 
namespace std {
template <>
struct hash<Photo> {
size_t operator()(const Photo& p) const noexcept {
return std::hash<std::string>{}(p.name());
}
};
}
 
8 lines is not lot. The std::equal_to uses Photo::operator== already
so (unless your case needs some potentially confusing differences)
that isn't needed to specialize.
 
> I know I can do things to the std::unordered_set itself, but would
> prefer not to (and I doubt there would be less boilerplate, even for
> a single instantiation).
 
The plain std::unordered_set<Photo> is fine when std::hash<Photo>
and std::equal_to<Photo> compile.
 
> My context is C++11 or C++14; a C++17 or later solution would be
> interesting, but of no immediate use to me.
 
What I wrote was AFAIK so since C++11 ... C++17 did not touch it.
C++17 added std::pmr::unordered_set but that is of no concern in
current thread.
Brian Wood <woodbrian77@gmail.com>: Oct 11 07:38AM -0700

On Sunday, October 11, 2020 at 7:55:01 AM UTC-5, Öö Tiib wrote:
> }
> };
> }
 
A little indentation is helpful in my opinion:
 
namespace std {
template <>
struct hash<Photo> {
size_t operator()(Photo const& p) const noexcept {
return ::std::hash<::std::string>{}(p.name());
}
};
}
 
Not sure if I'd use noexcept there.
 
 
Brian
Ebenezer Enterprises
https://github.com/Ebenezer-group/onwards
Jorgen Grahn <grahn+nntp@snipabacken.se>: Oct 11 02:51PM

On Sun, 2020-10-11, Öö Tiib wrote:
> };
> }
 
> 8 lines is not lot.
 
It's IMO a lot, compared to the operator< that std::map and friends
need. And (like I hinted at, but didn't explain) I can't hash this
class via its public interface (it doesn't have a name() method) so
I have to delegate to a Photo::hash() method.
 
"A lot" is subjective. Perhaps it's better to say I think it makes
people prefer std::set to std::unordered set, especially for
short-lived containers which maybe just help eliminate duplicates from
a sequence. And maybe use std::strings as keys rather than their own
types.
 
As you can tell, it's a bit frustrating to me that the better
containers (when you don't need ordering etc) mean more work.
 
But, at least it seems I haven't missed some nice shortcut that
everyone else uses.
 
/Jorgen
 
--
// Jorgen Grahn <grahn@ Oo o. . .
\X/ snipabacken.se> O o .
Jorgen Grahn <grahn+nntp@snipabacken.se>: Oct 11 03:02PM

On Sun, 2020-10-11, Bonita Montero wrote:
>> That's a lot of boilerplate, compared to operator<. Is there a better
>> way?
 
> 1. A unordered map is usually faster.
 
Yes, that's the only reason I bother with it. (Another reason would be
if my type was easy to hash but hard to compare, but I don't think
that has happened to me yet.)
 
> 2. You can define the hashing-class as a lambda and use
> decltype(hc) as the teplate-parameter.
 
That /did/ occur to me this morning. It would make it possible to
write
 
std::unordered_set<Photo, decltype([] (const Photo& p) {...})> my_set;
 
but it seems perverse to (seemingly) create a lambda function just for
its type. And more importantly, I want the "hashability" to be a
property of class Photo so I can use it anywhere. I may e.g. want to
build on it when I make MugShot and std::pair<Photo, Photo> hashable.
 
/Jorgen
 
--
// Jorgen Grahn <grahn@ Oo o. . .
\X/ snipabacken.se> O o .
Manfred <noname@add.invalid>: Oct 11 05:18PM +0200

On 10/11/2020 5:02 PM, Jorgen Grahn wrote:
> its type. And more importantly, I want the "hashability" to be a
> property of class Photo so I can use it anywhere. I may e.g. want to
> build on it when I make MugShot and std::pair<Photo, Photo> hashable.
 
From cppreference.com:
 
Named requirements: UnorderedAssociativeContainer (std::set belongs to
this set ;) )
 
"Unordered associative containers are parametrized by ... Hash, a Hash
function object which acts as hash function on Key ..."
https://en.cppreference.com/w/cpp/named_req/UnorderedAssociativeContainer
 
named requirements: Hash
"A Hash is a /function object/ ..."
https://en.cppreference.com/w/cpp/named_req/Hash
 
So, instead of decltype(...) you may use
std::function<size_t(const Photo&)>
 
"Öö Tiib" <ootiib@hot.ee>: Oct 11 10:43AM -0700

On Sunday, 11 October 2020 17:51:34 UTC+3, Jorgen Grahn wrote:
> class via its public interface (it doesn't have a name() method) so
> I have to delegate to a Photo::hash() method.
 
> "A lot" is subjective.
 
Yes. I've often been helping with code bases of hundreds of
thousands of lines and so it is non-issue for my subjective taste.
C++ is verbose, I use it not because it is terse. Some C-like
languages like Swift are way more laconic but these have their
own issues.
Program is put together of patterns and it is business of compiler
to reduce it to minimal binary code.
 
> short-lived containers which maybe just help eliminate duplicates from
> a sequence. And maybe use std::strings as keys rather than their own
> types.
 
May be it is indeed better to use unordered_map<std::string, Photo>.
I think of such idiom as "dictionary idiom" as it is rather common.
That does compile when <string> is included. What you search
for photo in your set of photos with? Without file name in
external interface? Dummy photo?
 
> containers (when you don't need ordering etc) mean more work.
 
> But, at least it seems I haven't missed some nice shortcut that
> everyone else uses.
 
I have not felt like that is issue. Most searchable containers
I use are sorted vectors, hash-table based containers or
Boost.Intrusives. Sometimes I use my own implementation of
sorted deque (as std::deque does not let to specify block sizes).
The std::set or std::map I avoid because these either lose in
performance or lack characteristics that I need.
 
For example Boost.Intrusive sets are way more verbose than
std::sets to handle as these do not own the elements but when
used correctly then have great qualities that nothing else provides.
For example it is possible to have polymorphic objects in those
very efficiently, similarly to boost::poly_collection::base_collection.
Consider std::set<unique_ptr<Base>> ... boost::intrusive::set just
wipes floors with it in every benchmark.
Jorgen Grahn <grahn+nntp@snipabacken.se>: Oct 11 05:50PM

On Sun, 2020-10-11, Brian Wood wrote:
> On Sunday, October 11, 2020 at 7:55:01 AM UTC-5, Öö Tiib wrote:
...
>> };
>> }
 
> A little indentation is helpful in my opinion:
 
Tiib's post /was/ indented; Google must have broken it for you.
 
/Jorgen
 
--
// Jorgen Grahn <grahn@ Oo o. . .
\X/ snipabacken.se> O o .
Brian Wood <woodbrian77@gmail.com>: Oct 10 10:07PM -0700

On Sunday, September 30, 2018 at 11:38:32 AM UTC-5, Brian Wood wrote:
 
> If you would indulge me, I'd like to have a little guessing
> game as far as what quote I am thinking of in regards to this
> milestone.
 
Now I can't remember what the quote was, but I'm still
working on the software and appreciate ideas on how
to improve it.
 
 
Brian
Ebenezer Enterprises - Enjoying programming again.
https://github.com/Ebenezer-group/onwards
Brian Wood <woodbrian77@gmail.com>: Oct 10 09:21PM -0700

> the compilers knows what lib it has to include?
 
> I cannot see any motivation for modules apart of (fast compilation
> and hiding macros).
 
For years I was looking forward to them, but they are
so slow in coming that I don't think about them much
now. Maybe in a few years it will be different.

 
Brian
Ebenezer Enterprises
https://github.com/Ebenezer-group/onwards
You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page.
To unsubscribe from this group and stop receiving emails from it send an email to comp.lang.c+++unsubscribe@googlegroups.com.

No comments: