Sunday, June 20, 2021

Digest for comp.lang.c++@googlegroups.com - 25 updates in 3 topics

Sam <sam@email-scan.com>: Jun 12 08:50AM -0400

Bonita Montero writes:
 
>> use it, if it exists. Otherwise the container template will use only the
>> standard-specified interface.
 
> Extremely unlikely.
 
Whether it is unlikely or not, it's perfectly doable.
Bonita Montero <Bonita.Montero@gmail.com>: Jun 12 01:39PM +0200

> I'd started working as a consultant. I'd inadvertently used `free` to
> deallocate something POD-ish allocated with `new`, it worked, and when
> someone remarked on it I argued about the same as you do now. ...
 
That's not what I meant.
Sam <sam@email-scan.com>: Jun 12 08:52AM -0400

Bonita Montero writes:
 
> That's while you can replace malloc and change all the allocations in
> you program. Having such a special allocator that returns the size
> actually allocated would break that.
 
I just explained how this can be done. Nobody has to replace malloc. Just
because you are incapable of wrapping your brain around this concept doesn't
mean that it cannot be done.
Bonita Montero <Bonita.Montero@gmail.com>: Jun 12 03:31PM +0200

>> It's not specified how I described it, but it's expected that
>> the containers _indirectly_ allocate via malloc() so that the
 
> Maybe you expect that, but not anyone who actually understands C++.
 
That's not a matter of how C++ is defined but how its actually
imple- mented. As many people expect the whole memory-allocatoin
to be replaceable via exchanging malloc() and free() the major
C++-runtimes are all implemented in that way.
Sam <sam@email-scan.com>: Jun 12 06:59PM -0400

Bonita Montero writes:
 
> to have replaceable memory-allocation to install a faster allocator
> like mimalloc. Therefore it's the most realistic assumption to say
> that everything in the standard-library indirectly bases on malloc().
 
Once more: what "people expect" is immaterial. As long as a particular
implementation is in compliance with the C++ standard, then that's the end
of it. And I'm afraid that is not the case. The C++ standard explicitly
states that it does not specify whether operator new uses malloc.
 
# 17.6.2.1 Single-object forms [new.delete.single]
#
# [[nodiscard]] void* operator new(std::size_t size);
#
# [[nodiscard]] void* operator new(std::size_t size, std::align_val_t alignment);
#
# …
#
# Whether the attempt involves a call to the C standard library functions
# malloc or aligned_alloc is unspecified.
 
The C++ standard authoritatively asserts that it's unspecified whether new
is implemented in terms of malloc. Too bad if someone's expectations were
otherwise.
 
Now, I don't actually see what that has to do with the original proposition
of std::vector::reserve()'s optimization, but since (for some odd reason)
you chose to sidetrack into your expectations of new's behavior, I have to
set the record straight:
 
The C++ standard does not require that operator new must be implemented in
terms of malloc. Feel free to "expect" otherwise. Your expectations may not
be realized, unfortunately.
 
The End.
Bonita Montero <Bonita.Montero@gmail.com>: Jun 13 07:31AM +0200

> What "many people expect" is immaterial. ...
 
It's the most realistic assumption.
Bonita Montero <Bonita.Montero@gmail.com>: Jun 13 02:03PM +0200

> Maybe a reasonable assumption on many platforms, but an incorrect
> one, and one that can lead to hard to find bugs when it isn't true.
 
No, this doesn't imply any possible bugs.
Sam <sam@email-scan.com>: Jun 13 08:09AM -0400

Bonita Montero writes:
 
>> Once more: what "people expect" is immaterial. ...
 
> It's realistisc.
 
Realistically, people are also always going to whine when they relied on
unspecified behavior, and their code breaks.
 
Also realistically: nobody will care.
 
> Rest of your stupid stuff unread.
 
I believe you 100%, just like I believe that you will not read every
character of this post.
Sam <sam@email-scan.com>: Jun 13 08:11AM -0400

Bonita Montero writes:
 
>> Maybe a reasonable assumption on many platforms, but an incorrect
>> one, and one that can lead to hard to find bugs when it isn't true.
 
> No, this doesn't imply any possible bugs.
 
Ok, you go ahead and write some code that depends on it, then wait when it
breaks, inevitably, after a compiler/library upgrade, then get back to us.
 
We'll figure out whether it's a bug, or not, at that time.
Sam <sam@email-scan.com>: Jun 13 11:23AM -0400

Bonita Montero writes:
 
>> malloc. ...
 
> But that's what a lot of developers expect and because of that standard
> -libaries are implemented in that way.
 
All the voices in your head does not translate to "a lot of developers".
It's still only you.
 
 
> ... rest of your stupid stuff unread.
 
Did I call it, or what?
 
# But you will probably just not read any of this, right?
 
Damn, I'm good.
Bonita Montero <Bonita.Montero@gmail.com>: Jun 13 06:55PM +0200

> Who said that I want to replace malloc?
> Nobody's talking about replacing malloc, but replacing only operator
> new, and nothing else.
 
No, we're talking about replacing malloc() and thereby changing
how C++-implementations allocate at last.
Bonita Montero <Bonita.Montero@gmail.com>: Jun 14 05:16AM +0200

>>> developers". It's still only you.
 
>> There are no good reasons to do it in a different way.
 
> Just because you can't think of any doesn't mean there isn't.
 
You have too much phantasy.
"Chris M. Thomasson" <chris.m.thomasson.1@gmail.com>: Jun 15 01:09PM -0700


> auto
> for(<type> <var>: <container>)
> C++ threads (for simple threading)
 
I would also add C++ atomics and memory barriers.
 
David Brown <david.brown@hesbynett.no>: Jun 16 01:14PM +0200

On 16/06/2021 11:49, Tim Woodall wrote:
> you end up needing to understand far, far more than you would wish (or
> worse, end up guessing what code really does when you need to modify it
> and get it working by "coincidence" rather than by "understanding")
 
That can certainly be a problem - but it is a general development and
team problem rather than a language specific problem. Even if the
programming language doesn't have unusual and rarely used features (and
which language doesn't? Hands up those C experts who know what the
"static" in "void bar(int a[static 10]);" is doing?), people will find
unusual ways to express things. And it is best solved by reading each
other's code, talking together, and sharing information.
 
Paavo Helde <myfirstname@osa.pri.ee>: Jun 16 09:08PM +0300

>> use for an atomic fetch_and_add (e.g. the ARM64 LDADD instruction or intel
>> LOCK INC).
 
> I'm confused. Either std::atomic is thread safe or it isn't. Which is it?
 
Of course it is thread-safe, that's what the name 'atomic' means. Alas,
there are many definitions of "thread-safe", so one might easily get
confused.
 
In the case of atomics, "thread-safe" at least means the value of a
particular atomic is well-defined and predictable when the atomic is
accessed from multiple threads. Whether it means anything more depends
on the used memory order and other details.
David Brown <david.brown@hesbynett.no>: Jun 20 09:45AM +0200

On 19/06/2021 23:33, Sam wrote:
 
> The shared_ptr itself gets moved around. This happens every time you
> pass the shared_ptr to a function by value or move it somewhere. Now you
> have two copy two pointers instead of one.
 
Please look at <https://en.cppreference.com/w/cpp/memory/shared_ptr> and
read the implementation notes. It explains things a lot better than I
have done.
"Öö Tiib" <ootiib@hot.ee>: Jun 20 05:19AM -0700

On Sunday, 20 June 2021 at 00:46:43 UTC+3, Sam wrote:
 
> And everyone lives happily ever after. Weak pointers only introduce a little
> bit of extra overhead when destroying an object with weak pointers, when
> they exist, and otherwise carry minimal costs.
 
 
Hmm ... very interesting. I use std::atomic load of strong count and then loop
until count is zero or compare_exchange_(strong/weak) succeeds to increment it by
one where you use locks. That can no way increment strong count when it is zero,
but weak can fail on case the events 1) and 2) coincide. If
compare_exchange_weak or compare_exchange_strong is better (or it does not
matter) depends on platform.
Paavo Helde <myfirstname@osa.pri.ee>: Jun 20 09:09PM +0300

19.06.2021 16:08 Sam kirjutas:
 
> And if you make it atomic, you'll get thread safety. If you actually go
> ahead and try to measure the additional overhead, I'd be surprised if
> you'd be able to actually measure anything.
 
Well, I took your word and measured the overheads. I'm foremost
interested in my typical workloads, so that's the scenario I tested: all
cpu cores maxed out, a lot of smart pointer copies intermixed with
memory-heavish calculations. The program output is here (the "result" is
only calculated to ensure all code branches calculate the same thing and
nothing is optimized away).
 
hardware_concurrency: 16
Async pointer : result =1056964608, total time = 13.4281 s
Atomic pointer : result =1056964608, total time = 24.774 s
std::shared_ptr : result =1056964608, total time = 25.7377 s
std::make_shared: result =1056964608, total time = 25.1681 s
 
The measured times *contain the data processing*, so this means the
actual synchronized pointer overhead is not 2x, but actually *many
times* more expensive than an async smart pointer.
 
These results suggest that I was right, any thread synchronization
(including std::atomic_ptr with std::memory_order_relaxed) means heavy
penalties, whereas extra pointers or extra dynamic allocations involved
with std::shared_ptr only cost peanuts.
 
I guess one should better stop complaining about the shared_ptr design,
it appears to be pretty fine for its intended purpose (safe usage).
 
The code is below, feel free to try it out on your favorite platform. My
numbers are from MSVC++ 2019, x64 Release build.
 
------------------------------------
 
#include <memory>
#include <string>
#include <iostream>
#include <array>
#include <vector>
#include <algorithm>
#include <numeric>
#include <thread>
#include <atomic>
 
const size_t kChunkSize = 64;
const size_t kNumPointers = 32*1024;
const size_t kNumCopies = 1024;
const int kCycles = 1;
 
class AsyncBase {
public:
AsyncBase() : refcount(0) {}
virtual ~AsyncBase() {}
void Capture() { ++refcount; }
void Release() { if (--refcount == 0) { delete this; } }
private:
int refcount;
};
 
class AtomicBase {
public:
AtomicBase() : refcount(0) {}
virtual ~AtomicBase() {}
void Capture() { refcount.fetch_add(1, std::memory_order_relaxed); }
void Release() { if (refcount.fetch_sub(1,
std::memory_order_relaxed)==1) { delete this; } }
private:
std::atomic<int> refcount;
};
 
class DummyBase {};
 
template<typename T>
struct SmartPtr {
public:
SmartPtr() : p(nullptr) {}
SmartPtr(T* x) { if (x) { x->Capture(); } p = x; }
SmartPtr(const SmartPtr& b) { if (b.p) { b.p->Capture(); } p = b.p; }
SmartPtr(SmartPtr&& b) noexcept : p(b.p) { b.p = nullptr; }
~SmartPtr() { if (p) { T* q = p; p = nullptr; q->Release(); } }
SmartPtr& operator=(const SmartPtr& x) { Assign(x.p); return *this; }
SmartPtr& operator=(SmartPtr&& b) noexcept { Move(b.p); return *this; }
T* operator->() { return p; }
T& operator*() { return *p; }
explicit operator bool() const { return !!p; }
 
private:
void Assign(T* x) {
if (x) {
x->Capture();
}
if (p) {
T* q = p;
p = nullptr;
q->Release();
}
p = x;
}
 
void Move(T*& x) noexcept {
if (p) {
T* q = p;
p = nullptr;
q->Release();
}
p = x;
x = nullptr;
}
 
private:
mutable T* p;
};
 
 
template<class BASE>
class Data: public BASE {
public:
Data(unsigned int seed) {
std::iota(arr.begin(), arr.end(), seed);
}
unsigned int Process(unsigned int sum) {
for (auto& x : arr) {
sum += x;
++x;
}
return sum;
}
private:
std::array<unsigned int, kChunkSize> arr;
};
 
 
 
template<class DATA, class PTR>
unsigned int ProcessSingleData(PTR p) {
// Make some copies of the pointer:
std::vector<PTR> copies(kNumCopies);
std::fill(copies.begin(), copies.end(), p);
unsigned int sum = 0;
unsigned int shift = 0;
for (auto p : copies) {
sum += p->Process(shift);
++shift;
}
return sum;
}
 
template<class DATA, class PTR, bool USEMAKESHARED>
unsigned int Work() {
 
// make some smartpointers.
std::vector<PTR> pointers(kNumPointers);
unsigned int seed = 0;
for (auto& ref : pointers) {
if constexpr (USEMAKESHARED) {
ref = std::make_shared<DATA>(seed);
} else {
ref = PTR(new DATA(seed));
}
++seed;
}
 
unsigned int result = 0;
for (int i = 0; i < kCycles; ++i) {
for (auto p : pointers) {
result += ProcessSingleData<DATA, PTR>(p);
}
}
return result;
}
 
template<class DATA, class PTR, bool USEMAKESHARED>
std::pair<unsigned int, double> TimedWork() {
auto start = std::chrono::steady_clock::now();
unsigned int result = Work<DATA, PTR, USEMAKESHARED>();
auto finish = std::chrono::steady_clock::now();
double lapse = std::chrono::duration<double>(finish - start).count();
return std::make_pair(result, lapse);
}
 
template<class DATA, class PTR, bool USEMAKESHARED=false>
std::pair<unsigned int, double> ThreadedWork(unsigned int numThreads) {
 
std::vector<unsigned int> results(numThreads);
std::vector<double> lapses(numThreads);
std::vector<std::thread> threads;
for (unsigned int i = 0; i < numThreads; ++i) {
threads.emplace_back([i, &results, &lapses]() {
unsigned int result;
double lapse;
std::tie(result, lapse) = TimedWork<DATA, PTR, USEMAKESHARED>();
results[i] = result;
lapses[i] = lapse;
});
}
for (auto& ref : threads) {
ref.join();
}
unsigned int result = results[0];
for (auto x : results) {
if (x != result) {
throw std::logic_error("checksum differs");
}
}
double totalLapse = std::accumulate(lapses.begin(), lapses.end(), 0.0);
return std::make_pair(result, totalLapse);
}
 
using AsyncData = Data<AsyncBase>;
using PAsyncData = SmartPtr<AsyncData>;
 
using AtomicData = Data<AtomicBase>;
using PAtomicData = SmartPtr<AtomicData>;
 
using SharedData = Data<DummyBase>;
using PSharedData = std::shared_ptr<SharedData>;
 
int main() {
unsigned int numThreads = std::thread::hardware_concurrency();
std::cout << "hardware_concurrency: " << numThreads << "\n";
 
unsigned int result;
double lapse;
 
std::tie(result, lapse) = ThreadedWork<AsyncData, PAsyncData>(numThreads);
std::cout << "Async pointer : result =" << result << ", total time =
" << lapse << " s\n";
 
std::tie(result, lapse) = ThreadedWork<AtomicData,
PAtomicData>(numThreads);
std::cout << "Atomic pointer : result =" << result << ", total time =
" << lapse << " s\n";
 
std::tie(result, lapse) = ThreadedWork<SharedData,
PSharedData>(numThreads);
std::cout << "std::shared_ptr : result =" << result << ", total time =
" << lapse << " s\n";
 
std::tie(result, lapse) = ThreadedWork<SharedData, PSharedData,
true>(numThreads);
std::cout << "std::make_shared: result =" << result << ", total time =
" << lapse << " s\n";
 
}
"Öö Tiib" <ootiib@hot.ee>: Jun 20 11:59AM -0700

On Sunday, 20 June 2021 at 21:10:14 UTC+3, Paavo Helde wrote:
> (including std::atomic_ptr with std::memory_order_relaxed) means heavy
> penalties, whereas extra pointers or extra dynamic allocations involved
> with std::shared_ptr only cost peanuts.
 
Yes, same was with text processing with those CoW strings. Atomic
operations of ref count caused these to be sluggish. Who wants to be
quick should be embarrassingly parallel and share only immutable
data.
Bonita Montero <Bonita.Montero@gmail.com>: Jun 14 03:00PM +0200

Do you think about sth. like:
 
namespace fuck
{
void fn();
}
 
void fuck::fn()
{
}
David Brown <david.brown@hesbynett.no>: Jun 14 11:17PM +0200

On 14/06/2021 22:28, Paavo Helde wrote:
> }
 
>  .. OR ..
 
> using N::g;
 
That's a neat suggestion - but it won't work in my particular case
because "g" should be a weak elf symbol. The reason it should be
outside the module's (in the old-fashioned sense of a cpp file with a
namespace - I am not yet using real C++ modules) namespace is that
another part of the code could override it by defining its own function
with that name. I don't want the overriding code to define the
overriding function inside the module's namespace.
 
At the moment, I am declaring "g" to be "extern C", which is a bit of a
sledgehammer solution but works.
David Brown <david.brown@hesbynett.no>: Jun 14 11:07PM +0200

On 14/06/2021 20:28, Öö Tiib wrote:
>> model. If there is a syntax unknown to me that could be used declare
>> "g" as being outside the namespace, it would be neater in my code structure.
 
> Probably there are no such way.
 
I suspect that is the case. But there is always the hope that I am
missing something!
 
> I only have functions with internal linkage in global namespace as code
> bases tend to be huge lately. Why you need it to be in global namespace?
 
It is in connection with interrupts and weak linkage (again, this isn't
the only way to handle this, but it would be convenient for my code
structure).
 
 
> int N::f(int x) { return x; }
 
> int g(int x) { return x; }
 
> int N::h(int x) { return x; }
 
That would be possible, I suppose, but there are quite a few internal
functions and a fair amount of internal data within the namespace.
These could themselves be put inside a singleton class, but a namespace
does that job just as well.
Nikolaj Lazic <nlazicBEZ_OVOGA@mudrac.ffzg.hr>: Jun 15 12:51AM

> }
 
> Does.
 
> Remember you can close and reopen namespaces unlike classes.
 
That was the exact thing OP did and now asks for the way to avoid it.
Richard Damon <Richard@Damon-Family.org>: Jun 14 09:01PM -0400

On 6/14/21 8:51 PM, Nikolaj Lazic wrote:
 
>> Does.
 
>> Remember you can close and reopen namespaces unlike classes.
 
> That was the exact thing OP did and now asks for the way to avoid it.
 
But if it IS right way to do it, it is the right way to do it.
 
The other option that I find messier but might work would be:
 
namespace n;
 
void n::hf1() {}
 
void gf2() {}
 
void n::nf3() {}
Bonita Montero <Bonita.Montero@gmail.com>: Jun 14 02:58PM +0200

>> }
 
>> Is the above code the only way to call the lambda ?
 
> You made a templated lambda - you have to give the type /somehow/.
 
No, I just want to return a value of that type.
You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page.
To unsubscribe from this group and stop receiving emails from it send an email to comp.lang.c+++unsubscribe@googlegroups.com.

No comments: