Wednesday, January 31, 2024

Digest for comp.lang.c++@googlegroups.com - 4 updates in 3 topics

Bonita Montero <Bonita.Montero@gmail.com>: Jan 31 04:45PM +0100

Am 31.01.2024 um 00:19 schrieb Chris M. Thomasson:
 
> MSVC should define _MSC_VER, not exactly sure why clang-cl would be in
> the mix. Probably due to:
 
clang-cl does say its _MSC_VER, clang++ for Windows not.
clang-cl doesn't define __clang__, clang++ does.
But clang-cl does define __llvm__.
 
> https://clang.llvm.org/docs/MSVCCompatibility.html
 
Nothing of what I said.
"Chris M. Thomasson" <chris.m.thomasson.1@gmail.com>: Jan 31 12:32PM -0800

On 1/31/2024 7:45 AM, Bonita Montero wrote:
> But clang-cl does define __llvm__.
 
>> https://clang.llvm.org/docs/MSVCCompatibility.html
 
> Nothing of what I said.
 
I think it is relevant. Also, what's up with all of those pragmas anyway?
Vir Campestris <vir.campestris@invalid.invalid>: Jan 31 03:56PM

On 28/01/2024 10:19, Bonita Montero wrote:
> I'd first have guessed that the prefetchers between the memory-levels
> are as effective for both directions. So I'd like to see some results
> from you.
 
On my Linux box with an AMD Ryzen 5 3400G it's about 11% slower for the
second number. But that's a very about, it's doing something else right
now and that's the average from several runs - where the ratio is
between 97% and 130%.
 
Andy
MarioCCCP <NoliMihiFrangereMentulam@libero.it>: Jan 31 01:17AM +0100

On 22/01/24 12:16, Malcolm McLean wrote:
> modules. One of which will be the graphics system, which may
> well have requirements beyond simple points in space, but
> will include such a requirement.
 
Looking at the variety of image formats, I tend to think
that the coordinate system and RAM representation is still
the least of the problems, and that colour space
representation (bits depths, order, endianness, transparency
support) could be even worse.
Also, at some point, every library has to interact with the
OS to load/unload images into the display driver, which is
at best posix-copliant, but still depends a lot on the OS
internals.
 
Graphics that wont' show up on screen is not very appealing
or useful.
Here a lot of graphic SW had to be rewritten migrating from
X to Wayland.
 
So RAM representation is really the least hard part. The
worse is how to load/save a variety of formats to/from disk,
and how to display the results on screen. How to move large
block of pixels avoinding flickering, etc
 
Since external libraries handle all this, it's not that
difficult to have also its own RAM representation of pixels,
vectors and geometry.
I think that creating a standard for this won't solve the
other main problems.
 
 
--
1) Resistere, resistere, resistere.
2) Se tutti pagano le tasse, le tasse le pagano tutti
MarioCPPP
You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page.
To unsubscribe from this group and stop receiving emails from it send an email to comp.lang.c+++unsubscribe@googlegroups.com.

Tuesday, January 30, 2024

Digest for comp.lang.c++@googlegroups.com - 3 updates in 2 topics

Bonita Montero <Bonita.Montero@gmail.com>: Jan 30 11:44AM +0100

Am 29.01.2024 um 23:45 schrieb Chris M. Thomasson:
> ^^^^^^^^^^^^^
 
> Why use _WIN32 here to disable all of those important warnings?
> _MSC_VER instead?
 
I could change that, but clang-cl is the only compiler that recoginizes
_MSC_VER and doesn't complain about the #pragmas.
"Chris M. Thomasson" <chris.m.thomasson.1@gmail.com>: Jan 30 03:19PM -0800

On 1/30/2024 2:44 AM, Bonita Montero wrote:
>> _MSC_VER instead?
 
> I could change that, but clang-cl is the only compiler that recoginizes
> _MSC_VER and doesn't complain about the #pragmas.
 
MSVC should define _MSC_VER, not exactly sure why clang-cl would be in
the mix. Probably due to:
 
https://clang.llvm.org/docs/MSVCCompatibility.html
 
Back in the day I used several compilers on Windows. Humm... Anyway,
what's up with all of those pragmas anyway? ;^)
Bonita Montero <Bonita.Montero@gmail.com>: Jan 30 11:47AM +0100

Am 29.01.2024 um 23:12 schrieb Chris M. Thomasson:
>> one in the cacheline doesn't matter since physical offset zero would
>> then be occupied by logical offset 63.
 
> You don't want to straddle any cache lines. ...
 
I'm testing all 64 offsets and for my measurement it doesn't matter if
the beginning of the block is at offset zero inside a cacheline since
the result show equal access times for all offsets.
If there were different results it might have made sense to have proper
alignment.
You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page.
To unsubscribe from this group and stop receiving emails from it send an email to comp.lang.c+++unsubscribe@googlegroups.com.

Monday, January 29, 2024

Digest for comp.lang.c++@googlegroups.com - 4 updates in 3 topics

"Chris M. Thomasson" <chris.m.thomasson.1@gmail.com>: Jan 29 02:45PM -0800

On 1/27/2024 1:05 PM, Chris M. Thomasson wrote:
>> #include "thread_pool.h"
>> #include "invoke_on_destruct.h"
 
>> #if defined(_WIN32)
^^^^^^^^^^^^^
 
Why use _WIN32 here to disable all of those important warnings?
 
_MSC_VER instead?
 
Also, why disable all of those warnings in MSVC?
 
 
Bonita Montero <Bonita.Montero@gmail.com>: Jan 29 09:56AM +0100

Am 28.01.2024 um 20:18 schrieb Chris M. Thomasson:
 
> Try padding and aligning the blocks. iirc, std::vector works with
> alignas. Actually, it's pretty nice.
 
I'm testing all 64 offsets. If offset zero becomes physically offset
one in the cacheline doesn't matter since physical offset zero would
then be occupied by logical offset 63.
"Chris M. Thomasson" <chris.m.thomasson.1@gmail.com>: Jan 29 02:12PM -0800

On 1/29/2024 12:56 AM, Bonita Montero wrote:
 
> I'm testing all 64 offsets. If offset zero becomes physically offset
> one in the cacheline doesn't matter since physical offset zero would
> then be occupied by logical offset 63.
 
You don't want to straddle any cache lines. Properly aligning and
padding the blocks gets around that...
Vir Campestris <vir.campestris@invalid.invalid>: Jan 29 09:31PM

I retired last year, and I haven't really written any code since. This
has turned out to be quite a fun little thing of a type I haven't had
time for for YEARS. And oddly I still don't seem to have enough time for
it... It's the garden, and the kid's gardens, and my mum's garden, and
all those holidays :)
 
But some optimisations. You'll remember in Bonita's first version the
bitmap was initialised to 0xaaaa, because it's a waste of time doing the
sieve for 2.
 
I pointed out that we don't need to even store the even numbers.
 
But there's more.
 
If you look at the bitmap when you've sieved for 2 you see
 
12 34 56 78
11 10 10 10...
which is a repeat of 2 after an initialisation word. That's the aaaa.
 
You can do the same with 3
 
123 456 789
111 110 110 110
 
except this time the repeat is 3. And annoyingly that doesn't map well
down onto a byte-based architecture. You end up with an initial word,
then a 3-word repeat. (If your word was 24 or 36 bytes it would only be
1 word, but I haven't seen that since the 1970s)
 
In hex, with lowest byte first, that is
fd b6 6d db b6 6d db
 
(That BTW is the same if you only store the odd numbers - a 3-word repeat.)
 
So rather than start off with your bitmap all set to 1s you can set it
to this repeating pattern. That replaces all the ANDs for all the values
of three with a memory fill with far fewer accesses.
 
You can do the same with 5:
 
12345 6789a bcdef
11111 11110 11110 11110
 
You can then AND your pattern for 3 with the one for 5, and get one with
a repeat length of 15, and set that into your bitmap. You've now
replaced about a fifth of all your AND operations with a flood fill.
 
This can carry on - for a while. Only a short while. You _can_ make a
sequence for lots of primes. But it gets quite long, quite quickly. For
up to 23, and not storing the evens, it's over 1E8 words long!
 
I was implementing a version of that when something else occurred to me.
You can sacrifice speed for store size if you're prepared to do an
integer divide for every prime lookup.
 
Group our numbers into 6s like this
 
012345
6789ab
cdefgh (YKWIM)
ijklmn
 
Mask out the ones divisible by 2 or 3 and you see
 
0123-5
-7---b
-d---h
-i---n
 
Except for the first "page" the pattern is identical. Your algorithm can be:
Divide your candidate by 6.
If the result is zero look up in the page zero table 0123-5
If it is non-zero index into a translation table like this
010002
If the translation is zero the number is not prime.
If it is not zero it is the index of the bit for this page which tells
me if the number is prime. And there need only be 2 bits for 6 numbers.
 
(6 is the product of the first primes 2 and 3)
 
It's still worth masking out the even numbers - they don't need the
divide, a simple AND detects them - and I suspect it's worth using a
much larger number than 6. 3*5*11*13 is 15,015, which seems as though it
might be a convenient size. And only about a third of the number aren't
known to be primes (5760 to be exact)
 
I might play with that idea some time.
 
But a note for the group of course - optimising this to the max has
nothing to do whatever with C++. The only C++ optimising I've found
myself doing is using raw pointers, not vector's operator[]. (certainly
not the at function). And also I found myself using emplace_back a lot.
It's a PITA because you can only emplace back a single item, and it is slow.
 
Andy
You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page.
To unsubscribe from this group and stop receiving emails from it send an email to comp.lang.c+++unsubscribe@googlegroups.com.

Sunday, January 28, 2024

Digest for comp.lang.c++@googlegroups.com - 5 updates in 2 topics

Bonita Montero <Bonita.Montero@gmail.com>: Jan 28 11:19AM +0100

With my thread pool there's some code that looks up a deque in reverse
order with find_if with reverse iterators to check if there's a "this"
pointer inside the deque. I also could have scanned the deque in forward
order but it's more likely to find a fitting element wenn searching from
the back first.
A deque usually consists of a number of linear parts in memory. This
lead me to the question if scanning memory is faster forward or back-
ward. I tried to test this with the below program:
 
#include <iostream>
#include <vector>
#include <atomic>
#include <chrono>
 
using namespace std;
using namespace chrono;
 
atomic_char aSum;
 
int main()
{
constexpr size_t GB = 1ull << 30;
vector<char> vc( GB );
auto sum = []( auto begin, auto end, ptrdiff_t step )
{
auto start = high_resolution_clock::now();
char sum = 0;
for( auto p = begin; end - p >= step; sum += *p, p += step );
::aSum.store( sum, memory_order_relaxed );
cout << duration_cast<nanoseconds>( high_resolution_clock::now() -
start ).count() / 1.0e6 << "ms" << endl;
};
constexpr size_t STEP = 100;
sum( vc.begin(), vc.end(), STEP );
sum( vc.rbegin(), vc.rend(), STEP );
}
 
On my Windows 7050X Zen4 computer scanning memory in both directions
has the same speed. On my Linux 3990X Zen2 computer scanning forward
is 22% faster. On my small Linux PC, a HP EliteDesk Mini PC with a
Skylake Pentium G4400 scanning memory forward is about 38% faster.
I'd first have guessed that the prefetchers between the memory-levels
are as effective for both directions. So I'd like to see some results
from you.
Marcel Mueller <news.5.maazl@spamgourmet.org>: Jan 28 11:32AM +0100

Am 28.01.24 um 11:19 schrieb Bonita Montero:
 
> I'd first have guessed that the prefetchers between the memory-levels
> are as effective for both directions. So I'd like to see some results
> from you.
 
Reverse memory access is typically slower simply because the last data
of a cache line (after a cache miss) arrives at last. If you read
forward the process continues when the first few bytes of the cache line
are read. The further data is read in parallel.
 
But details depend on many other factors. First of all the placement of
the memory chunks and the used prefetching technique (if any).
 
 
Marcel
Bonita Montero <Bonita.Montero@gmail.com>: Jan 28 02:07PM +0100

Am 28.01.2024 um 11:32 schrieb Marcel Mueller:
 
> Reverse memory access is typically slower simply because the
> last data of a cache line (after a cache miss) arrives at last.
 
I tested this and for all offsets within a cacheline I get thes
same timing for all three of my computers:
 
#include <iostream>
#include <vector>
#include <chrono>
#include <atomic>
 
using namespace std;
using namespace chrono;
 
#if defined(__cpp_lib_hardware_interference_size)
constexpr size_t CL_SIZE = hardware_constructive_interference_size;
#else
constexpr size_t CL_SIZE = 64;

Saturday, January 27, 2024

Digest for comp.lang.c++@googlegroups.com - 6 updates in 2 topics

"Chris M. Thomasson" <chris.m.thomasson.1@gmail.com>: Jan 26 05:15PM -0800

On 1/25/2024 8:08 PM, Bonita Montero wrote:
>> [...]
 
>> Just make sure to take the time to model it in a race detector.
 
> Idiot ...
 
Sigh. I don't have the time to look over your code and find any
potential issues right now. I will wait for one of your infamous
corrections instead. At least if you said here are some test units and
they pass, well, that would be a good sign, right? :^)
"Chris M. Thomasson" <chris.m.thomasson.1@gmail.com>: Jan 26 07:23PM -0800

On 1/25/2024 8:08 PM, Bonita Montero wrote:
>> [...]
 
>> Just make sure to take the time to model it in a race detector.
 
> Idiot ...
 
Don't be ashamed of creating a test unit. If it find any errors, just
correct them, right? Notice how I formulated my xchg algortihm in a test
unit first!
 
https://groups.google.com/g/comp.lang.c++/c/Skv1PoQsUZo/m/bZoTXWDkAAAJ
 
No shame in that! Right? :^)
"Chris M. Thomasson" <chris.m.thomasson.1@gmail.com>: Jan 26 07:24PM -0800

On 1/26/2024 7:23 PM, Chris M. Thomasson wrote:
> unit first!
 
> https://groups.google.com/g/comp.lang.c++/c/Skv1PoQsUZo/m/bZoTXWDkAAAJ
 
> No shame in that! Right? :^)
 
Give it a go? https://github.com/dvyukov/relacy
Bonita Montero <Bonita.Montero@gmail.com>: Jan 27 09:38AM +0100

Am 25.01.2024 um 20:31 schrieb Chris M. Thomasson:
 
>> This is the implementation
> [...]
 
> Just make sure to take the time to model it in a race detector.
 
The synchronization part is trivial.
It's the state the synchronization manages that is complex.
"Chris M. Thomasson" <chris.m.thomasson.1@gmail.com>: Jan 27 01:05PM -0800

On 1/25/2024 9:25 AM, Bonita Montero wrote:
> #include <functional>
> #include <chrono>
 
> struct thread_pool
 
[...]
 
>     #pragma clang diagnostic ignored "-Wparentheses"
>     #pragma clang diagnostic ignored "-Wunqualified-std-cast-call"
>

Friday, January 26, 2024

Digest for comp.lang.c++@googlegroups.com - 1 update in 1 topic

Bonita Montero <Bonita.Montero@gmail.com>: Jan 26 05:08AM +0100

Am 25.01.2024 um 20:31 schrieb Chris M. Thomasson:
 
>> This is the implementation
> [...]
 
> Just make sure to take the time to model it in a race detector.
 
Idiot ...
You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page.
To unsubscribe from this group and stop receiving emails from it send an email to comp.lang.c+++unsubscribe@googlegroups.com.

Thursday, January 25, 2024

Digest for comp.lang.c++@googlegroups.com - 2 updates in 1 topic

Bonita Montero <Bonita.Montero@gmail.com>: Jan 25 06:25PM +0100

Once I've written a thread pool that has an upper limit of the number
threads and a timeout when idle threads end theirselfes. If you have
sth userpace CPU bound you'd specify the number of hardware-threads
as the upper limit, if you have much threads doing I/O you may go far
beyond since the hardware-threads aren't fully occupied anyway.
The problem with my initial thread pool class was that there may be
a large number of idle threads which could be used by other pools.
So I wrote a thread pool class where each pool has an upper limit of
the number of executing threads and there are no idle threads within
each pool. Instead the threads go idling in a global singleton pool
and attach to each pool which needs a new thread, thereby minimizing
the total number of threads.
 
This is the implementation
 
// header
 
#pragma once
#include <thread>
#include <mutex>
#include <condition_variable>
#include <deque>
#include <functional>
#include <chrono>
 
struct thread_pool
{
using void_fn = std::function<void ()>;
thread_pool( size_t maxThreads = 0 );
thread_pool( thread_pool const & ) = delete;
void operator =( thread_pool const & ) = delete;
~thread_pool();
uint64_t enqueue_task( void_fn &&task );
void_fn cancel( uint64_t queueId );
void wait_idle();
size_t max_threads();
size_t resize( size_t maxThreads );
bool clear_queue();
void_fn idle_callback( void_fn &&fn = {} );
std::pair<size_t, size_t> processing();
static typename std::chrono::milliseconds timeout(
std::chrono::milliseconds timeout );
private:
struct idle_node
{
idle_node *next;
bool notify;
};
using queue_item = std::pair<uint64_t, void_fn>;
using task_queue_t = std::deque<queue_item>;
bool m_quit;
size_t
m_maxThreads,
m_nThreadsExecuting;
uint64_t
m_lastIdleQueueId,
m_nextQueueId;
task_queue_t m_queue;
std::condition_variable m_idleCv;
std::shared_ptr<void_fn> m_idleCallback;
idle_node *m_idleList;
inline static struct global_t
{
std::mutex m_mtx;
std::chrono::milliseconds m_timeout = std::chrono::seconds( 1 );
std::condition_variable
m_cv,
m_quitCv;
bool m_quit;
size_t
m_nThreads,
m_nThreadsActive;
std::deque<thread_pool *> m_initiate;
void theThread();
global_t();
~global_t();
} global;
void processIdle( std::unique_lock<std::mutex> &lock );
std::unique_lock<std::mutex> waitIdle();
};
 
// translation unit
 
#include <cassert>
#include "thread_pool.h"
#include "invoke_on_destruct.h"
 
#if defined(_WIN32)
#pragma warning(disable: 26110) // Caller failing to hold lock
'lock' before calling function 'func'.
#pragma warning(disable: 26111) // Caller failing to release lock
'lock' before calling function 'func'.
#pragma warning(disable: 26115) // Failing to release lock 'lock'
in function 'func'.
#pragma warning(disable: 26117) // Releasing unheld lock 'lock' in
function 'func'.
#pragma warning(disable: 26800) // Use of a moved from object:
'object'.

Monday, January 22, 2024

Digest for comp.lang.c++@googlegroups.com - 7 updates in 1 topic

immibis <news@immibis.com>: Jan 22 01:22AM +0100

On 1/19/24 19:17, Malcolm McLean wrote:
> be the same structures or incompatible structures.
> But a simple standardisation would mean the end of pointless editing of
> code just to conform to whatever the host program has chosen.
 
And what should be the data type of the coefficients of the vector? And
what should? Why not also have matrices? What is the maximum dimension
supported? Are homogeneous coordinates a built-in feature? No, leave the
graphics stuff to a graphics team.
Malcolm McLean <malcolm.arthur.mclean@gmail.com>: Jan 22 11:16AM

On 22/01/2024 00:22, immibis wrote:
> what should? Why not also have matrices? What is the maximum dimension
> supported? Are homogeneous coordinates a built-in feature? No, leave the
> graphics stuff to a graphics team.
It should take a template, so any type can be used for the coefficients.
Unless you have some weird and wonderful ideas, it will of course be
scalar.
I'd recommend a 2D with x and y and a 3D with x, y and z. Humanity is
not going to be elevated to a higher dimension any time soon. No
homogenous co-ordinates. No angle / magnitude notation. No need for
matrices because we already have a natural representation of the these,
since C++ supports 2 dimensional fixed size array.
Needing to store points in 2D or 3D space is a common requirement, and
code needs to communicate with other modules. One of which will be the
graphics system, which may well have requirements beyond simple points
in space, but will include such a requirement.
--
Check out Basic Algorithms and my other books:
https://www.lulu.com/spotlight/bgy1mm
Malcolm McLean <malcolm.arthur.mclean@gmail.com>: Jan 22 11:22AM

On 21/01/2024 04:06, Kaz Kylheku wrote:
 
> And that's just
 
> [ c -d ] [ a ] = [ ca - db ]
> [ d c ] [ b ] = [ da + cb ]
 
Yes I know. I did complex numbers at high school.
 
But whilst you could use the Argand plane as your graphics surface and
thus represent all points as complex numbers, I've never actually seen
anyone do so, and the axes are always given different labels. Except of
course in Mandelbrots or other programs concerned with complex numbers
themselves.
--
Check out Basic Algorithms and my other books:
https://www.lulu.com/spotlight/bgy1mm
"Fred. Zwarts" <F.Zwarts@HetNet.nl>: Jan 22 12:34PM +0100

Op 22.jan.2024 om 12:16 schreef Malcolm McLean:
> code needs to communicate with other modules. One of which will be the
> graphics system, which may well have requirements beyond simple points
> in space, but will include such a requirement.
 
According to Einstein, humanity lives already in a four dimensional
space; time is the fourth dimension.
There are many problems in physics and other fields with even more than
4 dimensions, so it would be short-sighted to limit the library to 3
dimensions.
In addition one could ask how far the standard library must go. What
operations must be supported? Calculate the length of a vector, allowing
non-Euclidian spaces?
Malcolm McLean <malcolm.arthur.mclean@gmail.com>: Jan 22 12:31PM

On 22/01/2024 11:34, Fred. Zwarts wrote:
> In addition one could ask how far the standard library must go. What
> operations must be supported? Calculate the length of a vector, allowing
> non-Euclidian spaces?
No-one is saying that you can't devise your own structures if you want
to write programs to solve problems in general relativity. The idea is
to have a common standard for the common requirement to represent pints
and vectors in 2d and 3d spaces, so that routines writen in C++ can
communicate with each other without the need for adapter code or rewriting.
However having decided on a representation for points, there is also a
very strong case for a standard library for basic operations on those
points, such as taking the length of a vector. However probably not
non-Euclidian spaces. Again, some people will want to write software
that operates in Hilbert space or other non-Euclidean space, but it's
likely to be specialised, and so you can't expect much support from the
standard library.
--
Check out Basic Algorithms and my other books:
https://www.lulu.com/spotlight/bgy1mm
"Chris M. Thomasson" <chris.m.thomasson.1@gmail.com>: Jan 22 12:22PM -0800

On 1/22/2024 3:22 AM, Malcolm McLean wrote:
> anyone do so, and the axes are always given different labels. Except of
> course in Mandelbrots or other programs concerned with complex numbers
> themselves.
 
Usually a vector, say 2-ary (x, y), x is the horizontal axis and y is
the vertical axis. This matches a complex number x + yi:
 
+y
|
-x--0--+x
|
-y
 
x is real, y is imaginary. :^)
"Chris M. Thomasson" <chris.m.thomasson.1@gmail.com>: Jan 22 12:24PM -0800

On 1/22/2024 3:16 AM, Malcolm McLean wrote:
> code needs to communicate with other modules. One of which will be the
> graphics system, which may well have requirements beyond simple points
> in space, but will include such a requirement.
 
And 4-ary with (x, y, z, w)
 
Again I am quite fond of the GLM library. It's just nice to me.
You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page.
To unsubscribe from this group and stop receiving emails from it send an email to comp.lang.c+++unsubscribe@googlegroups.com.

Saturday, January 20, 2024

Digest for comp.lang.c++@googlegroups.com - 3 updates in 2 topics

wij <wyniijj5@gmail.com>: Jan 21 06:22AM +0800

On Tue, 2024-01-16 at 21:29 +0000, bubu wrote:
> ?
 
> Sorry. I have a lot of problems. I will have others questions after.
 
> Thanks a lot.
 
Check out this site. They are willingly to answer questions about Qt.
ttps://www.qtcentre.org/content/
Malcolm McLean <malcolm.arthur.mclean@gmail.com>: Jan 20 01:59PM

On 19/01/2024 18:35, Kaz Kylheku wrote:
>> representation of a point or a vector. Whilst generally it's just a POD
>> structure with x and y members, the name varies, and sometimes the
 
> For code working with 2D vectors, designers should consider complex numbers.
 
That's a nice idea.
 
But I've never seen code where the horizontal axis is "real" and the
vertical "imaginary", except of course in code designed to demonstrate
complex numbers as such. Mandelbrot is my favourite test program when
getting a new system.
 
--
Check out Basic Algorithms and my other books:
https://www.lulu.com/spotlight/bgy1mm
"Chris M. Thomasson" <chris.m.thomasson.1@gmail.com>: Jan 20 11:22AM -0800

On 1/20/2024 5:59 AM, Malcolm McLean wrote:
> vertical "imaginary", except of course in code designed to demonstrate
> complex numbers as such. Mandelbrot is my favourite test program when
> getting a new system.
 
Same here! :^D
You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page.
To unsubscribe from this group and stop receiving emails from it send an email to comp.lang.c+++unsubscribe@googlegroups.com.

Tuesday, January 16, 2024

Digest for comp.lang.c++@googlegroups.com - 5 updates in 2 topics

Lynn McGuire <lynnmcguire5@gmail.com>: Jan 16 04:03PM -0600

"We are doomed"
https://www.carette.xyz/posts/we_are_doomed/
 
"The only system with a good software compatibility that I know is
Windows, and this explains a ton of things keeping very old UI/UX
frameworks, software and APIs to run, for example, Windows 95 compatible
games like "Roller Coaster Tycoon"."
 
"Otherwise, you are doomed."
 
The C++ committee has screwed up and continues to screw up by not
creating a graphics standard for C++.
 
Lynn
bubu <bruno.donati@hotmail.fr>: Jan 16 08:52PM

Hi,
 
Sorry for my bad english and sorry for my bad level in qt.
 
I would like to use a qml program and c++ libraries (with import).
 
I use Visual Studio on windows.
 
I have several problems (and I put an example here-after) :
- how to use c++ with qml
- how to use a library written in c++ to use in a program in QML with
IMPORT statement.
- how to build executable and librarie with cmake?
 
My example is here :
 
main.qml
 
ApplicationWindow {
visible: true
width: 400
height: 300
title: "Calculator"
 
Calculatrice {
id: calculatrice
}
 
Column {
anchors.centerIn: parent
spacing: 10
 
TextField {
id: input1
placeholderText: "Entrez le premier nombre"
validator: DoubleValidator {
bottom: -1000000000.0
top: 1000000000.0
}
}
 
TextField {
id: input2
placeholderText: "Entrez le deuxième nombre"
validator: DoubleValidator {
bottom: -1000000000.0
top: 1000000000.0
}
}
 
Row {
spacing: 10
 
Button {
text: "Additionner"
onClicked: {
resultLabel.text = "Résultat: " +
calculator.add(parseFloat(input1.text), parseFloat(input2.text))
}
}
 
Button {
text: "Soustraire"
onClicked: {
resultLabel.text = "Résultat: " +
calculator.subtract(parseFloat(input1.text), parseFloat(input2.text))
}
}
 
Button {
text: "Multiplier"
onClicked: {
resultLabel.text = "Résultat: " +
calculator.multiply(parseFloat(input1.text), parseFloat(input2.text))
}
}
}
 
Label {
id: resultLabel
text: "Résultat: "
}
}
}
 
 
 
 
main.cpp
 
#include <QGuiApplication>#include <QQmlApplicationEngine>#include
<QtCore>//#include "calculator.cpp"int main(int argc, char* argv[]){
QCoreApplication::setAttribute(Qt::AA_EnableHighDpiScaling);
QGuiApplication app(argc, argv); QQmlApplicationEngine engine;
//qmlRegisterType<Calculator>("Calculator", 1, 0, "Calculator"); const
QUrl url(QStringLiteral("qrc:/main.qml"));
QObject::connect(&amp;engine, &amp;QQmlApplicationEngine::objectCreated,
&amp;app, [url](QObject* obj, const QUrl&amp; objUrl) { if
(!obj &amp;&amp; url == objUrl) QCoreApplication::exit(-1);
}, Qt::QueuedConnection); engine.load(url); return
app.exec();}
 
 
 
calculator.h
 
 
// calculator.h#ifndef CALCULATOR_H#define CALCULATOR_Hclass Calculator
{public: double add(double a, double b) const; double subtract(double a,
double b) const; double multiply(double a, double b) const;};

Sunday, January 14, 2024

Digest for comp.lang.c++@googlegroups.com - 1 update in 1 topic

Tim Rentsch <tr.17687@z991.linuxsc.com>: Jan 13 09:31PM -0800


>> Does that all make sense?
 
> Right now, no. But that's me. I'll flag it to read again when I've
> had a better night's sleep.
 
I'm posting to nudge you into looking at this again, if
you haven't already.
 
I have now had a chance to get your source and run some
comparisons. A program along the lines I outlined can run much
faster than the code you posted (as well as needing less memory).
A good target is to find all primes less than 1e11, which needs
less than 4gB of ram.
You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page.
To unsubscribe from this group and stop receiving emails from it send an email to comp.lang.c+++unsubscribe@googlegroups.com.

Wednesday, January 10, 2024

Digest for comp.lang.c++@googlegroups.com - 1 update in 1 topic

"Chris M. Thomasson" <chris.m.thomasson.1@gmail.com>: Jan 09 11:33PM -0800

On 1/8/2024 5:14 PM, red floyd wrote:
 
>>> Absolutely not, not with four way associativity.
 
>> Whatever you say; Sigh. I am done with this.
 
> Intel just needs to call Bonita whenever they have an issue.
 
Okay. You just made me laugh so hard I started to cough a bit! Wow.
Cleaned out the pipes, so to speak. Thanks. ROFL! Cough...
 
:^D
You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page.
To unsubscribe from this group and stop receiving emails from it send an email to comp.lang.c+++unsubscribe@googlegroups.com.

Monday, January 8, 2024

Digest for comp.lang.c++@googlegroups.com - 2 updates in 1 topic

Bonita Montero <Bonita.Montero@gmail.com>: Jan 08 06:48AM +0100

Am 07.01.2024 um 21:46 schrieb Chris M. Thomasson:
 
> I know that they had a problem and the provided workaround from Intel
> really did help out. ...
 
Absolutely not, not with four way associativity.
"Chris M. Thomasson" <chris.m.thomasson.1@gmail.com>: Jan 08 12:18PM -0800

On 1/7/2024 9:48 PM, Bonita Montero wrote:
 
>> I know that they had a problem and the provided workaround from Intel
>> really did help out. ...
 
> Absolutely not, not with four way associativity.
 
Whatever you say; Sigh. I am done with this.
You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page.
To unsubscribe from this group and stop receiving emails from it send an email to comp.lang.c+++unsubscribe@googlegroups.com.

Saturday, January 6, 2024

Digest for comp.lang.c++@googlegroups.com - 6 updates in 1 topic

"Chris M. Thomasson" <chris.m.thomasson.1@gmail.com>: Jan 05 07:21PM -0800

On 1/3/2024 7:37 PM, Bonita Montero wrote:
 
> The Pentium 4's L1 data cache is between 16 and 32kB, so there
> can't be a 64kB aliasing. And aliasing can be only on a set basis
> and the sets are 4kB or 8kB large.
 
Are you trying to tell me that the aliasing problem on those older Intel
hyperthreaded processors and the workaround (from Intel) was a myth?
lol. ;^)
Bonita Montero <Bonita.Montero@gmail.com>: Jan 06 08:18AM +0100

Am 06.01.2024 um 04:21 schrieb Chris M. Thomasson:
 
> Are you trying to tell me that the aliasing problem on those older Intel
> hyperthreaded processors and the workaround (from Intel) was a myth?
> lol. ;^)
 
Intel just made a nerd-suggestion. With four-way associativity
there's no frequent aliasing problem in the L1 data dache of
Pentium 4.
Kaz Kylheku <433-929-6894@kylheku.com>: Jan 06 08:31AM


> Intel just made a nerd-suggestion. With four-way associativity
> there's no frequent aliasing problem in the L1 data dache of
> Pentium 4.
 
I think the L1 cache was 8K on that thing, and the blocks are 32 bytes.
 
I think how it works on the P4 is that the address is structured is like
this:
 
31 11 10 5 4 0
| | | | | |
[ 21 bit tag ] [ 6 bit cache set ] [ 5 bit offset into 32 bit block ]
 
Thus say we have an area of the stack with the address
range nnnnFF80 to nnnnFFFF (128 bytes, 4 x 32 byte cache blocks).
 
These four blocks all map to the same set: they have the same six
bits in the "cache set" part of the address.
 
So if a thread is accessing something in all four blocks, it will
completely use that cache set, all by itself.
 
If any other thread has a similar block in its stack, with the same
cache set ID, it will cause evictions against this thread.
 
Sure, if each of these threads confines itself to working with just one
cacheline-sized aperture of the stack, it looks better.
 
You're forgetting that the sets are very small and that groups of
adjacent four 32 byte blocks map to the same set. Touch four adjacent
cache blocks that are aligned on a 128 byte boundary, and you have
hit full occupancy in the cache set corresponding to that block!
 
(I suspect the references to 64K should not be kilobytes but sets.
The 8K cache has 64 sets.)
 
In memory, 128 byte blocks that is aligned maps to, and precisely covers
a cache set. If two such blocks addresses that are equal modulo 8K, they
collide to the same cache set. If one of those blocks is fully present
in the cache, the other must be fully evicted.
 
It's really easy to see how things can go south under hyperthreading.
If two hyperthreads are working with clashing 128 byte areas that each
want to hog the same cache set, and the core is switching between them
on a fine-grained basis, ... you get the picture.
 
It's very easy for the memory mapping allocations used for thread
stacks to produce addresses such tha the delta between them is a
multiple of 8K.
 
--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca
NOTE: If you use Google Groups, I don't see you, unless you're whitelisted.
Bonita Montero <Bonita.Montero@gmail.com>: Jan 06 10:30AM +0100

Am 06.01.2024 um 09:31 schrieb Kaz Kylheku:
 
> It's very easy for the memory mapping allocations used for thread
> stacks to produce addresses such tha the delta between them is a
> multiple of 8K.
 
Of course it's easy to intentionally provoke frequent aliasing
with the P4's L1 cache, but actually this doesn't happen often.
"Chris M. Thomasson" <chris.m.thomasson.1@gmail.com>: Jan 06 01:15PM -0800

On 1/6/2024 1:30 AM, Bonita Montero wrote:
>> multiple of 8K.
 
> Of course it's easy to intentionally provoke frequent aliasing
> with the P4's L1 cache, but actually this doesn't happen often.
 
Fwiw, some people were complaining about bad performance using
hyperthreading. Turning it off in bios improved performance. Hence the
paper was written to show them how to vastly improve performance when
hyperthreading was turned on. You call it nerd stuff, and I still cannot
figure out why?
"Chris M. Thomasson" <chris.m.thomasson.1@gmail.com>: Jan 06 01:19PM -0800

On 1/6/2024 1:15 PM, Chris M. Thomasson wrote:
> paper was written to show them how to vastly improve performance when
> hyperthreading was turned on. You call it nerd stuff, and I still cannot
> figure out why?
 
Humm... I can see it know. Bonita works for Intel and received the
complaints... Bonita says shut up you stupid nerds! Humm... ;^o
You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page.
To unsubscribe from this group and stop receiving emails from it send an email to comp.lang.c+++unsubscribe@googlegroups.com.