soft and program: November 2013

Sunday, November 17, 2013

comp.lang.c++ - 26 new messages in 5 topics - digest

comp.lang.c++
http://groups.google.com/group/comp.lang.c++?hl=en

comp.lang.c++@googlegroups.com

Today's topics:

* hash_map and unordered map - 2 messages, 2 authors
http://groups.google.com/group/comp.lang.c++/t/c9f1495ba9d6d1c5?hl=en
* Returning a class instance by value and a naming question - 10 messages, 6
authors
http://groups.google.com/group/comp.lang.c++/t/914a6523b571b0a9?hl=en
* Good way to write integer overflow checks? - 2 messages, 2 authors
http://groups.google.com/group/comp.lang.c++/t/1363de0ff88836cd?hl=en
* including files best practice - 11 messages, 7 authors
http://groups.google.com/group/comp.lang.c++/t/5558ad8b3f50bf41?hl=en
* Available C++ Libraries FAQ - 1 messages, 1 author
http://groups.google.com/group/comp.lang.c++/t/1caaa278cbb03e87?hl=en

==============================================================================
TOPIC: hash_map and unordered map
http://groups.google.com/group/comp.lang.c++/t/c9f1495ba9d6d1c5?hl=en
==============================================================================

== 1 of 2 ==
Date: Wed, Nov 13 2013 9:17 am
From: Paavo Helde

jashan4235@gmail.com wrote in
news:5c7409af-cc6b-489c-8762-6ab4da754633@googlegroups.com:

>I atcually want to know what is the latest implementation of
> hash_map??

You should use std::unordered_map which comes with your C++ compiler.

Cheers
Paavo

== 2 of 2 ==
Date: Wed, Nov 13 2013 12:20 pm
From: Jorgen Grahn

On Wed, 2013-11-13, Paavo Helde wrote:
> jashan4235@gmail.com wrote in
> news:5c7409af-cc6b-489c-8762-6ab4da754633@googlegroups.com:
>
>>I atcually want to know what is the latest implementation of
>> hash_map??
>
> You should use std::unordered_map which comes with your C++ compiler.

Different answer: the latest version of hash_map is probably from the
mid-1990s when HP and SGI fiddled with it. I doubt anyone else made
any important changes to it.

People used hash_map as a faster but non-standard alternative to
std::map over the decades, until it was finally standardized in a
slightly different form called first std::tr1::unordered_map, and then
in C++11 std::unordered_map.

/Jorgen

--
// Jorgen Grahn <grahn@ Oo o. . .
\X/ snipabacken.se> O o .

==============================================================================
TOPIC: Returning a class instance by value and a naming question
http://groups.google.com/group/comp.lang.c++/t/914a6523b571b0a9?hl=en
==============================================================================

== 1 of 10 ==
Date: Wed, Nov 13 2013 3:09 pm
From: DSF

Hello,

I have a class I've been writing where I frequently need to return
the class by value so that class members can be changed without
affecting the original. The problem is all of the copying. First,
the class has to be copied into a local instance of the class. Then,
upon return, the local instance is copied into an unnamed instance
that's created before the function is called, and therefore not
destroyed when the function ends. Finally, after the function ends,
the operator= of the target is called, yet again copying the class
(unnamed). Here is an example:

FStringW FStringW::Start(size_t end) const
{
FStringW tfs(*this);
if(end < length)
{
tfs.str[end] = 0;
tfs.length = end;
}
tfs.StartIP(end);
return tfs;
}

StartIP is a member function that alters the string in place.

It returns 'end' number of characters of the start of a wide string.

I tried creating a "global" temp that could be used in place of each
function that used 'tfs'. It was a pointer to FStringW that was
created in each constructor and deleted in the destructor. I managed
to avoid an infinite loop in the constructor, but didn't realize it
created one in the destructor, too. I wasn't able to get around that
one. All the functions that returned FStringW returned a reference to
the global object, so the reference had the lifetime of until another
function uses it. That was acceptable. It would have eliminated one
copy if it worked.

So the question is: Are there any techniques for minimizing the
number of copies that occur when returning by value?

The ideal situation would be if the returned object could become the
unnamed object that's passed to operator=. But I think the compiler
would have to do that. And there's always the chance the function
could return an expression i.e. return tfs1 + tfs2.

My second question relates to the naming of member functions. Most
string-altering functions I've written have two versions, one that
returns by value and leaves the original alone, and one that alters
the string in place. For now, I'm using:

FStringW FStringW::Start(size_t end);
void FStringW::StartIP(size_t end);

But I was thinking maybe something like:

FStringW FStringW::GetStringStart(size_t end);
void FStringW::StringStart(size_t end);

Something like Crop or Cut StringEnd is more descriptive, but uses
start and end as terms for functions that do very similar things. Any
Ideas?

Thanks.
"'Later' is the beginning of what's not to be."
D.S. Fiscus

== 2 of 10 ==
Date: Wed, Nov 13 2013 4:17 pm
From: Jorgen Grahn

On Wed, 2013-11-13, DSF wrote:
> Hello,
>
> I have a class I've been writing where I frequently need to return
> the class by value so that class members can be changed without
> affecting the original. The problem is all of the copying.
[...]

Are you saying that doing it the naive, straightforward way causes
performance problems in your application? Because otherwise I'd stop
worrying and just do it that way.

(I'm asking because I myself have an unfortunate tendency to
micro-optimise things, so perhaps others do, too.)

/Jorgen

--
// Jorgen Grahn <grahn@ Oo o. . .
\X/ snipabacken.se> O o .

== 3 of 10 ==
Date: Wed, Nov 13 2013 4:31 pm
From: ram@zedat.fu-berlin.de (Stefan Ram)

DSF <notavalid@address.here> writes:
> I have a class I've been writing where I frequently need to return
>the class by value so that class members can be changed without
>affecting the original.

What you return is called »an instance of the class«
or »an object of the class«. It's not the class.

>upon return, the local instance is copied into an unnamed instance

Return values are rvalues, they are moved, not copied.

> So the question is: Are there any techniques for minimizing the
>number of copies that occur when returning by value?

Move members (move constructors and move assignements) are
already generated by the compiler, unless you defined one of
them or copy members or destructors.

I think, you just have to take care to mark non-obvious
rvalues with ::std::move and to have handles to resources
as members (not large members).

== 4 of 10 ==
Date: Wed, Nov 13 2013 10:49 pm
From: Paavo Helde

DSF <notavalid@address.here> wrote in
news:g9g789tnbu5lfsjal0pqg6clcpervqqmth@4ax.com:

> Hello,
>
> I have a class I've been writing where I frequently need to return
> the class by value so that class members can be changed without
> affecting the original. The problem is all of the copying.

How do you know? Have you profiled the code and seen this is a
bottleneck?

> First,
> the class has to be copied into a local instance of the class. Then,
> upon return, the local instance is copied into an unnamed instance
> that's created before the function is called, and therefore not
> destroyed when the function ends. Finally, after the function ends,
> the operator= of the target is called, yet again copying the class
> (unnamed). Here is an example:
>
> FStringW FStringW::Start(size_t end) const
> {
> FStringW tfs(*this);
> if(end < length)
> {
> tfs.str[end] = 0;
> tfs.length = end;
> }
> tfs.StartIP(end);
> return tfs;
> }
>
> StartIP is a member function that alters the string in place.
>
> It returns 'end' number of characters of the start of a wide string.
>
> I tried creating a "global" temp that could be used in place of each
> function that used 'tfs'.

Whatever you do, don't do this. Global (non-const) variables are evil,
they complicate the data flow, make the optimizations harder and will
create lots of troubles in recursive functions or if you ever want to go
multithreaded.

> It was a pointer to FStringW that was
> created in each constructor and deleted in the destructor. I managed
> to avoid an infinite loop in the constructor, but didn't realize it
> created one in the destructor, too. I wasn't able to get around that
> one. All the functions that returned FStringW returned a reference to
> the global object, so the reference had the lifetime of until another
> function uses it. That was acceptable. It would have eliminated one
> copy if it worked.
>
> So the question is: Are there any techniques for minimizing the
> number of copies that occur when returning by value?

This is mostly a job for the optimizer. In your example, the return value
optimization (RVO) might be used by the compiler, eliminating one copy
without any work from your part.

Writing simple and straightforward code often helps the optimizer to do
its work better.

If the profiler still tells you that there is a bottleneck in copy, then
you have to do something about it. One way would be to keep unchanging
parts of the object in shared use, by adding some reference counters and
so on, but this may also cause troubles in multithreading and is actually
not guaranteed to speed things up.

A better way to increase performance is to review your algorithms. The
above method looks a lot like substr(), but still copies the whole string
even if only a tiny portion is going to be used later. It should allocate
space for and copy over only the needed amount of characters.

BTW, what is the reason you are using your own string class and not
something based on std::basic_string?

>
> The ideal situation would be if the returned object could become the
> unnamed object that's passed to operator=. But I think the compiler
> would have to do that. And there's always the chance the function
> could return an expression i.e. return tfs1 + tfs2.

> My second question relates to the naming of member functions. Most
> string-altering functions I've written have two versions, one that
> returns by value and leaves the original alone, and one that alters
> the string in place. For now, I'm using:
>
> FStringW FStringW::Start(size_t end);
> void FStringW::StartIP(size_t end);

The first one should be 'const', this also helps a bit to distinguish it.

For names, I have sometimes used a convention like Foo and ApplyFoo.

But in your example, naming the functions something like Left() and
Truncate() would be much better than Start() and StartIP() ;-) And even
better would be to get rid of your class and use std::wstring or
something else based on std::basic_string instead. Selecting good names
is important for shortening the learning curve of any reader (including
yourself in the future), with std::wstring the learning curve ought to be
zero.

hth
Paavo

== 5 of 10 ==
Date: Thurs, Nov 14 2013 5:44 am
From: "Alf P. Steinbach"

On 14.11.2013 00:09, DSF wrote:
>
> I have a class I've been writing where I frequently need to return
> the class by value so that class members can be changed without
> affecting the original. The problem is all of the copying. First,
> the class has to be copied into a local instance of the class. Then,
> upon return, the local instance is copied into an unnamed instance
> that's created before the function is called, and therefore not
> destroyed when the function ends. Finally, after the function ends,
> the operator= of the target is called, yet again copying the class
> (unnamed).

Most of that copying is eliminated by the compiler.

However, for a simple and naively implemented string or array the
compiler can't get rid of the underlying O(n) element copying, since for
such a class that's the definition of the operation.

The programmer, in control of the class design, can do far better.

In your case, returning and assigning a leftmost substring, the whole
thing -- substring operation, returning, assignment -- can be done
in constant time, O(k) (also known as O(1)), and a very efficient
constant time at that, mainly[1] at the cost of making conversion to
zero-terminated string for the worst case an O(n) operation.

However, I gather you're doing this exercise for learning, and in that
case the O(k) design and implementation may be beyond your current level
of expertise, too hard to tackle. So don't fret about it. But it's worth
knowing that it's THERE, that it's obtainable, and that it's therefore
something to aim for (later) and something to compare against.

What you CAN do right now is to MEASURE.

MEASURE.

And compare to measurement of roughly the same when using e.g. std::string.

If things are too slow you can then, among other things, try to
implement a C++ "move constructor". Or a C++03 equivalent. Just ask
here, but do measure first!

Cheers, & hth.,

- Alf

Notes:
[1] Another possible cost is that the strings then become immutable. But
I see that mainly as an advantage, not a cost. It certainly both speeds
up and simplifies things, in general.

== 6 of 10 ==
Date: Thurs, Nov 14 2013 4:45 pm
From: DSF

On 14 Nov 2013 00:31:03 GMT, ram@zedat.fu-berlin.de (Stefan Ram)
wrote:

Please read *all* my answers before replying. (Or at least the last
one. It explains the context of my answers.)

>DSF <notavalid@address.here> writes:
>> I have a class I've been writing where I frequently need to return
>>the class by value so that class members can be changed without
>>affecting the original.
>
> What you return is called »an instance of the class«
> or »an object of the class«. It's not the class.

I used the term "instance" everywhere else, no need to nit pick.

>>upon return, the local instance is copied into an unnamed instance
>
> Return values are rvalues, they are moved, not copied.

That's an issue of semantics. A copy followed by a delete is often
called a move, but there's still a copy involved.
Return by value:
Copy local object to unnamed object created before function call.
Call local object's destructor.
Return from function.
Copy unnamed object to target.
Call unnamed object's destructor.

Call it two "moves," but there's still two copies made.

>
>> So the question is: Are there any techniques for minimizing the
>>number of copies that occur when returning by value?
>
> Move members (move constructors and move assignements) are
> already generated by the compiler, unless you defined one of
> them or copy members or destructors.

I've never heard of a "Move Constructor." A "Copy Constructor,"
yes. Same with assignments.

> I think, you just have to take care to mark non-obvious
> rvalues with ::std::move and to have handles to resources
> as members (not large members).

What would be a "non-obvious rvalue"?

I looked up std::move. I see it's new to C++11. This probably
applies to your two previous answers.

Usenet text can convey the wrong attitude, so please take this in
the friendly manor it's meant. If you are dealing with relatively new
terms, state their context. As in:

In C++11, return values are rvalues, they are moved, not copied.

And please do not assume that everyone is using the latest
technology.

"'Later' is the beginning of what's not to be."
D.S. Fiscus

== 7 of 10 ==
Date: Thurs, Nov 14 2013 5:34 pm
From: DSF

On 14 Nov 2013 00:17:25 GMT, Jorgen Grahn <grahn+nntp@snipabacken.se>
wrote:

>On Wed, 2013-11-13, DSF wrote:
>> Hello,
>>
>> I have a class I've been writing where I frequently need to return
>> the class by value so that class members can be changed without
>> affecting the original. The problem is all of the copying.
>[...]
>
>Are you saying that doing it the naive, straightforward way causes
>performance problems in your application? Because otherwise I'd stop
>worrying and just do it that way.

I assume you meant "native?" It fits into the sentence better than
"naive."
>
>(I'm asking because I myself have an unfortunate tendency to
>micro-optimise things, so perhaps others do, too.)
>
>/Jorgen

I was raised (programming-wise) at a time when every byte and every
clock cycle counted. So I tend to still operate that way, whether
it's necessary or not.

In this case, if I had an array of 100,000 FStringWs that needed one
of my string manipulations done to each member and saved in another
array, it would be faster to duplicate the entire array and then use
the "in place" version on each of the destination elements. Which
leads me to consider if I even need the versions that return by value.
One can always create an instance of a class if one doesn't want to
alter the original. Typing aloud here (Same as thinking aloud) we
would have:

FStringW fs1, fs2;
fs1 = L"I don't want to be changed";
fs2 = fs1;
fs2.StartIP(18);

Vs.

FStringW fs1, fs2;
fs1 = L"I don't want to be changed";
fs2 = fs1.Start(18);

Not counting the initial assignment to fs1, the first example copies
the string once, the second example copies the string three times.

I set up a timed loop for each of the two examples. The first
example averaged three times faster than the second.

I'm leaning toward eliminating the functions that return FStringW by
value. Creating a copy and altering it is much faster, even if it
does require a little more typing.

"'Later' is the beginning of what's not to be."
D.S. Fiscus

== 8 of 10 ==
Date: Thurs, Nov 14 2013 6:44 pm
From: Ian Collins

DSF wrote:
> On 14 Nov 2013 00:31:03 GMT, ram@zedat.fu-berlin.de (Stefan Ram)
> wrote:
>>
>> Return values are rvalues, they are moved, not copied.
>
> That's an issue of semantics. A copy followed by a delete is often
> called a move, but there's still a copy involved.
> Return by value:
> Copy local object to unnamed object created before function call.
> Call local object's destructor.
> Return from function.
> Copy unnamed object to target.
> Call unnamed object's destructor.
>
> Call it two "moves," but there's still two copies made.

Are you familiar with the concept of Return Value Optimisation? If not,
look it up. In most situations, what you think may be an expensive
operation, such as returning a container by value, isn't.

--
Ian Collins

== 9 of 10 ==
Date: Thurs, Nov 14 2013 9:27 pm
From: DSF

On Thu, 14 Nov 2013 00:49:11 -0600, Paavo Helde
<myfirstname@osa.pri.ee> wrote:

>DSF <notavalid@address.here> wrote in
>news:g9g789tnbu5lfsjal0pqg6clcpervqqmth@4ax.com:
>
>> Hello,
>>
>> I have a class I've been writing where I frequently need to return
>> the class by value so that class members can be changed without
>> affecting the original. The problem is all of the copying.
>
>How do you know? Have you profiled the code and seen this is a
>bottleneck?

Making a copy of the original string object and altering the copy
takes on third the time of the return by value method. The majority
of my coding involves string manipulation, so it will make a
difference. (See my response to Jorgen Grahn for details.)

{snipped my example}
>>
>> I tried creating a "global" temp that could be used in place of each
>> function that used 'tfs'.
>
>Whatever you do, don't do this. Global (non-const) variables are evil,
>they complicate the data flow, make the optimizations harder and will
>create lots of troubles in recursive functions or if you ever want to go
>multithreaded.

In the first place *nothing* in C/C++ programming is evil! Arguably,
the worst that C/C++ has(had) to offer was gets(). And I wouldn't
call it evil. Poor design? Yes. Written that way because perhaps
back at that time an 80-column terminal also had a one line input
limit? Maybe. Could be interesting research if I have some time
sometime.

Note that "global" is in quotations. It wasn't literally a global,
it was merely a class member so it would persist when member function
Start (or any other return by value function) returned. It would be
"global" storage for return by value functions for that particular
instance of the class that would persist until another RBV function is
called.

One could think of class member variables as local to the class but
global to all its members. One thing I like about C++ is that one can
have variables common to a group of functions without having to pass
them around all of the time or make them global.

I had written a database in C, and while there was nothing wrong
with it. I disliked the fact that I had to pass a pointer to a
structure to almost every function. So I re-wrote it in C++ and the
structure's members became members of the class. Quite a few of the
functions no longer required any parameters.

As far as recursive code goes, C++ shines with it as well. My
database had two recursive functions. Each of them required some data
to be available at any depth. In C, those had to be globals. In C++,
they became members of the class, allowing more than one instance of
the database in a program. I maintained both versions for a while,
but found myself "forgetting" to update the C version more and more
often until I stopped.

A directory-traversing class for Windows I've written uses the same
method and takes it one step further. If it encounters a mount point,
It creates an instance of the class and runs it with the mount point
as the starting directory, making the whole class recursive.

{More snipping}
>>
>> So the question is: Are there any techniques for minimizing the
>> number of copies that occur when returning by value?
>
>This is mostly a job for the optimizer. In your example, the return value
>optimization (RVO) might be used by the compiler, eliminating one copy
>without any work from your part.

My compiler is several thousand years old and does very little
optimization. (There's a post of mine in clc on just how literal it
is at producing assembly code that's very close to the C code that
generated it.)

>Writing simple and straightforward code often helps the optimizer to do
>its work better.

See above.

>If the profiler still tells you that there is a bottleneck in copy, then
>you have to do something about it. One way would be to keep unchanging
>parts of the object in shared use, by adding some reference counters and
>so on, but this may also cause troubles in multithreading and is actually
>not guaranteed to speed things up.

The bottleneck is not in the actual copying of the string, but
rather the number of times return by value needs to copy said string.

>A better way to increase performance is to review your algorithms. The
>above method looks a lot like substr(), but still copies the whole string
>even if only a tiny portion is going to be used later. It should allocate
>space for and copy over only the needed amount of characters.

Hmmmm. I don't seem to have a substr() function in my RTL. Must've
been added in the last 500 years or so. :o) The string class that
comes with the compiler (not part of the STL) does have a SubString
that takes a starting position and length.

>BTW, what is the reason you are using your own string class and not
>something based on std::basic_string?

I learn best by doing. I can read about how to do something over
and over and not get it, yet experiment with it and get the concept in
one or two attempts. My main interest is string manipulation, whilst
books concentrate on mathematical examples. So what better way to
learn the ins and outs of an object than to create a string class?
I've put a lot of time and energy into the string class and I'm
currently working on wrapping the 40 or so C string functions I've
written into the string class. As I mentioned in another posting in
this thread, making a copy of the string and performing the alteration
on the copy is ~three times faster than return by value. So I'm
seriously considering dropping the return by value versions and using
only the in place ones.

Oh! And when I started working on the string class, I hadn't heard
of the STL.
>>
>> The ideal situation would be if the returned object could become the
>> unnamed object that's passed to operator=. But I think the compiler
>> would have to do that. And there's always the chance the function
>> could return an expression i.e. return tfs1 + tfs2.
>
>> My second question relates to the naming of member functions. Most
>> string-altering functions I've written have two versions, one that
>> returns by value and leaves the original alone, and one that alters
>> the string in place. For now, I'm using:
>>
>> FStringW FStringW::Start(size_t end);
>> void FStringW::StartIP(size_t end);
>
>The first one should be 'const', this also helps a bit to distinguish it.

It actually is. No typing. Strictly C&P from the header file:

FStringW Start(size_t end) const;

>For names, I have sometimes used a convention like Foo and ApplyFoo.
>
>But in your example, naming the functions something like Left() and
>Truncate() would be much better than Start() and StartIP() ;-) And even
>better would be to get rid of your class and use std::wstring or
>something else based on std::basic_string instead. Selecting good names
>is important for shortening the learning curve of any reader (including
>yourself in the future), with std::wstring the learning curve ought to be
>zero.

Not much chance of me getting rid of my class. I've put too much
time and energy into it. Not to mention that doing so would require a
rewrite of every program I've written that I'm actively maintaining.

Interestingly, Start was originally called Left, but I changed it
because it sounded too "BASICy". Speaking of selecting good names, I
find quite a few of the names in the STL counterintuitive. You "add"
an item to a vector with "push_back." Why not "add"?

And speaking of "vector," is the definition of the word as it's used
here unique to C++ or programming languages in general? Every
definition I can find relates to the position of one object from
another based on the length of a line drawn between the two objects
and the angle of that line from a baseline drawn through the first
object.

"'Later' is the beginning of what's not to be."
D.S. Fiscus

== 10 of 10 ==
Date: Thurs, Nov 14 2013 10:55 pm
From: Paavo Helde

DSF <notavalid@address.here> wrote in
news:vv6b89p1ampg8sqggvu309hoi9oseabe3v@4ax.com:

> On Thu, 14 Nov 2013 00:49:11 -0600, Paavo Helde
> <myfirstname@osa.pri.ee> wrote:
>
> Note that "global" is in quotations. It wasn't literally a global,
> it was merely a class member so it would persist when member function
> Start (or any other return by value function) returned. It would be
> "global" storage for return by value functions for that particular
> instance of the class that would persist until another RBV function is
> called.

OK, maybe I misunderstood you. Still, this design seems a bit fragile and
complicated.

[snipping long praise to OOP]

> My compiler is several thousand years old and does very little
> optimization. (There's a post of mine in clc on just how literal it
> is at producing assembly code that's very close to the C code that
> generated it.)

So it seems it's last time to ditch it. I do not follow c.l.c so I don't
know the reasons you are sticking to it, but I hope they are good.

>>BTW, what is the reason you are using your own string class and not
>>something based on std::basic_string?
>
> I learn best by doing. I can read about how to do something over
> and over and not get it, yet experiment with it and get the concept in
> one or two attempts. My main interest is string manipulation, whilst
> books concentrate on mathematical examples. So what better way to
> learn the ins and outs of an object than to create a string class?
> I've put a lot of time and energy into the string class and I'm
> currently working on wrapping the 40 or so C string functions I've
> written into the string class. As I mentioned in another posting in
> this thread, making a copy of the string and performing the alteration
> on the copy is ~three times faster than return by value. So I'm
> seriously considering dropping the return by value versions and using
> only the in place ones.
>
> Oh! And when I started working on the string class, I hadn't heard
> of the STL.

So, are you learning how to write a string class, or how to write your
application? If the former, then you certainly should study the design of
std::basic_string. While the interface is monstrous, it actually provides
the functions you seem to need. An example: instead of your

FStringW a, b;
a = ...
b = a.Start(18); // many copies of the whole string and substring (?)

with standard C++ you could write:

std::wstring a, b;
a = ...
b.assign(a, 0, 18); // 1 copy of substring

This is a bit more verbose than return by value, but has the benefit that
no optimizing skills are required from the compiler.

The assign method is documented e.g. at
http://www.cplusplus.com/reference/string/basic_string/assign/

> Not much chance of me getting rid of my class. I've put too much
> time and energy into it. Not to mention that doing so would require a
> rewrite of every program I've written that I'm actively maintaining.

So add a proper assign() method to it and use that instead of assignment
(which requires temporary objects and a good optimizing compiler).

>
> Interestingly, Start was originally called Left, but I changed it
> because it sounded too "BASICy".

Oh, so your goal seems to obstruct the code ;-) There are many better
ways to do that in C++, for example, you could overload some operators
like + or - to return substrings - much more fun! But "Start" is a good
start, considering that nothing seems to be started here.

> Speaking of selecting good names, I
> find quite a few of the names in the STL counterintuitive. You "add"
> an item to a vector with "push_back." Why not "add"?

Because "add" is too vague. It could be easily minsinterpreted to add a
number to each element of the vector, for example.

>
> And speaking of "vector," is the definition of the word as it's used
> here unique to C++ or programming languages in general? Every

It's for programming in general.

> definition I can find relates to the position of one object from
> another based on the length of a line drawn between the two objects
> and the angle of that line from a baseline drawn through the first
> object.

This is the physics/math term. Basically, it's the same (once one has
invented Cartesian coordinates), only the number of dimensions is
typically much more in programming. But there are also e.g. Hilbert
spaces with infinite number of dimensions, meaning the vectors in such
space would have infinitely many elements.

There is also "vector of attack" which is a bit different again.

But are you aware that the word "string" has even more meanings? From
superstring theory in theoretical physics, to violins and underwear?

Cheers
Paavo

==============================================================================
TOPIC: Good way to write integer overflow checks?
http://groups.google.com/group/comp.lang.c++/t/1363de0ff88836cd?hl=en
==============================================================================

== 1 of 2 ==
Date: Thurs, Nov 14 2013 8:53 am
From: David Harmon

On Mon, 11 Nov 2013 18:47:23 +0000 in comp.lang.c++, Leigh Johnston
<leigh@i42.co.uk> wrote,
>The idiot "plink'd" you which is idiot-speak for blacklisting your
>posts. He has also "plink'd" me as the guy can't handle criticism.

A man's got to know his limitations.
-- Dirty Harry

>

== 2 of 2 ==
Date: Thurs, Nov 14 2013 10:41 pm
From: Öö Tiib

On Thursday, 14 November 2013 18:53:54 UTC+2, David Harmon wrote:
> On Mon, 11 Nov 2013 18:47:23 +0000 in comp.lang.c++, Leigh Johnston
> <leigh@i42.co.uk> wrote,
> >The idiot "plink'd" you which is idiot-speak for blacklisting your
> >posts. He has also "plink'd" me as the guy can't handle criticism.
>
> A man's got to know his limitations.

Got to know what he does not want to see and filter that out. Rational.

==============================================================================
TOPIC: including files best practice
http://groups.google.com/group/comp.lang.c++/t/5558ad8b3f50bf41?hl=en
==============================================================================

== 1 of 11 ==
Date: Thurs, Nov 14 2013 1:34 pm
From: ouroboros84

Please find here below a simple example. I know that a .hpp file has to include or forward declare everything it uses I wonder if it is best practice to include all .hpp files that are used in the .cpp file as well. I am asking this because I recently saw on a new project I am working on, that people tend to avoid including headers in a lot of cpp files.

In this case: is it good to include b.hpp in a.cpp? I know it has already been included by including a.hpp, but I personally feel that if a symbol appears in a .cpp, it should be included (or forward declared if possible)

a.hpp

#pragma once
#include "b.hpp"

class C; //forward declaration of C

class A
{
public:
B get_b();
private:
B _b;
C* _c;
};

a.cpp

#include "a.hpp"
//needed or I can leave it out, since B is included already in a.hpp?
#include "b.hpp"

B A::get_b()
{
return _b;
}

== 2 of 11 ==
Date: Thurs, Nov 14 2013 2:14 pm
From: Jorgen Grahn

On Thu, 2013-11-14, ouroboros84 wrote:
> Please find here below a simple example. I know that a .hpp file has
> to include or forward declare everything it uses

It doesn't /have to/ -- although it can be rather confusing if a file
which just says #include "foo.hpp" doesn't compile.

> I wonder if it is
> best practice to include all .hpp files that are used in the .cpp file
> as well. I am asking this because I recently saw on a new project I am
> working on, that people tend to avoid including headers in a lot of
> cpp files.

No, that's not best practice.

> In this case: is it good to include b.hpp in a.cpp? I know it has
> already been included by including a.hpp, but I personally feel that
> if a symbol appears in a .cpp, it should be included (or forward
> declared if possible)

My personal rule is this: if I have a pair foo.cpp/foo.hpp, I always
do a

#include "foo.hpp"

first in that cpp file. That's an automatic check that foo.hpp is
idempotent. Then you can do almost whatever you want and not end up
with a nasty effects like "I included foo.hpp from a new source file,
and had to spend half an hour finding out which other files I needed
to pull in first".

I try not to have a foo.hpp pull in a /lot/ of other headers which it
doesn't really need, but it's not a big disaster if it it pulls in a
few too many.

/Jorgen

--
// Jorgen Grahn <grahn@ Oo o. . .
\X/ snipabacken.se> O o .

== 3 of 11 ==
Date: Thurs, Nov 14 2013 2:27 pm
From: Victor Bazarov

On 11/14/2013 4:34 PM, ouroboros84 wrote:
> Please find here below a simple example. I know that a .hpp file has
to include or forward declare everything it uses I wonder if it is best
practice to include all .hpp files that are used in the .cpp file as
well. I am asking this because I recently saw on a new project I am
working on, that people tend to avoid including headers in a lot of cpp
files.
>
> In this case: is it good to include b.hpp in a.cpp? I know it has
already been included by including a.hpp, but I personally feel that if
a symbol appears in a .cpp, it should be included (or forward declared
if possible)
>
> a.hpp
>
> #pragma once
> #include "b.hpp"
>
> class C; //forward declaration of C
>
> class A
> {
> public:
> B get_b();
> private:
> B _b;
> C* _c;
> };
>
> a.cpp
>
> #include "a.hpp"
> //needed or I can leave it out, since B is included already in a.hpp?
> #include "b.hpp"
>
> B A::get_b()
> {
> return _b;
> }
>

I second this.

I would include it since 'B' type is used in a.cpp. Cannot (and should
not) rely on 'b.hpp' being pulled in by 'a.hpp'.

Of course, it's a style question, not a necessity. If it's pulled in
twice, it should contain the double inclusion guards, etc. And if it's
not included after somebody edits 'a.hpp' file, an attempt to compile
'a.cpp' will reveal that. But I still include all those things, myself.

V
--
I do not respond to top-posted replies, please don't ask

== 4 of 11 ==
Date: Thurs, Nov 14 2013 2:29 pm
From: ouroboros84

Il giorno giovedì 14 novembre 2013 23:14:02 UTC+1, Jorgen Grahn ha scritto:
> On Thu, 2013-11-14, ouroboros84 wrote:
>
>
> > I wonder if it is
>
> > best practice to include all .hpp files that are used in the .cpp file
>
> > as well. I am asking this because I recently saw on a new project I am
>
> > working on, that people tend to avoid including headers in a lot of
>
> > cpp files.
>
>
>
> No, that's not best practice.
>
>
>

Imagine though, if now in a.cpp you had this, as you suggest:

#include "a.hpp"
//#include "b.hpp" deleted

B A::get_b()
{
return _b;
}

namespace {
B b; //I can instanciate it
// even if I don't have the header
//thanks to an im[licit inclusion from a.hpp
C = b.get_c() //I can do it as well because a.hpp -> b.hpp -> c.hpp
}

wouldn't that cause problems? imagine if someone deleted c.hpp inclusion from b.hpp, using a forward declare for returning a reference instead of a value.
in that case a.cpp wouldn't compile anymore.

I prefer to be explicit in what I have to include in every class.
for example it is rare that you need to include a <string> header, but you basically do it every time you use a string.

== 5 of 11 ==
Date: Thurs, Nov 14 2013 2:33 pm
From: ouroboros84

Il giorno giovedì 14 novembre 2013 23:27:48 UTC+1, Victor Bazarov ha scritto:
> On 11/14/2013 4:34 PM, ouroboros84 wrote:
>
> > Please find here below a simple example. I know that a .hpp file has
>
> to include or forward declare everything it uses I wonder if it is best
>
> practice to include all .hpp files that are used in the .cpp file as
>
> well. I am asking this because I recently saw on a new project I am
>
> working on, that people tend to avoid including headers in a lot of cpp
>
> files.
>
> >
>
> > In this case: is it good to include b.hpp in a.cpp? I know it has
>
> already been included by including a.hpp, but I personally feel that if
>
> a symbol appears in a .cpp, it should be included (or forward declared
>
> if possible)
>
> >
>
> > a.hpp
>
> >
>
> > #pragma once
>
> > #include "b.hpp"
>
> >
>
> > class C; //forward declaration of C
>
> >
>
> > class A
>
> > {
>
> > public:
>
> > B get_b();
>
> > private:
>
> > B _b;
>
> > C* _c;
>
> > };
>
> >
>
> > a.cpp
>
> >
>
> > #include "a.hpp"
>
> > //needed or I can leave it out, since B is included already in a.hpp?
>
> > #include "b.hpp"
>
> >
>
> > B A::get_b()
>
> > {
>
> > return _b;
>
> > }
>
> >
>
>
>
> I second this.
>
>
>
> I would include it since 'B' type is used in a.cpp. Cannot (and should
>
> not) rely on 'b.hpp' being pulled in by 'a.hpp'.
>
>
>
> Of course, it's a style question, not a necessity. If it's pulled in
>
> twice, it should contain the double inclusion guards, etc. And if it's
>
> not included after somebody edits 'a.hpp' file, an attempt to compile
>
> 'a.cpp' will reveal that. But I still include all those things, myself.
>
>
>
> V
>
> --
>
> I do not respond to top-posted replies, please don't ask

thanks, I prefer to include them myself too. I was just wondering if there was a common agreement in the C++ commmunity, but as usual it seems there isn't one :-)

== 6 of 11 ==
Date: Thurs, Nov 14 2013 2:55 pm
From: Jorgen Grahn

On Thu, 2013-11-14, ouroboros84 wrote:
> Il giorno giovedì 14 novembre 2013 23:14:02 UTC+1, Jorgen Grahn ha scritto:
>> On Thu, 2013-11-14, ouroboros84 wrote:
>>
>>
>> > I wonder if it is
>>
>> > best practice to include all .hpp files that are used in the .cpp file
>>
>> > as well. I am asking this because I recently saw on a new project I am
>>
>> > working on, that people tend to avoid including headers in a lot of
>>
>> > cpp files.
>>
>>
>>
>> No, that's not best practice.
>
> Imagine though, if now in a.cpp you had this, as you suggest:

I'm not sure I suggested that. However ...

>
> #include "a.hpp"
> //#include "b.hpp" deleted
>
> B A::get_b()
> {
> return _b;
> }
>
> namespace {
> B b; //I can instanciate it
> // even if I don't have the header
> //thanks to an im[licit inclusion from a.hpp
> C = b.get_c() //I can do it as well because a.hpp -> b.hpp -> c.hpp
> }

> wouldn't that cause problems? imagine if someone deleted c.hpp
> inclusion from b.hpp, using a forward declare for returning a
> reference instead of a value. in that case a.cpp wouldn't compile
> anymore.

I have a hard time following your examples; I can't remember a c.hpp
in your first posting.

But anyway: ok, so someone removes an #include from some header file,
because it's not strictly needed there. And building something else
fails.

For me that's perfectly fine. I have all the sources; I can rearrange
them as needed. I myself was probably that someone who made the
change, a few seconds earlier -- because noone would change a header
file and commit without checking if it at least built!

I guess this means I'm assuming you're not developing isolated
components or libraries. Things are different and a lot harder if you
do. But (a) very few people need to do that, and (b) there are
several other worries you have in that case.

It /is/ a real problem for library writers and users though: many
times I've upgraded my compiler or standard library and had code stop
building, because #include <iostream> no longer pulls in <ostream>, or
something. I see no way around this.

/Jorgen

--
// Jorgen Grahn <grahn@ Oo o. . .
\X/ snipabacken.se> O o .

== 7 of 11 ==
Date: Thurs, Nov 14 2013 3:46 pm
From: "Alf P. Steinbach"

On 14.11.2013 22:34, ouroboros84 wrote:
[snip]
> a.cpp
>
> #include "a.hpp"
> //needed or I can leave it out, since B is included already in a.hpp?
> #include "b.hpp"
>
> B A::get_b()
> {
> return _b;
> }

In this case the full class B definition is not needed for class A's
public interface. So there is a possibility that class A will be changed
so that [b.hpp] is not needed for [a.hpp]. However, changing [a.hpp]
will usually be accompanied by a corresponding change of [a.cpp], so
there's problem with THIS file, whatever you decide.

The main problem is with client code. A programmer using [a.hpp] cannot
know whether [b.hpp] is guaranteed made available or is just present as
an implementation aspect that might be changed at any time. That's
because the complete class B definition is not needed for A's /public/
interface, only for its implementation, and so some optimization of
class A (e.g. for build time) might result in [b.hpp]'s removal.

So if I was concerned about this kind of problem then I would document
in some way whether [b.hpp] is present by design or as just an
implementation aspect.

* * *

In some cases you absolutely know that a header will be made available
by some other header. For example, a header that exposes something that
involves Windows API things will, as a rule, guaranteed include
<windows.h>. But that does not mean that client code can avoid including
<windows.h>. On the contrary, that Microsoft header, the "mother of all
headers", provides a set of declarations that depends strongly on
various macro symbols, plus that a great many of the declarations
themselves depend on various macro symbols. Thus to ensure consistent
declarations and a consistent set of declarations, it is a good idea to
include this header before including headers that depend on it. Other
3rd party headers may be similarly badly designed.

* * *

To ensure that headers are free-standing (that they're "idempotent")
it's a good idea to include each header at the top in /some/ [.cpp]
file. For header only modules I use a dummy [.cpp] file. E.g.,
[blahblah.h] is included by [blahblah.h.dummy.cpp], if it isn't included
by a real implementation file [blahblah.cpp].

With some toolsets (in particular Microsoft's) this necessitates turning
off librarian warnings about unused stuff.

Also note that this practice ties in with the previous section, ensuring
that those headers that need some H do include it.

* * *

Microsoft's precompiled header feature, which is on by default in a
Visual Studio project, can make incorrect code compile and can cause
errors with correct code. It's no big deal to adjust the code but it
means that code compiled with this feature may not necessarily work with
some other compiler. I do not know of any good solution.

Cheers & hth.,

- Alf

== 8 of 11 ==
Date: Thurs, Nov 14 2013 11:24 pm
From: Tobias Müller

"Alf P. Steinbach" <alf.p.steinbach+usenet@gmail.com> wrote:
> Microsoft's precompiled header feature, which is on by default in a
> Visual Studio project, can make incorrect code compile and can cause
> errors with correct code. It's no big deal to adjust the code but it
> means that code compiled with this feature may not necessarily work with
> some other compiler. I do not know of any good solution.

I know you are not reading my posts but maybe someone else can explain
that.
I always use precompiled headers and never had problems with them.
Given that they are set up correctly of course.

Tobi

== 9 of 11 ==
Date: Fri, Nov 15 2013 12:53 am
From: Paavo Helde

=?UTF-8?Q?Tobias=20M=C3=BCller?= <troplin@bluewin.ch> wrote in
news:1507027537406192681.376648troplin-bluewin.ch@news.eternal-
september.org:

> "Alf P. Steinbach" <alf.p.steinbach+usenet@gmail.com> wrote:
>> Microsoft's precompiled header feature, which is on by default in a
>> Visual Studio project, can make incorrect code compile and can cause
>> errors with correct code. It's no big deal to adjust the code but it
>> means that code compiled with this feature may not necessarily work with
>> some other compiler. I do not know of any good solution.
>
> I know you are not reading my posts but maybe someone else can explain
> that.
> I always use precompiled headers and never had problems with them.
> Given that they are set up correctly of course.

There is a bug in MSVC that any code before #include "precompiled.h" line
is ignored. For ignored #include lines they issue a warning, but other code
is ignored silently. I guess this is what Alf meant.

Cheers
Paavo

== 10 of 11 ==
Date: Fri, Nov 15 2013 2:14 am
From: Juha Nieminen

Jorgen Grahn <grahn+nntp@snipabacken.se> wrote:
> I try not to have a foo.hpp pull in a /lot/ of other headers which it
> doesn't really need, but it's not a big disaster if it it pulls in a
> few too many.

It's not a disaster, but in general one ought to minimize the number
of #includes (especially when they refer to files inside the same
project, rather than eg. system headers).

In small projects it's mostly inconsequential. However, the larger the
project, the more important it becomes to minimize the number of #include
lines. The more inter-dependency there is between source files and
header files, the longer the building process becomes even from very
small changes. If a change in one header file causes 5 source files to
be recompiled, that's a much nicer result than if it causes 50 source
files to be recompiled (especially when you are developing/debugging
the project and you are doing a lot of test builds frequently.)

Of course one shouldn't go overboard with this and start to make the
program more complex or inefficient just for the sake of avoiding a
few #includes, but in cases where it doesn't otherwise affect the
complexity or efficiency of the program, it's something to keep in mind.

--- news://freenews.netfront.net/ - complaints: news@netfront.net ---

== 11 of 11 ==
Date: Fri, Nov 15 2013 4:59 am
From: "Alf P. Steinbach"

On 15.11.2013 09:53, Paavo Helde wrote:
> =?UTF-8?Q?Tobias=20M=C3=BCller?= <troplin@bluewin.ch> wrote in
> news:1507027537406192681.376648troplin-bluewin.ch@news.eternal-
> september.org:
>
>> "Alf P. Steinbach" <alf.p.steinbach+usenet@gmail.com> wrote:
>>> Microsoft's precompiled header feature, which is on by default in a
>>> Visual Studio project, can make incorrect code compile and can cause
>>> errors with correct code. It's no big deal to adjust the code but it
>>> means that code compiled with this feature may not necessarily work with
>>> some other compiler. I do not know of any good solution.
>>
>> I know you are not reading my posts but maybe someone else can explain
>> that.
>> I always use precompiled headers and never had problems with them.
>> Given that they are set up correctly of course.
>
> There is a bug in MSVC that any code before #include "precompiled.h" line
> is ignored. For ignored #include lines they issue a warning, but other code
> is ignored silently. I guess this is what Alf meant.

Yep.

Thanks,

- Alf

==============================================================================
TOPIC: Available C++ Libraries FAQ
http://groups.google.com/group/comp.lang.c++/t/1caaa278cbb03e87?hl=en
==============================================================================

== 1 of 1 ==
Date: Thurs, Nov 14 2013 3:23 pm
From: Nikki Locke

Available C++ Libraries FAQ

URL: http://www.trumphurst.com/cpplibs/

This is a searchable list of libraries and utilities (both free
and commercial) available to C++ programmers.

If you know of a library which is not in the list, why not fill
in the form at http://www.trumphurst.com/cpplibs/cppsub.php

Maintainer: Nikki Locke cpplibs@trumphurst.com

==============================================================================

You received this message because you are subscribed to the Google Groups "comp.lang.c++"
group.

To post to this group, visit http://groups.google.com/group/comp.lang.c++?hl=en

To unsubscribe from this group, send email to comp.lang.c+++unsubscribe@googlegroups.com

To change the way you get mail from this group, visit:
http://groups.google.com/group/comp.lang.c++/subscribe?hl=en

To report abuse, send email explaining the problem to abuse@googlegroups.com

==============================================================================
Google Groups: http://groups.google.com/?hl=en

Saturday, November 16, 2013

comp.programming.threads - 26 new messages in 16 topics - digest

comp.programming.threads
http://groups.google.com/group/comp.programming.threads?hl=en

comp.programming.threads@googlegroups.com

Today's topics:

* That's sad... - 1 messages, 1 author
http://groups.google.com/group/comp.programming.threads/t/e33611aba6d1df10?hl=en
* But Amine, why do you want to satisfy many requirements - 1 messages, 1
author
http://groups.google.com/group/comp.programming.threads/t/c1fdb02d2419e0aa?hl=en
* There is a solution to satisfy the energy efficiency requirement - 3
messages, 1 author
http://groups.google.com/group/comp.programming.threads/t/2a12c9e711904820?hl=en
* About FIFO queues - 1 messages, 1 author
http://groups.google.com/group/comp.programming.threads/t/74e8365bbd4a8cdd?hl=en
* Concurrent FIFO queue... - 1 messages, 1 author
http://groups.google.com/group/comp.programming.threads/t/0a68fb555b8fc66e?hl=en
* How do I look inside an .exe file to view the programming - 1 messages, 1
author
http://groups.google.com/group/comp.programming.threads/t/70ea90ac4a00188c?hl=en
* Concurrent FIFO Queue version 1.0 - 1 messages, 1 author
http://groups.google.com/group/comp.programming.threads/t/7a3901f20c6b0023?hl=en
* Lock and Lockfree... - 2 messages, 1 author
http://groups.google.com/group/comp.programming.threads/t/4fc87d01927bc106?hl=en
* Spinlock and lockfree and waitfree... - 3 messages, 2 authors
http://groups.google.com/group/comp.programming.threads/t/624873836709bf90?hl=en
* Threadpool and Threadpool with priorities were updated - 2 messages, 1
author
http://groups.google.com/group/comp.programming.threads/t/6b3b2dc79e93d1f5?hl=en
* Here is again my Proof - 1 messages, 1 author
http://groups.google.com/group/comp.programming.threads/t/1b195ecfbb07dead?hl=en
* ParallelVarFiler version 1.17 - 3 messages, 1 author
http://groups.google.com/group/comp.programming.threads/t/a5a073dcd9c9e9d4?hl=en
* ParallelVarFiler version 1.22 - 3 messages, 1 author
http://groups.google.com/group/comp.programming.threads/t/10ed747cbfc667d1?hl=en
* About my parallel libraries... - 1 messages, 1 author
http://groups.google.com/group/comp.programming.threads/t/e71cb05470c8cd13?hl=en
* Scalable Locks... - 1 messages, 1 author
http://groups.google.com/group/comp.programming.threads/t/a5a22f4fc887c2e7?hl=en
* scalable RWLock - 1 messages, 1 author
http://groups.google.com/group/comp.programming.threads/t/e6a38fcb3c4fb23b?hl=en

==============================================================================
TOPIC: That's sad...
http://groups.google.com/group/comp.programming.threads/t/e33611aba6d1df10?hl=en
==============================================================================

== 1 of 1 ==
Date: Fri, Oct 11 2013 4:47 pm
From: aminer

Hello,

Sorry there is still a problem, cause spin-wait in lockfree
and waitfree FIFO queues even with a sleep(0) (when for example there no
items in the queue) do not satisfy the requirement of the
energy efficency. That's sad i think, cause we have to use
my FIFO fair SemaCondvar or FIFO fair semaphores to satisfy
the requirement of energy efficiency, but SemaCondvar and Semaphores
are slow, that's sad.

Thank you,
Amine Moulay Ramdane.

==============================================================================
TOPIC: But Amine, why do you want to satisfy many requirements
http://groups.google.com/group/comp.programming.threads/t/c1fdb02d2419e0aa?hl=en
==============================================================================

== 1 of 1 ==
Date: Fri, Oct 11 2013 5:04 pm
From: aminer

Hello,

Question:

But Amine, why do you want to satisfy many requirements for your FIFO
queue, such
as minimizing efficiently the cache-coherence traffic,
and the FIFO fairness, and also the energy efficiency ? why
the energy efficiency ?

answer:

Energy efficieny is also important, cause we must not think just
for today , we must think for the future when there will be many
more cores, this requirement of energy efficiency will become much more
important, so as i told you it's sad that those waitfree and lockfree
FIFO queue algorithms must use spin-wait when there no item in the queue
and the threads must wait for new items etc. so as i told you, to
satisfy the requirement of energy efficiency we must use my FIFO fair
SemaCondvar or FIFO fair Semaphores inside the FIFO queue, but since my
SemaCondvar
and Smeaphores are slow , this will slow the FIFO queue, and that's sad.

Thank you,
Amine Moulay Ramdne.

==============================================================================
TOPIC: There is a solution to satisfy the energy efficiency requirement
http://groups.google.com/group/comp.programming.threads/t/2a12c9e711904820?hl=en
==============================================================================

== 1 of 3 ==
Date: Sat, Oct 12 2013 3:55 pm
From: aminer

Hello,

I wrote:
"Question:
But Amine, why do you want to satisfy many requirements for your FIFO
queue, such
as minimizing efficiently the cache-coherence traffic,
and the FIFO fairness, and also the energy efficiency ? why
the energy efficiency ?
answer:
Energy efficieny is also important, cause we must not think just
for today , we must think for the future when there will be many
more cores, this requirement of energy efficiency will become much more
important, so as i told you it's sad that those waitfree and lockfree
FIFO queue algorithms must use spin-wait when there no item in the queue
and the threads must wait for new items etc. so as i told you, to
satisfy the requirement of energy efficiency we must use my FIFO fair
SemaCondvar or FIFO fair Semaphores inside the FIFO queue, but since my
SemaCondvar
and Smeaphores are slow , this will slow the FIFO queue, and that's sad."

I think there is a solution to satisfy the energy efficiency requirement
, you have to use my manual event object that is supported by FreePascal
and Delphi and that is portable,
so after you push an item in the queue you have to call SetEvent() of
the manual event object, and in the pop() side you have to use something
like this:

--

if self.count <> 0
then continue;

if self.count=0
then
begin
self.event.waitfor(INFINITE);
self.event.resetevent;
end;
end;

---

This method is more energy efficient, so when there is no items in the
queue the push() method will wait on the manual event object so
it will not use CPU ressources like with the spin-wait mechanism,
and i think you have to avoid Semaphores cause semaphores are slow.

Here is the complete source code:

===

unit Lockfree_MPMC;

{$IFDEF FPC}
{$ASMMODE intel}
{$ENDIF}

interface

uses
syncobjs,
{$IFDEF Delphi}expbackoff,sysutils;{$ENDIF}
{$IFDEF FPC}expbackoff,sysutils;{$ENDIF}

{$I defines.inc}

const margin=1000; // limited to 1000 threads...
{$IFDEF CPU64}
INFINITE = qword($FFFFFFFFFFFFFFFF);
{$ENDIF CPU64}
{$IFDEF CPU32}
INFINITE = longword($FFFFFFFF);
{$ENDIF CPU32}
type
{$IFDEF CPU64}
long = qword;
{$ENDIF CPU64}
{$IFDEF CPU32}
long = longword;
{$ENDIF CPU32}

tNodeQueue = tObject;
typecache1 = array[0..15] of longword;

// TLockfree_MPMC = class(TFreelist)
TLockfree_MPMC = class
private
tail:long;
tmp1:typecache1;
head: long;
fMask : long;
fSize : long;
temp:long;
backoff1,backoff2:texpbackoff;
event:TSimpleEvent;
tab : array of tNodeQueue;
procedure setobject(lp : long;const aobject : tNodeQueue);
function getLength:long;
function getSize:long;
function getObject(lp : long):tNodeQueue;
public
{$IFDEF CPU64}
constructor create(aPower : int64 =20); {allocate tab with size
equal 2^aPower, for 20 size is equal 1048576}
{$ENDIF CPU64}
{$IFDEF CPU32}
constructor create(aPower : integer =20); {allocate tab with size
equal 2^aPower, for 20 size is equal 1048576}
{$ENDIF CPU32}

destructor Destroy; override;
function push(tm : tNodeQueue):boolean;
function pop(var obj:tNodeQueue;wait:boolean=false):boolean;
property length : long read getLength;
property count: long read getLength;
property size : long read getSize;

end;

implementation

function LockedIncLong(var Target: long): long;
asm
{$IFDEF CPU32}
// --> EAX Target
// <-- EAX Result
MOV ECX, EAX
MOV EAX, 1
//sfence
LOCK XADD [ECX], EAX
inc eax
{$ENDIF CPU32}
{$IFDEF CPU64}
// --> RCX Target
// <-- EAX Result
MOV rax, 1
//sfence
LOCK XADD [rcx], rax
INC rax
{$ENDIF CPU64}
end;

function CAS(var Target:long;Comp ,Exch : long): boolean;assembler;stdcall;
asm
{$IFDEF CPU64}
mov rax, comp
lock cmpxchg [Target], Exch
setz al
{$ENDIF CPU64}
{$IFDEF CPU32}
mov eax, comp
mov ecx,Target
mov edx,exch
lock cmpxchg [ecx], edx
setz al

{$ENDIF CPU32}

end; { CAS }

{function CAS(var Target: long; Comperand: long;NewValue: long ):
boolean; assembler;stdcall;
asm
mov ecx,Target
mov edx,NewValue
mov eax,Comperand
//sfence
lock cmpxchg [ecx],edx
JNZ @@2
MOV AL,01
JMP @@Exit
@@2:
XOR AL,AL
@@Exit:
end;}
{$IFDEF CPU64}
constructor TLockfree_MPMC.create(aPower : int64 );
{$ENDIF CPU64}
{$IFDEF CPU32}
constructor TLockfree_MPMC.create(aPower : integer );
{$ENDIF CPU32}

begin
if aPower < 10
then
begin
writeln('Constructor argument must be greater or equal to 10');
halt;
end;
if (aPower < 0) or (aPower > high(integer))
then
begin
writeln('Constructor argument is incorrect');
halt;
end;
backoff1:=texpbackoff.create(1,2);
backoff2:=texpbackoff.create(1,2);
{$IFDEF CPU64}
fMask:=not($FFFFFFFFFFFFFFFF shl aPower);{$ENDIF CPU64}
{$IFDEF CPU32}
fMask:=not($FFFFFFFF shl aPower);
{$ENDIF CPU32}

fSize:=(1 shl aPower) - margin;
setLength(tab,1 shl aPower);
tail:=0;
head:=0;
temp:=0;
Event := TSimpleEvent.Create;

end;

destructor TLockfree_MPMC.Destroy;

begin
event.free;
backoff1.free;
backoff2.free;
setLength(tab,0);
inherited Destroy;
end;

procedure TLockfree_MPMC.setObject(lp : long;const aobject : tNodeQueue);
begin
tab[lp and fMask]:=aObject;
end;

function TLockfree_MPMC.getObject(lp : long):tNodeQueue;
begin
result:=tab[lp and fMask];
end;

function TLockfree_MPMC.push(tm : tNodeQueue):boolean;
var lasttail,newtemp:long;
i,j:integer;
begin

if getlength >= fsize
then
begin
result:=false;
exit;
end;
result:=true;

newTemp:=LockedIncLong(temp);

lastTail:=newTemp-1;
//asm mfence end;
setObject(lastTail,tm);

repeat

if CAS(tail,lasttail,newtemp)
then
begin
event.setevent;
exit;
end;
sleep(0);
until false;

end;

function TLockfree_MPMC.pop(var obj:tNodeQueue;wait:boolean=false):boolean;

var lastHead : long;
begin
repeat
lastHead:=head;
if tail<>head
then
begin
obj:=getObject(lastHead);

if CAS(head,lasthead,lasthead+1)
then
begin
result:=true;
exit;
end
else sleep(0); //backoff2.delay;

end
else
begin
if wait=false
then
begin
result:=false;
exit;
end;

if self.count <> 0
then continue;

if self.count=0
then
begin
self.event.waitfor(INFINITE);
self.event.resetevent;
end;
end;
until false;
end;

function TLockfree_MPMC.getLength:long;
var head1,tail1:long;
begin
head1:=head;
tail1:=tail;
if tail1 < head1
then result:= (High(long)-head1)+(1+tail1)
else result:=(tail1-head1);
end;

function TLockfree_MPMC.getSize:long;

begin
result:=fSize;
end;

end.
===

Thank you,
Amine Moulay Ramdane.

== 2 of 3 ==
Date: Sat, Oct 12 2013 3:57 pm
From: aminer

Hello.

So as you have noticed this method that i am using is
more efficient cause you will not wait on the manual event object until
self.count = 0.

Thank you,
Amine Moulay Ramdane.

On 10/12/2013 6:55 PM, aminer wrote:
>
> Hello,
>
> I wrote:
> "Question:
> But Amine, why do you want to satisfy many requirements for your FIFO
> queue, such
> as minimizing efficiently the cache-coherence traffic,
> and the FIFO fairness, and also the energy efficiency ? why
> the energy efficiency ?
> answer:
> Energy efficieny is also important, cause we must not think just
> for today , we must think for the future when there will be many
> more cores, this requirement of energy efficiency will become much more
> important, so as i told you it's sad that those waitfree and lockfree
> FIFO queue algorithms must use spin-wait when there no item in the queue
> and the threads must wait for new items etc. so as i told you, to
> satisfy the requirement of energy efficiency we must use my FIFO fair
> SemaCondvar or FIFO fair Semaphores inside the FIFO queue, but since my
> SemaCondvar
> and Smeaphores are slow , this will slow the FIFO queue, and that's sad."
>
>
>
> I think there is a solution to satisfy the energy efficiency requirement
> , you have to use my manual event object that is supported by FreePascal
> and Delphi and that is portable,
> so after you push an item in the queue you have to call SetEvent() of
> the manual event object, and in the pop() side you have to use something
> like this:
>
> --
>
> if self.count <> 0
> then continue;
>
> if self.count=0
> then
> begin
> self.event.waitfor(INFINITE);
> self.event.resetevent;
> end;
> end;
>
> ---
>
> This method is more energy efficient, so when there is no items in the
> queue the push() method will wait on the manual event object so
> it will not use CPU ressources like with the spin-wait mechanism,
> and i think you have to avoid Semaphores cause semaphores are slow.
>
> Here is the complete source code:
>
> ===
>
>
> unit Lockfree_MPMC;
>
> {$IFDEF FPC}
> {$ASMMODE intel}
> {$ENDIF}
>
>
> interface
>
> uses
> syncobjs,
> {$IFDEF Delphi}expbackoff,sysutils;{$ENDIF}
> {$IFDEF FPC}expbackoff,sysutils;{$ENDIF}
>
> {$I defines.inc}
>
> const margin=1000; // limited to 1000 threads...
> {$IFDEF CPU64}
> INFINITE = qword($FFFFFFFFFFFFFFFF);
> {$ENDIF CPU64}
> {$IFDEF CPU32}
> INFINITE = longword($FFFFFFFF);
> {$ENDIF CPU32}
> type
> {$IFDEF CPU64}
> long = qword;
> {$ENDIF CPU64}
> {$IFDEF CPU32}
> long = longword;
> {$ENDIF CPU32}
>
>
> tNodeQueue = tObject;
> typecache1 = array[0..15] of longword;
>
> // TLockfree_MPMC = class(TFreelist)
> TLockfree_MPMC = class
> private
> tail:long;
> tmp1:typecache1;
> head: long;
> fMask : long;
> fSize : long;
> temp:long;
> backoff1,backoff2:texpbackoff;
> event:TSimpleEvent;
> tab : array of tNodeQueue;
> procedure setobject(lp : long;const aobject : tNodeQueue);
> function getLength:long;
> function getSize:long;
> function getObject(lp : long):tNodeQueue;
> public
> {$IFDEF CPU64}
> constructor create(aPower : int64 =20); {allocate tab with size
> equal 2^aPower, for 20 size is equal 1048576}
> {$ENDIF CPU64}
> {$IFDEF CPU32}
> constructor create(aPower : integer =20); {allocate tab with size
> equal 2^aPower, for 20 size is equal 1048576}
> {$ENDIF CPU32}
>
>
> destructor Destroy; override;
> function push(tm : tNodeQueue):boolean;
> function pop(var obj:tNodeQueue;wait:boolean=false):boolean;
> property length : long read getLength;
> property count: long read getLength;
> property size : long read getSize;
>
> end;
>
>
> implementation
>
> function LockedIncLong(var Target: long): long;
> asm
> {$IFDEF CPU32}
> // --> EAX Target
> // <-- EAX Result
> MOV ECX, EAX
> MOV EAX, 1
> //sfence
> LOCK XADD [ECX], EAX
> inc eax
> {$ENDIF CPU32}
> {$IFDEF CPU64}
> // --> RCX Target
> // <-- EAX Result
> MOV rax, 1
> //sfence
> LOCK XADD [rcx], rax
> INC rax
> {$ENDIF CPU64}
> end;
>
> function CAS(var Target:long;Comp ,Exch : long): boolean;assembler;stdcall;
> asm
> {$IFDEF CPU64}
> mov rax, comp
> lock cmpxchg [Target], Exch
> setz al
> {$ENDIF CPU64}
> {$IFDEF CPU32}
> mov eax, comp
> mov ecx,Target
> mov edx,exch
> lock cmpxchg [ecx], edx
> setz al
>
> {$ENDIF CPU32}
>
> end; { CAS }
>
>
>
> {function CAS(var Target: long; Comperand: long;NewValue: long ):
> boolean; assembler;stdcall;
> asm
> mov ecx,Target
> mov edx,NewValue
> mov eax,Comperand
> //sfence
> lock cmpxchg [ecx],edx
> JNZ @@2
> MOV AL,01
> JMP @@Exit
> @@2:
> XOR AL,AL
> @@Exit:
> end;}
> {$IFDEF CPU64}
> constructor TLockfree_MPMC.create(aPower : int64 );
> {$ENDIF CPU64}
> {$IFDEF CPU32}
> constructor TLockfree_MPMC.create(aPower : integer );
> {$ENDIF CPU32}
>
>
> begin
> if aPower < 10
> then
> begin
> writeln('Constructor argument must be greater or equal to 10');
> halt;
> end;
> if (aPower < 0) or (aPower > high(integer))
> then
> begin
> writeln('Constructor argument is incorrect');
> halt;
> end;
> backoff1:=texpbackoff.create(1,2);
> backoff2:=texpbackoff.create(1,2);
> {$IFDEF CPU64}
> fMask:=not($FFFFFFFFFFFFFFFF shl aPower);{$ENDIF CPU64}
> {$IFDEF CPU32}
> fMask:=not($FFFFFFFF shl aPower);
> {$ENDIF CPU32}
>
> fSize:=(1 shl aPower) - margin;
> setLength(tab,1 shl aPower);
> tail:=0;
> head:=0;
> temp:=0;
> Event := TSimpleEvent.Create;
>
> end;
>
> destructor TLockfree_MPMC.Destroy;
>
> begin
> event.free;
> backoff1.free;
> backoff2.free;
> setLength(tab,0);
> inherited Destroy;
> end;
>
>
> procedure TLockfree_MPMC.setObject(lp : long;const aobject : tNodeQueue);
> begin
> tab[lp and fMask]:=aObject;
> end;
>
> function TLockfree_MPMC.getObject(lp : long):tNodeQueue;
> begin
> result:=tab[lp and fMask];
> end;
>
>
> function TLockfree_MPMC.push(tm : tNodeQueue):boolean;
> var lasttail,newtemp:long;
> i,j:integer;
> begin
>
> if getlength >= fsize
> then
> begin
> result:=false;
> exit;
> end;
> result:=true;
>
> newTemp:=LockedIncLong(temp);
>
> lastTail:=newTemp-1;
> //asm mfence end;
> setObject(lastTail,tm);
>
> repeat
>
> if CAS(tail,lasttail,newtemp)
> then
> begin
> event.setevent;
> exit;
> end;
> sleep(0);
> until false;
>
> end;
>
>
> function TLockfree_MPMC.pop(var obj:tNodeQueue;wait:boolean=false):boolean;
>
> var lastHead : long;
> begin
> repeat
> lastHead:=head;
> if tail<>head
> then
> begin
> obj:=getObject(lastHead);
>
> if CAS(head,lasthead,lasthead+1)
> then
> begin
> result:=true;
> exit;
> end
> else sleep(0); //backoff2.delay;
>
> end
> else
> begin
> if wait=false
> then
> begin
> result:=false;
> exit;
> end;
>
> if self.count <> 0
> then continue;
>
> if self.count=0
> then
> begin
> self.event.waitfor(INFINITE);
> self.event.resetevent;
> end;
> end;
> until false;
> end;
>
> function TLockfree_MPMC.getLength:long;
> var head1,tail1:long;
> begin
> head1:=head;
> tail1:=tail;
> if tail1 < head1
> then result:= (High(long)-head1)+(1+tail1)
> else result:=(tail1-head1);
> end;
>
> function TLockfree_MPMC.getSize:long;
>
> begin
> result:=fSize;
> end;
>
> end.
> ===
>
>
>
>
> Thank you,
> Amine Moulay Ramdane.
>
>
>
>

== 3 of 3 ==
Date: Sat, Oct 12 2013 4:28 pm
From: aminer

On 10/12/2013 6:55 PM, aminer wrote:
> if self.count=0
> then
> begin
> self.event.waitfor(INFINITE);
> self.event.resetevent;
> end;
> end;

This method doesn't work , cause if a thread1 was on
"self.event.waitfor(INFINITE)" and another thread2 corossed
the "if self.count=0" , and if thread2 has not arrive yet to
"self.event.waitfor(INFINITE)" , and thread1 has
received two setevent() from the push() method and it has called after
that "self.event.resetevent" , so thread2 will stop and wait
on "self.event.waitfor(INFINITE)" even if there is items in the queue,
so this is a problem.

I will still think more about a method to satisfy the efficency
requirement without using SemaCondvar or Semaphore cause they are slow.

Thank you,
Amine Moulay Ramdane.

==============================================================================
TOPIC: About FIFO queues
http://groups.google.com/group/comp.programming.threads/t/74e8365bbd4a8cdd?hl=en
==============================================================================

== 1 of 1 ==
Date: Sat, Oct 12 2013 4:41 pm
From: aminer

Hello,

I think SemaCondvar and Semaphores are also a good alternative
to build a concurrent FIFO queue, cause if there is items in the queue
my SemaCondvar and Semaphores also will not context switch to kernel
mode so it will
not be so expensive i think, the threads will switch in kernel
mode only when they want to pop() and when there is no items in the queue,
so my SemaCondVar and Semaphores are a good alternative if you
want to satisfy the FIFO fairness requirement and to minimize the
cache-coherence traffic and to satisfy also the energy efficiency
requirement.

But beware the Windows Semaphore is not FIFO fair, but my SemaCondvar
is FIFO fair.

Thank you,
Amine Moulay Ramdane.

==============================================================================
TOPIC: Concurrent FIFO queue...
http://groups.google.com/group/comp.programming.threads/t/0a68fb555b8fc66e?hl=en
==============================================================================

== 1 of 1 ==
Date: Sun, Oct 13 2013 1:26 pm
From: aminer

Hello,

I have thought more, and i have come with a solution
for a concurrent FIFO queue that satisfies many requirements, it
minimizes efficiently the cache-coherence traffic, it is
FIFO fair(it uses FIFO fair locks) and it is energy efficient on the
pop() side: it will not spin-wait when there no items in the queue but
it will wait on a manual reset event object, and this is energy efficient.

I have benchmarked it and it gives 1.65 millions pop()
per second and 0.8 million push() per second.

Here is the FIFO queue, i will soon put it on my website.

{****************************************************************************
*
*
* FIFO MPMC Queue
*
*
*
*
*
* Language: FPC Pascal v2.2.0+ / Delphi 5+
*
*
*
* Required switches: none
*
*
*
* Authors: Amine Moulay Ramdane
*
*
*
*
*
*
*
*
* Version: 1.0
*
*
* Send bug reports and feedback to aminer @@ videotron @@ ca
*
* You can always get the latest version/revision of this package from
*
*
*
* http://pages.videotron.com/aminer/
*
*
*
* Description: Algorithm to handle an FIFO MPMC queue
*
*
*
* This program is distributed in the hope that it will be useful,
*
* but WITHOUT ANY WARRANTY; without even the implied warranty of
*
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
*
*
*
*
*
*****************************************************************************
* BEGIN LICENSE BLOCK
*

{ changelog
v.1.0
}

unit FIFOQUEUE_MPMC;

interface
{$IFDEF FPC}
{$ASMMODE intel}
{$ENDIF}

uses
LW_ALOCK,SpinLock,SemaCondvar,msync,syncobjs,sysutils;

{$I defines.inc}

type

{$IFDEF CPU64}
long = qword;
{$ENDIF CPU64}
{$IFDEF CPU32}
long = longword;
{$ENDIF CPU32}

tNodeQueue = tObject;
typecache1 = array[0..15] of longword;

// TLockfree_MPMC = class(TFreelist)
TFIFOQUEUE_MPMC = class
private
tail:longword;
tmp1:typecache1;
head: long;
fMask : long;
fSize : long;
tab : array of tNodeQueue;
lock1,lock2:TALOCK;
event:TSimpleEvent;
lock3:TSpinlock;
count1:long;
procedure setobject(lp : long;const aobject : tNodeQueue);
function getLength:long;
function getSize:long;
function getObject(lp : long):tNodeQueue;
public
constructor create(aPower : long =20); {allocate tab with size
equal 2^aPower, for 20 size is equal 1048576}
destructor Destroy; override;
function push(tm : tNodeQueue):boolean;
function pop(var obj:tNodeQueue):boolean;
property length : long read getLength;
property count: long read getLength;
property size : long read getSize;

end;

implementation

{$IF defined(CPU64) }
function LockedCompareExchange(CompareVal, NewVal: long; var Target:
long): long; overload;
asm
mov rax, rcx
lock cmpxchg [r8], rdx
end;
{$IFEND}
{$IF defined(CPU32) }
function LockedCompareExchange(CompareVal, NewVal: long; var
Target:long): long; overload;
asm
lock cmpxchg [ecx], edx
end;
{$IFEND}

function CAS(var Target:long;Comp ,Exch : long): boolean;
var ret:long;
begin

ret:=LockedCompareExchange(Comp,Exch,Target);
if ret=comp
then result:=true
else result:=false;

end; { CAS }

function LockedIncLong(var Target: long): long;
asm
{$IFDEF CPU32}
// --> EAX Target
// <-- EAX Result
MOV ECX, EAX
MOV EAX, 1
//sfence
LOCK XADD [ECX], EAX
inc eax
{$ENDIF CPU32}
{$IFDEF CPU64}
// --> RCX Target
// <-- EAX Result
MOV rax, 1
//sfence
LOCK XADD [rcx], rax
INC rax
{$ENDIF CPU64}
end;

function LockedDecLong(var Target: long): long;
asm
{$IFDEF CPU32}
// --> EAX Target
// <-- EAX Result
MOV ECX, EAX
MOV EAX, -1
//sfence
LOCK XADD [ECX], EAX
dec eax
{$ENDIF CPU32}
{$IFDEF CPU64}
// --> RCX Target
// <-- EAX Result
MOV rax, -1
//sfence
LOCK XADD [rcx], rax
dec rax
{$ENDIF CPU64}
end;

constructor TFIFOQUEUE_MPMC.create(aPower : long );
begin
if (aPower < 0) or (aPower > high(long))
then
begin
writeln('Constructor''s argument incorrect');
halt;
end;

{$IFDEF CPU64}
fMask:=not($FFFFFFFFFFFFFFFF shl aPower);
{$ENDIF CPU64}
{$IFDEF CPU32}
fMask:=not($FFFFFFFF shl aPower);
{$ENDIF CPU32}

fSize:=(1 shl aPower);
setLength(tab,1 shl aPower);
tail:=0;
head:=0;
lock1:=TALOCK.create(100);
lock2:=TALOCK.create(100);
lock3:=TSpinlock.create;
event:=TSimpleEvent.create;
count1:=0;

end;

destructor TFIFOQUEUE_MPMC.Destroy;

begin
lock1.free;
lock2.free;
lock3.free;
event.free;
setLength(tab,0);
inherited Destroy;
end;

procedure TFIFOQUEUE_MPMC.setObject(lp : long;const aobject : tNodeQueue);
begin
tab[lp and fMask]:=aObject;
end;

function TFIFOQUEUE_MPMC.getObject(lp : long):tNodeQueue;
begin
result:=tab[lp and fMask];
end;

function TFIFOQUEUE_MPMC.push(tm : tNodeQueue):boolean;//stdcall;
begin

lock1.enter;
result:=true;
if getlength >= fsize
then
begin
result:=false;
lock1.leave;
exit;
end;

setObject(tail,tm);
tail:=(tail+1);
lock3.enter;
inc(count1);
event.setevent;
lock3.leave;
lock1.leave;
end;

function TFIFOQUEUE_MPMC.pop(var obj:tNodeQueue):boolean;
var b:long;

begin
if self.count=0
then
begin
event.waitfor(INFINITE);
lock3.enter;
if count1=0 then event.resetevent;
lock3.leave;
end;

lock2.enter;

if tail<>head
then
begin
obj:=getObject(head);
head:=(head+1);
result:=true;
lock3.enter;
dec(count1);
lock3.leave;
lock2.leave;
exit;
end
else
begin
result:=false;
lock2.leave;

end;
end;

function TFIFOQUEUE_MPMC.getLength:long;
var head1,tail1:long;
begin
head1:=head;
tail1:=tail;
if tail1 < head1
then result:= (High(long)-head1)+(1+tail1)
else result:=(tail1-head1);
end;

function TFIFOQUEUE_MPMC.getSize:long;

begin
result:=fSize;
end;

end.

==============================================================================
TOPIC: How do I look inside an .exe file to view the programming
http://groups.google.com/group/comp.programming.threads/t/70ea90ac4a00188c?hl=en
==============================================================================

== 1 of 1 ==
Date: Tues, Oct 15 2013 8:32 am
From: rush8y@googlemail.com

As this is the top link returned when searching for something similar to the OP, you will find this tool very useful:

http://www.jetbrains.com/decompiler/features/

==============================================================================
TOPIC: Concurrent FIFO Queue version 1.0
http://groups.google.com/group/comp.programming.threads/t/7a3901f20c6b0023?hl=en
==============================================================================

== 1 of 1 ==
Date: Tues, Oct 15 2013 2:17 pm
From: aminer

Hello,

I have implemented a concurrent FIFO Queue that satifies
many requirements such us: it is FIFO fair, it minimizes efficiently the
cache-coherence traffic and it is energy efficient on the pop() side:
when there is no items in the queue it will not spin-wait , but it will
wait on a portable manual event object, it is not as fast
as lockfree algorithms, cause if you want to satisfy the FIFO fairness
and to minimize efficiently the cache-coherence traffic and to be
energy efficient you have to pay a price: this will lower the speed
of the concurrent FIFO queue, but the speed of my concurrent FIFO Queue
is good i think , cause i have benchmarked on the Quad core processor
each running at 1.6 GHz, with 4 threads pushing and 4 threads poping and
it gives 1 million transaction(push and pop) per second, and that's also
good.

You can download my concurrent FIFO Queue from:

http://pages.videotron.com/aminer/

Thank you,
Amine Moulay Ramdane.

==============================================================================
TOPIC: Lock and Lockfree...
http://groups.google.com/group/comp.programming.threads/t/4fc87d01927bc106?hl=en
==============================================================================

== 1 of 2 ==
Date: Tues, Oct 15 2013 4:30 pm
From: aminer

Hello,

I have come to an interresting subject, i have compared
a FIFO queue that uses two spinlocks with an exponentaial backoff
and a lockfree FIFO queue that does not use an exponential backoff, i
have come to the conclusion
that the Spinlock with a backoff is 6.5X times faster than
the lockfree version without a backoff, and you have to be smart and
understand why.. i will explain it to you, if you take a look at a
Spinlock with an exponentaial backoff , you will read this in the
Enter() method:

---
procedure TSpinLock.Enter;

var t:integer;

begin

backoff.reset;
repeat

if CAS(@FCount2^.FCount2,0,1)then break;
t:=backoff.delay;
sleep(t);
until false;
end;

---

As you have noticed if for example 4 threads will execute the CAS
one of them will succeed and 3 of them will backoff , but what
is very important to know, is that the thread that have succeeded
will have the opportunity to reenter many times the locked region
and modifying FCount2^.FCount2 in its local cache, so this will make
it very fast, in fact 6.5X times faster than the lockfree version
without backoff, cause in lockfree algorithms every thread must have a
cache miss and reload the cache line from the memory and this
is slower than a Spinlock with a backoff, so this is why the Spinlock
with a backoff is very much faster and it minimizes also the
cache-coherence traffic under contention, so what is the solution then?
i think that the lockfree FIFO queue algorithms must use an exponential
backoff mechanism to be much faster and to be able to minimize the
cache-coherence traffic.

Hope you have undertood what i mean in this post..

Thank you,
Amine Moulay Ramdane.

== 2 of 2 ==
Date: Tues, Oct 15 2013 4:38 pm
From: aminer

I mean 6.5X times faster under contention of course.

Thank youm
Amine Moulay Ramdane.

On 10/15/2013 7:30 PM, aminer wrote:
>
> Hello,
>
>
> I have come to an interresting subject, i have compared
> a FIFO queue that uses two spinlocks with an exponentaial backoff
> and a lockfree FIFO queue that does not use an exponential backoff, i
> have come to the conclusion
> that the Spinlock with a backoff is 6.5X times faster than
> the lockfree version without a backoff, and you have to be smart and
> understand why.. i will explain it to you, if you take a look at a
> Spinlock with an exponentaial backoff , you will read this in the
> Enter() method:
>
> ---
> procedure TSpinLock.Enter;
>
> var t:integer;
>
> begin
>
> backoff.reset;
> repeat
>
> if CAS(@FCount2^.FCount2,0,1)then break;
> t:=backoff.delay;
> sleep(t);
> until false;
> end;
>
> ---
>
> As you have noticed if for example 4 threads will execute the CAS
> one of them will succeed and 3 of them will backoff , but what
> is very important to know, is that the thread that have succeeded
> will have the opportunity to reenter many times the locked region
> and modifying FCount2^.FCount2 in its local cache, so this will make
> it very fast, in fact 6.5X times faster than the lockfree version
> without backoff, cause in lockfree algorithms every thread must have a
> cache miss and reload the cache line from the memory and this
> is slower than a Spinlock with a backoff, so this is why the Spinlock
> with a backoff is very much faster and it minimizes also the
> cache-coherence traffic under contention, so what is the solution then?
> i think that the lockfree FIFO queue algorithms must use an exponential
> backoff mechanism to be much faster and to be able to minimize the
> cache-coherence traffic.
>
>
>
> Hope you have undertood what i mean in this post..
>
>
>
> Thank you,
> Amine Moulay Ramdane.
>
>
>
>
>
>
>
>
>
>
>

==============================================================================
TOPIC: Spinlock and lockfree and waitfree...
http://groups.google.com/group/comp.programming.threads/t/624873836709bf90?hl=en
==============================================================================

== 1 of 3 ==
Date: Tues, Oct 15 2013 5:03 pm
From: aminer

Hello,

I have come to an interresting subject, and you have
to be smart to understand it more...

If you take a look at the following FIFO queue that is
waitfree on the push() side and lockfree on the pop()
side, here it's:

http://pastebin.com/f72cc3cc1

as i said before , there is a weaknesses with this FIFO queue, first
the lockfree side on the pop() is not FIFO fair , so it's not
starvation-free on the pop() side, and there is a second weakness:
it doesn't minimizes the cache-coherence traffic ,
but there is also a third weakness with this algorithm: since
you can not use an exponential backoff on the waitfree pop() side
you can not make it 4 times or 6 times faster under contention on the
push() side,
the Spinlock() with an exponential backoff mechanism on the pop()
and the push() is 4 times to 6 times faster than the lockfree
version and the waitfree version without a backoff mechanism and
i have just explain it to you,
but you can add an exponential backoff on the pop() lockfree side of
this FIFO que,
and this will make it 6 times faster under contention on the pop() side.

So in my opinion , since the Spinlock with an exponential backoff is
6 times faster under contention , so the risk of a stavartion will
be lowered, and since it minimizes the cache-coherence traffic, so this
will make the Spinlock with an exponential backoff
a better alternative if you want better speed and to minimize
cache-coherence traffic.

Thank you,
Amine Moulay Ramdane.

== 2 of 3 ==
Date: Tues, Oct 15 2013 6:35 pm
From: aminer

Hello,

This 6X times faster is just for the pop() method, but
for the push() the Spinlock with backoff gives the same
performance as the lockfree, so this is why they both have
the same performance under contention.

Thank you,
Amine Moulay Ramdane.

== 3 of 3 ==
Date: Thurs, Oct 17 2013 5:10 pm
From: "Chris M. Thomasson"

"aminer" wrote in message news:l3kl4m$ioi$1@news.albasani.net...

[...]

> So in my opinion , since the Spinlock with an exponential backoff is
> 6 times faster under contention , so the risk of a stavartion will
> be lowered, and since it minimizes the cache-coherence traffic, so this
> will make the Spinlock with an exponential backoff
> a better alternative if you want better speed and to minimize
> cache-coherence traffic.

One tiny point:

A lockfree algorithm can make progress on every failure of the condition.

A spinlock cannot.

;^/

==============================================================================
TOPIC: Threadpool and Threadpool with priorities were updated
http://groups.google.com/group/comp.programming.threads/t/6b3b2dc79e93d1f5?hl=en
==============================================================================

== 1 of 2 ==
Date: Fri, Oct 18 2013 10:50 am
From: aminer

Hello,

Threadpool and Threapool with priorities were
updated, i have used my new concurrent FIFO queue
inside them to minimize efficiently the cache coherence traffic and so
that they become FIFO fair, and i have updated all my other libraries
such us my Parallel archiver etc.
please download them again if you need them.

You can download my updated libraries from:

http://pages.videotron.com/aminer/

Thank you,
Amine Moulay Ramdane.

== 2 of 2 ==
Date: Fri, Oct 18 2013 10:57 am
From: aminer

Hello,

All my libraries and programs are stable now, so if you need one of my
programs or libraries you can download them from my website:

http://pages.videotron.com/aminer/

Thank you,
Amine Moulay Ramdane.

==============================================================================
TOPIC: Here is again my Proof
http://groups.google.com/group/comp.programming.threads/t/1b195ecfbb07dead?hl=en
==============================================================================

== 1 of 1 ==
Date: Fri, Oct 18 2013 11:58 am
From: blmblm@myrealbox.com

In article <l36kri$3t3$1@news.albasani.net>, aminer <aminer@toto.net> wrote:
>
> I wrote:
> > Hello,
> >
> > I think i know what is my mistake: the serial part in
> > the Amdahl's law is not the critical section, you must not
> > confuse the two.
>

Exactly. The serial part is the part that must be done exactly once
and cannot be somehow split up among multiple threads or processes.

>
> But, Amine, how can you say that the serial part of the Amdahl's law is
> not
> the critical section ?

Having got it right once (above), why now go off the rails again
(below) ....

(And why start a new thread for each new thought, rather than keeping
the whole discussion in one thread?)

> In my example if you don't have context switches , and you have the same
> number of threads than the number of cores, and the time inside the
> critical
> section is constant and the time inside the parallel part is constant,
> so the Amdahl's law can predict the worst case contention scenario ,
> that means when all the threads are contending for the same critical
> section, hence you can predict the worst case contention scenario by
> just calculating the time inside the critical section and the time
> inside the parallel part and doing the Amdahl calculation, this will
> give you the
> exact result for the worst case contention scenario, so as you have
> noticed the serial part of the Amdahl equation is not only the critical
> section, it's the critical section with a context, so you have to take
> into consideration also the context, that means the context of the worst
> case contention scenario.
>
>
> Thank you,
> Amine Moulay Ramdane.
>
>
>
[ snip ]

--
B. L. Massingill
ObDisclaimer: I don't speak for my employers; they return the favor.

==============================================================================
TOPIC: ParallelVarFiler version 1.17
http://groups.google.com/group/comp.programming.threads/t/a5a073dcd9c9e9d4?hl=en
==============================================================================

== 1 of 3 ==
Date: Sat, Oct 19 2013 8:27 am
From: aminer

Hello,

ParallelVarFiler was updated to version 1.17, from now on you don't need
to call the LoadData().

Authors: Amine Moulay Ramdane and Peter Johansson

Parallel Variable filer and streamer for Delphi and Freepascal that can
be used also as a parallel hashtable and that uses ParallelHashList
(Parallel Hashtable) with O(1) best case and O(log(n)) worst case access
that uses lock striping and lightweight
MREWs(multiple-readers-exclusive-writer) , this allows multiple threads
to write and read concurently. also ParallelHashList maintains an
independant counter , that counts the number of entries , for each
segment of the hashtable and uses a lock for each counter, this is also
for better scalability.

Description:

Collect different types of variables and save them into one file or
stream. TParallelVarFiler reads and writes on files, streams and
Strings. You can collect Strings, Integers and Dates, in fact everything
that can be stored in a variant. In addition you can collect Streams.

You can use also ParallelVarFiler to send parameters over network. It
can be used for example to share data by any IPC mechanism.

And please look at test.pas and test1.pas a parallel variable filer
examples - compile and execute them...

Now you can use ParallelVarFiler as a small to medium database, i have
added a Clean() and DeletedItems() methods and i have changed the
constructor, now you can pass a file name to the constructor where the
data will be wrote and updated and deleted... if you pass and empty file
name the data will be updated and wrote and deleted only in memory. When
you use a file name in the constructor, many readers and one writer can
proceed concurrently. But when you pass an empty file name to the
constructor, many writers and many readers can proceed concurrently in
memory. and when you read a data item it only use the parallel hashtable
in memory, hardisk is not used.

And when you pass a file name to the constructor and delete or update
the items, those items will be marked deleted from the file, and you can
use DeletedItems() to see how many data items are marked deleted in the
file, and use after that the Clean() method to delete completly those
data items from the file.

How can you store mutiple data inside a database row and map it to a key ?

You can simply use Parallel Variable Filer for that and stream your
mutiple data variants and map them to a key...

As you know ParallelVarFiler is using ParallelHashList (a parallel
hashtable that scales on multicores), so when you are using the
GetKeys() method , and you want to get the data of those keys don't
forget to test if the variant is not empty by using the VarIsEmpty()
function, and when you are using GetStream() you can test if the data
exists by testing the return boolean value from GetStream().

ParallelVarFiler is easy to learn, i have documented all the methods ,
and please read inside ParallelVarFiler.pas about
them.

Now ParallelVarFiler is Fault tolerant to power failures etc. i have
done a simulation of power failures and data file damages and
ParallelVarFiller is recovering from power failure and damages of the
data file ... And when there is a power failure or the program has
crached and you want to recover from a damage to your data file you have
only to call the LoadData() method.

As you know ParallelVarFiler is using ParallelHashList (a parallel
hashtable), but as you know ParallelHashList is cache unfriendly since
it's using a hash mechanism etc. and as you know ParallelVarFiler and
ParallelHashlist are memory bound, so since ParallelHashList is cache
unfriendly and it's memory bound so i don't think ParallelVarFiler or
ParallelHahsList will scale very well, to avoid this problem you have to
use something like replication across computers using TCP/IP and using
your database in each computer and load balance between computers, but
even if you are using replication and load balancing this can make the
memory and hardisk truly parallel but this will not avoid the problem
with the network bandwidth limitation. But ParallelVarFiler and
ParallelHashlist are scaling to a certain point and they are fast and
they are still usefull.

Please see the test.pas example inside the zip file to see how i am
using it...

But please read the following:

This software is provided on an "as-is" basis, with no warranties,
express or implied. The entire risk and liability of using it is yours.
Any damages resulting from the use or misuse of this software will be
the responsibility of the user.

please look at the test.pas and test1.pas example inside the zip file to
see how to use ParallelVarFiler...

You can download ParallelVarFiler from:

http://pages.videotron.com/aminer/

Here is the methods that have been implemented:

PUBLIC METHODS:

constructor Create(file1:string,size,mrews:integer;casesensitive:boolean);

- Creates a new VarFiler ready to use, with size and with mrews number
of mrewa and casesensitive

for case sensitive keys,the number of
MREWS(multiple-readers-exclusive-writer) must be less or

equal to the Hashtable size and file1 is the the file to write to, if
file1 is an empty string

the data will be wrote only to memory and use SaveToFile() to write to a
file.

destructor Destroy;

- Destroys the ParallelVarFiler object and cleans up.

procedure Clear;

- Deletes all contents.

function Clean:boolean

- Cleans the deleted items from the file.

function DeletedItems:integer

- Returns the number of items marked deleted.

function LoadData:boolean

- Loads the data from the file passed to the constructor.

function Delete(Name : String):boolean;

- Deletes the variable with Name.

function Exists(Name : String) : Boolean;

- Returns True if a variable with Name exists

procedure GetKeys(Strings : TstringList);

- Fills up a TStringList descendant with all the keys.

function Count : Integer;

- Returns the number of variables

function Add(Name : String; Value : Variant ):boolean;

- Adds a new variable , given the Name

function AddStream( Name : String;Stream : TStream ):boolean;

- Adds a new Stream, given the Name

function Update(Name : String; Value : Variant):boolean;

- Updates the new variable , given the Name

function UpdateStream(Name : String; Stream : TStream ):boolean;

- Updates a new Stream , given the Name

function GetStream(Name : String;Stream : TStream):boolean;

- Fills up the stream with data

procedure SaveToStream(Stream : TStream);

- Saves all variables, streams and Graphics to a stream.

procedure SaveToFile(FileName : String);

- Saves all variables, streams and Graphics to a file.

procedure SaveToString(out S : String);

- Saves all variables, streams and Graphics to a string.

function LoadFromStream(Stream : TStream):boolean;

- Loads all variables, streams and Graphics from a stream.

function LoadFromFile(FileName : String):boolean;

- Loads all variables, streams and Graphics from a File.

function LoadFromString(S : String):boolean;

- Loads all variables, streams and Graphics from a string.

function Stream2String(Stream1: TStream;var Mystring:string): boolean;

procedure String2Stream(var String2BeSaved:string; Stream1: TStream);

procedure Variant2Stream(var VR:variant; Stream1: TStream);

function Stream2Variant(Stream1: TStream;var VR:variant): boolean;

PUBLIC PROPERTIES:

Items : Variant

- Gets the value (indexed).

Language: FPC Pascal v2.2.0+ and Lazarus / Delphi 7 to 2007:
http://www.freepascal.org/

Operating Systems: Win , Linux and Mac (x86).

Required FPC switches: -O3 -Sd -dFPC -dWin32 -dFreePascal

-Sd for delphi mode....

Required Delphi switches: -DMSWINDOWS -$H+

For Delphi use -DDelphi

Thank you,
Amine Moulay Ramdane.

== 2 of 3 ==
Date: Sat, Oct 19 2013 8:36 am
From: aminer

Hello,

You have to understand ParallelVarFiller ,
it's a Parallel HashTable that can be saved automaticly
or manually to a file or to a stream or to a string and it's
fault tolerant to power failure etc.

Hope you have understood the idea.

Thank you,
Amine Moulay Ramdane.

== 3 of 3 ==
Date: Sat, Oct 19 2013 8:40 am
From: aminer

I will add..

You have to understand ParallelVarFiller ,
it's a Parallel HashTable that can be saved automaticly or manually to a
file or to a stream or to a string and it can be restored from
a file or from a stream or from a string back to the Hashtable in
memory, and it's
fault tolerant to power failure etc.

Hope you have understood the idea.

Thank you,
Amine Moulay Ramdane.

==============================================================================
TOPIC: ParallelVarFiler version 1.22
http://groups.google.com/group/comp.programming.threads/t/10ed747cbfc667d1?hl=en
==============================================================================

== 1 of 3 ==
Date: Mon, Oct 21 2013 8:24 am
From: aminer

Hello,

I have updated ParallelVarFiler to version 1.22,
now SaveToStream() , SaveToString() and SaveToFile() are
thread safe, but you must pass a filename to the constructor
before using them.

You can download ParallelVarFiler version 1.22 from:

http://pages.videotron.com/aminer/

Thank you,
Amine Moulay Ramdane

== 2 of 3 ==
Date: Mon, Oct 21 2013 8:39 am
From: aminer

Hello all,

I have wrote ParallelVarFiler before writing
the scalable scalable Anderson array based Lock , but you have to
understand that under Windows the Windows critical
section is not FIFO fair, so it's not starvatoin-free, so in my next
version 1.23 of ParallelVarFiler i will
change this Critical Section by the scalable scalable Anderson array
based Lock.

Thank you,
Amine Moulay Ramdane.

== 3 of 3 ==
Date: Mon, Oct 21 2013 8:54 am
From: aminer

Hello,

ParallelVarFiler was updated to version 1.23, now it uses the scalable
Anderson array based Lock , so now the scalable Lock is starvation-free.

You can download ParallelVarFiler 1.23 from:

http://pages.videotron.com/aminer/

Thank you,
Amine Moulay Ramdane.

On 10/21/2013 11:24 AM, aminer wrote:
> Hello,
>
>
> I have updated ParallelVarFiler to version 1.22,
> now SaveToStream() , SaveToString() and SaveToFile() are
> thread safe, but you must pass a filename to the constructor
> before using them.
>
>
> You can download ParallelVarFiler version 1.22 from:
>
> http://pages.videotron.com/aminer/
>
>
> Thank you,
> Amine Moulay Ramdane

==============================================================================
TOPIC: About my parallel libraries...
http://groups.google.com/group/comp.programming.threads/t/e71cb05470c8cd13?hl=en
==============================================================================

== 1 of 1 ==
Date: Mon, Oct 21 2013 9:18 am
From: aminer

Hello,

You have to understand me, i have designed Parallel archiver, Parallel
compression library and ParallelHahslist and ParallelVarFiler and
scalable Locks and scalable RWLocks, cause i wanted to contribute to the
FreePascal and Lazarus community, Ludob have criticize ParallelVarFiler
and he said that there
were still some bugs with ParallelVarFiler, but i think i have corrected
those bugs and i hope you will enjoy my Parallel libraries, if you asked
me what i really love in my Parallel archiver,
it's an archiver that has a O(1) complexity access in all the ADD() ,
Delete() and Extract() etc. and you can use it as a Hash Table also
with both compression and AES encryption and
i think it is stable now, this is why i love it,
if you asked me also why i have designed those
scalable RWLocks and scalable Locks, cause the RWLock that i have
designed is scalable and the scalable Locks that i have implemented are
scalable and FIFO fair, so they are also starvation-free.. this what i
love in them etc. etc.

So hope you will enjoy my parallel libraries etc. etc.

You can download all my parallel librariries and programs from:

http://pages.videotron.com/aminer/

Thank you,
Amine Moulay Ramdane.

==============================================================================
TOPIC: Scalable Locks...
http://groups.google.com/group/comp.programming.threads/t/a5a22f4fc887c2e7?hl=en
==============================================================================

== 1 of 1 ==
Date: Mon, Oct 21 2013 10:02 am
From: aminer

Hello all,

Why not use the Windows critical section and that's all ?
you have to understand that the Windows critical section is not
starvation-free and it's not FIFO fair and it doesn't minimize
efficiently the cache-coherence traffic cause it is using a backoff
mechanism, so since the Windows critical section is using a backoff
mechanism, that means that same thread will have the opportunity to
enter many many times the same critical section, and that's what happen
with Windows critical section, so as you have noticed this is fast cause
the same same thread will reuse the same variable from the same local
cache, but this is not FIFO fair and it's not starvation-free, so this
why i have decided to contribute to the FreePascal and Lazarus community
and i have implemented scalable Locks that are scalable and that
minimizes efficiently the cache-coherence traffic and that are
FIFO fair and starvation-free.

And you can download them from my website:

http://pages.videotron.com/aminer/

Thank you,
Amine Moulay Ramdane.

==============================================================================
TOPIC: scalable RWLock
http://groups.google.com/group/comp.programming.threads/t/e6a38fcb3c4fb23b?hl=en
==============================================================================

== 1 of 1 ==
Date: Mon, Oct 21 2013 10:26 am
From: aminer

Hello again my dear FreePascal and Lazarus community..

Question:

Amine, and what about your scalable RWLock that you have implemented ?

here it is:

http://pages.videotron.com/aminer/rwlock1.html

You have to understand something, if
you take a look at OmniThread from Gabr that was implemented for the
Delphi community , here it is:

http://otl.17slon.com/

If you download the zipfile of OmniThread
and you take a look at the lockfree RWLock that Gabr wrote and that is
called TMREW, you will notice that this RWLock is using a CAS on a
global variable, hence this is expensive , so if
you want TMREW to scale, the time of the RWLock's READ section must be
bigger that the time to
execute a those CASes on a global variable with cache misses, and that's
not good, so this
is why i have implemented my scalable RWLock that scales even if the
RWLOCK's READ section is small.

You can dowload my scalable RWLock from:

http://pages.videotron.com/aminer/rwlock1.html

Thank you,
Amine Moulay Ramdane,.

==============================================================================

You received this message because you are subscribed to the Google Groups "comp.programming.threads"
group.

To post to this group, visit http://groups.google.com/group/comp.programming.threads?hl=en

To unsubscribe from this group, send email to comp.programming.threads+unsubscribe@googlegroups.com

To change the way you get mail from this group, visit:
http://groups.google.com/group/comp.programming.threads/subscribe?hl=en

To report abuse, send email explaining the problem to abuse@googlegroups.com

==============================================================================
Google Groups: http://groups.google.com/?hl=en

soft and program

Sunday, November 17, 2013

comp.lang.c++ - 26 new messages in 5 topics - digest

Saturday, November 16, 2013

comp.programming.threads - 26 new messages in 16 topics - digest

Blog Archive

About Me