comp.programming.threads
http://groups.google.com/group/comp.programming.threads?hl=en
comp.programming.threads@googlegroups.com
Today's topics:
* Open issues in the specification for fork()? - 8 messages, 3 authors
http://groups.google.com/group/comp.programming.threads/t/e219574ddde7649b?hl=en
* How do I look inside an .exe file to view the programming - 1 messages, 1
author
http://groups.google.com/group/comp.programming.threads/t/70ea90ac4a00188c?hl=en
* How to view what commands are run by an exe file as they are running - 1
messages, 1 author
http://groups.google.com/group/comp.programming.threads/t/826e0e9502290d02?hl=en
* Data copying on NUMA - 3 messages, 3 authors
http://groups.google.com/group/comp.programming.threads/t/89bfa62ecdef1706?hl=en
* ParallelVarFiler 1.31 - 2 messages, 1 author
http://groups.google.com/group/comp.programming.threads/t/ace1a33c8920491a?hl=en
* Scalable RWLOCK 1.15 - 1 messages, 1 author
http://groups.google.com/group/comp.programming.threads/t/e330bec654423093?hl=en
* Queuing theory... - 4 messages, 1 author
http://groups.google.com/group/comp.programming.threads/t/853742404d1ba2a6?hl=en
* Threadpool with priorities version 1.54 - 1 messages, 1 author
http://groups.google.com/group/comp.programming.threads/t/3a765b03ab36c1b3?hl=en
* Concurrent FIFO Queue 1.01 - 1 messages, 1 author
http://groups.google.com/group/comp.programming.threads/t/73de55ea89ca5de7?hl=en
* An M/M/n queuing model simulation - 3 messages, 1 author
http://groups.google.com/group/comp.programming.threads/t/a4d462b8599c98e1?hl=en
==============================================================================
TOPIC: Open issues in the specification for fork()?
http://groups.google.com/group/comp.programming.threads/t/e219574ddde7649b?hl=en
==============================================================================
== 1 of 8 ==
Date: Mon, Nov 25 2013 7:52 am
From: Rainer Weikusat
Casper H.S. Dik <Casper.Dik@OrSPaMcle.COM> writes:
> Rainer Weikusat <rweikusat@mobileactivedefense.com> writes:
>
>>This is all about the people who wrote the pthreads-specification
>>fighting the other tribe of people who are Violently(!!1) opposed to
>>multi-threading but purposely not defining any sensible semantics for
>>fork in multi-threaded processes. Consequently, the UNIX(*) standard is
>>useless here and best ignored: For any existing platform, the behaviour
>>is necessarily defined and code can be written to work in such an
>>environment.
>
> I think this is completely unfair as it are those who worked on the
> implementation and those who wrote the standard knew exactly the
> two choices they had; they picked one of the two choices and you
> seem to disagree with that particular choice.
I 'disagree' with this here:
There are two reasons why POSIX programmers call fork(). One
reason is to create a new thread of control within the same
program (which was originally only possible in POSIX by creating
a new process);
[...]
When a programmer is writing a multi-threaded program, the first
described use of fork(), creating new threads in the same
program, is provided by the pthread_create() function. The
fork() function is thus used only to run new programs,
because while this is not an outright lie, it's a carefully
weasel-worded piece of disinformation supposed to lent some appearance
of rationality to the entirely political bias of the author: Fork does
not create 'a new thread of control running the same program' (which has
a process chained to its ankle for some shitty, historical reason nobody
in his right mind could possibly understand) but it creates a new
process running the same program and this includes creating a new thread
of control among some other things, most prominently, a new address
space.
There's also the practical concern that adding a new thread to some
existing code unleashes 'immediate pthread madness' which is supposed to
change the semantics of the working code retroactively: Something which
used to be a perfectly legitimate use of fork is now considered to have
undefined behaviour, regardless of how the interactions between the
existing 'single-threaded process code' and the code running in the new
thread actually happen to work. This means 'using pthreads' is meant to
be an all-or-nothing choice: Either the process remains strictly
single-threaded. Or everything has to be rewritten to be 'strictly
pthreaded' insofar this is possible (which might not be the case since
the code could rely on properties of fork pthread_create does not have)
and this looks a lot like a useless effort to me, IOW, 'pthreads' trying
to punish me for using 'traditionally written UNIX(*) code'.
== 2 of 8 ==
Date: Mon, Nov 25 2013 8:55 am
From: Casper H.S. Dik
Rainer Weikusat <rweikusat@mobileactivedefense.com> writes:
>I 'disagree' with this here:
> There are two reasons why POSIX programmers call fork(). One
> reason is to create a new thread of control within the same
> program (which was originally only possible in POSIX by creating
> a new process);
The changes for the threads didn't change how fork() worked or
didn't work.
Even before threads were common, fork() had many, many issues, such
as fork()ing the whole status of the stdio library; atexit() handlers.
fork() as a second thread in the same program was always full of
risk.
You almost sound like someone who laments the loss of Linux' clone
model to the pthread model.
It is true, I think, that starting threads from a library is
frowned upon, except if these threads are extremely well-behaved
(and that might include not calling malloc() after control has been
back to the main thread)
Casper
== 3 of 8 ==
Date: Mon, Nov 25 2013 9:36 am
From: Rainer Weikusat
Casper H.S. Dik <Casper.Dik@OrSPaMcle.COM> writes:
> Rainer Weikusat <rweikusat@mobileactivedefense.com> writes:
>
>
>>I 'disagree' with this here:
>
>> There are two reasons why POSIX programmers call fork(). One
>> reason is to create a new thread of control within the same
>> program (which was originally only possible in POSIX by creating
>> a new process);
>
> The changes for the threads didn't change how fork() worked or
> didn't work.
>
> Even before threads were common, fork() had many, many issues, such
> as fork()ing the whole status of the stdio library; atexit() handlers.
>
> fork() as a second thread in the same program was always full of
> risk.
>
> You almost sound like someone who laments the loss of Linux' clone
> model to the pthread model.
I wrote something specific about the whole 'semantic unit of text' the
first paragraph of which you quoted above and about another problem with
the pthreads 'make no prisoners' approach. I don't see how the contents
of your text would relate to that. I'm also not aware of any 'losses'
affecting the Linux clone system call which were caused by adding the
functionality necessary for supporting the pthreads signal handling
model (which, while sensible in itself, is another 'disaster as
designed' because it runs counter the intuition of a lot of people, who
presumably use dedicate signal handling threads, thus, limiting the
scalability of their code, despite there's no real reason for that
except that they don't understand the purpose of the pthreads signal
handling approach and consider it "scary and dangerous" because of
that).
While I don't consider stdio particularly useful, I wouldn't go so far
to label it as 'risky on UNIX(*)' and the same is true for 'atexit
handlers'. Using binary-only code with unknown behaviour is 'risky' but
even these dangers can be dealt with.
== 4 of 8 ==
Date: Mon, Nov 25 2013 10:46 am
From: Drazen Kacar
Casper H.S Dik wrote:
> If you didn't link with -lthread or -lpthread, you got stubs for many
> of the thread calls (such as mutex primitives) so a library could be
> written as if it was multi-threaded but would work in a single threaded
> application.
Not really. Things worked that way and it was generally known that stubs
in libc exist[1], but that was neither documented nor supported. So the
only entity who could safely write libraries that way was Sun.
I don't know if anyone else wrote them that way. I decided not to, because
the method was not supported and thus likely to cause trouble if the
implementation changes.
[1] And, of course, there were people who wrote multi-threaded program,
but forgot to add one of the thread libraries to the link line. Then the
program would link fine (because of stubs in libc), but none of the
thread functions would work. That was a lot of fun for everyone. :-)
--
.-. .-. Yes, I am an agent of Satan, but my duties are largely
(_ \ / _) ceremonial.
|
| dave@fly.srk.fer.hr
== 5 of 8 ==
Date: Mon, Nov 25 2013 11:55 am
From: Casper H.S. Dik
Drazen Kacar <dave@fly.srk.fer.hr> writes:
>Not really. Things worked that way and it was generally known that stubs
>in libc exist[1], but that was neither documented nor supported. So the
>only entity who could safely write libraries that way was Sun.
And they didn't all work either; e.g., pthread_once() didn't work.
>I don't know if anyone else wrote them that way. I decided not to, because
>the method was not supported and thus likely to cause trouble if the
>implementation changes.
>[1] And, of course, there were people who wrote multi-threaded program,
>but forgot to add one of the thread libraries to the link line. Then the
>program would link fine (because of stubs in libc), but none of the
>thread functions would work. That was a lot of fun for everyone. :-)
Yes; fortunately, that was fixed when Solaris 10 was released (in 2005)
But that couldn't be done until we had created a multi-thread library
which behaved more sanely in the presence of just the one thread;
such a library was first shipped with Solaris 8 and was made the default
in Solaris 9.
Casper
== 6 of 8 ==
Date: Mon, Nov 25 2013 2:16 pm
From: Rainer Weikusat
Casper H.S. Dik <Casper.Dik@OrSPaMcle.COM> writes:
> Drazen Kacar <dave@fly.srk.fer.hr> writes:
[...]
>>[1] And, of course, there were people who wrote multi-threaded program,
>>but forgot to add one of the thread libraries to the link line. Then the
>>program would link fine (because of stubs in libc), but none of the
>>thread functions would work. That was a lot of fun for everyone. :-)
>
> Yes; fortunately, that was fixed when Solaris 10 was released (in 2005)
>
> But that couldn't be done until we had created a multi-thread library
> which behaved more sanely in the presence of just the one thread;
> such a library was first shipped with Solaris 8 and was made the default
> in Solaris 9.
This is described in some detail in a whitepaper named 'Multithreading
in the Solaris Operating Environment' published around the time of
Solaris 9 which is still worth a read (I actually just reread it). The
main change in order to get a 'sanely behaving threading library' was
switching from M:N threading to 1:1 threading and thus getting rid of
the necessity to use hidden userspace threads in order to try to solve
problems inherent in the M:N approach. This includes the remarkable feat
of accepting that threads are useful because they enable applications to
use multiprocessors AND NOT because they enable university eggheads from
'research assistant' upwards to avoid dealing with state machines in
their code (that's a gross oversimplification, mainly because I really
don't know how to describe this in a more accurate way).
Commercial operating system tend to oscillate between these two models
(with the occasional idiot reimplementing cooperative userspace
threading for the upteempth time), cf the following statement from HP on
'MxN threading in HP-UX 11i':
MxN threads were implemented in response to the perception that
enterprise applications creating large numbers of threads
perform better with MxN threads than with 1x1.
However,
The only real way to know how using the MxN model affects the
performance of your particular application is to benchmark the
two models and compare the results.
Or, to put this into other words "While there's a strong perception that
such applications ought to exist, there's little real evidence that they
actually do".
== 7 of 8 ==
Date: Tues, Nov 26 2013 12:33 am
From: Casper H.S. Dik
Rainer Weikusat <rweikusat@mobileactivedefense.com> writes:
>Commercial operating system tend to oscillate between these two models
>(with the occasional idiot reimplementing cooperative userspace
>threading for the upteempth time), cf the following statement from HP on
>'MxN threading in HP-UX 11i':
> MxN threads were implemented in response to the perception that
> enterprise applications creating large numbers of threads
> perform better with MxN threads than with 1x1.
>However,
> The only real way to know how using the MxN model affects the
> performance of your particular application is to benchmark the
> two models and compare the results.
>Or, to put this into other words "While there's a strong perception that
>such applications ought to exist, there's little real evidence that they
>actually do".
The argument for "M:N is better" was very strong because, theoretically,
it could be faster and synchronization wouldn't need to use the kernel.
A "1:1" implementation, in practice, however, uncontended locks don't need a
trip to the kernel either and the enormous amount of extra work needed in
the MxN implementation (handling SIGWAITING, a second implementation of
context switching, which actually often did require kernel involvement)
It should also be noticed that when a comparison was done, it was
often using the same libthread library but run so that M == N; but
that wasn't a proper comparison as in that case you still pay for
the penalty for the M:N support.
Casper
== 8 of 8 ==
Date: Tues, Nov 26 2013 7:46 am
From: Rainer Weikusat
Casper H.S. Dik <Casper.Dik@OrSPaMcle.COM> writes:
> Rainer Weikusat <rweikusat@mobileactivedefense.com> writes:
[M:N/ 1:1 threading]
> The argument for "M:N is better" was very strong because, theoretically,
> it could be faster and synchronization wouldn't need to use the kernel.
>
> A "1:1" implementation, in practice, however, uncontended locks don't need a
> trip to the kernel either and the enormous amount of extra work needed in
> the MxN implementation (handling SIGWAITING, a second implementation of
> context switching, which actually often did require kernel involvement)
Some people think that 'threads are a way to structure programs', that
is, they're primarily supposed to be helpful for developers. The idea is
typically that 'a thread of execution' can be dedicated to a single task
which progresses intermittently and concurrently with a lot of other
tasks of the same kind: In a single-threaded process, this requires
storing the state of each task in some 'globally accessible' data
structure and continuing to work on some other tasks based on the
recorded state of that whenever the 'current task' can't make progress
anymore. If each task had a dedicated thread, this wouldn't
be necessary because 'the scheduler' would transparently suspend one
thread and switch to another whenever things would otherwise come to a
standstill, thus providing the illusion that each task really runs on a
dedicated computer without having to consider the needs of other tasks.
This, of course, requires creating as many threads as there are
concurrent tasks and for something like a network server, this could
easily mean thousands or even tenthousands of them. Consequently, each
thread needs to be as cheap as possible, in particular, it shouldn't
have any kernel resources like 'a kernel stack' permanently allocated to
it. Also, in order to make programming easy, actual concurrent
execution of threads should rather be avoided: For as long as a thread
is busy, it just keeps running, and once it invokes 'a blocking call' of
some kind, another thread can start to run. In order to avoid starving
tasks, threads are supposed to be 'nice to each other', that is,
explictly yield the processer every now and then when being busy for a
long time. This means cooperative userspace threading. Because there's
no concurrency, synchronization isn't really needed and because there's
no preemption, no complicated userspace scheduler is needed, either. At
least in theory. In practice, once the process blocks in the kernel, all
threads stop running and hence, each and every posssible blocking system
call must be wrapped in some library routine which accomplishes the same
end in some non-blocking way and runs other threads in the meantime. The
net effect of this is usually a lot of borderline insane library hackery
and some rather bizarre restrictions as to what a thread can do or must
not do (the GNU pth paper provides a nice example of that).
Some other people think that threads are supposed to be useful for users
in the sense that they enable these to utilize processing resources such
as multiple processors and/or multiple cores which might be available to
them. This implies that 'a thread' runs code which looks pretty much
like that of a single-threaded in processes wrt to
'multi-tasking'. In order to keep scheduling overhead down, there will
be relatively few threads (somewhat more than available 'processing
things') but they're both supposed to execute in parallell and to work
with 'shared global resources'. This means synchronization is needed and
the kernel scheduler has to deal with them because the kernel is the
privileged piece of coding managing the actual hardware and thus, the
only software capable of doing things like assigned threads to available
processors for execution. Further, this is supposed to work in an
environment shared with other application which also seek to use the
available resources and which don't have been written to cooperate,
hence, preemption is necessary. The most natural choice for this is 1:1
threading where each 'application thread' corresponds with a 'kernel
scheduled entity' of some sort.
Trying to satisfy both groups of people leads to M:N threading which
really just combines the disadvantages of both models without providing
the advantages of either. It ends up as expensive way to screw everyone
alike but only in practice, not in theory.
I was planning to illustrate this using the pre-Solaris 8/9 'threading
workarounds' as example but this text is too long already ...
==============================================================================
TOPIC: How do I look inside an .exe file to view the programming
http://groups.google.com/group/comp.programming.threads/t/70ea90ac4a00188c?hl=en
==============================================================================
== 1 of 1 ==
Date: Tues, Nov 26 2013 4:02 am
From: goldee.milan@gmail.com
not even one at least one answer. someone should not write here poems, but simply copy and paste a SOFTWARE how it looks like. line after line.
==============================================================================
TOPIC: How to view what commands are run by an exe file as they are running
http://groups.google.com/group/comp.programming.threads/t/826e0e9502290d02?hl=en
==============================================================================
== 1 of 1 ==
Date: Tues, Nov 26 2013 5:29 am
From: bryan.chisenhall@gmail.com
On Friday, November 15, 2013 2:03:32 PM UTC-6, Lucas Levrel wrote:
> Le 15 novembre 2013, chiz a ᅵcrit :
>
>
>
> > My problem is this: I am trying to install a program that requires me to
>
> > have a specific version of another program installed before completing
>
> > the install. When I run the .exe file, it goes through the normal steps
>
> > until it checks to see if I have a specific version of another program
>
> > installed. I have this program installed but, for whatever reason, it
>
> > does not recognize that I have it installed.
>
>
>
> The "whatever" here is a big problem I think.
>
>
>
> > So, I want to see where it is looking for that file. I don't want to
>
> > edit the .exe file. I just want to know what commands the .exe is trying
>
> > to run so I can see where it is looking. Maybe a "command" is a bad way
>
> > to describe what I'm looking for... not sure. If I can see where it is
>
> > looking, I can install the program into that directory. I'm hoping that
>
> > would solve the problem.
>
>
>
> Normally one should not make assumptions on where a program resides, so
>
> your calling program should not have the path to the called program
>
> hardcoded. Is the called program in your %path%? Run a command window
>
> (Win+R, type "cmd", Enter), type there "the_called_program_name.exe". Is
>
> it found by Windows?
>
>
>
> But are you sure this is a path issue? The calling program could be
>
> running a command like "the_called_program /version" and parsing the
>
> output to know which version is installed.
>
>
>
> --
>
> LL
Thank you for the reply.
I checked the command prompt and it was not found by Windows. Thanks for mentioning that; I'll try to figure out how to get that to work.
The calling program does require a specific version but I tripled checked that thinking it must be the issue. I do have the correct version installed though.
Thanks again!
==============================================================================
TOPIC: Data copying on NUMA
http://groups.google.com/group/comp.programming.threads/t/89bfa62ecdef1706?hl=en
==============================================================================
== 1 of 3 ==
Date: Thurs, Nov 28 2013 2:25 pm
From: Paavo Helde
On NUMA, as the acronym says, some memory is better accessible by a certain
NUMA node than others. Now, let's say I have a deep dynamic-allocated data
structure I want to use in a thread running in another NUMA node, how
should I pass it there? Should I perform the copy in the other thread so
that the new dynamic allocations take place in the target thread? Probably
depends on the memory allocator, what would be the best choice for
Linux/Windows?
This is not a theoretical question, actually we see a large scaling
performance drop on NUMA and have to decide whether to go multi-process
somehow or are there some ways to make multi-threaded apps to behave
better. As far as I understand there is a hard limit of 64 worker threads
per process on Windows, so probably we have to go multi-process anyway at
some time point. Any insights or comments?
TIA
Paavo
== 2 of 3 ==
Date: Sun, Dec 1 2013 5:21 pm
From: Robert Wessel
On Thu, 28 Nov 2013 16:25:12 -0600, Paavo Helde
<myfirstname@osa.pri.ee> wrote:
>
>On NUMA, as the acronym says, some memory is better accessible by a certain
>NUMA node than others. Now, let's say I have a deep dynamic-allocated data
>structure I want to use in a thread running in another NUMA node, how
>should I pass it there? Should I perform the copy in the other thread so
>that the new dynamic allocations take place in the target thread? Probably
>depends on the memory allocator, what would be the best choice for
>Linux/Windows?
On Windows, for example, you can use the VirtualAllocExNuma API to
allocate storage "near" a given node from any running code. By
default the allocation will be in local memory for the allocating
thread, which would not be optimal if another thread is going to do
all the work on that area..
In general, allocating structure close to the executing node is
certainly a win for some applications. Yes, that's vague, but so is
your problem statement. If a thread (or group of threads), is going
to heavily use a structure that won't cache, running all of those
threads in a single NUMA node, and allocating that structure in memory
local to that node, can significantly improve the available bandwidth
and latency for those threads, as well as consuming less of the global
resources for other threads in the system.
>This is not a theoretical question, actually we see a large scaling
>performance drop on NUMA and have to decide whether to go multi-process
>somehow or are there some ways to make multi-threaded apps to behave
>better. As far as I understand there is a hard limit of 64 worker threads
>per process on Windows, so probably we have to go multi-process anyway at
>some time point. Any insights or comments?
There is no 64 threads per process limit in Windows. Perhaps such a
thing existed in Win9x. Win32 has a limit of 32 logical cores per
machine (which has nothing in particular to do with the number of
threads in a process), but that doesn't apply to Win64 (although if
you run Win32 applications on Win64 only the first 32 core in the
first processor group are used to execute Win32 code).
If you're running multiple processes, then the system can fairly
easily split the workload between nodes, as you're implicitly telling
the system that the data is not shared, so the system will try to keep
a process (and hence it's allocations) on a particular node. If
you're running enough threads in a process that you're going to span
more than one node, you're going to have to specify some of that
manually, or structure things so that allocations only happen on the
appropriate node (usually by only doing the required allocations on
the actual threads using the allocated areas).
== 3 of 3 ==
Date: Sat, Dec 7 2013 2:34 am
From: andrew@cucumber.demon.co.uk (Andrew Gabriel)
In article <XnsA287445E6980myfirstnameosapriee@216.196.109.131>,
Paavo Helde <myfirstname@osa.pri.ee> writes:
>
> On NUMA, as the acronym says, some memory is better accessible by a certain
> NUMA node than others. Now, let's say I have a deep dynamic-allocated data
> structure I want to use in a thread running in another NUMA node, how
> should I pass it there? Should I perform the copy in the other thread so
> that the new dynamic allocations take place in the target thread? Probably
> depends on the memory allocator, what would be the best choice for
> Linux/Windows?
On Solaris, memory is always allocated preferentially local to the
core the thread is running on. When a thread becomes runnable, it
is preferentially scheduled on a core local to the data it has been
accessing. The application itself doesn't need to do anything - this
is a feature of the OS. I'm less clear how Linux/Windows handle this.
> This is not a theoretical question, actually we see a large scaling
> performance drop on NUMA and have to decide whether to go multi-process
Multi-process or multi-threaded doesn't make any difference from the
NUMA point of view, if the data is shared in both cases.
> somehow or are there some ways to make multi-threaded apps to behave
> better. As far as I understand there is a hard limit of 64 worker threads
> per process on Windows, so probably we have to go multi-process anyway at
> some time point. Any insights or comments?
You can optimze the data layout such that different cores are not
competing for the same cache lines, and continually invalidating
each others' local caches. To do this, group data accessed by a
single thread at a time into aligned chunks which are multiples of
64 bytes long.
Break up hot locks where possible. Check for tools to identify hot
locks (on Solaris, plockstat).
If you are heavily using malloc/free in a multi-threaded process,
make sure you are linking in a library which contains a
version of these designed for heavy multi-threaded use.
--
Andrew Gabriel
[email address is not usable -- followup in the newsgroup]
==============================================================================
TOPIC: ParallelVarFiler 1.31
http://groups.google.com/group/comp.programming.threads/t/ace1a33c8920491a?hl=en
==============================================================================
== 1 of 2 ==
Date: Wed, Dec 11 2013 11:51 am
From: aminer
Hello,
ParallelVarFiler was updated to version 1.31,
i have corrected some hard to find memory leaks,
and now it has no memory leaks...
Author: Amine Moulay Ramdane
Parallel Variable filer and streamer for Delphi and Freepascal that can
be used also as a parallel hashtable and that uses ParallelHashList
(Parallel Hashtable) with O(1) best case and O(log(n)) worst case access
that uses lock striping and lightweight
MREWs(multiple-readers-exclusive-writer) , this allows multiple threads
to write and read concurently. also ParallelHashList maintains an
independant counter , that counts the number of entries , for each
segment of the hashtable and uses a lock for each counter, this is also
for better scalability.
Description:
ParallelVarFiler is a Parallel HashTable that can be saved automatically
or manually to a file or to a stream or to a string and it can be
restored from a file or from a stream or from a string back to the
Hashtable in memory, and it's fault tolerant to power failures etc.
You can collect different types of variables and save them into one file
or stream. TParallelVarFiler reads and writes on files, streams and
Strings. You can collect Strings, Integers and Dates, in fact everything
that can be stored in a variant. In addition you can collect Streams.
You can use also ParallelVarFiler to send parameters over network. It
can be used for example to share data by any IPC mechanism.
And please look at test.pas and test1.pas a parallel variable filer
examples - compile and execute them...
Now you can use ParallelVarFiler as a small to medium database, i have
added a Clean() and DeletedItems() methods and i have changed the
constructor, now you can pass a file name to the constructor where the
data will be wrote and updated and deleted... if you pass and empty file
name the data will be updated and wrote and deleted only in memory. When
you use a file name in the constructor, many readers and one writer can
proceed concurrently. But when you pass an empty file name to the
constructor, many writers and many readers can proceed concurrently in
memory. and when you read a data item it only use the parallel hashtable
in memory, hardisk is not used.
And when you pass a file name to the constructor and delete or update
the items, those items will be marked deleted from the file, and you can
use DeletedItems() to see how many data items are marked deleted in the
file, and use after that the Clean() method to delete completly those
data items from the file.
How can you store mutiple data inside a database row and map it to a key ?
You can simply use Parallel Variable Filer for that and stream your
mutiple data variants and map them to a key...
As you know ParallelVarFiler is using ParallelHashList (a parallel
hashtable that scales on multicores), so when you are using the
GetKeys() method , and you want to get the data of those keys don't
forget to test if the variant is not empty by using the VarIsEmpty()
function, and when you are using GetStream() you can test if the data
exists by testing the return boolean value from GetStream().
ParallelVarFiler is easy to learn, i have documented all the methods ,
and please read inside ParallelVarFiler.pas about
them.
Now ParallelVarFiler is Fault tolerant to power failures etc. i have
done a simulation of power failures and data file damages and
ParallelVarFiler is recovering from power failures and damages of the
data file ...
As you know ParallelVarFiler is using ParallelHashList (a parallel
hashtable), but as you know ParallelHashList is cache unfriendly since
it's using a hash mechanism etc. and as you know ParallelVarFiler and
ParallelHashlist are memory bound, so since ParallelHashList is cache
unfriendly and it's memory bound so i don't think ParallelVarFiler or
ParallelHahsList will scale very well, to avoid this problem you have to
use something like replication across computers using TCP/IP and using
your database in each computer and load balance between computers, but
even if you are using replication and load balancing this can make the
memory and hardisk truly parallel but this will not avoid the problem
with the network bandwidth limitation. But ParallelVarFiler and
ParallelHashlist are scaling to a certain point and they are fast and
they are still usefull.
Please see the test.pas example inside the zip file to see how i am
using it...
But please read the following:
This software is provided on an "as-is" basis, with no warranties,
express or implied. The entire risk and liability of using it is yours.
Any damages resulting from the use or misuse of this software will be
the responsibility of the user.
please look at the test.pas and test1.pas example inside the zip file to
see how to use ParallelVarFiler...
You can download ParallelVarFiler 1.31 from:
http://pages.videotron.com/aminer/
Here is the methods that have been implemented:
PUBLIC METHODS:
constructor Create(file1:string,size,mrews:integer;casesensitive:boolean);
- Creates a new VarFiler ready to use, with size and with mrews number
of mrewa and casesensitive
for case sensitive keys,the number of
MREWS(multiple-readers-exclusive-writer) must be less or
equal to the Hashtable size and file1 is the the file to write to, if
file1 is an empty string
the data will be wrote only to memory and use SaveToFile() to write to a
file.
destructor Destroy;
- Destroys the ParallelVarFiler object and cleans up.
procedure Clear;
- Deletes all contents.
function Clean:boolean
- Cleans the deleted items from the file.
function DeletedItems:integer
- Returns the number of items marked deleted.
function LoadData:boolean
- Loads the data from the file passed to the constructor.
function Delete(Name : String):boolean;
- Deletes the variable with Name.
function Exists(Name : String) : Boolean;
- Returns True if a variable with Name exists
procedure GetKeys(Strings : TstringList);
- Fills up a TStringList descendant with all the keys.
function Count : Integer;
- Returns the number of variables
function Add(Name : String; Value : Variant ):boolean;
- Adds a new variable , given the Name
function AddStream( Name : String;Stream : TStream ):boolean;
- Adds a new Stream, given the Name
function Update(Name : String; Value : Variant):boolean;
- Updates the new variable , given the Name
function UpdateStream(Name : String; Stream : TStream ):boolean;
- Updates a new Stream , given the Name
function GetStream(Name : String;Stream : TStream):boolean;
- Fills up the stream with data
procedure SaveToStream(Stream : TStream);
- Saves all variables, streams and Graphics to a stream.
procedure SaveToFile(FileName : String);
- Saves all variables, streams and Graphics to a file.
procedure SaveToString(out S : String);
- Saves all variables, streams and Graphics to a string.
procedure SaveToStreamAndDelete(Stream : TStream);
- Saves all variables, streams and Graphics to a stream
and delete after that all the data inside the hashtable
and inside the file.
procedure SaveToFileAndDelete(FileName : String);
- Saves all variables, streams and Graphics to a file.
and delete after that all the data inside the hashtable
and inside the file.
procedure SaveToStringAndDelete(out S : String);
- Saves all variables, streams and Graphics to a string.
and delete after that all the data inside the hashtable
and inside the file.
function LoadFromStream(Stream : TStream):boolean;
- Loads all variables, streams and Graphics from a stream.
function LoadFromFile(FileName : String):boolean;
- Loads all variables, streams and Graphics from a File.
function LoadFromString(S : String):boolean;
- Loads all variables, streams and Graphics from a string.
function Stream2String(Stream1: TStream;var Mystring:string): boolean;
procedure String2Stream(var String2BeSaved:string; Stream1: TStream);
procedure Variant2Stream(var VR:variant; Stream1: TStream);
function Stream2Variant(Stream1: TStream;var VR:variant): boolean;
PUBLIC PROPERTIES:
Items : Variant
- Gets the value (indexed).
Language: FPC Pascal v2.2.0+ and Lazarus / Delphi 7 to 2007:
http://www.freepascal.org/
Operating Systems: Win , Linux and Mac (x86).
Required FPC switches: -O3 -Sd -dFPC -dWin32 -dFreePascal
-Sd for delphi mode....
Required Delphi switches: -DMSWINDOWS -$H+
For Delphi use -DDelphi
Thank you,
Amine Moulay Ramdane.
== 2 of 2 ==
Date: Wed, Dec 11 2013 12:03 pm
From: aminer
Hello,
You can be confident with my ParallelVarfFiler , i have done my best to
debug it thouroughly and i think it is now more stable.
Be happy with ParallelVarFiler...
You download ParallelVarFiler from:
http://pages.videotron.com/aminer/
Thank you,
Amine Moulay Ramdane.
==============================================================================
TOPIC: Scalable RWLOCK 1.15
http://groups.google.com/group/comp.programming.threads/t/e330bec654423093?hl=en
==============================================================================
== 1 of 1 ==
Date: Sun, Dec 15 2013 2:39 pm
From: aminer
Hello,
I have updated my scalable RWLOCK to version 1.15, i have
used the scalable Anderson lock in the write side so
that it becomes FIFO fair and so that it reduces the
cache-coherence traffic efficiently.
Scalable RWLock 1.15
Authors: Amine Moulay Ramdane.
Description:
Description: A fast, and scalable, and lightweight
Multiple-Readers-Exclusive-Writer Lock that is portable called LW_RWLock
and that works across processes and threads and a fast and also a
scalable Multiple-Readers-Exclusive-Writer Lock that is portable called
RWLock and that don't use spin wait but uses an event object and also my
SemaMonitor and that consumes less CPU than the lightweight version and
it processes now the writers on a FIFO order and that's also important
and it works across threads .
A Read/Write Lock is a performance improvement over a standard mutex for
cases where reads outnumber writes. with a Read/Write Lock multiple
simultaneous read locks may be held, but write locks are exclusively held.
The exclusive writing lock ensures that race conditions do not occur,
since if one client is writing to the data no other client may read or
write. Also, the allowance for multiple simultaneous read locks
decreases resource contention since multiple readers can safely use the
shared data. This increases performance over a standard mutex for the
assumed usage pattern of frequent simultaneous reads and infrequent writes.
You can download Scalable RWLock from:
http://pages.videotron.com/aminer/
Please take a look a the test.pas Object Pascal demo inside the zipfile,
compile and run it...
Language: FPC Pascal v2.2.0+ / Delphi 7+: http://www.freepascal.org/
Operating Systems: Windows, Mac OSX , Linux , Unix...
Required FPC switches: -O3 -Sd -dFPC -dFreePascal
-Sd for delphi mode....
{$DEFINE CPU32} and {$DEFINE Windows32} for 32 bit systems
{$DEFINE CPU64} and {$DEFINE Windows64} for 64 bit systems
Thank you,
Amine Moulay Ramdane.
==============================================================================
TOPIC: Queuing theory...
http://groups.google.com/group/comp.programming.threads/t/853742404d1ba2a6?hl=en
==============================================================================
== 1 of 4 ==
Date: Tues, Dec 17 2013 1:48 pm
From: aminer
Hello,
I want to talk about the mathematical modeling of
queuing networks , if you have noticed in a webserver
that uses a distributed database , the database
have to be modeled as a queuing system with an
hyperexponential service, but this will complicate the
mathematical modeling of the webserver since the
database have to be modeled as an M/G/1 queuing system,
so we have to simplify the mathematical modeling , so if
for exemple the database's write transactions takes in average
much more time than a database's read transaction so we have
to choose the worst case scenario , that means we have to choose
the rate of the database's write transactions as the rate
of the arrival to the queuing system of the database so
that the database can be modeled as an M/M/1 queuing system
(M: means markovian), so that the webserver can finally be
modeled as two M/M/1 queuing system in series, one for
the database and one for the internet network.
But there is still a problem, if we want to also resolve the
worst case scenario when for exemple thousands of custumers
arrive at the same time to the webserver.. so the webserver
have to be modeled taking into acount this worst case scenario...
Here is the matematical modeling of an M/M/1 queuing system:
We already know that to satisfy a Poisson process we must
have that N(t1)- N(t0), N(t2)- N(t1) etc. must be independant
that means the counting increments must be independant.
We have the following relation between the Poisson law
and Exponential law:
the expected value E(X exponential) = 1 / E(X poisson)
so f the arrival is poissonian then the interarrivals are
exponential..
Now i will calculate the expected mean waiting time and
mean number of custumers in the queuing system:
The Little law says that the waiting time in the queuing system:
Ws = Ls/lambda
And the waiting time in the queue of the queuing system is:
Wq = Ws - 1/Mu
= Ls/lambda - 1/Mu
And the Little law gives us the expected mean number of custumers in the
queue:
Lq = Wq*lambda = Ls - Phi et Phi = lamda/Mu
That implies:
Ls - Lq = Phi
When the system is in a stationary state , the balance equations gives:
State rate in = rate out
state 0: Mu * P1 = lambda*P0
state 1: lambda*P0 + Mu*P2 = lambda*P1 + Mu*P1
state 2: lambda*P1 + Mu*P3 = lambda*P2 + Mu*P2
...
state n: lambda*Pn-1 + Mu*Pn+1 = lambda*Pn + Mu*Pn
And that gives us the following balance equations:
lambda * P0 = Mu*P1 <=> P1 = (lambda/Mu)*P0
lambda * P1 = Mu*P2 <=> P1 = (lambda/Mu)^2*P0
lambda * P2 = Mu*P3 <=> P1 = (lambda/Mu)^3*P0
...
lambda * Pn-1 = Mu*Pn <=> P_n = (lambda/Mu)^k*P0 [1]
Note: P0 means the probality of having zero custumer in the queuing system.
and ^ means power.
And we have also that:
Sum(n=0 -> infini) (Pn) = 1 that means the sum of probabilities equal 1
And [1] gives us:
Sum(n=0 -> infini) (Phi^n*P0) =1
And by resolving the sum if a geometric series that gives us:
P0 * 1/(1-Phi) = 1 => P0 = 1 - Phi and Phi < 1
And [1] gives us:
Pn = (1-Phi) Phi^n
And we have that means the mean expected number iof custumer in the
queing system is:
Ls = Sum(n=0 -> infini) (n*Pn)
That implies:
Ls = Sum(n=0 -> infini) (n*(1-Phi)*Phi^n) = Phi/1-Phi et Phi<1
and we have the mean expected number of custumer in the queue of the
queing system is :
Lq = Ls -Phi = Phi^2/ (1-Phi) et Phi<1
Little law gives us the mean waiting time in the system and in the queue:
Ws = 1/lambda * Phi/(1-Phi) and Phi<1
et
Wq= 1/lambda * Phi^2/(1-Phi) and Phi<1
I have just read the following page, look at the the USL
(Universal Law of Computational Scalability) of Dr. Gunther,
he wrote this: (See: http://en.wikipedia.org/wiki/Neil_J._Gunther )
--------------
The relative capacity C(N) of a computational platform is given by:
C(N) = N
------------------------------------------
1 + α (N - 1) + ß N (N - 1)
where N represents either the number of physical processors
in the hardware configuration or the number of users driving the
software application. The parameters α and ß represent respectively
the levels of contention (e.g., queueing for shared resources) and
coherency delay (i.e., latency for data to become consistent) in the
system. The ß parameter also quantifies the retrograde throughput
seen in many stress tests but not accounted for in either Amdahl's law
or event-based simulations.
----------
His website: http://www.perfdynamics.com/
If you read carefully , you will see that Dr. Gunther is using this
model to predict scalability after he simulates a relatively small
number of vusers in LoadRunner ( because of licensing costs, it's
cost-effective) and after that he finds the coefficients of the
2nd-degree polynomial (quadratic equation) and then transform
those coefficients back to the USL parameters using the α = b - a
and ß = a.
And then he is extrapolating with the USL model to higher loads
to predict scalability.
He is also applying the model to webservers with heterogeneous
workloads. like in the following page:
http://perfdynamics.blogspot.com/2009/04/assessing-usl-scalability-wi...
Now my question follows:
Suppose we have obtained a small number of measured load-points
with Loadrunner or others tools, and we calculated the USL equation
to predict scalability of a webserver , how the USL model can predict
if the scalability/performance is limited by the network bandwidth
and not the server ? I think USL can not predict this.
When we are modeling webservers , we have to include
the network node in our network queuig model (that includes
the computer server queue) , and since the service in the
computer server is comprised of multiple services (when we
are using htmls , databases etc.) the network queue will not
be markovian in its service , and we have to model the network
queue as an M/G/1 and this will complicate the mathematical
analytical modeling...
So, i think the best way is to use a webserver stress tool
like http://fwptt.sourceforge.net/
You can even test the webserver with an heterogeneous
workloads by starting multiple fwtpp processes, and
you should increase the number of threads to 5 and after
that to 10 threads, 15 ... and so on until the webserver
applications stops responding propperly(and this will inform
on the maximum number of users that you can have in the same time...)
and as you are stress testing you can even see (by testing/measuring
it) if the network bandwidth is not the bottleneck... and this can
not be done by the USL model.
We already know that to satisfy a Poisson process we must
have that N(t1)- N(t0), N(t2)- N(t1) etc. must be independant
that means the counting increments must be independant.
We have the following relation between the Poisson law
and Exponential law:
the expected value E(X exponential) = 1 / E(X poisson)
and if the arrival is poissonian then the interarrivals are
exponential..
Now what about a webserver ?
I have read the following paper:
http://docs.google.com/viewer?a=v&q=cache:JFYCp_dSPP4J:citeseerx.ist....
And it says that a simple model with M/M/1/k with FCFS discipline
can predict webserver performance quite well..
Hence, i think we can model our webserver over internet
with 3 queues connected as a Jackson Network like this
A -> M/M/1 Server Queue -> M/M/1 Network queue -> M/M/1 Client -> A
A: is the arrival rate
and we have the following:
Ni: number of jobs in each queue
Ui: utilization of each queue
Ni = Ui / (1-Ui)
Adding all the Ni in each individual queue will give the
average number of jobs in the entire queuing network.
After that we apply the Little formula:
A: network arrival rate
T: average response time
N = A*T => T = N / A
And after that from the mathematical analytical equation
we can simulate the jackson queuing network that model our
webservers...
Now there is still an important question that i have:
Our analytical jackson network model can give us insight
on the webservers behavivior.. but the difficulty that i find in
practice is that: suppose we have found the right parametters
that we want to choose, like for example the service rate of
the M/M/1 Server Queue , how , from this service rate, can
we buy the right computer that satisfies the service rate
that we want ?
We say in OR that:
"Understanding the behavior of a system is what Queueing Theory
and Little's Law is all about. But, managing a Queue requires not
just understanding the behavior of a system, but also in-depth
knowledge of how to improve a system - improving both aspects
of Queueing will mean a better, more efficient and cost-effective
system and, more importantly, a much better customer experience."
I wrote before that:
---
"It says that a simple model with M/M/1/k with FCFS discipline
can predict webserver performance quite well..
Hence, i think we can model our webserver over internet
with 3 queues connected as a Jackson Network like this
A -> M/M/1 Server Queue -> M/M/1 Network queue -> M/M/1 Client -> A
A: the arrival rate
and we have the following:
Ni: number of jobs in each queue
Ui: utilization of each queue
Ni = Ui / (1-Ui)
Adding all the Ni in each individual queue will give the
average number of jobs in the entire queuing network.
After that we apply the Little formula:
A: network arrival rate
T: average response time
N = A*T => T = N / A
And after that from the mathematical analytical equation
we can simulate the jackson queuing"
--
As you have noticed , this mathematical model of
this jackson network does in fact take into account
the M/M/1 Network queue node , the USL model can not
do this... and with this performance data from the mathematical
analytical model simulation we can for example validate
the performance data of the fwptt stress webserver simulation..
But you have to take into account worst cases and the
peak traffic loads...
Let for example we have a a webserver hosting html pages
and it is receiving 1000000 HTTP operations
per day with an average file size of 10 KB.
What would be the network bandwidth required for this website
considering peak traffic if the peak traffic load from past
observations was four times greater than average loads?
Required bandwidth is solved by the following equation:
HTTP op/sec x average file size or
1000000 HTTP ops per day =1000000/24 = 41,667 op/hour =
41,667/3600= 11.6 HTTP ops/sec
The needed bandwidth is
11.6 HTTP ops/sec X 10 KB/HTTP op = 116 KB/sec = 928 Kbps.
If we assume a protocol overhead of 20% then the actual throughput
required is 928 Kbps X 1.2 = 1,114 Kbps.
However if peak loads, as i said before, is as much as
4 times greater, the bandwidth required to handle spikes
would be 4 X 1,114 Kbps = 4.456 Mbps.
So you have to think also about the cost of this line...
I will add the following:
As you have noticed i said that:
"As you have noticed , this mathematical model of
this jackson network does in fact take into account
the M/M/1 Network queue node , the USL model can not
do this... and with this performance data from the mathematical
analytical model simulation we can for example validate
the performance data of the fwptt stress webserver simulation.."
and i said that:
"Hence, i think we can model our webserver over internet
with 3 queues connected as a Jackson Network like this
An M/M/1 Server Queue -> M/M/1 Network queue -> M/M/1 Client queue"
And of course on Capacity Planning for Enterprise Datacenters
and Websites , you can mirror many computer servers and load
balance between them with a software... to make the system much
FASTER, and this will be modeled as a jackson network like this:
A -> M/M/n Server Queue -> M/M/1 Network queue -> M/M/1 Client -> A
A: the arrival rate to the system
But there is still an important thing , as i have showed before
on my calculations:
"However if peak loads, as i said before, is as much as
4 times greater, the bandwidth required to handle spikes
it would be 4 X 1,114 Kbps = 4.456 Mbps.
So you have to think also about the cost of this line..."
I think that you have also to take into account the knee utilisation
of your M/M/n Servers Queues, if for example the number of computer
servers is 8 and the Knee is 74% that means that in our previous
example the bandwidth must equal to:
74% X 4.456 Mbps = 3.297 Mbps.
Cause as you know, above this Knee of 74% for 8 servers
the curve of the waiting time does grow quickly ..
And we have to take into account the cost of the line ...
Let us take another example:
In the network node of the Jackson network, there is three
main parameters that are responsable for its performance:
latency, bandwidth and utilization.
Now, if for example you take a look at my provider
http://www.videotron.com/service/internet-services/internet-access/hi...
My upload network speed is clearly 820 Kbs => 102.5 Kbytes/sec
Hence, if the average file size on my web site is for
example 25 Kbyte and the protocol overhead is 20%, that
means that the file size will be in average:
25 * 120% = 30 Kbyte
And if the Knee of the M/M/1 queuing node in our Jackson network
is 50%, then the bandwidth available will be reduced to
102.5 Kbytes/sec * 50% = 51.25 Kbytes/sec
If the download speed of the client is for example 7.5 Mbps,
that means that the file can be retrieved in no more than
30 / 51.25 = 0.585 second = 585 milliseconds.
But if we assume for example that a typical web visitor
to my web site is spending 585 milliseconds retrieving the
page and 60 seconds reading the HTML page THEN each client
on average only requires:
0.585 / (0.585 + 60) = 9.6% of the bandwidth that is allocated to
it..
but in practice i think we have to consider the worst case
scenarios..
So, since we can support a constraint of no more than 5 seconds
in the T (average response time), that means:
585 ms * X = 5000 milliseconds => X = 8.54
Hence, in worst case , my network node can not support more
than 8.54 users in the same time...
So, on *average* my network node can support:
(8.54 * 86400) / 5 = 147571 users per day
I wrote before that:
--
And of course on Capacity Planning for Enterprise Datacenters
and Websites , you can mirror many computer servers and load
balance between them with a software... to make the system much
FASTER, and this will be modeled as a Jackson network like this:
(1)
A -> M/M/n Server Queue -> M/M/1 Network queue ->
-> M/M/1 Client queue -> A
A: the arrival rate to the system"
--
We know also from an operational law of queuing theory that:
(2) the rate of job leaving any stable node must equal
its arrival rate.
(2) is from the Equivalence property.
Equivalence property: Assume that a service facility with
s servers and infinite queue has a Poisson input with
parameter lambda and the same exponential service-time
distribution with parameter Mu for each server (the M/M/s model),
where s * Mu > lambda. Then the steady-state output of
this service facility is also poisson process with parameter
lambda.
We know for an M/M/c queue that:
Note: try to add servers with almost the same hardware
configuration...
C(c, U) = Erlang formula = P(c) / (1 - Utilization)
note: c the number of servers..
P(c): means the probability that all the servers are busy
P(0): means the probability that there is no waiting time in the
queue, that means also: AT LEAST one server among the C servers
are not busy...
The average waiting time in the 'queue' =
C(c,U) / (service rate x c x (1 - Utilization)) (3)
It's approximatly equal to:
Utilization^C/(service rate x (1 - Utilization^C)
Note: ^ means power
This approximation is exact for the M/M/1 and M/M/2 models,
but 'slightly' lower than the result in (3) if c > 2
and
Utilization = Density of circulation / C (number of servers)
Note: ^ means power()
and C means the number of servers
Response time = The average waiting time in the 'queue' +
(1 / service rate)
average numbers of users in the system = service rate x response time
average number of users in queue = service rate x average waiting
time
in the 'queue'
Now as i said before:
--
So the equation for
Ni: number of jobs in each queue
Ui: utilization of each queue
Ni = Ui / (1-Ui)
Adding all the Ni in each individual queue will give the
average number of jobs in the entire queuing network.
After that we apply the Little formula:
A: network arrival rate
T: average response time
N = A*T => T = N / A
And after that from the mathematical analytical equation
we can simulate the jackson queuing network that model our
webservers...
---
If we try to calculate the Ni = Ui / (1-Ui) in
the Jackson network, and from the operational law (2) above
this will give us:
Ns for the M/M/n Server queue is:
(DC / n) / (1 - (DC/n))
and DC = Ss / A => Ns = ((Ss/A)/n) / (1 -((Ss/A)/n))
Ss: service rate at the queuing server.
A: Arrival rate to the jackson network
DC: is the Density of circulation
n: number of servers
And Nn for the M/M/1 Network queue is:
and Utilization in the M/M/1 network queue = Sn / A
this imply that => Nn = (Sn/A) / (1 -(Sn/A))
Nn: number of jobs in the M/M/1 network queue node
Un: Utilization in the network queue node
Sn: service rate at the queuiNg server.
A: Arrival rate to the jackson network
And Nc for the M/M/1 Client queue node is:
and Uc= Sc / (F/5)
this imply that => Nc = (Sc/(F/5)) / (1 -(Sc/(F/5)))
Note: F/5, if for example the rate is one file every 5 seconds.
Nc: number of jobs in the M/M/1 client queue node
Uc: Utilization in the M/M/1 client queue node
Sc: service rate at the queuiNg server.
A: Arrival rate to the Jackson network
F: Average file size
Adding all the Ni in each individual queue will
give the average number of jobs in the entire
queuing network that is equal to:
Ni = Nn + Ns + Nc
= (((Ss/A)/n) / (1 -((Ss/A)/n)) + (Sn/A) / (1 -(Sn/A))
+ (Sc/(F/5)) / (1 -(Sc/(F/5)))
After that we apply the Little formula:
A: network arrival rate
T: average response time
N = A*T => T = N / A
this imply that the T(the average response time in the Jackson
network)
is:
Ni = Nn + Ns + Nc
= (((Ss/A)/n) / (1 -((Ss/A)/n)) + (Sn/A) / (1 -(Sn/A))
+ (Sc/(F/5)) / (1 -(Sc/(F/5)))
Now finally we have our mathematical analytic equation
and we can begin to simulate our Enteprise webserver
and also validate the performance data with the fwptt stress
webserver simulation tool...
And don't forget what i have said about USL:
"Suppose we have obtained a small number of measured
load-points with Loadrunner or others tools, and we
calculated the USL equation to predict scalability of
a webserver , how the USL model can predict if the
scalability/performance is limited by the network bandwidth
and not the server ? I think USL can not predict this."
Also i wrote about USL that:
"As you have noticed , this mathematical model of
this jackson network does in fact take into account
the M/M/1 Network queue node , the USL model can not
do this... and with this performance data from the mathematical
analytical model simulation we can for example validate
the performance data of the fwptt stress webserver simulation.."
If you have noticed i have tried to show also that one
of the principal objectives is to control and manage
the *Quality* of the system or process, i wrote in my
post the following:
---
this imply that the T(the average response time in the Jackson
network)
is:
Ni = Nn + Ns + Nc
= (((Ss/A)/n) / (1 -((Ss/A)/n)) + (Sn/A) / (1 -(Sn/A))
+ (Sc/(F/5)) / (1 -(Sc/(F/5)))
Now finally we have our mathematical analytic equation
and we can begin to simulate our Enteprise webserver
and also validate the performance data with the fwptt stress
webserver simulation tool...
---
So, as you have noticed you can simulate our enterprise
webserver with this mathematical analytical equation and
validate the data collected by fwptt stress webserrer tool
and try to keep T( the average response time of your HTTP requests)
bellow or equal to 5 seconds, and see also if your webserver
is scaling, and we call this also Quality control using OR
(operational research).
In quality management, quality can also be maintained and
improved through the detection and reduction of process
variation. Where statistical techniques as ANOVA and others
can be used..
The ANOVA method , for example, analyses the variation in
the data set to discover the amount of variation that can
be attributed to each factor.. etc.
There is still an important thing..
As you have noticed, i have finally arrived to the following
mathematical model of our Jackson network that model an Enterprise
website:
Ni = Nn + Ns + Nc
= (((Ss/A)/n) / (1 -((Ss/A)/n)) + (Sn/A) / (1 -(Sn/A))
+ (Sc/(F/5)) / (1 -(Sc/(F/5)))
T: is the average response time..
So read it carefully an you will notice that this mathematical
model also can simulate and predict an intranet website behavior
or an internet website..
First look at the first factor (((Ss/A)/n) / (1 -((Ss/A)/n))
in our mathematical model, as you have noticed as
n (number of servers) grows this will make the denominator
grow rapidly and this will make (((Ss/A)/n) / (1 -((Ss/A)/n))
smaller, that's good for the response time...
Now let's look at the other factors in this mathematical
model:
IF the system is an Intranet, and for example the local
area network Sn is faster than the multiserver, that means:
We have Sn >> Ss, and that means => Ss /A >> Ss /A => that the
Knee in the server side can be reached more quickly than the
Knee in the network node of our Jackson network, so that
the bottleneck becomes the server node.
In the other hand if:
Ss >> Sn => Ss /A >> Sn / A => the Knee in the M/M/1 network
node can be reached more quickly, so that the network node in
the Jackson network becomes the bottleneck.
Availability:
The demand for high availability and reliability of computer and
software systems has led to a formal verification of such systems.
There are two principal approaches to formal verification:
model checking and theorem proving.
And i wrote about Petri Nets and how to model parallel
programs and how to exploit verification tools as Tina to verify
the properties such as liveleness of the system.
Please take a look at my other article about Petri Nets:
http://pages.videotron.com/aminer/PetriNet/formal.htm
And by using Petri Net and Tina you can make your system
more reliable , and if it's more reliable then the system
will have higher availability.
There are important factors to take into consideration when
building systems, factors such us availability and
efficiency(scalability etc....).
System availability is also calculated by modeling the system
as an interconnection of parts in series and parallel.
The availability A of a system is calculated like this
A = MTBF / (MTBF + MTTR)
MTBF: Mean time between failure
MTTR :Mean time to repair
and the MTTR is composed of:
time to respond to the failure + time to find
the problem + time to correct the problem + time
to verify that the system is working corretly.
Now if the interconnected parts are parallel , the equation
that we use to calculate the availability is:
A = 1- (U1 * U2 * U3 * ... * Un) (1)
Ui: is the non-availability of the component part of the system...
The availability of a system composed of interconnected parts in
series is:
A = A1 * A2 * ... * An (2)
Now as you have noticed i have modeled an enterprise website
as a Jackson network, and i have said before that:
--
And of course on Capacity Planning for Enterprise Datacenters
and Websites , you can mirror many computer servers and load
balance between them with a software... to make the system much
FASTER, and this will be modeled as a jackson network like this:
A -> M/M/n Server Queue -> M/M/1 Network queue -> M/M/1 Client -> A
A: the arrival rate to the system
--
As you have noticed i wrote "FASTER" but i didn't spook
about the AVAILABILITY, you will notice that you can not just
make your system more EFFICIENT and FASTER , but also
you can higher the AVAILABILITY of your system by mirroring
many servers...
I have also wrote before that:
---
"If you have noticed i have tried to show also that one
of the principal objectives is to control and manage
the *Quality* of the system or process, i wrote in my
previous post the following:
this imply that the T(the average response time in the Jackson
network)
is:
T = Ni /A = (Nn + Ns + Nc) /A
= (((Ss/A)/n) / (1 -((Ss/A)/n)) + (Sn/A) / (1 -(Sn/A))
+ (Sc/A) / (1 -(Sn/A))) / A
Now finally we have our mathematical analytic equation
and we can begin to simulate our Enteprise webserver
and also validate the performance data with the fwptt stress
webserver simulation tool...
So, as you have noticed you can simulate our enterprise
webserver with this mathematical analytical equation and
validate the data collected by fwptt stress webserrer tool
and try to keep T( the average response time of your HTTP requests)
bellow or equal to 5 seconds, and see also if your webserver
is scaling, and we call this also Quality control using OR
(operational research).
In quality management, quality can also be maintained and
improved through the detection and reduction of process
variation. Where statistical techniques as ANOVA and others
can be used.. "
---
We have many mathematical tools that we can use in Quality control,
example: Correlation, Regression , ANOVA etc.
ANOVA is a powerfull and important mathematic tool in
Quality control , by calculating the p-value in Excel or other
statistical tools , this will give us an important information like
if one or more variables have a significant effect on other
variables. If for example the p-value <= 0.05 (or if F > critical F value)
this imply that the null hypothesis is rejected , and the factor (or
factors)
has in fact an effect..
Regards,
Amine Moulay Ramdane.
== 2 of 4 ==
Date: Tues, Dec 17 2013 2:08 pm
From: aminer
I correct: infini in french means infinity..
On 12/17/2013 4:48 PM, aminer wrote:
>
> Hello,
>
> I want to talk about the mathematical modeling of
> queuing networks , if you have noticed in a webserver
> that uses a distributed database , the database
> have to be modeled as a queuing system with an
> hyperexponential service, but this will complicate the
> mathematical modeling of the webserver since the
> database have to be modeled as an M/G/1 queuing system,
> so we have to simplify the mathematical modeling , so if
> for exemple the database's write transactions takes in average
> much more time than a database's read transaction so we have
> to choose the worst case scenario , that means we have to choose
> the rate of the database's write transactions as the rate
> of the arrival to the queuing system of the database so
> that the database can be modeled as an M/M/1 queuing system
> (M: means markovian), so that the webserver can finally be
> modeled as two M/M/1 queuing system in series, one for
> the database and one for the internet network.
>
> But there is still a problem, if we want to also resolve the
> worst case scenario when for exemple thousands of custumers
> arrive at the same time to the webserver.. so the webserver
> have to be modeled taking into acount this worst case scenario...
>
> Here is the matematical modeling of an M/M/1 queuing system:
>
> We already know that to satisfy a Poisson process we must
> have that N(t1)- N(t0), N(t2)- N(t1) etc. must be independant
> that means the counting increments must be independant.
>
> We have the following relation between the Poisson law
> and Exponential law:
>
> the expected value E(X exponential) = 1 / E(X poisson)
>
> so f the arrival is poissonian then the interarrivals are
> exponential..
>
> Now i will calculate the expected mean waiting time and
> mean number of custumers in the queuing system:
>
> The Little law says that the waiting time in the queuing system:
>
> Ws = Ls/lambda
>
> And the waiting time in the queue of the queuing system is:
>
> Wq = Ws - 1/Mu
> = Ls/lambda - 1/Mu
>
> And the Little law gives us the expected mean number of custumers in the
> queue:
>
> Lq = Wq*lambda = Ls - Phi et Phi = lamda/Mu
>
> That implies:
>
> Ls - Lq = Phi
>
>
> When the system is in a stationary state , the balance equations gives:
>
> State rate in = rate out
>
> state 0: Mu * P1 = lambda*P0
> state 1: lambda*P0 + Mu*P2 = lambda*P1 + Mu*P1
> state 2: lambda*P1 + Mu*P3 = lambda*P2 + Mu*P2
> ...
> state n: lambda*Pn-1 + Mu*Pn+1 = lambda*Pn + Mu*Pn
>
>
> And that gives us the following balance equations:
>
> lambda * P0 = Mu*P1 <=> P1 = (lambda/Mu)*P0
>
> lambda * P1 = Mu*P2 <=> P1 = (lambda/Mu)^2*P0
>
> lambda * P2 = Mu*P3 <=> P1 = (lambda/Mu)^3*P0
>
> ...
>
> lambda * Pn-1 = Mu*Pn <=> P_n = (lambda/Mu)^k*P0 [1]
>
>
>
> Note: P0 means the probality of having zero custumer in the queuing system.
> and ^ means power.
>
>
> And we have also that:
>
> Sum(n=0 -> infini) (Pn) = 1 that means the sum of probabilities equal 1
>
> And [1] gives us:
>
> Sum(n=0 -> infini) (Phi^n*P0) =1
>
> And by resolving the sum if a geometric series that gives us:
>
> P0 * 1/(1-Phi) = 1 => P0 = 1 - Phi and Phi < 1
>
> And [1] gives us:
>
> Pn = (1-Phi) Phi^n
>
> And we have that means the mean expected number iof custumer in the
> queing system is:
>
> Ls = Sum(n=0 -> infini) (n*Pn)
>
> That implies:
>
> Ls = Sum(n=0 -> infini) (n*(1-Phi)*Phi^n) = Phi/1-Phi et Phi<1
>
> and we have the mean expected number of custumer in the queue of the
> queing system is :
>
> Lq = Ls -Phi = Phi^2/ (1-Phi) et Phi<1
>
>
> Little law gives us the mean waiting time in the system and in the queue:
>
> Ws = 1/lambda * Phi/(1-Phi) and Phi<1
>
> et
>
> Wq= 1/lambda * Phi^2/(1-Phi) and Phi<1
>
>
> I have just read the following page, look at the the USL
>
> (Universal Law of Computational Scalability) of Dr. Gunther,
> he wrote this: (See: http://en.wikipedia.org/wiki/Neil_J._Gunther )
>
> --------------
>
> The relative capacity C(N) of a computational platform is given by:
>
> C(N) = N
> ------------------------------------------
> 1 + α (N - 1) + ß N (N - 1)
>
> where N represents either the number of physical processors
> in the hardware configuration or the number of users driving the
> software application. The parameters α and ß represent respectively
> the levels of contention (e.g., queueing for shared resources) and
> coherency delay (i.e., latency for data to become consistent) in the
> system. The ß parameter also quantifies the retrograde throughput
> seen in many stress tests but not accounted for in either Amdahl's law
> or event-based simulations.
>
> ----------
>
> His website: http://www.perfdynamics.com/
>
> If you read carefully , you will see that Dr. Gunther is using this
> model to predict scalability after he simulates a relatively small
> number of vusers in LoadRunner ( because of licensing costs, it's
> cost-effective) and after that he finds the coefficients of the
> 2nd-degree polynomial (quadratic equation) and then transform
> those coefficients back to the USL parameters using the α = b - a
> and ß = a.
>
> And then he is extrapolating with the USL model to higher loads
> to predict scalability.
>
> He is also applying the model to webservers with heterogeneous
> workloads. like in the following page:
>
> http://perfdynamics.blogspot.com/2009/04/assessing-usl-scalability-wi...
>
> Now my question follows:
>
> Suppose we have obtained a small number of measured load-points
> with Loadrunner or others tools, and we calculated the USL equation
> to predict scalability of a webserver , how the USL model can predict
> if the scalability/performance is limited by the network bandwidth
> and not the server ? I think USL can not predict this.
>
> When we are modeling webservers , we have to include
> the network node in our network queuig model (that includes
> the computer server queue) , and since the service in the
> computer server is comprised of multiple services (when we
> are using htmls , databases etc.) the network queue will not
> be markovian in its service , and we have to model the network
> queue as an M/G/1 and this will complicate the mathematical
> analytical modeling...
>
> So, i think the best way is to use a webserver stress tool
> like http://fwptt.sourceforge.net/
>
> You can even test the webserver with an heterogeneous
> workloads by starting multiple fwtpp processes, and
> you should increase the number of threads to 5 and after
> that to 10 threads, 15 ... and so on until the webserver
> applications stops responding propperly(and this will inform
> on the maximum number of users that you can have in the same time...)
> and as you are stress testing you can even see (by testing/measuring
> it) if the network bandwidth is not the bottleneck... and this can
> not be done by the USL model.
>
> We already know that to satisfy a Poisson process we must
> have that N(t1)- N(t0), N(t2)- N(t1) etc. must be independant
> that means the counting increments must be independant.
>
> We have the following relation between the Poisson law
> and Exponential law:
>
> the expected value E(X exponential) = 1 / E(X poisson)
>
> and if the arrival is poissonian then the interarrivals are
> exponential..
>
> Now what about a webserver ?
>
> I have read the following paper:
>
> http://docs.google.com/viewer?a=v&q=cache:JFYCp_dSPP4J:citeseerx.ist....
>
> And it says that a simple model with M/M/1/k with FCFS discipline
> can predict webserver performance quite well..
>
> Hence, i think we can model our webserver over internet
> with 3 queues connected as a Jackson Network like this
>
> A -> M/M/1 Server Queue -> M/M/1 Network queue -> M/M/1 Client -> A
>
> A: is the arrival rate
>
> and we have the following:
>
> Ni: number of jobs in each queue
> Ui: utilization of each queue
>
> Ni = Ui / (1-Ui)
>
> Adding all the Ni in each individual queue will give the
> average number of jobs in the entire queuing network.
>
> After that we apply the Little formula:
>
> A: network arrival rate
> T: average response time
>
> N = A*T => T = N / A
>
> And after that from the mathematical analytical equation
> we can simulate the jackson queuing network that model our
> webservers...
>
> Now there is still an important question that i have:
>
> Our analytical jackson network model can give us insight
> on the webservers behavivior.. but the difficulty that i find in
> practice is that: suppose we have found the right parametters
> that we want to choose, like for example the service rate of
> the M/M/1 Server Queue , how , from this service rate, can
> we buy the right computer that satisfies the service rate
> that we want ?
>
> We say in OR that:
>
> "Understanding the behavior of a system is what Queueing Theory
> and Little's Law is all about. But, managing a Queue requires not
> just understanding the behavior of a system, but also in-depth
> knowledge of how to improve a system - improving both aspects
> of Queueing will mean a better, more efficient and cost-effective
> system and, more importantly, a much better customer experience."
>
> I wrote before that:
>
> ---
>
> "It says that a simple model with M/M/1/k with FCFS discipline
> can predict webserver performance quite well..
> Hence, i think we can model our webserver over internet
> with 3 queues connected as a Jackson Network like this
> A -> M/M/1 Server Queue -> M/M/1 Network queue -> M/M/1 Client -> A
>
> A: the arrival rate
> and we have the following:
> Ni: number of jobs in each queue
> Ui: utilization of each queue
>
> Ni = Ui / (1-Ui)
>
> Adding all the Ni in each individual queue will give the
> average number of jobs in the entire queuing network.
> After that we apply the Little formula:
>
> A: network arrival rate
> T: average response time
>
> N = A*T => T = N / A
>
> And after that from the mathematical analytical equation
> we can simulate the jackson queuing"
>
== 3 of 4 ==
Date: Tues, Dec 17 2013 2:18 pm
From: aminer
Hello,
I will correct the following part cause i have just
translate it myself from french and it contains
some mistakes...
Here is the matematical modeling of an M/M/1 queuing system:
We already know that to satisfy a Poisson process we must
have that N(t1)- N(t0), N(t2)- N(t1) etc. must be independant
that means the counting increments must be independant.
We have the following relation between the Poisson law
and Exponential law:
the expected value E(X exponential) = 1 / E(X poisson)
so if the arrival is poissonian then the interarrivals are
exponential..
Now i will calculate the expected mean waiting time and
mean number of custumers in the queuing system:
The Little law says that the waiting time in the queuing system:
Ws = Ls/lambda
And the waiting time in the queue of the queuing system is:
Wq = Ws - 1/Mu
= Ls/lambda - 1/Mu
And the Little law gives us the expected mean number of custumers in the
queue:
Lq = Wq*lambda = Ls - Phi and Phi = lambda/Mu
That implies:
Ls - Lq = Phi
When the system is in a stationary state , the balance equations gives:
State rate in = rate out
state 0: Mu * P1 = lambda*P0
state 1: lambda*P0 + Mu*P2 = lambda*P1 + Mu*P1
state 2: lambda*P1 + Mu*P3 = lambda*P2 + Mu*P2
...
state n: lambda*Pn-1 + Mu*Pn+1 = lambda*Pn + Mu*Pn
And that gives us the following balance equations:
lambda * P0 = Mu*P1 <=> P1 = (lambda/Mu)*P0
lambda * P1 = Mu*P2 <=> P1 = (lambda/Mu)^2*P0
lambda * P2 = Mu*P3 <=> P1 = (lambda/Mu)^3*P0
...
lambda * Pn-1 = Mu*Pn <=> P_n = (lambda/Mu)^k*P0 [1]
Note: P0 means the probability of having zero custumer in the queuing
system and ^ means power.
And we have also that:
Sum(n=0 -> infinity) (Pn) = 1 that means the sum of probabilities equal 1
And [1] gives us:
Sum(n=0 -> infinity) (Phi^n*P0) =1
And by resolving the sum if a geometric series that gives us:
P0 * 1/(1-Phi) = 1 => P0 = 1 - Phi and Phi < 1
And [1] gives us:
Pn = (1-Phi) Phi^n
And we have that means the mean expected number iof custumer in the
queing system is:
Ls = Sum(n=0 -> infinity) (n*Pn)
That implies:
Ls = Sum(n=0 -> infinity) (n*(1-Phi)*Phi^n) = Phi/1-Phi et Phi<1
and we have the mean expected number of custumer in the queue of the
queing system is :
Lq = Ls -Phi = Phi^2/ (1-Phi) et Phi<1
Little law gives us the mean waiting time in the system and in the queue:
Ws = 1/lambda * Phi/(1-Phi) and Phi<1
et
Wq= 1/lambda * Phi^2/(1-Phi) and Phi<1
Thank you,
Amine Moulay Ramdane.
== 4 of 4 ==
Date: Tues, Dec 17 2013 2:31 pm
From: aminer
J'ai ecrit:
"We have the following relation between the Poisson law
and Exponential law:
the expected value E(X exponential) = 1 / E(X poisson)
so if the arrival is poissonian then the interarrivals are
exponential.. "
Alors c'est plus facile de faire une simulation d'une file d'attente
M/M/1 ou M/M/n puisque les services et les arrivées devraient etre
exponentiallement distribués et vous allez trouver ma simulation
dans mon site web:
http://pages.videotron.com/aminer/
Merci,
Amine Moulay Ramdane.
==============================================================================
TOPIC: Threadpool with priorities version 1.54
http://groups.google.com/group/comp.programming.threads/t/3a765b03ab36c1b3?hl=en
==============================================================================
== 1 of 1 ==
Date: Sat, Dec 21 2013 9:16 am
From: aminer
Hello,
Threadpool with priorities was updated to version 1.54 (stable version),
and threadpool was updated to version 1.55, i have changed something
inside the scalable Anderson lock so that the exemples works correcly,
now Threadpool with priorities is very stable.
Author: Amine Moulay Ramdane
Language: FPC Pascal v2.2.0+ / Delphi 5+: http://www.freepascal.org/
Operating Systems: Win , Linux and Mac (x86).
Description:
Thread Pool Engine.
The following have been added:
- You can give the following priorities to jobs:
LOW_PRIORITY
NORMAL_PRIORITY
HIGH_PRIORITY
- Uses a FIFO queue that satisfies many requirements: it is FIFO fair,
it minimizes efficiently the cache-coherence traffic and it is energy
efficient on the pop(): when there is no items in the queue it will not
spin-wait , but it will wait on a portable manual event object..
- Enters in a wait state when there is no job in the queue - for more
efficiency -
- You can distribute your jobs to the workers threads and call any
method with the threadpool's execute() method.
- Uses O(1) complexity on enqueue and O(3) worst case complexity on
dequeue.
Look into defines.inc there is many options:
CPU32: for 32 bits architecture
CPU64: for 64 bits architecture
Please read an article that i wrote about my Threadpool engine: article.
Look at test1.pas demo inside the zip file...
You can dowload Threadpool with priorities 1.54 from:
http://pages.videotron.com/aminer/
Required FPC switches: -O3 -Sd -dFPC -dWin32 -dFreePascal
-Sd for delphi mode....
Required Delphi switches: -DMSWINDOWS -$H+
For Delphi 5,6,7 use -DDelphi
For Delphi 2005,2006,2007,2009,2010+ use the switch -DDELPHI2005+
{$DEFINE CPU32} and {$DEFINE Win32} for 32 bit systems
{$DEFINE CPU64} and {$DEFINE Win64} for 64 bit systems
Note: testpool.pas is a parallel program of a Matrix multiply by a
vector that uses SSE+ and it requires Delphi 5+. test.pas and
test_thread.pas works with both FreePascal and Delphi.
Thank you,
Amine Moulay Ramdane.
==============================================================================
TOPIC: Concurrent FIFO Queue 1.01
http://groups.google.com/group/comp.programming.threads/t/73de55ea89ca5de7?hl=en
==============================================================================
== 1 of 1 ==
Date: Sat, Dec 21 2013 9:36 am
From: aminer
Hello,
Concurrent FIFO Queue was updated to version 1.01, i have added the
Ticket Spinlock with a proportional backoff so that
it becomes FIFO fair and so that it reduces the cache-coherence traffic
, my
concurrent FIFO queue is also energy efficient on the pop(): when there
is no items in the queue it will not spin-wait , but it will wait on a
portable manual event object, and now using the Ticket Spinlock with a
proportional backoff it gives 3 millions push transactions per second
much better than the scalable Anderson lock and that's excellent, you
can use also the scalable Anderson lock but it gives less throughput than
the TicketSpinlock with a proportional backoff.
If you want to use the TicketSpinlock
use "Ticket" inside the defines.inc and
if you want to use the scalable Anderson lock use "ALOCK" inside the
defines.inc.
Authors: Amine Moulay Ramdane.
Description:
A concurrent FIFO queue that satisfies many requirements: it is FIFO
fair, it minimizes efficiently the cache-coherence traffic and it is
energy efficient on the pop(): when there is no items in the queue it
will not spin-wait , but it will wait on a portable manual event object.
Please take a look a the test.pas Object Pascal demo inside the zipfile,
compile and run it...
You can download Concurrent FIFO Queue 1.01 from:
http://pages.videotron.com/aminer/
Language: FPC Pascal v2.2.0+ / Delphi 7+: http://www.freepascal.org/
Operating Systems: Windows, Mac OSX , Linux , Unix...
Required FPC switches: -O3 -Sd -dFPC -dFreePascal
-Sd for delphi mode....
{$DEFINE CPU32} and {$DEFINE Windows32} for 32 bit systems
{$DEFINE CPU64} and {$DEFINE Windows64} for 64 bit systems
Thank you,
Amine Moulay Ramdane.
==============================================================================
TOPIC: An M/M/n queuing model simulation
http://groups.google.com/group/comp.programming.threads/t/a4d462b8599c98e1?hl=en
==============================================================================
== 1 of 3 ==
Date: Sat, Dec 21 2013 10:07 am
From: aminer
Hello,
An M/M/n queuing model simulation with Object Pascal and my Thread Pool
Engine - version 1.02
You can download it from:
http://pages.videotron.com/aminer/
Author: Amine Moulay Ramdane
Description:
It's harder and sometimes impossible to get analytical results about
waiting times and queue length for general interarrival and service
distributions; so, it's important to be able to estimate these
quantities by observing the results of simulation.
It's very easy in Object Pascal to simulate a sequence of arrival times
with a given interarrival distribution.
Look at the examples MM1.pas( M/M/1 queuing model) and MMn.pas(M/M/n -
n: number of servers -) inside the zip file:
---------------------------
InterArrivals:=TExponentialDistribution.Create(420623,1.0/3.0);
ServiceTimes:=TExponentialDistribution.Create(220623,1.0/4.0);
currtime:=0.0;
for i:=1 to simnumber
do
begin
obj:=TJob.create;
obj.simnumber:=simnumber;
obj.number:=i;
currtime:=currtime+InterArrivals.Sample;
obj.ArrivalTime:=currtime;
obj.Servicetime:= ServiceTimes.sample;
TP.execute(myobj.myproc1,pointer(obj),NORMAL_PRIORITY);
end;
-------------------------------------------
Here we have the InterArrivals object and ServiceTimes object and we are
calling InterArrivals.Sample to get our samples from the Exponential
Distribution.
After that we are calling myobj.myproc1 to simulate our M/M/1 queuing
model...
If you look at MM1.pas , you will see that the arrival rate is: 3 and
the service rate is 4 , so, this will give us a theoretical value of
1/(4-3) = 1 for one server, and the Object Pascal simulation gave me
1.02 for one server.
Now i want to talk about themathematical modeling of
queuing networks , if you have noticed in a webserver
that uses a distributed database , the database
have to be modeled as a queuing system with an
hyperexponential service, but this will complicate the
mathematical modeling of the webserver since the
database have to be modeled as an M/G/1 queuing system,
so we have to simplify the mathematical modeling , so if
for exemple the database's write transactions takes in average
much more time than a database's read transaction so we have
to choose the worst case scenario , that means we have to choose
the rate of the database's write transactions as the rate
of the arrival to the queuing system of the database so
that the database can be modeled as an M/M/1 queuing system
(M: means markovian), so that the webserver can finally be
modeled as two M/M/1 queuing system in series, one for
the database and one for the internet network.
But there is still a problem, if we want to also resolve the
worst case scenario when for exemple thousands of custumers
arrive at the same time to the webserver.. so the webserver
have to be modeled taking into acount this worst case scenario...
Here is the matematical modeling of an M/M/1 queuing system:
We already know that to satisfy a Poisson process we must
have that N(t1)- N(t0), N(t2)- N(t1) etc. must be independant
that means the counting increments must be independant.
We have the following relation between the Poisson law
and Exponential law:
the expected value E(X exponential) = 1 / E(X poisson)
so if the arrival is poissonian then the interarrivals are
exponential..
Now i will calculate the expected mean waiting time and
mean number of custumers in the queuing system:
The Little law says that the waiting time in the queuing system:
Ws = Ls/lambda
And the waiting time in the queue of the queuing system is:
Wq = Ws - 1/Mu
= Ls/lambda - 1/Mu
And the Little law gives us the expected mean number of custumers in the
queue:
Lq = Wq*lambda = Ls - Phi et Phi = lamda/Mu
That implies:
Ls - Lq = Phi
When the system is in a stationary state , the balance equations gives:
State rate in = rate out
state 0: Mu * P1 = lambda*P0
state 1: lambda*P0 + Mu*P2 = lambda*P1 + Mu*P1
state 2: lambda*P1 + Mu*P3 = lambda*P2 + Mu*P2
...
state n: lambda*Pn-1 + Mu*Pn+1 = lambda*Pn + Mu*Pn
And that gives us the following balance equations:
lambda * P0 = Mu*P1 <=> P1 = (lambda/Mu)*P0
lambda * P1 = Mu*P2 <=> P1 = (lambda/Mu)^2*P0
lambda * P2 = Mu*P3 <=> P1 = (lambda/Mu)^3*P0
...
lambda * Pn-1 = Mu*Pn <=> P_n = (lambda/Mu)^k*P0 [1]
Note: P0 means the probality of having zero custumer in the queuing system.
and ^ means power.
And we have also that:
Sum(n=0 -> infini) (Pn) = 1 , that means the sum of probabilities equal 1
And [1] gives us:
Sum(n=0 -> infinite) (Phi^n*P0) =1
And by resolving the sum of a geometric series that gives us:
P0 * 1/(1-Phi) = 1 => P0 = 1 - Phi and Phi < 1
And [1] gives us:
Pn = (1-Phi) * Phi^n
And we have the mean expected number iof custumer in the queuing system is:
Ls = Sum(n=0 -> infinite) (n*Pn)
That implies:
Ls = Sum(n=0 -> infinite) (n*(1-Phi)*Phi^n) = Phi/1-Phi and Phi<1
and we have the mean expected number of custumers in the queue of the
queing system is :
Lq = Ls -Phi = Phi^2/ (1-Phi) et Phi<1
Little law gives us the mean waiting time in the system and in the queue:
Ws = 1/lambda * Phi/(1-Phi) and Phi<1
et
Wq= 1/lambda * Phi^2/(1-Phi) and Phi<1
I have just read the following page, look at the the USL
(Universal Law of Computational Scalability) of Dr. Gunther,
he wrote this: (See: http://en.wikipedia.org/wiki/Neil_J._Gunther )
--------------
The relative capacity C(N) of a computational platform is given by:
C(N) = N
------------------------------------------
1 + α (N - 1) + ß N (N - 1)
where N represents either the number of physical processors
in the hardware configuration or the number of users driving the
software application. The parameters α and ß represent respectively
the levels of contention (e.g., queueing for shared resources) and
coherency delay (i.e., latency for data to become consistent) in the
system. The ß parameter also quantifies the retrograde throughput
seen in many stress tests but not accounted for in either Amdahl's law
or event-based simulations.
----------
His website: http://www.perfdynamics.com/
If you read carefully , you will see that Dr. Gunther is using this
model to predict scalability after he simulates a relatively small
number of vusers in LoadRunner ( because of licensing costs, it's
cost-effective) and after that he finds the coefficients of the
2nd-degree polynomial (quadratic equation) and then transform
those coefficients back to the USL parameters using the α = b - a
and ß = a.
And then he is extrapolating with the USL model to higher loads
to predict scalability.
He is also applying the model to webservers with heterogeneous
workloads. like in the following page:
http://perfdynamics.blogspot.com/2009/04/assessing-usl-scalability-wi...
Now my question follows:
Suppose we have obtained a small number of measured load-points
with Loadrunner or others tools, and we calculated the USL equation
to predict scalability of a webserver , how the USL model can predict
if the scalability/performance is limited by the network bandwidth
and not the server ? I think USL can not predict this.
When we are modeling webservers , we have to include
the network node in our network queuig model (that includes
the computer server queue) , and since the service in the
computer server is comprised of multiple services (when we
are using htmls , databases etc.) the network queue will not
be markovian in its service , and we have to model the network
queue as an M/G/1 and this will complicate the mathematical
analytical modeling...
So, i think the best way is to use a webserver stress tool
like http://fwptt.sourceforge.net/
You can even test the webserver with an heterogeneous
workloads by starting multiple fwtpp processes, and
you should increase the number of threads to 5 and after
that to 10 threads, 15 ... and so on until the webserver
applications stops responding propperly(and this will inform
on the maximum number of users that you can have in the same time...)
and as you are stress testing you can even see (by testing/measuring
it) if the network bandwidth is not the bottleneck... and this can
not be done by the USL model.
We already know that to satisfy a Poisson process we must
have that N(t1)- N(t0), N(t2)- N(t1) etc. must be independant
that means the counting increments must be independant.
We have the following relation between the Poisson law
and Exponential law:
the expected value E(X exponential) = 1 / E(X poisson)
and if the arrival is poissonian then the interarrivals are
exponential..
Now what about a webserver ?
I have read the following paper:
http://docs.google.com/viewer?a=v&q=cache:JFYCp_dSPP4J:citeseerx.ist....
And it says that a simple model with M/M/1/k with FCFS discipline
can predict webserver performance quite well..
Hence, i think we can model our webserver over internet
with 3 queues connected as a Jackson Network like this
A -> M/M/1 Server Queue -> M/M/1 Network queue -> M/M/1 Client -> A
A: is the arrival rate
and we have the following:
Ni: number of jobs in each queue
Ui: utilization of each queue
Ni = Ui / (1-Ui)
Adding all the Ni in each individual queue will give the
average number of jobs in the entire queuing network.
After that we apply the Little formula:
A: network arrival rate
T: average response time
N = A*T => T = N / A
And after that from the mathematical analytical equation
we can simulate the jackson queuing network that model our
webservers...
Now there is still an important question that i have:
Our analytical jackson network model can give us insight
on the webservers behavivior.. but the difficulty that i find in
practice is that: suppose we have found the right parametters
that we want to choose, like for example the service rate of
the M/M/1 Server Queue , how , from this service rate, can
we buy the right computer that satisfies the service rate
that we want ?
We say in OR that:
"Understanding the behavior of a system is what Queueing Theory
and Little's Law is all about. But, managing a Queue requires not
just understanding the behavior of a system, but also in-depth
knowledge of how to improve a system - improving both aspects
of Queueing will mean a better, more efficient and cost-effective
system and, more importantly, a much better customer experience."
I wrote before that:
---
"It says that a simple model with M/M/1/k with FCFS discipline
can predict webserver performance quite well..
Hence, i think we can model our webserver over internet
with 3 queues connected as a Jackson Network like this
A -> M/M/1 Server Queue -> M/M/1 Network queue -> M/M/1 Client -> A
A: the arrival rate
and we have the following:
Ni: number of jobs in each queue
Ui: utilization of each queue
Ni = Ui / (1-Ui)
Adding all the Ni in each individual queue will give the
average number of jobs in the entire queuing network.
After that we apply the Little formula:
A: network arrival rate
T: average response time
N = A*T => T = N / A
And after that from the mathematical analytical equation
we can simulate the jackson queuing"
--
As you have noticed , this mathematical model of
this jackson network does in fact take into account
the M/M/1 Network queue node , the USL model can not
do this... and with this performance data from the mathematical
analytical model simulation we can for example validate
the performance data of the fwptt stress webserver simulation..
But you have to take into account worst cases and the
peak traffic loads...
Let for example we have a a webserver hosting html pages
and it is receiving 1000000 HTTP operations
per day with an average file size of 10 KB.
What would be the network bandwidth required for this website
considering peak traffic if the peak traffic load from past
observations was four times greater than average loads?
Required bandwidth is solved by the following equation:
HTTP op/sec x average file size or
1000000 HTTP ops per day =1000000/24 = 41,667 op/hour =
41,667/3600= 11.6 HTTP ops/sec
The needed bandwidth is
11.6 HTTP ops/sec X 10 KB/HTTP op = 116 KB/sec = 928 Kbps.
If we assume a protocol overhead of 20% then the actual throughput
required is 928 Kbps X 1.2 = 1,114 Kbps.
However if peak loads, as i said before, is as much as
4 times greater, the bandwidth required to handle spikes
would be 4 X 1,114 Kbps = 4.456 Mbps.
So you have to think also about the cost of this line...
I will add the following:
As you have noticed i said that:
"As you have noticed , this mathematical model of
this jackson network does in fact take into account
the M/M/1 Network queue node , the USL model can not
do this... and with this performance data from the mathematical
analytical model simulation we can for example validate
the performance data of the fwptt stress webserver simulation.."
and i said that:
"Hence, i think we can model our webserver over internet
with 3 queues connected as a Jackson Network like this
An M/M/1 Server Queue -> M/M/1 Network queue -> M/M/1 Client queue"
And of course on Capacity Planning for Enterprise Datacenters
and Websites , you can mirror many computer servers and load
balance between them with a software... to make the system much
FASTER, and this will be modeled as a jackson network like this:
A -> M/M/n Server Queue -> M/M/1 Network queue -> M/M/1 Client -> A
A: the arrival rate to the system
But there is still an important thing , as i have showed before
on my calculations:
"However if peak loads, as i said before, is as much as
4 times greater, the bandwidth required to handle spikes
it would be 4 X 1,114 Kbps = 4.456 Mbps.
So you have to think also about the cost of this line..."
I think that you have also to take into account the knee utilisation
of your M/M/n Servers Queues, if for example the number of computer
servers is 8 and the Knee is 74% that means that in our previous
example the bandwidth must equal to:
74% X 4.456 Mbps = 3.297 Mbps.
Cause as you know, above this Knee of 74% for 8 servers
the curve of the waiting time does grow quickly ..
And we have to take into account the cost of the line ...
Let us take another example:
In the network node of the Jackson network, there is three
main parameters that are responsable for its performance:
latency, bandwidth and utilization.
Now, if for example you take a look at my provider
http://www.videotron.com/service/internet-services/internet-access/hi...
My upload network speed is clearly 820 Kbs => 102.5 Kbytes/sec
Hence, if the average file size on my web site is for
example 25 Kbyte and the protocol overhead is 20%, that
means that the file size will be in average:
25 * 120% = 30 Kbyte
And if the Knee of the M/M/1 queuing node in our Jackson network
is 50%, then the bandwidth available will be reduced to
102.5 Kbytes/sec * 50% = 51.25 Kbytes/sec
If the download speed of the client is for example 7.5 Mbps,
that means that the file can be retrieved in no more than
30 / 51.25 = 0.585 second = 585 milliseconds.
But if we assume for example that a typical web visitor
to my web site is spending 585 milliseconds retrieving the
page and 60 seconds reading the HTML page THEN each client
on average only requires:
0.585 / (0.585 + 60) = 9.6% of the bandwidth that is allocated to
it..
but in practice i think we have to consider the worst case
scenarios..
So, since we can support a constraint of no more than 5 seconds
in the T (average response time), that means:
585 ms * X = 5000 milliseconds => X = 8.54
Hence, in worst case , my network node can not support more
than 8.54 users in the same time...
So, on *average* my network node can support:
(8.54 * 86400) / 5 = 147571 users per day
I wrote before that:
--
And of course on Capacity Planning for Enterprise Datacenters
and Websites , you can mirror many computer servers and load
balance between them with a software... to make the system much
FASTER, and this will be modeled as a Jackson network like this:
(1)
A -> M/M/n Server Queue -> M/M/1 Network queue ->
-> M/M/1 Client queue -> A
A: the arrival rate to the system"
--
We know also from an operational law of queuing theory that:
(2) the rate of job leaving any stable node must equal
its arrival rate.
(2) is from the Equivalence property.
Equivalence property: Assume that a service facility with
s servers and infinite queue has a Poisson input with
parameter lambda and the same exponential service-time
distribution with parameter Mu for each server (the M/M/s model),
where s * Mu > lambda. Then the steady-state output of
this service facility is also poisson process with parameter
lambda.
We know for an M/M/c queue that:
Note: try to add servers with almost the same hardware
configuration...
C(c, U) = Erlang formula = P(c) / (1 - Utilization)
note: c the number of servers..
P(c): means the probability that all the servers are busy
P(0): means the probability that there is no waiting time in the
queue, that means also: AT LEAST one server among the C servers
are not busy...
The average waiting time in the 'queue' =
C(c,U) / (service rate x c x (1 - Utilization)) (3)
It's approximatly equal to:
Utilization^C/(service rate x (1 - Utilization^C)
Note: ^ means power
This approximation is exact for the M/M/1 and M/M/2 models,
but 'slightly' lower than the result in (3) if c > 2
and
Utilization = Density of circulation / C (number of servers)
Note: ^ means power()
and C means the number of servers
Response time = The average waiting time in the 'queue' +
(1 / service rate)
average numbers of users in the system = service rate x response time
average number of users in queue = service rate x average waiting
time
in the 'queue'
Now as i said before:
--
So the equation for
Ni: number of jobs in each queue
Ui: utilization of each queue
Ni = Ui / (1-Ui)
Adding all the Ni in each individual queue will give the
average number of jobs in the entire queuing network.
After that we apply the Little formula:
A: network arrival rate
T: average response time
N = A*T => T = N / A
And after that from the mathematical analytical equation
we can simulate the jackson queuing network that model our
webservers...
---
If we try to calculate the Ni = Ui / (1-Ui) in
the Jackson network, and from the operational law (2) above
this will give us:
Ns for the M/M/n Server queue is:
(DC / n) / (1 - (DC/n))
and DC = Ss / A => Ns = ((Ss/A)/n) / (1 -((Ss/A)/n))
Ss: service rate at the queuing server.
A: Arrival rate to the jackson network
DC: is the Density of circulation
n: number of servers
And Nn for the M/M/1 Network queue is:
and Utilization in the M/M/1 network queue = Sn / A
this imply that => Nn = (Sn/A) / (1 -(Sn/A))
Nn: number of jobs in the M/M/1 network queue node
Un: Utilization in the network queue node
Sn: service rate at the queuiNg server.
A: Arrival rate to the jackson network
And Nc for the M/M/1 Client queue node is:
and Uc= Sc / (F/5)
this imply that => Nc = (Sc/(F/5)) / (1 -(Sc/(F/5)))
Note: F/5, if for example the rate is one file every 5 seconds.
Nc: number of jobs in the M/M/1 client queue node
Uc: Utilization in the M/M/1 client queue node
Sc: service rate at the queuiNg server.
A: Arrival rate to the Jackson network
F: Average file size
Adding all the Ni in each individual queue will
give the average number of jobs in the entire
queuing network that is equal to:
Ni = Nn + Ns + Nc
= (((Ss/A)/n) / (1 -((Ss/A)/n)) + (Sn/A) / (1 -(Sn/A))
+ (Sc/(F/5)) / (1 -(Sc/(F/5)))
After that we apply the Little formula:
A: network arrival rate
T: average response time
N = A*T => T = N / A
this imply that the T(the average response time in the Jackson
network)
is:
Ni = Nn + Ns + Nc
= (((Ss/A)/n) / (1 -((Ss/A)/n)) + (Sn/A) / (1 -(Sn/A))
+ (Sc/(F/5)) / (1 -(Sc/(F/5)))
Now finally we have our mathematical analytic equation
and we can begin to simulate our Enteprise webserver
and also validate the performance data with the fwptt stress
webserver simulation tool...
And don't forget what i have said about USL:
"Suppose we have obtained a small number of measured
load-points with Loadrunner or others tools, and we
calculated the USL equation to predict scalability of
a webserver , how the USL model can predict if the
scalability/performance is limited by the network bandwidth
and not the server ? I think USL can not predict this."
Also i wrote about USL that:
"As you have noticed , this mathematical model of
this jackson network does in fact take into account
the M/M/1 Network queue node , the USL model can not
do this... and with this performance data from the mathematical
analytical model simulation we can for example validate
the performance data of the fwptt stress webserver simulation.."
If you have noticed i have tried to show also that one
of the principal objectives is to control and manage
the *Quality* of the system or process, i wrote in my
post the following:
---
this imply that the T(the average response time in the Jackson
network)
is:
Ni = Nn + Ns + Nc
= (((Ss/A)/n) / (1 -((Ss/A)/n)) + (Sn/A) / (1 -(Sn/A))
+ (Sc/(F/5)) / (1 -(Sc/(F/5)))
Now finally we have our mathematical analytic equation
and we can begin to simulate our Enteprise webserver
and also validate the performance data with the fwptt stress
webserver simulation tool...
---
So, as you have noticed you can simulate our enterprise
webserver with this mathematical analytical equation and
validate the data collected by fwptt stress webserrer tool
and try to keep T( the average response time of your HTTP requests)
bellow or equal to 5 seconds, and see also if your webserver
is scaling, and we call this also Quality control using OR
(operational research).
In quality management, quality can also be maintained and
improved through the detection and reduction of process
variation. Where statistical techniques as ANOVA and others
can be used..
The ANOVA method , for example, analyses the variation in
the data set to discover the amount of variation that can
be attributed to each factor.. etc.
There is still an important thing..
As you have noticed, i have finally arrived to the following
mathematical model of our Jackson network that model an Enterprise
website:
Ni = Nn + Ns + Nc
= (((Ss/A)/n) / (1 -((Ss/A)/n)) + (Sn/A) / (1 -(Sn/A))
+ (Sc/(F/5)) / (1 -(Sc/(F/5)))
T: is the average response time..
So read it carefully an you will notice that this mathematical
model also can simulate and predict an intranet website behavior
or an internet website..
First look at the first factor (((Ss/A)/n) / (1 -((Ss/A)/n))
in our mathematical model, as you have noticed as
n (number of servers) grows this will make the denominator
grow rapidly and this will make (((Ss/A)/n) / (1 -((Ss/A)/n))
smaller, that's good for the response time...
Now let's look at the other factors in this mathematical
model:
IF the system is an Intranet, and for example the local
area network Sn is faster than the multiserver, that means:
We have Sn >> Ss, and that means => Ss /A >> Ss /A => that the
Knee in the server side can be reached more quickly than the
Knee in the network node of our Jackson network, so that
the bottleneck becomes the server node.
In the other hand if:
Ss >> Sn => Ss /A >> Sn / A => the Knee in the M/M/1 network
node can be reached more quickly, so that the network node in
the Jackson network becomes the bottleneck.
Availability:
The demand for high availability and reliability of computer and
software systems has led to a formal verification of such systems.
There are two principal approaches to formal verification:
model checking and theorem proving.
And i wrote about Petri Nets and how to model parallel
programs and how to exploit verification tools as Tina to verify
the properties such as liveleness of the system.
Please take a look at my other article about Petri Nets:
http://pages.videotron.com/aminer/PetriNet/formal.htm
And by using Petri Net and Tina you can make your system
more reliable , and if it's more reliable then the system
will have higher availability.
There are important factors to take into consideration when
building systems, factors such us availability and
efficiency(scalability etc....).
System availability is also calculated by modeling the system
as an interconnection of parts in series and parallel.
The availability A of a system is calculated like this
A = MTBF / (MTBF + MTTR)
MTBF: Mean time between failure
MTTR :Mean time to repair
and the MTTR is composed of:
time to respond to the failure + time to find
the problem + time to correct the problem + time
to verify that the system is working corretly.
Now if the interconnected parts are parallel , the equation
that we use to calculate the availability is:
A = 1- (U1 * U2 * U3 * ... * Un) (1)
Ui: is the non-availability of the component part of the system...
The availability of a system composed of interconnected parts in
series is:
A = A1 * A2 * ... * An (2)
Now as you have noticed i have modeled an enterprise website
as a Jackson network, and i have said before that:
--
And of course on Capacity Planning for Enterprise Datacenters
and Websites , you can mirror many computer servers and load
balance between them with a software... to make the system much
FASTER, and this will be modeled as a jackson network like this:
A -> M/M/n Server Queue -> M/M/1 Network queue -> M/M/1 Client -> A
A: the arrival rate to the system
--
As you have noticed i wrote "FASTER" but i didn't spook
about the AVAILABILITY, you will notice that you can not just
make your system more EFFICIENT and FASTER , but also
you can higher the AVAILABILITY of your system by mirroring
many servers...
I have also wrote before that:
---
"If you have noticed i have tried to show also that one
of the principal objectives is to control and manage
the *Quality* of the system or process, i wrote in my
previous post the following:
this imply that the T(the average response time in the Jackson
network)
is:
T = Ni /A = (Nn + Ns + Nc) /A
= (((Ss/A)/n) / (1 -((Ss/A)/n)) + (Sn/A) / (1 -(Sn/A))
+ (Sc/A) / (1 -(Sn/A))) / A
Now finally we have our mathematical analytic equation
and we can begin to simulate our Enteprise webserver
and also validate the performance data with the fwptt stress
webserver simulation tool...
So, as you have noticed you can simulate our enterprise
webserver with this mathematical analytical equation and
validate the data collected by fwptt stress webserrer tool
and try to keep T( the average response time of your HTTP requests)
bellow or equal to 5 seconds, and see also if your webserver
is scaling, and we call this also Quality control using OR
(operational research).
In quality management, quality can also be maintained and
improved through the detection and reduction of process
variation. Where statistical techniques as ANOVA and others
can be used.. "
---
We have many mathematical tools that we can use in Quality control,
example: Correlation, Regression , ANOVA etc.
ANOVA is a powerfull and important mathematic tool in
Quality control , by calculating the p-value in Excel or other
statistical tools , this will give us an important information like
if one or more variables have a significant effect on other
variables. If for example the p-value <= 0.05 (or if F > critical F value)
this imply that the null hypothesis is rejected , and the factor (or
factors)
has in fact an effect..
Regards,
Amine Moulay Ramdane.
== 2 of 3 ==
Date: Sat, Dec 21 2013 10:49 am
From: aminer
Hello,
Don't forget to read all the text inside my post
about also that i have mathematically modeled using a queuing network,
even a web server with a distributed database...i have corrected some
typos, so please read again...
Thank you,
Amine Moulay Ramdane.
== 3 of 3 ==
Date: Sat, Dec 21 2013 10:52 am
From: aminer
Hello,
Don't forget to read all the text inside my post
about also a web server that i have mathematically modeled using a
queuing network,
even a web server with a distributed database...i have corrected some
typos, so please read again...
Thank you,
Amine Moulay Ramdane.
==============================================================================
You received this message because you are subscribed to the Google Groups "comp.programming.threads"
group.
To post to this group, visit http://groups.google.com/group/comp.programming.threads?hl=en
To unsubscribe from this group, send email to comp.programming.threads+unsubscribe@googlegroups.com
To change the way you get mail from this group, visit:
http://groups.google.com/group/comp.programming.threads/subscribe?hl=en
To report abuse, send email explaining the problem to abuse@googlegroups.com
==============================================================================
Google Groups: http://groups.google.com/?hl=en
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment