soft and program: Digest for comp.lang.c++@googlegroups.com

comp.lang.c++@googlegroups.com

Google Groups

Utah C++ Programmers: Asynchronous Messaging with 0MQ - 1 Update
memcpy - 2 Updates
Counter-example to Rice's Theorem ? - 2 Updates
Strange structure initialization problem. - 4 Updates

Utah C++ Programmers: Asynchronous Messaging with 0MQ

legalize+jeeves@mail.xmission.com (Richard): Feb 10 07:33PM

[Please do not mail me a copy of your followup]

Join the Utah C++ User Group tomorrow for our regular monthly meeting!

Seth Hays will give a presentation on Asynchronous Messaging with 0MQ
(ZeroMQ).

Utah C++ Programmers meet on the 2nd Wednesday of every month.

Full event details:
<http://www.meetup.com/Utah-Cpp-Programmers/events/219986869/>
--
"The Direct3D Graphics Pipeline" free book <http://tinyurl.com/d3d-pipeline>
The Computer Graphics Museum <http://computergraphicsmuseum.org>
The Terminals Wiki <http://terminals.classiccmp.org>
Legalize Adulthood! (my blog) <http://legalizeadulthood.wordpress.com>

memcpy

David Brown <david.brown@hesbynett.no>: Feb 10 09:44AM +0100

On 09/02/15 17:17, Christopher Pisz wrote:

> Oh come on, it is well beyond end of life and they know darn well 99% of
> those will upgrade. I have no such pull with my customers. Nor do I have
> so few competitors.

I don't know whether you are joking or ignorant. Statistics vary
depending on who is counting, and what they are counting, but XP is
generally rated at 15-20% share of current desktops, while Win8 + Win8.1
is at 10-15% and Win7 is over 50%.

People don't upgrade their OS - they replace the entire machine. And
those that have been happy with XP for the past couple of years, are
mostly happy to continue with it for a good while longer.

Christopher Pisz <nospam@notanaddress.com>: Feb 10 10:14AM -0600

On 2/10/2015 2:44 AM, David Brown wrote:

> People don't upgrade their OS - they replace the entire machine. And
> those that have been happy with XP for the past couple of years, are
> mostly happy to continue with it for a good while longer.

Ok....so by these statistics, if I were to consider Flibble's comment
and "reconsider my OS", I would drop a customer base that covers >
75-85% of the market ....makes sense to me!

At any rate, I don't know what you are trying to argue.

Counter-example to Rice's Theorem ?

"X.Y. Newberry" <newberryxy@gmail.com>: Feb 09 08:45PM -0800

Peter Olcott wrote:
> whether or not it can correctly decide the halting property for every
> element of
> Program x Inputs.

I still wonder if self-referentiality is decidable.

--
X.Y. Newberry

If Jack says 'What I am saying at this very moment is not true', we can
successfully and truly assert that he did not utter a truth: 'What Jack
said is not true'. But it is hardly conceivable that Jack's utterance is
true by virtue of its success in attributing non-truth to itself.

Haim Gaifman

Peter Olcott <OCR4Screen>: Feb 10 06:44AM -0600

On 2/9/2015 10:45 PM, X.Y. Newberry wrote:
>> element of
>> Program x Inputs.

> I still wonder if self-referentiality is decidable.

It is *not* a matter of self-referentiality per se that is being decided.
It seems to be that both self-referentiality and any of its equivalents are
*all* being decided. Also it is not just self-referentiality, yet *only*
those
cases where self-referentiality prevents halting from being decided.

Pathological Self Reference (PSR) is the case where actual self-reference
to a TM m by its input data causes neither TRUE nor FALSE to be a correct
return value corresponding to m(p,i).

The equivalent of self-reference would the the case where a the input to
TM m (p,i) includes another TM m2 that is equivalent to m in that for this
specific instance of m and m2, and this specific instance of (p,i) in
Programs x
Inputs m2 would return the same value that m would return.

A TM m2 that is identical to m has the effect of a TM that is equivalent
to m
and not the same effect as m itself, when m itself is directly included
in (p,i).
When m is directly included in (p,i) then the three return values of m
actually
change the literal string pair of (p,i). When m2 is merely identical to
m, then
the three return values of m do not change the literal string pair (p,i).

To refer back to my original terminology, when-so-ever any Turing Machine m
is presented with any essentially (yes/no) question that lacks a correct
answer
from the set of (yes/no), such that neither yes nor no returned by m forms
a correct answer to this (yes/no) question, m simply decides that this
question
when posed to itself is incorrect and thus returns NEITHER.

The above generally applies to *any* undecidable property of (p,i) and
is thus
not limited to merely the halting property of (p,i). Therefore m can
*always*
decide the decidability of any nontrivial property of the r.e. sets, thereby
forming a counter-example disproving Rice's Theorem.

Automata and Computability, Dexter Kozen (1997) page 245:
(Rice's Theorem) Every nontrivial property of the r.e. sets is undecidable.

//
// m is a string encoding of a Turing Machine
// m(p,i)(TRUE) means: m(p,i) leaves a single 1 digit on the tape
// m(p,i)(FALSE) means: m(p,i) leaves a single 0 digit on the tape
// m(p,i)(ERROR) means: m(p,i) leaves a single 2 digit on the tape
//
01 There exists a TM, m, such that
02 for every (p, i) in Sigma* x Sigma*
03 m(p, i) halts with
04 m(p, i) in {TRUE, FALSE, ERROR} <and>
05 (halts(p, i) == m(p, i)(TRUE) <xor>
06 halts(p, i) == m(p, i)(FALSE) <xor>
07 (halts(p,i) != m(p,i)(TRUE) <and> halts(p,i) != m(p,i)(FALSE))

Line 05 specifies the subset of Sigma* x Sigma* where returning
TRUE would be correct.

Line 06 specifies the subset of Sigma* x Sigma* where returning
FALSE would be correct.

Line 07 specifies the subset of Sigma* x Sigma* where returning TRUE
or returning FALSE would *both* be incorrect, thus it returns ERROR.

Strange structure initialization problem.

DSF <notavalid@address.here>: Feb 10 12:56AM -0500

On Sun, 08 Feb 2015 23:13:34 -0500, DSF <notavalid@address.here>
wrote:

Hello.

Info below.
> mov [ebp-0x04], eax

> This is wrong on multiple levels.

> 1. The compiler doesn't use immediate 0s, but two memory locations.

The compiler treats initialization of a structure the same way it
treats initialization of a character array. As in:

void Bar(void)
{
char foo[] = {"This is foo!"};
...

In this case, the text is not constant, but is expected to contain
"This is foo!" after the above line every time Bar is run. Most
compilers, I imagine, will store the literal text string in a static
area and copy it to local "foo" memory at the start of Bar.

The code:

UINT array[20] = {1, 2, 3, 4};

Produces:
_DATA segment dword public use32 'DATA'
align 4
$mkbknmaa label dword
dd 1
dd 2
dd 3
dd 4
db 64 dup(?)
_DATA ends
...
; UINT array[20] = {1, 2, 3, 4};
mov esi,offset $mkbknmaa
lea edi,dword ptr [ebp-116]
mov ecx,20
rep movsd

I believe this is what my compiler is automatically doing with any
structure or array initialization. It stores the literal numeric
values in a static area and then copies them into the structure. If I
change the original to:

IOERRORS ret;

ret.ioerror = 0;
ret.syserror = 0;

It becomes:
; IOERRORS ret; // = {0, 0};
; ret.ioerror = 0;
@6:
xor eax,eax
mov dword ptr [ebp-8],eax
;
; ret.syserror = 0;
xor edx,edx
mov dword ptr [ebp-4],edx

Still stupid code. And it still creates/uses a local ret and copies
it to the LHS pointer on the stack at the end, even though there is
only one exit point. But at least I understand now; it's a
boilerplate for initialization.

> 2. The memory locations are referenced in entirely different ways.

That, as has been mentioned, is in linker territory.

>C source file. (The C file *is* being compiled my the C compiler.)
>The only reason this C file is part of the project and not a library
>call is so that I can debug it properly.

It turns out that FBaseString<wchar_t>::blank + 0x30 resolves to...

...wait for it...

Address 0x42a694! Which explains:
> 4. I have no idea what/where [0x42a698] refers to.
It's the address following FBaseString<wchar_t>::blank + 0x30!

So there we have our static storage of two unsigned integer zeros.

Thanks to Ian Collins for putting me onto setting a breakpoint on
write, which forced me to track down the address of
FBaseString<wchar_t>::blank + 0x30. Provided by going into a function
of said template and placing a watch on &blank. Then setting a write
breakpoint for 8 bytes at address 0x42a694.

As to whether it's a compiler bug or an error of mine is yet to be
determined. But address 0x42a694 lies right in the middle of a buffer
of FBaseString and is overwritten near the end of the program loop
that this code is within. Thus causing all subsequent loop passes to
fail because GetVolumeInfo doesn't return 0, 0.

I've also learned (at least while I'm using this compiler) to avoid
multiple exit points if I'm returning a structure. As a test I added
three return ret; statements to GetVolumeInfo. Each one produced:

mov eax,dword ptr [ebp+8]
mov edx,dword ptr [ebp-8]
mov dword ptr [eax],edx
mov edx,dword ptr [ebp-4]
mov dword ptr [eax+4],edx
mov eax,dword ptr [ebp+8]
jmp @12
Copying the *same* local variable to the stack for return. They may
use different registers to do it, but it's an exit point, negating any
later requirements on the register values.

The last one even has a convenient label. So instead of repeating
the 17-byte sequence each time, they should have dumped every mov
above and changed the last to jmp @8!

@8:
mov ecx,dword ptr [ebp+8]
mov eax,dword ptr [ebp-8]
mov dword ptr [ecx],eax
mov eax,dword ptr [ebp-4]
mov dword ptr [ecx+4],eax
mov eax,dword ptr [ebp+8]
@12:
pop edi
pop esi
pop ebx
mov esp,ebp
pop ebp
ret

Enough of this off topic typing. At least I think I know why some
of the strange code is implemented the way it is. And I know where
the error is. FBaseString<wchar_t>::blank is a const static value, so
it's reasonable that 0x30 farther on (0x42a694) is still in the static
area. The memory allocated for the string starts at 0x42a664. I
don't have an idea off the top of my head how to determine if it's a
bug or corrupt memory data. 42a664 could be a segment of a string,
indicating a string has written over memory manager addresses, except
this is a 16-bit Unicode project and I don't believe there are any
ASCII strings. This will be the next challenge, as initializing ret
members individually (thus avoiding the address conflict) is only
putting a band-aid on a problem that's sure to arise when I least
expect it and byte me in the ass!

Thanks for all your advice and patience!
DSF
"'Later' is the beginning of what's not to be."
D.S. Fiscus

Marcel Mueller <news.5.maazl@spamgourmet.org>: Feb 10 09:21AM +0100

On 09.02.15 23.38, Ian Collins wrote:
> You undoubtedly have either 1) a compiler bug or 2) a memory corrupting
> bug in your code.

> Case 2) would appear more likely,

Normally I would fully agree. But he is using /Borland/ C. That make 2)
more probably, so maybe it can beat 1).

I have really seen wired things with Borland C/C++ compilers. E.g. a
value is assigned to register X and to read from register Y a few
instructions later, of course without a matching MOV in between. Most of
these Bugs were related to some optimization option, but not all of
them. And it was not restricted to one compiler version or platform. I
had this kind of problems with BCOS2 as well as with different Windows
Versions.

> to use read only literals or set a break-point on writes to those
> locations. If you can't do both of those, that's yet another reason to
> upgrade!

Fortunately even very old Borland debuggers can use hardware data
points. So observing memory should not be a serious problem.

Marcel

David Brown <david.brown@hesbynett.no>: Feb 10 09:54AM +0100

On 09/02/15 22:27, DSF wrote:
> "GetVolumeInfo.c" was saved as shown. On the disk it is now
> "getvolumeinfo.c". Sometimes it does this, sometimes it leaves the
> case alone.

NTFS does not mangle case - it is a case-preserving filesystem. If you
save a file as "GetVolumeInfo.c", that's what you get on the disk.

Of course, applications (such as a compiler IDE) are free to screw up
filenames as much as they like. For example, rather than simply
overwriting an old file of similar name (same letters, different cases),
the IDE /could/ do so by opening the old file, writing the new contents,
and closing it again - then your newly saved file will (I think) take
the case of the old file.

> Also, for any item in the project list you can look up (and also
> change) what is used to process the file. For "GetVolumeInfo.c" it's
> CCompile, not CPPCompile.

Okay, so this is not an issue here. But aim for consistency and /try/
to save your C files as ".c". Maybe one day you will be using a serious
compiler and working on a serious OS.

> mid '90s. I don't have time right now to learn the operation of a new
> compiler, let alone the probability of having to alter every piece of
> source code I've written.

Yet you have time to deal with the endless problems you have with such
an old tool? If you are waiting until you have a few weeks to spare,
you will wait forever - changing to a newer and better development tool
is an investment that will save you time in the future.

(I am not a fan of upgrading just for the sake of getting the latest and
greatest, and I fully understand the benefits of sticking to an old but
known tool - but there are limits!)

> I got caught up in trying various permutations of the code and
> discovering it did something else unexpected. It will be posted soon
> after this.

Fair enough. There is a good chance that you will figure out the
problem yourself in this process - the exercise of "finding a minimal
compilable sample that demonstrates the problem" is as important to the
person with the problem as to those trying to help.

Marcel Mueller <news.5.maazl@spamgourmet.org>: Feb 10 10:11AM +0100

On 10.02.15 06.56, DSF wrote:
[init local copy]
> I believe this is what my compiler is automatically doing with any
> structure or array initialization. It stores the literal numeric
> values in a static area and then copies them into the structure.

Exactly.

> xor edx,edx
> mov dword ptr [ebp-4],edx

> Still stupid code.

Turn global register optimizations on (or however this was called at
Borland) and it will likely look more pretty. But your debugger will
dislike the result.

> And it still creates/uses a local ret and copies
> it to the LHS pointer on the stack at the end, even though there is
> only one exit point.

The code in between usually need the registers for something else or
calls functions that do not guarantee to preserver their values.

Furthermore the debugger cannot access variable values that have no
memory representation. So using a register is not an option as long as
you have debugging enabled.

> boilerplate for initialization.

>> 2. The memory locations are referenced in entirely different ways.

> That, as has been mentioned, is in linker territory.

It is up to the implementation to use one or another method.
Independently of your code.

However, there should also be an option to place compile time constants
in the code segment. This is not that natural.
Firstly because on some platforms read and execute access requires
different permissions. Think of the NX feature, although x86 learned
this quite lately.
Secondly in old C character constants like your "This is foo!" are of
type char* rather than const char*. This has the effect that when is is
passed to a function as char* argument the function is allowed to change
the value of this constant. That is the reason why they are normally
placed in the DATA segment rather than TEXT. This behavior is no longer
valid, but your compiler might still be compatible to that.

>> 4. I have no idea what/where [0x42a698] refers to.
> It's the address following FBaseString<wchar_t>::blank + 0x30!

> So there we have our static storage of two unsigned integer zeros.

Probably. The compiler simply did not create a debugger symbol for its
internal constant. And the debugger uses the next best symbol with an
offset. This is nothing where the linker is involved. It is simple the
first symbol in the data segment of your compilation unit (.obj) and the
debugger uses the last symbol from the previous compilation unit,
unaware of compilation units.

Activate the assembler output of the compiler and you will see a
reasonable reference.

> determined. But address 0x42a694 lies right in the middle of a buffer
> of FBaseString and is overwritten near the end of the program loop
> that this code is within.

Maybe you called free on memory not allocated before and the compiler
reused your memory.

Normally I would recommend to run your program with a memory analyzer
like valgrind. But with that old platform you may not have any option
like this.

> Copying the *same* local variable to the stack for return. They may
> use different registers to do it, but it's an exit point, negating any
> later requirements on the register values.

- Turn on optimizations. For debugging purposes the compiler can neither
share nor interleave code between different source lines.
- Use a recent compiler. Many things happened in between.

> The last one even has a convenient label. So instead of repeating
> the 17-byte sequence each time, they should have dumped every mov
> above and changed the last to jmp @8!

This is the common sub expression optimization. It has to be turned on
and the code has to fit into the analysis window.

> area. The memory allocated for the string starts at 0x42a664. I
> don't have an idea off the top of my head how to determine if it's a
> bug or corrupt memory data.

I would guess some of your code has undefined behavior and the string
buffer never should point to that area. I have no idea what FBaseString
is and whether it allows to assign a buffer from outside. This might
explain everything.
Maybe you set some string class instance to blank, retrieved its address
as C compatible char* and then used this as strcpy target. Bad idea!
I'm just guessing, of course.

Again, turn on the option to place constants in the TEXT segment and you
will get a CPU exception when an instruction wants to write to the
constant. This won't prevent you from from doing other wired things with
pointers, but it will catch at least a few cases.

As rule of thumb: do not use the type char* anywhere in your C++ code.
Use your string class or const char* only. This will prevent you from
many problems.
I did not take care of wchar_t in my post. Everything which applies to
char applies to wchar_t as well.

Marcel

You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page.
To unsubscribe from this group and stop receiving emails from it send an email to comp.lang.c+++unsubscribe@googlegroups.com.

soft and program

Tuesday, February 10, 2015

Digest for comp.lang.c++@googlegroups.com - 9 updates in 4 topics

No comments:

Blog Archive

About Me