soft and program: Digest for comp.lang.c++@googlegroups.com

comp.lang.c++@googlegroups.com

Google Groups

Union type punning in C++ redux - 3 Updates
Alignment hack... - 2 Updates
"Doing UTF-8 in Windows" by Mircea Neacsu - 12 Updates
transactional memory idea - 2 Updates

Pavel <pauldontspamtolk@removeyourself.dontspam.yahoo>: Feb 21 02:03AM -0500

Öö Tiib wrote:
> | —(4.1) they are the same object, or
> | —(4.2) one is a standard-layout union object and the other is a non-static data member of that object, or
> | —(4.3) one is a standard-layout class object and the other is the ﬁrst non-static data member of that object, or, if the object has no non-static data members, the ﬁrst
BTW. 2020 draft n4849 changes "first" to "any" here
> | —(4.4) there exists an object c such that a and c are pointer-interconvertible, and c and b are pointerinterconvertible.
> |
> | If two objects are pointer-interconvertible, then they have the same address, and it is possible to obtain a pointer to one from a pointer to the other via a reinterpret_cast
Thanks, I missed this language. I think the only non-trivial addition to my
interpretation of the standard it brings is 4.3 (the guarantees given by 4.2 can
be taken advantage of with unions and no reinterpret_cast and the other 2 are
trivial); but I don't think it helps the legality of the latest OP's code
because it is not obvious to me which of these 4 rules can be applied to
conclude that &V::data_ and the results of its reinterpret-casting to const A*
or const B* are pointer-interconvertible.
>> the object of its original type. The above code relies on de-referenced result
>> of reinterpreted_cast so I think its behavior is unspecified.

> I am quite certain that the reinterpret_cast is not so useless as you put it.
Well, most of legal and useful uses I can remember were casting from void* to T*
and back; but your citation (specifically 4.3) does add one more potentially
useful legal use I do not remember using (but now that you mentioned it I think
I am recalling seeing it somewhere), namely the casts between a pointer to a
standard-layout class object and the pointer to its first member.

(It also says that for an "empty" standard-layout class object we can
reinterpret_cast between that object and any of its base objects (obviously,
also empty) -- but that is not very useful as it is both shorter and IMHO more
readable to just use static_cast for that).

Daniel <danielaparker@gmail.com>: Feb 21 06:05AM -0800

On Friday, February 21, 2020 at 2:03:16 AM UTC-5, Pavel wrote:
> mentioned it I think I am recalling seeing it somewhere), namely the casts
> between a pointer to a standard-layout class object and the pointer to its
> first member.

Hardly normative, but by my count, boost 1_71 has 4414 occurrences of
reinterpret_cast in 268 header files, including many uses with aligned
storage.

"Öö Tiib" <ootiib@hot.ee>: Feb 21 10:26AM -0800

On Friday, 21 February 2020 09:03:16 UTC+2, Pavel wrote:
> because it is not obvious to me which of these 4 rules can be applied to
> conclude that &V::data_ and the results of its reinterpret-casting to const A*
> or const B* are pointer-interconvertible.

You totally misinterpreted extent of my example. It was only given
to show that "I could not find in the Standard any additional guarantee
about reinterpret_cast results conditioned on the involved types' being
having standard-layout" does not matter since I could. Search bit more.

> reinterpret_cast between that object and any of its base objects (obviously,
> also empty) -- but that is not very useful as it is both shorter and IMHO more
> readable to just use static_cast for that).

Wrong place, red herring. That is what is fun about standard, there are additional guarantees about bytes (IOW chars) and also about that
std::aligned_storage. ;)

Alignment hack...

"Chris M. Thomasson" <chris.m.thomasson.1@gmail.com>: Feb 21 01:38AM -0800

Fwiw, this is some old C code I just cobbled up to work with C++; used
it as a region allocator in the past:

https://groups.google.com/forum/#!original/comp.lang.c/7oaJFWKVCTw/sSWYU9BUS_QJ

Well, "work" with C++ or even C is very loose here. Its a total hack to
force align objects on large boundaries. This is very useful wrt
designing different exotic algorithms. However, I think its forever
doomed wrt UB. I am not sure how to ever make it work in a 100% portable
way. When I say a large boundary, I mean say, 2048 bytes are much
bigger. Well, here is some code, can you even get it to run without
tripping an assert or getting a throw?
______________________
#include <iostream>
#include <new>
#include <cassert>
#include <cstdlib>
#include <cstddef>
#include <cstdint>

// Doctor Hackinstein!
#define CT_RALLOC_ALIGN_UP(mp_ptr, mp_align) \
((unsigned char*)( \
(((std::uintptr_t)(mp_ptr)) + ((mp_align) - 1)) \
& ~(((mp_align) - 1)) \
))

#define CT_RALLOC_ALIGN_ASSERT(mp_ptr, mp_align) \
(((unsigned char*)(mp_ptr)) == CT_RALLOC_ALIGN_UP(mp_ptr, mp_align))

// Hackish indeed!
template<std::size_t T_size>
struct ct_local_mem
{
unsigned char m_bytes[T_size];

template<typename T>
unsigned char* align_mem()
{
return align_mem<T>(alignof(T));
}

template<typename T>
unsigned char* align_mem(unsigned long align)
{
if (!align) align = alignof(T);

unsigned char* base = m_bytes;
unsigned char* aligned = CT_RALLOC_ALIGN_UP(base, align);

assert(CT_RALLOC_ALIGN_ASSERT(aligned, align));

std::size_t size = aligned - m_bytes;

if (size + sizeof(T) + align > T_size)
{
throw;
}

return aligned;
}
};

// A test program...
struct foo
{
int m_a;
int m_b;

foo(int a, int b) : m_a(a), m_b(b)
{
std::cout << this << "->foo::foo.m_a = " << m_a << "\n";
std::cout << this << "->foo::foo.m_b = " << m_b << "\n";
}

~foo()
{
std::cout << this << "->foo::~foo.m_a = " << m_a << "\n";
std::cout << this << "->foo::~foo.m_b = " << m_b << "\n";
}
};

int main()
{
{
// create some memory on the stack
ct_local_mem<4096> local = { '\0' };

// create a foo f
std::cout << "Naturally aligned...\n";
foo* f = new (local.align_mem<foo>(alignof(foo))) foo(1, 2);

// destroy f
f->~foo();

// create a foo f aligned on a large byte boundary
std::size_t alignment = 2048;
std::cout << "\n\nForced aligned on a " << alignment << " byte
boundary...\n";

// ensure the alignment of foo is okay with the boundary
assert((alignment % alignof(foo)) == 0);

f = new (local.align_mem<foo>(alignment)) foo(3, 4);

assert(CT_RALLOC_ALIGN_ASSERT(f, alignment));

// destroy f
f->~foo();
}

{
std::cout << "\n\nFin\n";
std::cout.flush();
std::cin.get();
}

return 0;
}
______________________

Here is some output, notice the addresses on the large boundary:
______________________
Naturally aligned...

0x7fffb1f56ba0->foo::foo.m_a = 1

0x7fffb1f56ba0->foo::foo.m_b = 2

0x7fffb1f56ba0->foo::~foo.m_a = 1

0x7fffb1f56ba0->foo::~foo.m_b = 2

Forced aligned on a 2048 byte boundary...

0x7fffb1f57000->foo::foo.m_a = 3

0x7fffb1f57000->foo::foo.m_b = 4

0x7fffb1f57000->foo::~foo.m_a = 3

0x7fffb1f57000->foo::~foo.m_b = 4
______________________

Notice how the latter pointer values have zeros at the end? There are
many fun things we can do here, but I am afraid it all UB. ;^o

"Alf P. Steinbach" <alf.p.steinbach+usenet@gmail.com>: Feb 21 12:50PM +0100

On 21.02.2020 10:38, Chris M. Thomasson wrote:
> ______________________

> Notice how the latter pointer values have zeros at the end? There are
> many fun things we can do here, but I am afraid it all UB. ;^o

Not sure, I just cooked this up, but I believe the following is
standard-compliant and does what you want:

#include <assert.h> // assert
#include <limits.h> // CHAR_BIT

#include <bitset> // std::bitset
#include <exception> // std::terminate
#include <iostream> // std::(cin, cout)
#include <memory> // std::align
#include <stddef.h> // size_t
#include <new> // std::bad_alloc
using std::align, std::bad_alloc, std::terminate, std::bitset,
std::cout, std::cin;

using Byte = unsigned char;
const int bits_per_byte = CHAR_BIT;
template< class T > constexpr int bits_per_ = sizeof( T )*bits_per_byte;
template< class T > using Type_ = T;

struct Buffer_view{ void* p_start; size_t size; };

template< class Int >
auto pop_count( const Int value )
-> int
{ return static_cast<int>( bitset<bits_per_<Int>>( value ).count() ); }

auto operator new( const size_t size, Buffer_view& buffer, const size_t
alignment )
-> void*
{
assert( pop_count( alignment ) == 1 );
if( auto p = align( alignment, size, buffer.p_start, buffer.size ) ) {
return p;
}
throw bad_alloc();
}

auto operator new( const size_t size, Buffer_view&& buffer, const size_t
alignment )
-> void*
{ return operator new( size, buffer, alignment ); }

// Called if constructor throws.
void operator delete( const Type_<void*>, Buffer_view&, const size_t )
{
terminate(); // Clean-up can be supported by more info in
Buffer_view.
}

void operator delete( const Type_<void*> p, Buffer_view&& b, const
size_t a )
{
operator delete( p, b, a );
}

// A test program...
struct foo
{
int m_a;
int m_b;

foo( const int a, const int b ):
m_a( a ), m_b( b )
{
cout << this << "->foo::foo.m_a = " << m_a << "\n";
cout << this << "->foo::foo.m_b = " << m_b << "\n";
}

~foo()
{
cout << this << "->foo::~foo.m_a = " << m_a << "\n";
cout << this << "->foo::~foo.m_b = " << m_b << "\n";
}
};

void test()
{
// create some memory on the stack
Byte local[4096] = {};
const auto buffer_view = [&]{ return Buffer_view{ &local, sizeof(
local ) }; };

// create a foo f
cout << "Naturally aligned...\n";
foo* f = new( buffer_view(), alignof( foo ) ) foo( 1, 2 );
f->~foo();

// create a foo f aligned on a large byte boundary
size_t alignment = 2048;
cout << "\n\nForced aligned on a " << alignment << " byte
boundary...\n";

// ensure the alignment of foo is okay with the boundary
assert( alignment % alignof( foo ) == 0 );

foo* f2 = new( buffer_view(), alignment ) foo( 3, 4 );
f2->~foo();
}

auto main()
-> int
{
test();
cout << "\n\nFin\n";
cin.get();
}

- Alf

"Doing UTF-8 in Windows" by Mircea Neacsu

Lynn McGuire <lynnmcguire5@gmail.com>: Feb 20 08:38PM -0600

On 2/20/2020 2:48 PM, David Brown wrote:
> (some were pre-Unix), and many will have moved to Linux. The middle
> group will mostly be from a Windows-dominated age, while newer
> programmers are on Linux again.

I started writing Fortran IV code on a Univac 1108 in 1975.

Lyn

"Alf P. Steinbach" <alf.p.steinbach+usenet@gmail.com>: Feb 21 03:40AM +0100

On 19.02.2020 22:35, Lynn McGuire wrote:
> whole document to get indoctrinated ☺."
> http://utf8everywhere.org/

> We are finally moving our software to UTF-8. It is horrendous so far.

Uhm, half a year after Windows finally got support for UTF-8 as process
ANSI codepage, you start rewriting your code base to replace calls of
the ordinary ASCII based functions with ditto UTF-8 ones.

That's sort of counter-productive, in-the-wrong-place-at-the-wrong-time.

- Alf

Cholo Lennon <chololennon@hotmail.com>: Feb 21 12:35AM -0300

On 2/20/20 4:45 PM, Bart wrote:
> where one would buy an actual Linux desktop PC. (Tablets and phones are
> a completely different market, not really suitable for the kind of
> software we wrote.)

There are many places, here is one of them:
https://slimbook.es/en/

> Even if Linux PCs were everywhere - each one is running a different
> version of Linux. Each could have a different processor. How do you even
> distribute binaries on such systems? That is also a different world.

Well most systems (specially desktops) are x86/x86_64, for those you
have many alternatives like static compilation, dockers, flatpak,
AppImage, Snap, etc.

I worked several years in the telecom industry (before docker started to
simplify our lives, and Linux took the crown against Windows/Solaris on
that domain). Our (very) complex server side applications ran on Windows
(32/64 bits), Linux (different versions of RHEL 32/64 bits) and Solaris
Intel/Sparc. It was a great effort to have (design/build/maintain) a C++
cross-platform code base, and also to deploy those applications (using
home-made deployer scripts; so we distributed our binaries using special
scripts), but it wasn't impossible. Yeah, it wasn't impossible, but
costly, so costly in terms of time and money that I spent my final years
in that industry writing new software in Java (Business decisions) and
maintaining "legacy" C++ code.

--
Cholo Lennon
Bs.As.
ARG

Cholo Lennon <chololennon@hotmail.com>: Feb 21 12:56AM -0300

On 2/19/20 7:26 PM, Lynn McGuire wrote:

>> /Jorgen

> ASCII. Our Windows user interface has 450,000 lines of code in C++. Our
> Calculation Engine has 700,000 lines of F77 and 10,000+ lines of C and C++.

Wow, I used F77 (a Microsoft variant) at university in 1993. It was
ancient at that time with poor control sentences. I have some scientist
friends who still use it for calculus/simulations, but I can't believe
that it is still present in commercial applications :-D

--
Cholo Lennon
Bs.As.
ARG

Lynn McGuire <lynnmcguire5@gmail.com>: Feb 20 10:36PM -0600

On 2/20/2020 9:56 PM, Cholo Lennon wrote:
> Cholo Lennon
> Bs.As.
> ARG

Many commercial applications. Our software was first released in 1969.
Any calculational software of that vintage is Fortran.

Lynn

Lynn McGuire <lynnmcguire5@gmail.com>: Feb 20 10:38PM -0600

On 2/20/2020 8:40 PM, Alf P. Steinbach wrote:
> the ordinary ASCII based functions with ditto UTF-8 ones.

> That's sort of counter-productive, in-the-wrong-place-at-the-wrong-time.

> - Alf

Nope. We are converting our data storage to UTF-8. Our Win32 calls are
being converted to UTF-16. There is no UTF-8 Win32 API.

Lynn

"Alf P. Steinbach" <alf.p.steinbach+usenet@gmail.com>: Feb 21 06:12AM +0100

On 21.02.2020 05:38, Lynn McGuire wrote:

>> - Alf

> Nope. We are converting our data storage to UTF-8. Our Win32 calls are
> being converted to UTF-16.

Oh that's good. I misunderstood what you're doing, it wasn't all that clear.

> There is no UTF-8 Win32 API.

If you mean your company is not using it, that's one thing.

If you mean there's no such, that's incorrect.

When you set the process codepage to UTF-8, then the ANSI API is an
UTF-8 based API. The most notable function that has no ANSI wrapper, the
command line parsing function, isn't needed because the `main` arguments
are then UTF-8 (with the compilers I tried, VC and g++). And from what
I've seen use of the API for command line parsing is exceedingly rare.

- Alf

Cholo Lennon <chololennon@hotmail.com>: Feb 21 02:17AM -0300

On 2/21/20 1:36 AM, Lynn McGuire wrote:
>> Bs.As.
>> ARG

> Many commercial applications. Our software was first released in 1969.

Wow again. I received support tickets for my own creations after more
than 10 years in production or other tickets after more than 15 years in
production, but 1969! that year is way ahead of my "personal records",
actually I wasn't born :-O

Clearly software will survive us :-P

--
Cholo Lennon
Bs.As.
ARG

Christian Gollwitzer <auriocus@gmx.de>: Feb 21 07:20AM +0100

Am 21.02.20 um 06:12 schrieb Alf P. Steinbach:
> command line parsing function, isn't needed because the `main` arguments
> are then UTF-8 (with the compilers I tried, VC and g++). And from what
> I've seen use of the API for command line parsing is exceedingly rare.

Hold on. Does it mean, that there is a single line like this

#ifdef WINDOWS
magic_set_locale_toUTF8();

soft and program

Friday, February 21, 2020

Digest for comp.lang.c++@googlegroups.com - 19 updates in 4 topics

No comments:

Blog Archive

About Me