- In the end, rason will come - 4 Updates
- An example: code to list exports of a Windows DLL. - 4 Updates
- [niubbo] convert a string representing a valid date to some "binary" date - 3 Updates
- [niubbo] convert a string representing a valid date to some "binary" date - 1 Update
- Geodetic Development Tool - 1 Update
"Öö Tiib" <ootiib@hot.ee>: Aug 01 07:53PM -0700 On Friday, 2 August 2019 00:26:16 UTC+3, David Brown wrote: > > tricky to indicate to compiler that it may do such optimizations > > in conforming mode. > Such optimisations won't be tricky - they will be completely impossible. Why? Making compiler to warn that "(x + z) > (y + z)" is maybe sub-optimal code and to have some attribute or pragma for to suppress it when wrap *was* reason to write it like that is not impossible just tricky. Other likely outcome is that iterator or range based loops will simply be tiny bit more efficient than that int i as index. That can be even good. In Rust it is so and all are happy. > > off wrapping signed integers will be different in any way. > It will be a completely different thing. And I think that is so obvious > it doesn't need more explanation. Ok. :/ > wrapping behaviour, it is no longer a mistake. How is the compiler > supposed to guess what was intentional behaviour and what was likely to > be a programmer error? I don't know who does not agree that signed integer overflow is likely a bug and worth warning about. If someone writes some code that uses deliberately wrapping (for example retroactively to test if overflow did happen) then they can suppress that warning locally with some pragma or attribute. > >> correct. > > That option I would also love like I said above. > That cannot be an option if wrapping is the defined behaviour. Note that both g++ and clang have -fwrapv and -ftrapv just neither of those work reliably. Requiring at least working -fwrapv would be bit better than nothing. I could bit cheaper (or at least bit more readably) write my own throwing or saturating integers using wrapping integers. I like trapping best but I understand that it is most expensive on majority of platforms. > enabling or disabling particular features, such as exceptions or RTTI > (features which are typically only disabled for niche uses, such as > resource limited embedded systems). Yes, you have good point here that switching between different well defined behaviors is evil. It is maybe niche ... but it is niche that we *own* right now. I have about third of my career (plus sometimes as hobby) participated in such projects. There are literally tons of electronics made all around and other languages but C and C++ are only sometimes experimentally tried there. Also the optimizations often matter only on such limited systems somewhat. > that there is /some/ way to get it. If unsigned arithmetic were not > defined with wrapping in C and C++, there would have to be another way > to do it for these occasional uses. Yes, some of the code to avoid that undefined behavior with signed integers uses unsigned right now (and assumes twos complement). The well-defined behavior can not make wrong code to give correct answers. The wrong answers will just come more consistently and portably with wrap and so I can trust unit tests for my math of embedded system ran on PC bit more. > It was not ordered by importance (or what I feel is most important). > Correctness always trumps efficiency, and aids to correctness are > therefore high on my list of important features. I have same views especially about arithmetic that is fast. For massive amounts of algebra it is better to use libraries that utilize GPU or the like anyway. I still do not understand how undefined behavior is supposed to be more safe and reliable than defined behavior ... but perhaps it is just me. |
Christian Gollwitzer <auriocus@gmx.de>: Aug 02 07:41AM +0200 Am 01.08.19 um 23:25 schrieb David Brown: > It was not ordered by importance (or what I feel is most important). > Correctness always trumps efficiency, and aids to correctness are > therefore high on my list of important features. Then the right thing to do would be mathematical signed integers, which can not overflow at all (like those in Python, short of memory exhaustion) for the type "int". Wrapping integers for int8_t, uint8_t etc. No general unsigned int, unless it throws an exception on negative numbers. The only reason no one wants to suggest this, is performance. Christian |
Bonita Montero <Bonita.Montero@gmail.com>: Aug 02 08:10AM +0200 > Specifying wrapping behavior for signed integers could be a *big* > problem. In particular, this: > int n = INT_MAX + 1; That's not the least problem because no one writes such code. |
David Brown <david.brown@hesbynett.no>: Aug 02 08:46PM +0200 On 02/08/2019 04:53, Öö Tiib wrote: > Why? Making compiler to warn that "(x + z) > (y + z)" is maybe sub-optimal > code and to have some attribute or pragma for to suppress it when wrap > *was* reason to write it like that is not impossible just tricky. It is impossible, because the code would no longer be correct (unless the compiler has other information about the ranges). If one of (x + z) or (y + z) wraps and the other does not, optimising to "x > y" would not be valid. Try this with gcc: bool foo(int x, int y, int z) { return (x + z) > (y + z); } #pragma GCC optimize "-fwrapv" bool foo2(int x, int y, int z) { return (x + z) > (y + z); } (from the usual https://godbolt.org> foo: cmp edi, esi setg al ret foo2: add edi, edx add edx, esi cmp edi, edx setg al ret When signed integer overflow has two's complement wrapping, you can't simplify code using as many normal mathematical integer identities. You can still do some re-arrangements and simplifications - more than if overflow is defined as trapping - but you lose some case, especially those involving relations and inequalities. > Other likely outcome is that iterator or range based loops will simply > be tiny bit more efficient than that int i as index. > That can be even good. In Rust it is so and all are happy. When the compiler has more information about the possible ranges of the numbers involved, it can do more optimisation. That is the case for iterators. But it is not a good thing that some types of code become less efficient - the fact that other types of code might not be affected does not suddenly make it good. A change that makes current, correct, working code slower is never a good thing by itself. It is only worth having if there are significant advantages. Turning broken code with arbitrary bad behaviour into broken code with predictable bad behaviour is not particularly useful. Let people who think that wrapping integers are somehow good or safe use other languages. There is no need to weaken C++ with the same mistake. > deliberately wrapping (for example retroactively to test if overflow > did happen) then they can suppress that warning locally with some > pragma or attribute. So you think this feature - wrapping overflows - is so useful and important that it should be added to the language and forced upon all compilers, and yet it also is so unlikely to be correct code that compilers should warn about it whenever possible and require specific settings to disable the warning? Isn't that a little inconsistent? I'd be much happier to see some standardised pragmas like: #pragma STDC_OVERFLOW_WRAP #pragma STDC_OVERFLOW_TRAP #pragma STDC_OVERFLOW_UNDEFINED (or whatever variant is preferred) where undefined behaviour is the standard, but people can choose specific defined behaviour if they want, using a standardised method. Basically, allow #pragma GCC optimize "-fwrapv" in a common form. Surely that would be enough for those that want wrapping behaviour, without bothering anyone else? >> That cannot be an option if wrapping is the defined behaviour. > Note that both g++ and clang have -fwrapv and -ftrapv just neither > of those work reliably. "-fwrapv" does exactly what it says on the tin, as far as I know. Do you know of any problems with it, or any way in which it does not work reliably? Neither gcc nor clang are bug-free, of course, and given that this is an option that is used rarely it will not receive as much heavy testing as other aspects of the tools. But the intention is that this is a working and maintained option that gives you specific new semantics in the compiler. "-ftrapv" has always been a bit limited, and somewhat poorly defined. It is not clear how much rearrangement is done before the trapping operations are used, and support varies from target to target. AFAIUI, you are recommended to use -fsanitize=signed-integer-overflow instead. > Requiring at least working -fwrapv would be bit better > than nothing. Again, what do you think does not work with -fwrapv? > I could bit cheaper (or at least bit more readably) write > my own throwing or saturating integers using wrapping integers. For gcc and clang, you are better off using the overflow arithmetic builtin functions. > I like trapping best but I understand that it is most expensive on > majority of platforms. It is expensive on /all/ platforms, because it disables a good many re-arrangements and simplifications of expressions. Beyond that, it usually boils down to adding "trap if overflow" or "branch if overflow" instructions after arithmetic operations - and the cost of that does vary between targets. But it is probably cheaper than saturating arithmetic in many cases, which also disables many re-arrangements and requires extra code for targets that don't have saturating arithmetic instructions. > around and other languages but C and C++ are only sometimes > experimentally tried there. Also the optimizations often matter > only on such limited systems somewhat. My career has been dominated by programming on platforms which are small enough that you typically disable exceptions and RTTI when using C++. (Things might change in the future with the newer ideas for cheaper C++ exceptions.) And yes, it is a very important niche, especially for C. >> to do it for these occasional uses. > Yes, some of the code to avoid that undefined behavior with signed > integers uses unsigned right now (and assumes twos complement). Indeed - and that is often fine, though perhaps of limited portability. (Making two's complement representation a requirement will fix this.) > correct answers. The wrong answers will just come more consistently > and portably with wrap and so I can trust unit tests for my > math of embedded system ran on PC bit more. If your unit tests rely on wrapping for overflow, then those unit tests are broken. "More consistent wrong answers" is /not/ a phrase you want to hear regarding test code! You want your embedded code to run quickly and efficiently - but there is no reason not to have -fsanitize=signed-integer-overflow for your PC-based unit tests and simulations. > utilize GPU or the like anyway. I still do not understand how > undefined behavior is supposed to be more safe and reliable than > defined behavior ... but perhaps it is just me. Undefined behaviour is something that your tools know is wrong. That means that there is at least a chance that the tools can spot mistakes. They can't do it all the time, and sometimes there are significant costs in run-time tests for find mistakes, but it is possible. Look at the sanitize options at <https://gcc.gnu.org/onlinedocs/gcc/Instrumentation-Options.html> (or the clang page if you prefer), to see some of the mistakes that can be caught at compile time. They can only be caught because they are mistakes - they have no defined behaviour. If something is given defined behaviour by the language, then the compiler must assume that when it occurs, it is intentional. At best, any compile-time or run-time checking for this must be an optional feature with a high risk of false positives. Additionally, adding this new semantics to the language would likely confuse people and make them think it was always the case. There are far too many programmers already who think signed arithmetic wraps in C and C++ - it would be worse if they see it documented for some standards and not others. If there were enough benefit from the additional behaviour, that would be fair enough. But there isn't any benefit of significance - correct code remains correct after this change, and broken code remains broken. |
"Alf P. Steinbach" <alf.p.steinbach+usenet@gmail.com>: Aug 02 03:36AM +0200 This just-now-it-compiled-and-appeared-to-work code is for whoever needs or desires a tool to list DLL exports (e.g. because of the large amount of irrelevant output from MS `dumpbin`, or because one doesn't want to install Visual Studio but instead just use MinGW g++), or those interested in the internal structure of a Windows PE format executable. For the latter, if it's still available on the web then one could do worse than reading Matt Pietrek's two- or three part article series. I once posted an exports listing program like this on Stack Overflow, back then only for 32-bit DLLS, and as I recall then by loading the DLL and using the pointers in the image directly as pointers. This code instead reads the file. As I recall from the SO posting the function names are stored internally with UTF-8 encoding, but the documentation says ASCII, yet, on the third hand, the documentation of encodings used in Windows is known to be wildly inaccurate & misleading. I've yet to test that. But apparently the documentation of `GetProcAddr` (the Windows way of getting a pointer to specified function in a loaded DLL) now says ASCII, instead of previously Windows ANSI. The $ things in the following code are macros from the specified header-only library on GitHub. E.g. `$use_std` expands to a `using` declaration that prepends `std::` to each specified name. If use of the library is undesired then just manually expand the macro invocations. The "winapi-header-wrappers" micro-library is not on GitHub or anywhere, but they're just simple wrappers, mostly just ensuring that `<windows.h>` is included, because MS headers are not self-sufficient. ------------------------------------------------------------------------ #include <cppx-core/all.hpp> // <url: https://github.com/alf-p-steinbach/cppx-core> #include <winapi-header-wrappers/windows-h.hpp> // Just a <windows.h> wrapper with UNICODE defined. #include <winapi-header-wrappers/shellapi-h.hpp> // CommandLineToArgvW namespace win_util { $use_std( exchange ); $use_cppx( Wide_c_str, Mutable_wide_c_str, hopefully, P_ ); class Command_line_args { P_<Mutable_wide_c_str> m_parts; int m_n_parts; Command_line_args( const Command_line_args& ) = delete; auto operator=( const Command_line_args& ) -> Command_line_args& = delete; public: auto count() const -> int { return m_n_parts - 1; } auto operator[]( const int i ) const -> Wide_c_str { return m_parts[i + 1]; } auto invocation() const -> Wide_c_str { return m_parts[0]; } Command_line_args(): m_parts( CommandLineToArgvW( GetCommandLine(), &m_n_parts ) ) { hopefully( m_parts != nullptr ) or $fail( "CommandLineToArgvW failed" ); } Command_line_args( Command_line_args&& other ): m_parts( exchange( other.m_parts, nullptr ) ), m_n_parts( exchange( other.m_n_parts, 0 ) ) {} ~Command_line_args() { if( m_parts != nullptr ) { LocalFree( m_parts ); } } }; } // namespace win_util namespace app { $use_std( cout, clog, endl, invoke, runtime_error, string, vector ); $use_cppx( hopefully, fail_, Is_zero, Byte, fs_util::C_file, Size, Index, C_str, fs_util::read, fs_util::read_, fs_util::read_sequence, fs_util::read_sequence_, fs_util::peek_, is_in, P_, to_hex, up_to ); namespace fs = std::filesystem; using namespace cppx::basic_string_building; // operator<<, operator""s // A class to serve simple failure messages to the user, via exceptions. These // exceptions are thrown without origin info, and are presented as just strings. // Don't do this in any commercial code. class Ui_exception: public runtime_error { using runtime_error::runtime_error; }; using Uix = Ui_exception; struct Pe32_types { using Optional_header = IMAGE_OPTIONAL_HEADER32; static constexpr int address_width = 32; }; struct Pe64_types { using Optional_header = IMAGE_OPTIONAL_HEADER64; static constexpr int address_width = 64; }; template< class Type > auto from_bytes_( const P_<const Byte> p_first ) -> Type { Type result; memcpy( &result, p_first, sizeof( Type ) ); return result; } template< class Type > auto sequence_from_bytes_( const P_<const Byte> p_first, const Size n ) -> vector<Type> { vector<Type> result; if( n <= 0 ) { return result; } result.reserve( n ); for( const Index i: up_to( n ) ) { result.push_back( from_bytes_<Type>( p_first + i*sizeof( Type ) ) ); } return result; } // When this function is called the file position is at start of the optional header. template< class Pe_types > void list_exports( const string& u8_path, const C_file& f, const IMAGE_FILE_HEADER& pe_header ) { cout << Pe_types::address_width << "-bit DLL." << endl; using Optional_header = typename Pe_types::Optional_header; const auto pe_header_opt = read_<Optional_header>( f ); hopefully( IMAGE_DIRECTORY_ENTRY_EXPORT < pe_header_opt.NumberOfRvaAndSizes ) or fail_<Uix>( ""s << "No exports found in '" << u8_path << "'." ); const auto section_headers = invoke( [&]() -> vector<IMAGE_SECTION_HEADER> { vector<IMAGE_SECTION_HEADER> headers; for( int _: up_to( pe_header.NumberOfSections ) ) { (void) _; headers.push_back( read_<IMAGE_SECTION_HEADER>( f ) ); } return headers; } ); const auto& dir_info = pe_header_opt.DataDirectory[IMAGE_DIRECTORY_ENTRY_EXPORT]; hopefully( dir_info.Size >= sizeof( IMAGE_EXPORT_DIRECTORY ) ) or fail_<Uix>( "Ungood file: claimed size of export dir header is too small." ); const IMAGE_SECTION_HEADER& section = invoke( [&]() { const auto dir_addr = dir_info.VirtualAddress; const auto beyond_dir_addr = dir_addr + dir_info.Size; for( const auto& s: section_headers ) { const auto s_addr = s.VirtualAddress; const auto beyond_s_addr = s_addr + s.SizeOfRawData; if( s_addr <= dir_addr and beyond_dir_addr <= beyond_s_addr ) { return s; } } fail_<Uix>( "Ungood file: no section (fully) contains the export table." ); } ); hopefully( section.SizeOfRawData > 0 ) or fail_<Uix>( "Ungood file: section with export table, is of length zero." ); const auto addr_to_pos = section.PointerToRawData - section.VirtualAddress; const auto dir_position = dir_info.VirtualAddress + addr_to_pos; fseek( f, dir_position, SEEK_SET ) >> Is_zero() or fail_<Uix>( "Ungood file: a seek to the exports table section failed." ); const auto dir = read_<IMAGE_EXPORT_DIRECTORY>( f ); if( dir.NumberOfFunctions == 0 ) { cout << "No functions are exported"; } else if( dir.NumberOfFunctions == 1 ) { cout << "1 function is exported, at ordinal 0"; } else if( dir.NumberOfFunctions > 1 ) { cout << dir.NumberOfFunctions << " functions are exported" << ", at ordinals 0 ... " << dir.NumberOfFunctions - 1; } cout << "." << endl; if( dir.NumberOfFunctions == 0 ) { return; } fseek( f, dir.AddressOfNames + addr_to_pos, SEEK_SET ) >> Is_zero() or fail_<Uix>( "Ungood file: a seek to the name addresses table failed." ); const vector<DWORD> name_positions = read_sequence_<DWORD>( f, dir.NumberOfNames ); vector<string> names; names.reserve( name_positions.size() ); for( const DWORD name_addr: name_positions ) { string name; int ch; fseek( f, name_addr + addr_to_pos, SEEK_SET ) >> Is_zero() or fail_<Uix>( "Ungood file: a seek to the an export name failed." ); while( (ch = fgetc( f )) != EOF and ch != 0 ) { name += char( ch ); } names.push_back( name ); } fseek( f, dir.AddressOfNameOrdinals + addr_to_pos, SEEK_SET ) >> Is_zero() or fail_<Uix>( "Ungood file: a seek to the ordinals table failed." ); const vector<WORD> ordinals = read_sequence_<WORD>( f, dir.NumberOfNames ); cout << string( 72, '-' ) << endl; for( const int i: up_to( dir.NumberOfNames ) ) { cout << names[i] << " @" << ordinals[i] << endl; } } void run() { const auto args = win_util::Command_line_args(); hopefully( args.count() == 1 ) or fail_<Uix>( "Specify one argument: the DLL filename or path." ); const fs::path dll_path = args[0]; const string u8_path = cppx::fs_util::utf8_from( dll_path ); const auto f = C_file( tag::Read(), dll_path ); const auto dos_header = read_<IMAGE_DOS_HEADER >( f ); hopefully( dos_header.e_magic == IMAGE_DOS_SIGNATURE ) //0x5A4D, 'MZ' multichar. or fail_<Uix>( ""s << "No MZ magic number at start of '" << u8_path << "'." ); fseek( f, dos_header.e_lfanew, SEEK_SET ) >> Is_zero() or fail_<Uix>( "fseek to PE header failed" ); const auto pe_signature = read_<DWORD>( f ); hopefully( pe_signature == IMAGE_NT_SIGNATURE ) //0x4550, 'PE' multichar. or fail_<Uix>( ""s << "No PE magic number in PE header of '" << u8_path << "'." ); const auto pe_header = read_<IMAGE_FILE_HEADER>( f ); const auto image_kind_spec = peek_<WORD>( f ); switch( image_kind_spec ) { case IMAGE_NT_OPTIONAL_HDR32_MAGIC: { // 0x10B list_exports<Pe32_types>( u8_path, f, pe_header ); break; } case IMAGE_NT_OPTIONAL_HDR64_MAGIC: { // 0x20B list_exports<Pe64_types>( u8_path, f, pe_header ); break; } default: { // E.g. 0x107 a.k.a. IMAGE_ROM_OPTIONAL_HDR_MAGIC fail_<Uix>( "Not a PE32 (32-bit) or PE32+ (64-bit) file." ); } }; } } // namespace app auto main() -> int { $use_std( exception, cerr, endl, clog, ios_base ); $use_cppx( monospaced_bullet_block, description_lines_from ); #ifdef NDEBUG clog.setstate( ios_base::failbit ); // Suppress trace output.
Subscribe to:
Post Comments (Atom)
|
No comments:
Post a Comment