- How to write wide char string literals? - 4 Updates
- [ OT ] C - Open Standards - 8 Updates
- Trying to understand pointers. Why does this give unexpected results? - 1 Update
| Juha Nieminen <nospam@thanks.invalid>: Jul 08 08:11AM >> from the one you wanted. You essentially get garbage. > You got precisely what you specified - if it's not what you wanted, you > need to change your specification. No, I didn't. I wanted a way to specify wide string literals, and that solution was incorrect. |
| Juha Nieminen <nospam@thanks.invalid>: Jul 08 08:13AM > Add the solution for the readability is to just write the code as native > literals, but NOT as the actual C++ file, and have a filter stage that > translates this file into the actual C++ code with the escapes. Clearly you have never written unit tests. |
| James Kuyper <jameskuyper@alumni.caltech.edu>: Jul 08 05:52AM -0400 On 7/8/21 4:11 AM, Juha Nieminen wrote: >> need to change your specification. > No, I didn't. I wanted a way to specify wide string literals, and that > solution was incorrect. Paavo Helde's solution of using "\xC2\xA9" was correct for narrow string literals (on systems with CHAR_BIT==8, a requirement that he didn't bother mentioning). He was relying upon a UTF-8 => UTF-16 conversion routine of his own creation to get the corresponding wide string. You asked whether L"\xC2\xA9" would work, and the answer is "No", because it specifies two wide characters when only one is desired. You were aware that it wouldn't work, but seemed to be suggesting that there's a potentially faulty UTF-8=>UTF-16 conversion involved in it's failure to be correct. There is no such conversion. L"\xC2\xA9" specifies directly a wchar_t array of length 3 initialized with {0xC2, 0xA9, 0}, which is not what you wanted. I initially didn't address that point properly because I hadn't realized that only one character was desired. However, u"\xA9" or U"\xA9" would work fine; L"\xA9" should produce the desired result on systems where wchar_t uses UCS2 or UCS4 (==UTF-32) encoding. |
| James Kuyper <jameskuyper@alumni.caltech.edu>: Jul 08 03:56PM -0400 On 7/3/21 10:28 AM, Alf P. Steinbach wrote: >> source character set would prevent those escapes from working is not. > As far as I know nobody's argued that the source encoding assumption > would prevent any escapes from working. You said "It gets the wrong characters in the wide string literal, period.", and other parts of the discussion implicated source encoding assumptions as the reason why. The use of "period" implies no exceptions, and there's a very large set of exceptions: at least two, as as many as four, fully portable working escape sequences for every single Unicode code point. > [<<] > Which it decidedly does. > It's trivial to just try it out and see; QED. I did try it: as he said, it can get the wrong character if the string type isn't unicode encoded, and as I pointed out, it can also get the wrong character if the wrong escape sequence is used (which seems trivially obvious). But it's perfectly capable of giving the right characters when the right escape sequence is used with a prefix that mandates a unicode encoding. By saying "... it gets the wrong characters ... period.", you were denying that it's ever possible for it to get the right characters, which is demonstrably false. I've tried out the sequences I specified in the message you quoted above. They all work on my systems, and according to my understanding of the standard, they're required to work on all fully conforming implementations, regardless of source encoding assumptions - if that's not the case, I want to know how the exceptions can be justified. ... > sequences (including universal character designators) are affected by > the source encoding, but to me it has been about whether Juha's example > yields the desired string, as he correctly surmised that it didn't. Yes, but that's because it was the wrong escape sequence, not because there's any inherent problem with using correct escape sequences for that purpose. |
| Real Troll <real.troll@trolls.com>: Jul 08 01:00AM +0100 I have managed to find direct links to the official standard and they are here: <http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1336.pdf> <http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf> <http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1124.pdf> I am not sure if there are any official standards after n1336.pdf. Perhaps there are or perhaps there aren't unless you pay for them. Let me know if there are any for free use. Microsoft has defined what Open Standard Means: |
| Real Troll <real.troll@trolls.com>: Jul 08 01:20AM +0100 On 08/07/2021 01:00, Real Troll wrote: I have now found the official download link to "ISO/IEC 9899:2018". The link is here: <http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2310.pdf> |
| "Alf P. Steinbach" <alf.p.steinbach@gmail.com>: Jul 08 03:53AM +0200 On 8 Jul 2021 02:00, Real Troll wrote: >> parties and operate on a consensus basis. _*An open standard is >> publicly available*_, and developed, approved and maintained via a >> collaborative and consensus driven process. N1256 (in your list) is the amalgamated C99 + TC1 + TC2 + TC3 document, very nice. I believe N1570 (not in your list) was the last draft of C11. - Alf |
| Real Troll <real.troll@trolls.com>: Jul 08 02:30AM On 08/07/2021 02:53, Alf P. Steinbach wrote: > N1256 (in your list) is the amalgamated C99 + TC1 + TC2 + TC3 > document, very nice. > I believe N1570 (not in your list) was the last draft of C11. OK Thanks for informing about N1570. I have found the official download link so the complete list is as follows: <http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2310.pdf> <http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf> <http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1336.pdf> <http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf> <http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1124.pdf> Please let us know if anything else is missing from the list. The next standard is 23xx and it won't be approved until 2023 at the latest unless something drastic happens in the interim. |
| David Brown <david.brown@hesbynett.no>: Jul 08 08:46AM +0200 On 08/07/2021 02:00, Real Troll wrote: > I am not sure if there are any official standards after n1336.pdf. > Perhaps there are or perhaps there aren't unless you pay for them. Let > me know if there are any for free use. That's a useful list - thanks. >> parties and operate on a consensus basis. _*An open standard is >> publicly available*_, and developed, approved and maintained via a >> collaborative and consensus driven process. As usual, Microsoft has a somewhat different definition from other people... "Open standard" usually means that the standard is /available/ to anyone who wants it - but not necessarily for free. There are a great many open standards that are only available for a fee, or if you join the relevant group. "Open" in this context means that anyone can get the standards - there are no restrictions by country, company, contract, etc. This also applies to the C and C++ standards, which are published by ISO - anyone can get the standards, but you have to pay for them. What is unusual (but /very/ nice) is that the ISO working groups here publish their drafts at zero cost. |
| Juha Nieminen <nospam@thanks.invalid>: Jul 08 08:21AM > <http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1336.pdf> > <http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf> > <http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1124.pdf> Note that something being directly available for download, even if it's hosted at the IP owners' own servers, doesn't make it somehow automatically legal to download if the documents are under a commercial license. Making something available without technical barriers is not in itself any sort of implicit free license. (I don't know if those documents are commercial. Merely pointing out that fact.) |
| Philipp Klaus Krause <pkk@spth.de>: Jul 08 10:31AM +0200 Am 08.07.21 um 10:21 schrieb Juha Nieminen: > any sort of implicit free license. > (I don't know if those documents are commercial. Merely pointing out > that fact.) I understand that the WG14 / ISO copyright situation can be somewhat complicated (and in the past ISO expressed some dislike about the existance of that WG14 website). On the other hand, what you write would hold for any text, website, etc, which is kind of impractical (how do I know then I am allowed to read your message that I'm replying to here?). Anyway, those N documents are not meant to be hidden by WG14. There is a list of them (http://www.open-std.org/jtc1/sc22/wg14/www/wg14_document_log.htm), which is linked from the WB14 website (http://www.open-std.org/jtc1/sc22/wg14/). AFAIK, it is disputed who own the copyright to the individual N documents there, and it might even differ by legislation (in particular there might be US vs. EU law differences). |
| Bo Persson <bo@bo-persson.se>: Jul 08 12:19PM +0200 On 2021-07-08 at 10:21, Juha Nieminen wrote: > any sort of implicit free license. > (I don't know if those documents are commercial. Merely pointing out > that fact.) I am not a lawyer :-), but these papers are not official ISO documents, so no commercial license. Especially humorous is n1256. ISO official documents are the C99 official standard, plus three separate corrigenda - TC1, TC2, and TC3. ISO never published a "corrected" standard, just these four separate documents. In preparation for the C11 work, the committee then produced a "working draft" with the TCs applied to the C99 standard. You need to have a base document, right? And arguably a lot better than the official one, as the bugs have been removed. However, ISO never published this intermediate version, only the completed C11 standard. |
| Juha Nieminen <nospam@thanks.invalid>: Jul 08 08:17AM > ptrA = &a; > ptrB = &b; > ptrC = &c; I'm genuinely wondering why you are writing it like that, instead of the simpler: int *ptrA = &a; float *ptrB = &b; char *ptrC = &c; > cout << "value of c: " << c << "; address of c: " << ptrC << endl; A char* pointer is overloaded to print the string pointed to by that pointer, so that will severely malfunction. You can cast it to void* instead: std::cout << static_cast<void*>(ptrC) << std::endl; |
| You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page. To unsubscribe from this group and stop receiving emails from it send an email to comp.lang.c+++unsubscribe@googlegroups.com. |
No comments:
Post a Comment