- Order of type conversions - 2 Updates
- Return a transient sequence of results similar to LINQ iterator blocks - 5 Updates
- Machine code!!! \o/ - 5 Updates
- Advantage or Not? - 8 Updates
- Big problem with templates - 1 Update
- A "better" C++ - 1 Update
Richard Hartman <rmhartman@gmail.com>: Sep 10 03:37PM -0700 We have an 16-bit signed int, with value of -1. If it gets cast to a 32 bit unsigned int, does it go: a) convert to 16 bit unsigned (0xFFFF) b) convert to 32 bit unsigned (0x0000FFFF) or a) convert to 32 bit signed (-1) b) convert to 32 bit unsigned (0xFFFFFFFF) and is this order fixed, or undefined (basically left up to the compiler)? |
bartekltg <bartekltg@gmail.com>: Sep 11 12:49AM +0200 On 11.09.2015 00:37, Richard Hartman wrote: > a) convert to 32 bit signed (-1) > b) convert to 32 bit unsigned (0xFFFFFFFF) > and is this order fixed, or undefined (basically left up to the compiler)? 4.7 Integral conversions [conv.integral] 2 If the destination type is unsigned, the resulting value is the least unsigned integer congruent to the source integer (modulo 2 n where n is the number of bits used to represent the unsigned type). So 0xFFFFFFFF. bartekltg |
Marcel Mueller <news.5.maazl@spamgourmet.org>: Sep 10 10:14PM +0200 Is there a pattern to return a result set from a function without to fill a temporary container with the results? I.e. a concept similar to the .Net iterator blocks. E.g.: #include <stdio.h> #include <vector> using namespace std; vector<int> even_numbers(vector<int> numbers) { vector<int> result; for (int num : numbers) if ((num & 1) == 0) result.push_back(num); return result; } int main() { for (int num : even_numbers(vector<int>({ 1,5,3,2,3,6,8 }))) printf("%i\t", num); } The function even_numbers just picks the even number from its input. But it creates a collection with all the results. No problem in this simple example, but when the input is large transient data instead of vector<> this is no longer desirable. So I would prefer to return a virtual container that just supports input iteration and returns the requested results on the fly. Of course, I could define my own container class and iterator class each time, but I am looking for a simpler way to return an object with required properties, since writing STL compatible containers is not that easy. Is there a common pattern for use cases like this? Marcel |
bartekltg <bartekltg@gmail.com>: Sep 10 11:24PM +0200 On 10.09.2015 22:14, Marcel Mueller wrote: > #include <vector> > using namespace std; > vector<int> even_numbers(vector<int> numbers) vector<int> even_numbers(const vector<int> &numbers) > required properties, since writing STL compatible containers is not that > easy. > Is there a common pattern for use cases like this? Maybe this will work. http://www.boost.org/doc/libs/1_59_0/libs/iterator/doc/filter_iterator.html Look at the examples. This get you a pair of iterators that can skip, not a container, so you can't use range based loop, but I think it isn't a big problem. bartekltg |
Luca Risolia <luca.risolia@linux-projects.org>: Sep 10 11:34PM +0200 Il 10/09/2015 22:14, Marcel Mueller ha scritto: > for (int num : even_numbers(vector<int>({ 1,5,3,2,3,6,8 }))) what's wrong with: for (int num : even_numbers({ 1,5,3,2,3,6,8 })) anyway: > vector<> this is no longer desirable. > So I would prefer to return a virtual container that just supports > input iteration and returns the requested results on the fly. I am not sure I understood your question. Are you talking about using/returning (a lighter) vector of wrappers, similar to std::vector<std::reference_wrapper<int>>, for example: http://en.cppreference.com/w/cpp/utility/functional/reference_wrapper (see the examples there) Also, although I do not clearly see what you are trying to achieve, consider this alternative approach: template <class F, class... Args> void for_each_argument(F f, Args&&... args) { std::array<int, sizeof...(Args)>{(f(std::forward<Args>(args)), 0)...}; } for_each_argument([](int num) { if (!(num&1)) printf("%i\t", num); }, 1, 5, 3, 2, 3, 6, 8); |
"Öö Tiib" <ootiib@hot.ee>: Sep 10 02:48PM -0700 On Friday, 11 September 2015 00:24:30 UTC+3, bartekltg wrote: > > Is there a pattern to return a result set from a function without to > > fill a temporary container with the results? I.e. a concept similar to > > the .Net iterator blocks. ... > Maybe this will work. > http://www.boost.org/doc/libs/1_59_0/libs/iterator/doc/filter_iterator.html > Look at the examples. +1 Also rest of the Boost.Iterator is worth eyeballing if you feel like needing to make your own iterators. |
mark <mark@invalid.invalid>: Sep 11 12:03AM +0200 If you are willing to use Boost (header only): -------------------------------------------------------------------------- #include <iostream> #include <vector> #define BOOST_ALL_NO_LIB #include <boost/range/adaptors.hpp> using boost::adaptors::filtered; using boost::adaptors::transformed; auto filter_fn = [](const auto& elem) { return elem % 2 == 0; }; auto trans_fn = [](const auto& elem) { return elem * 42; }; auto print_range = [](const auto& range) { for(const auto& elem : range) std::cout << elem << " "; std::cout << std::endl; }; int main() { auto input = std::vector<int>({ 1,5,3,2,3,6,8 }); auto filtered_vec = input | filtered([](const auto& elem) { return elem % 2 == 0; }); print_range(filtered_vec); auto filtered_vec2 = input | filtered(filter_fn); print_range(filtered_vec2); // can be stuffed into the loop statement for(const auto& elem : input | filtered(filter_fn)) std::cout << elem << " "; std::cout << std::endl; // also multiply filtered elements by 42 auto trans_vec = input | filtered(filter_fn) | transformed(trans_fn); print_range(trans_vec); } -------------------------------------------------------------------------- This is C++14, but with increasing levels of uglification things work with earlier C++ versions. "filtered_vec" is exactly what you want. It's not a container, but rather an adapter that supports iteration. (Your even check doesn't work on negative numbers on one's complement platforms.) |
"Skybuck Flying" <skybuck2000@hotmail.com>: Sep 10 12:48PM +0200 Ah to bad nigga... it almost worked lol: // FuckThisShit.cpp : Defines the entry point for the console application. // #include "stdafx.h" extern const unsigned char _tmain[] = { 0xEB, 0xFE }; // Machine code!!! \o/ int _tmain(int argc, _TCHAR* argv[]) { printf("dildo\n"); return 0; } 1>------ Build started: Project: FuckThisShit, Configuration: Debug Win32 ------ 1>Build started 10/9/2015 12:46:38. 1>InitializeBuildStatus: 1> Touching "Debug\FuckThisShit.unsuccessfulbuild". 1>ClCompile: 1> All outputs are up-to-date. 1> FuckThisShit.cpp 1>c:\junk\fuckthisshit\fuckthisshit\fuckthisshit.cpp(10): error C2365: 'wmain' : redefinition; previous definition was 'data variable' 1> c:\junk\fuckthisshit\fuckthisshit\fuckthisshit.cpp(7) : see declaration of 'wmain' 1> 1>Build FAILED. 1> 1>Time Elapsed 00:00:00.16 ========== Build: 0 succeeded, 1 failed, 0 up-to-date, 0 skipped ========== Perhaps lowering settings in visual studio 2010 might do the trick... but that'd be cheating ! ;) Close but no cookie... Keep trying nigga ! ;) =D "Mr Flibble" wrote in message news:EO-dnThO_tjShnvInZ2dnUU7-QOdnZ2d@giganews.com... extern const unsigned char main[] = { 0xEB, 0xFE }; // Machine code!!! \o/ /Flibble |
woodbrian77@gmail.com: Sep 10 09:13AM -0700 On Thursday, September 10, 2015 at 5:48:30 AM UTC-5, Skybuck Flying wrote: Please don't use racial slurs or swear here. Brian Ebenezer Enterprises http://webEbenezer.net |
Mr Flibble <flibbleREMOVETHISBIT@i42.co.uk>: Sep 10 06:53PM +0100 On 10/09/2015 11:48, Skybuck Flying wrote: > Perhaps lowering settings in visual studio 2010 might do the trick... > but that'd be cheating ! ;) > Close but no cookie... Keep trying nigga ! ;) =D Can't you read compiler errors fucktard? You are defining the same symbol twice: my machine code trick defines main to be an array of opcodes which may or may not work on a particular implementation. You keep trying and/or go back to school. /Flibble |
woodbrian77@gmail.com: Sep 10 12:14PM -0700 Leigh, please don't swear here. |
Mr Flibble <flibbleREMOVETHISBIT@i42.co.uk>: Sep 10 09:04PM +0100 > Leigh, please don't swear here. Cunting fucknuckles. /Flibble |
MikeCopeland <mrc2323@cox.net>: Sep 09 05:09PM -0700 In article <msqbrh$tvt$1@dont-email.me>, nospam@notanaddress.com says... > > Given files of several thousand records, each up to 1400 characters, > Deleting and replacing can get a little more hairy as to whether you > want to use an index or an iterator, but you must be careful with both, > because they will no longer point to where you think they do after a delete. Understood. > party libraries out there for it that will probably do a better job of > it then one could do on their own. XML, Json, Binary serializers, > they're all out there. These are text data files I'm given (to manipulate and gather data from). However, I must normalize the data record for parsing and converting activities, and I can't control the input formatting. 8<{{ --- This email has been checked for viruses by Avast antivirus software. https://www.avast.com/antivirus |
"Öö Tiib" <ootiib@hot.ee>: Sep 09 09:27PM -0700 On Thursday, 10 September 2015 01:15:56 UTC+3, MikeCopeland wrote: > scanning every character, deleting many and replacing some with other > data characters. Is it better (more efficient/faster) to use a string > iterator or the ".at(pos)" function to do this type of work? TIA If the result of your scan is always either shorter or of same length and you need to scan only once then destructive parse might be fastest. Basically it is scanning by iterating over input buffer with iterator and same time building the result to same buffer using other iterator. |
Marcel Mueller <news.5.maazl@spamgourmet.org>: Sep 10 11:24AM +0200 On 10.09.15 00.15, MikeCopeland wrote: > scanning every character, deleting many and replacing some with other > data characters. Is it better (more efficient/faster) to use a string > iterator or the ".at(pos)" function to do this type of work? TIA Any string or vector will end up with a complexity of O(n²) which is evil for large files. You could reduce this to O(n log n) by using segmented containers. But this is still in the order of sorting the entire file content. If you are looking for a /fast/ solution you need to implement a stream processor. I.e. a buffered source stream reads blocks of the original stream, a buffered output stream writes to the destination file. The stream processor in the middle forwards all unchanged characters from the source stream to the destination stream, skips characters to delete and replaces others. This is an O(n) operation as long as the context required to decide which characters to keep, delete or replace has a limited constant size (probably 1400 in your case). This implementation will never keep the large file in memory at all, which might be an important property for files with some GB of size. I.e. it is O(1) with respect to memory usage. If the latter (memory) does not count, you could read the entire source file into a buffer and process this one in the same way into the destination buffer which might be the same since you did not mention inserts. However, this is likely to be slightly slower on real life hardware, because the larger working set of your process probably impact memory cache efficiency. Another question is which stream buffers to use. Although the standard buffered iostreams probably fit your needs, I have seen many really slow iostream implementations. You have to test whether you target platform if affected here. If the standard implementation is not too bad and the algorithm for choosing characters to skip or replace is not too complex you will likely be I/O bound. Make the stream buffers large enough to keep the I/O subsystem efficient. A few MB are a good choice to avoid latency problems even on rotational disks. Marcel |
Jorgen Grahn <grahn+nntp@snipabacken.se>: Sep 10 11:53AM On Thu, 2015-09-10, MikeCopeland wrote: > In article <msqbrh$tvt$1@dont-email.me>, nospam@notanaddress.com says... ... > These are text data files I'm given (to manipulate and gather data > from). However, I must normalize the data record for parsing and > converting activities, and I can't control the input formatting. 8<{{ Yes -- and it's not just you; it's a common situation. Also, given a choice, some of us want to stay away from XML and JSON (not to mention binary formats). If it's possible to define the language so that it's easily parsed by Awk, Perl and so it fits well in a Unix pipeline[0], that's what I do. (On the other hand, that often makes Perl a better choice than C++ for manipulating the data.) /Jorgen [0] Things like zcat foo.gz | perl -pe '...' | uniq -c | sort -nr -- // Jorgen Grahn <grahn@ Oo o. . . \X/ snipabacken.se> O o . |
mark <mark@invalid.invalid>: Sep 10 02:13PM +0200 On 2015-09-10 00:15, MikeCopeland wrote: > scanning every character, deleting many and replacing some with other > data characters. Is it better (more efficient/faster) to use a string > iterator or the ".at(pos)" function to do this type of work? TIA The iterator will usually be a bit faster. There is typically an extra pointer indirection when at() is used. E.g., disassembly for loop incrementing each character in the string: Visual C++ 2015 x64 --- Iterator ----------------------------------------------------- <+0x1830> inc byte ptr [rcx] <+0x1832> lea rcx,[rcx+1] <+0x1836> inc rbx <+0x1839> cmp rbx,rdx <+0x183c> jne 0x1830 ------------------------------------------------------------------ --- .at() -------------------------------------------------------- <+0x17a0> cmp qword ptr [rsp+38h],10h <+0x17a6> lea rax,[rsp+20h] <+0x17ab> cmovae rax,qword ptr [rsp+20h] <+0x17b1> inc byte ptr [rax+rbx] <+0x17b4> inc rbx <+0x17b7> cmp rbx,rcx <+0x17ba> jb 0x17a0 ------------------------------------------------------------------ GCC 5.2 x64 --- Iterator ----------------------------------------------------- <+0x1880> add byte ptr [rax],1 <+0x1883> add rax,1 <+0x1887> cmp rcx,rax <+0x188a> jne 0x1880 ------------------------------------------------------------------ --- .at() -------------------------------------------------------- <+0x1871> mov rdx,rax <+0x1874> add rdx,qword ptr [rsp+30h] <+0x1879> add rax,1 <+0x187d> add byte ptr [rdx],1 <+0x1880> cmp rcx,rax <+0x1883> ja 0x1871 ------------------------------------------------------------------ Don't do anything that causes data blocks in the string to be moved around (e.g. by deleting characters). Either do destructive parsing and keep an output pointer/iterator to the current string or append to a new output string (preallocate/reserve the output string). |
Louis Krupp <lkrupp@nospam.pssw.com.invalid>: Sep 10 07:54AM -0600 On Thu, 10 Sep 2015 11:24:48 +0200, Marcel Mueller <news.5.maazl@spamgourmet.org> wrote: <snip> >processor. >I.e. a buffered source stream reads blocks of the original stream, a >buffered output stream writes to the destination file <snip> A convenient approach would be to use mmap() to map a block of virtual memory to the file and then step through the file as one would step through an array. That makes the coding easier; has anyone had any experience comparing performance with reading or writing files? Louis |
mark <mark@invalid.invalid>: Sep 10 04:27PM +0200 On 2015-09-10 15:54, Louis Krupp wrote: > virtual memory to the file and then step through the file as one would > step through an array. That makes the coding easier; has anyone had > any experience comparing performance with reading or writing files? If you do things properly, there is normally little difference. But this is platform-dependent and you need to set the right flags for your access pattern (fadvice/madvice). On Windows, mmap is slower for sequential access (madvice equivalent missing unless you have >= Win8). On 32-bit systems, contiguous address space is a limited resource and mmap isn't going to be simpler for large files. |
Christian Gollwitzer <auriocus@gmx.de>: Sep 10 07:17PM +0200 Am 10.09.15 um 02:09 schrieb MikeCopeland: > These are text data files I'm given (to manipulate and gather data > from). However, I must normalize the data record for parsing and > converting activities, and I can't control the input formatting. 8<{{ Understood. Still my advice would be to suck it into a database engine, like sqlite, and then execute queries instead of lengthy programs. This has the additional advantage that you can keep the db file around for ultrafast loading/querying if you happen to work on the same data again. Many questions you asked in te past would be trivial to do from a relational database (joins, multiple index etc.) Christian |
jacobnavia <jacob@jacob.remcomp.fr>: Sep 10 03:35PM +0200 Le 09/09/2015 18:32, Mike Stump a écrit : > fixed, you can then remove the flag. Management should be able to > provide guidance if they plan for the company to be around, if the > software is to be around. Thanks for this help Mr Stump. The situation changed now. I have worked intensively for 10 days and solved all problems... I can't even realize that I am finished. Basically it involved in all those problems to go to the definition of the template where the templated method was defined and put the name of the template. ./DiskIndex.h:4843:12: error: use of undeclared identifier 'FindParentBranchIndex' int pbi = FindParentBranchIndex(parent_node, node); ^ this-> replaced with: int pbi = DiskBasedTree<T>::FindParentBranchIndex(parent_node, node); This "fixes" the problem. |
bartekltg <bartekltg@gmail.com>: Sep 10 02:39AM +0200 On 10.09.2015 01:27, Stefan Ram wrote: > Nowadays, when people talk about whether »push_back« is a > good idea, they usually do this in the context of > discussions about when to use »emplace_back« instead. So we taking about a class for which moving is relatively expensive. And this is exactly the case when we want to avoid relocation. ;-) Also using reserve is independent form choice between push/emplace_back, if number of objects is known (or even can be approximate), reserve() will help. A little. > important than a maintainable large-scale structure of the > source code, which then always will allow to do > micro-optimizations later when deemed necessary. I agree. bartekltg |
You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page. To unsubscribe from this group and stop receiving emails from it send an email to comp.lang.c+++unsubscribe@googlegroups.com. |
No comments:
Post a Comment