- What is "TP_NUM_C_BUFS"? - 8 Updates
- Xn invites review, tryout or simply any idea. - 2 Updates
Paavo Helde <myfirstname@osa.pri.ee>: Apr 11 09:45PM +0300 On 11.04.2016 21:24, Robbie Hatley wrote: > Replaces 343 lines of code with... well... 0. I can erase my entire > homebrew regex module, seeing as how C++ 2011 std lib already > has easy-to-use regex match, search, replace tools. Cool. Good to hear! Yes of course it is better to use "builtin" C++ regex whenever possible. And I am still pretty sure the actual bug was in some of those deleted 343 lines (and deleting them was a good way to get rid of the bug!) > debugging my buggy C regex module later as an exercise. But for > any serious usage I think I'll use C++ regex instead. > I *still* wonder what the heck "TP_NUM_C_BUFS" is, though. :-) According to Google, this is a constant defined in some cygtls.h header file (e.g. https://sourceforge.net/u/earnie/msys2/msys2/ci/01e8fa7372b33b9766bb5dd3e059cfab695bbf90/tree/winsup/cygwin/cygtls.h). "tls" means thread local storage, a known source of problems and confusion on Windows. But I believe your issue had nothing to do with the thread local storage, it's just that this code bothered to do some consistency checks and noticed the memory corruption. Cheers Paavo |
legalize+jeeves@mail.xmission.com (Richard): Apr 11 08:39PM [Please do not mail me a copy of your followup] Robbie Hatley <see.my.sig@for.my.address> spake the secret code > << endl; > std::string Text2 = Text; // CATASTROPHIC PROGRAM CRASH!!! > [...] I'm going to take a step back here to add a viewpoint that hasn't yet been expressed in this thread. I agree that it is unlikely that the GNU C Library Regex facility requires that it's values be allocated on the heap in order to function properly. That would be unusual for the any Standard Library facility. They're designed intentionally so that you "only pay for what you use". TL;DR: Monster methods are a source of bugs. Refactor them into smaller units that your brain can reason about with confidence. If this code is easy to test in isolation, I should first write a unit test for it that reproduces the problem. What you've written suggests that this is a "monster method". Writing an isolated test for it is likely to be difficult because it almost certainly doesn't have "hundreds of lines of code" that execute in a linear fashion. That means lots of branching and conditionally executed code. So writing a test is likely to be difficult. If not, certainly write one that reproduces the problem. That "hundreds of lines of code" is really bugging me. Chances are that there is another "hundreds of lines of code" after the part you're showing us. Compose Method is your friend here.[1] (See below for an alternative: Replace Method with Method Object.) Most likely your "hundreds of lines of code" is introducing lots and lots of local variables throughout. Almost certainly it was decided that reusing local variables was preferable to each chunk of code using only it's own variables. Code like this makes it difficult to apply Compose Method because there is lots of shared state (the variables) between various chunks of code. They aren't coupled through the values of the variables because each chunk overwrites the value with it's own local initialization, but they are coupled through the names of the variables that are reused. In order to break this up, you'll want to refactor each isolated use of a reused variable into a use of a distinct variable. Unfortunately there isn't any good tool for C++ that can do this automatically and reliably, so you'll need to do it by hand. Here's where you should lean on your version control system really heavily. Each time you decouple a reused variable you'll want to decouple a single reuse instance and then make a commit into your version control system for it. Yes, this may result in 50+ commits for these several hundred lines of code, but so what? Commits are free. From the way you describe it, I assume this code has 0% unit test coverage. Small, slow steps are what you need in order to bring some sanity to this monster method. When you transform code in small steps anyone can follow the version control history and see what was done and how it was done. When a huge method like this is refactored in one gigantic step, then looking at the diffs becomes pointless. The diff viewer will say "several hundred lines of code were removed and replaced with several hundred other lines of code", making for an unintelligible change. When looking at the diff, the average person will be hard pressed to determine if the change was for the better or not in terms of bugs and preserved functionality. Once we've reduced the coupling of various chunks of code, we can start to Extract Method on the various chunks in order to achieve Compose Method. Here, there are good tools to help: CLion for Linux/Mac and VS2015/ReSharper for C++ for Windows. They all have pretty good automated Extract Method functionality that handles the dataflow needed to properly deduce input arguments, return types and their proper signatures. (Sorry, VisualAssist fairs pretty poorly here and I can't recomment it as it causes too much manual fixup after the fact.) If they don't deduce the types to best practices (passing std::string input args by value instead of const reference), then applying Change Signature on the extracted function gets you where you want to be. Why do Compose Method? We do it because large methods with lots of local variables are nearly impossible to reason about with confidence. Breaking a monster method up into a series of calls to smaller methods that we named with intention revealing names[2] let's us see the forest instead of the trees. Once things are broken down into small chunks we can reason about easily, we often spot the small error that caused the bug in the first place. It was "hiding in plain sight" inside that monster method. If you end up with very large argument lists because all these extracted chunks are communicating large amounts of state between them, then Replace Method with Method Object[3] may be a better alternative to Compose Method. With a Method Object, almost everything that was local state in the monster method becomes instance variables in the new object. This instance state doesn't need to be passed around between methods. We apply the same process as in Compose Method to break the monster method into smaller chunks. There's something nice that can happen when we use a Method Object instead of Compose Method on an existing object. The new Method Object can make all of it's extracted methods public, increasing the visible surface area that we can unit test from the outside. The Method Object is visible to us internally for testing purposes, but isn't exposed to clients as part of our public API anymore than the local variables in the monster method were exposed publicly. [1] <https://www.industriallogic.com/xp/refactoring/composeMethod.html> [2] <http://c2.com/cgi/wiki?IntentionRevealingNames> [3] <http://refactoring.com/catalog/replaceMethodWithMethodObject.html> -- "The Direct3D Graphics Pipeline" free book <http://tinyurl.com/d3d-pipeline> The Computer Graphics Museum <http://computergraphicsmuseum.org> The Terminals Wiki <http://terminals.classiccmp.org> Legalize Adulthood! (my blog) <http://legalizeadulthood.wordpress.com> |
Robbie Hatley <see.my.sig@for.my.address>: Apr 12 09:16AM -0700 For comparison purposes, here are the C++ and C versions of my "Substitute" function, as stand-alone C++ programs, dependent only on std libraries (and the GNU C regex library in the case of the C-style version). First, the C++ version. Total code lines: 38. Total time spent writing & debugging: 60 minutes. #include <iostream> #include <regex> using std::cout; using std::cerr; using std::endl; std::string Substitute ( std::string const & Pattern, std::string const & Replacement, std::string const & Text, std::string const & Flags // If Flags contains 'g', do global replace; ) // otherwise, replace first occurrence only. { std::regex_constants::match_flag_type RegexFlags {}; if (std::string::npos == Flags.find_first_of("g")) { RegexFlags = std::regex_constants::format_first_only; } std::regex RegEx {Pattern}; std::string Result = std::regex_replace(Text, RegEx, Replacement, RegexFlags); return Result; } // end function rhregex::Substitute() int main (int Beren, char ** Luthien) { std::string Pattern {Luthien[1]}; std::string Replacement {Luthien[2]}; std::string Text {Luthien[3]}; std::string Flags {Luthien[4]}; if (5 != Beren) { cerr << "Error in regex-c-test.cpp: this program takes 4 arguments," << endl << "but you only typed " << Beren << ". Aborting program." << endl; exit(666); } std::string Result = Substitute(Pattern, Replacement, Text, Flags); std::cout << Result << std::endl; } Next, the C-style version. Total code lines: 322. Total time spent writing & debugging: many hours #include <iostream> #include <iomanip> #include <string> #include <sstream> #include <regex.h> #undef NDEBUG #include <assert.h> #include <errno.h> #undef BLAT_ENABLE #ifdef BLAT_ENABLE #define BLAT(X) std::cerr << X << std::endl; #else #define BLAT(X)
Subscribe to:
Post Comments (Atom)
|
No comments:
Post a Comment