Thursday, June 25, 2015

Digest for comp.lang.c++@googlegroups.com - 25 updates in 8 topics

noreply@example.com: Jun 25 04:39PM

I'm new to C++, and took on writing a VI-like editor as my "break the seal"
project.
 
After 6k lines of code, I have an editor that stumbles along, and handles most
things; it's not pretty, and I'll be rewriting it again to incorporate all the
stuff I've learned since March.
 
Early on in the design, I made the choice to use lists rather than vectors
to hold the working set of lines. That is, in a ten line file, there are
ten list elements, each effectively holding a std::string.
 
I based this decision on the run-time characteristics of the VI command
"g/^/.m0" -- tag each line and move it to the beginning of the file,
effectively reversing the order of the lines in the file.
 
Stroustrup's talk (mentioned in another thread somewhere in this newsgroup)
made me want to revisit the decision to use lists.
 
I'm wondering if in this specific domain (an editor), lists are in fact the
better choice over vectors as the holder-of-all-lines?
 
Thanks in advance,
-RK
 
--
Robert Krten
 
Visit me at http://www.ironkrten.com
legalize+jeeves@mail.xmission.com (Richard): Jun 25 05:18PM

[Please do not mail me a copy of your followup]
 
noreply@example.com spake the secret code
>made me want to revisit the decision to use lists.
 
>I'm wondering if in this specific domain (an editor), lists are in fact the
>better choice over vectors as the holder-of-all-lines?
 
There are two kinds of forces at work here: design and performance.
 
The design forces revolve around the readability of your code. Is
there any benefit to comprehension in choosing one container over
another? For std::list<std::string> vs. std::vector<std::string>,
there probably isn't going to be much difference in comprehension in
choosing one over the other. This is particularly true if your code is
making use of iterators and a typedef for the container. In other
words, if you have code like:
 
typedef std::list<std::string> line_container_t;
 
void munge_lines(line_container_t::iterator begin,
line_container_t::iterator end)
{
// do stuff with lines through iterators
}
 
Then switching from list to vector is just a matter of changing the
typedef and possibly adjusting a few places where you directly interact
with list specific members.
 
The performance forces are how fast the code runs. This can only be
determined by running your code against some sort of test case and
measuring the results. The surprising thing about all these talks in
the past few years at C++ conferences over "list vs. vector" or "map
vs. vector" is that the performance analysis shows that vector does
much, much better than most people expect. This appears to be true for
both short lists and long lists, short vectors and long vectors.
 
The costs for traversing up the computer memory hierarchy have been
getting bigger over time. Whereas before a cache miss was undesirable
but tolerable, today it can absolutely kill your performance.
<https://en.wikipedia.org/wiki/Memory_hierarchy>
--
"The Direct3D Graphics Pipeline" free book <http://tinyurl.com/d3d-pipeline>
The Computer Graphics Museum <http://computergraphicsmuseum.org>
The Terminals Wiki <http://terminals.classiccmp.org>
Legalize Adulthood! (my blog) <http://legalizeadulthood.wordpress.com>
noreply@example.com: Jun 25 05:35PM

> {
> // do stuff with lines through iterators
> }
 
Yes, this is exactly how the code is structured. The one interesting
tradeoff is that I can no longer really use "line numbers" in the editor
(when using lists) because there's no direct index into the list, whereas
with a vector it would be directly supported.
 
> getting bigger over time. Whereas before a cache miss was undesirable
> but tolerable, today it can absolutely kill your performance.
> <https://en.wikipedia.org/wiki/Memory_hierarchy>
 
I will try both, then, and see how they perform. Vectors would certainly
be easier for maintaining "dot" (the current line in ED/EX/VI parlance),
whereas today "dot" is maintained as an iterator (effectively like a pointer).
 
Thanks,
-RK
 
--
Robert Krten
 
Visit me at http://www.ironkrten.com
"Chris M. Thomasson" <nospam@nospam.nospam>: Jun 25 11:24AM -0700

> things; it's not pretty, and I'll be rewriting it again to incorporate all
> the
> stuff I've learned since March.
 
[...]
 
FWIW, you just might be interested in:
 
https://en.wikipedia.org/wiki/Rope_(data_structure)
 
;^)
Christian Gollwitzer <auriocus@gmx.de>: Jun 25 08:27PM +0200


>[...]
> I'm wondering if in this specific domain (an editor), lists are in fact the
> better choice over vectors as the holder-of-all-lines?
 
Look up a gap buffer. AFAIK that is a data structure optimized for a
speed/size tradeoff used in text editors.
 
If you ever need an editing component in a real project, use a library
like Scintilla to do that. Real editors consist of many man-years of
effort to make them fast, memory-efficient and pleasant to use.
 
Christian
"Chris M. Thomasson" <nospam@nospam.nospam>: Jun 25 11:27AM -0700

> > things; it's not pretty, and I'll be rewriting it again to incorporate
> > all the
> > stuff I've learned since March.
 
[...]
 
> FWIW, you just might be interested in:
 
> https://en.wikipedia.org/wiki/Rope_(data_structure)
 
Here is a paper on the idea:
 
http://www.cs.rit.edu/usr/local/pub/jeh/courses/QUARTERS/FP/Labs/CedarRope/rope-paper.pdf
 
Hope is it of interest to you.
 
:^)
noreply@example.com: Jun 25 06:39PM


> Here is a paper on the idea:
 
> http://www.cs.rit.edu/usr/local/pub/jeh/courses/QUARTERS/FP/Labs/CedarRope/rope-paper.pdf
 
> Hope is it of interest to you.
 
Thanks! So much to learn :-)
 
I'm thinking that the rope concept can be used for the lists of lines, rather than
just the contents of each line.
 
Cheers,
-RK
 
--
Robert Krten
 
Visit me at http://www.ironkrten.com
noreply@example.com: Jun 25 06:41PM

>> better choice over vectors as the holder-of-all-lines?
 
> Look up a gap buffer. AFAIK that is a data structure optimized for a
> speed/size tradeoff used in text editors.
 
Yes, but this (gap buffer) is at the *string* level; my main data structure problems
(so far) are at the *line* level -- i.e., how to organize the collection of
std::strings that make up the content of the file being editted.
 
> If you ever need an editing component in a real project, use a library
> like Scintilla to do that. Real editors consist of many man-years of
> effort to make them fast, memory-efficient and pleasant to use.
 
For sure. This is purely as a learning exercise :-)
 
Cheers,
-RK
--
Robert Krten
 
Visit me at http://www.ironkrten.com
Mr Flibble <flibbleREMOVETHISBIT@i42.co.uk>: Jun 25 07:51PM +0100

> made me want to revisit the decision to use lists.
 
> I'm wondering if in this specific domain (an editor), lists are in fact the
> better choice over vectors as the holder-of-all-lines?
 
Hi,
 
A good data structure for an editor is something like my container
called a "segmented array". A segmented array has relative fast (O(lg
N)) performance for random inserts/erases which is essential for an editor.
 
For more information go to http://www.i42.co.uk/stuff/segmented_array.htm
 
neolib::segmented_array<char> would be ideal.
 
/Flibble
scott@slp53.sl.home (Scott Lurndal): Jun 25 04:57PM


>> Almost anything is tolerable. The question was, what's the advantage?
 
>The advantage is that function's name is put before rest of the clutter. It=
> is same what lot of language neutral programming tools do.=20
 
Please.
 
int
main(int argc, const char **argv, const char **envp)
 
accomplishes the same thing, is more readable, and
allows '/^main' in vi to jump directly to the function
definition.
 
That works with C++ functions as well
 
inline uint64_t
CpuCoreState::get_gpr(Register::Registers register)
{
returns _registers[register];
}
"Öö Tiib" <ootiib@hot.ee>: Jun 25 11:31AM -0700

On Thursday, 25 June 2015 19:57:21 UTC+3, Scott Lurndal wrote:
 
> accomplishes the same thing, is more readable, and
> allows '/^main' in vi to jump directly to the function
> definition.
 
More readable is matter of taste, '/^auto main' takes
half a second longer to type indeed.
 
When I make list of all lines where overloads of function
are defined then it annoys me that these contain no return
types.
 
> {
> returns _registers[register];
> }
 
The member functions of class often return class-local types and typedefs.
That results with stutter:
 
SomeContainer::const_iterator SomeContainer::middle()
{
// ...
}
 
Two lines or one ... still stutter. With return type after ... no stutter:
 
auto SomeContainer::middle() -> const_iterator
{
// ...
}
legalize+jeeves@mail.xmission.com (Richard): Jun 25 05:06PM

[Please do not mail me a copy of your followup]
 
Rosario19 <Ros@invalid.invalid> spake the secret code
>>- 'using namespace fmeh;' at file scope
 
>as someone said, namespace should be the name of file where there is
>the function in cpu instructions.
 
I don't think you quite understood what I was saying, so I'll elaborate.
 
I agree it is good advice to define all of your own code inside a
namespace. What you name that namespace is up to you; it is a subjective
decision to say that it "should" be the name of the file containing the
source code. There are other naming policies that are just as good.
So while I say you should put your code in a namespace, I refrain from
telling you what your namespace should be named.
 
When I say that I've done 'using namespace fmeh;' at file scope, I am
not referring to the namespace enclosing the definitions for that file.
I am using it similarly to an import statement in Java: a way to get
names from some other namespace visible in my source file without
having to be fully qualified.
 
When you put all your own code inside a namespace, you should never do
a file-scope using directive for the namespace where your code resides.
This makes the compiler do more work to resolve the names you're
attempting to define. Instead your code should be explicitly
surrounded by a named namespace definition.
 
>#define printf namedll.printf
>or
>#define P namedll.printf
 
I don't know what you're trying to propose with this, but such abuse of
the preprocessor is a really bad idea IMO.
--
"The Direct3D Graphics Pipeline" free book <http://tinyurl.com/d3d-pipeline>
The Computer Graphics Museum <http://computergraphicsmuseum.org>
The Terminals Wiki <http://terminals.classiccmp.org>
Legalize Adulthood! (my blog) <http://legalizeadulthood.wordpress.com>
Lynn McGuire <lmc@winsim.com>: Jun 25 01:06PM -0500

On 6/25/2015 5:23 AM, Öö Tiib wrote:
 
> Why someday? Just do it right away when you have any name
> clashes. Namespaces *are* meant as tool for avoiding name
> clashes.
 
No bugs?
 
Amazing.
 
Lynn
Rosario19 <Ros@invalid.invalid>: Jun 25 08:20PM +0200

On Thu, 25 Jun 2015 17:06:54 +0000 (UTC), Richard wrote:
 
>I don't think you quite understood what I was saying, so I'll elaborate.
 
>I agree it is good advice to define all of your own code inside a
>namespace.
 
there is no need of C++ namespaces or other namespaces...
namespaces are just the file names where functions are stored
where file names are path+namefile where is the function
 
for example: directorya\directoryb\namefile.dll.namefunction
give access to one and only one function
 
because in a file can not be the same name for something
have the same type [or function type]
 
and in 2 files even if there is the same name and same type it has a
different global name consider the file name that has to be different
 
>This makes the compiler do more work to resolve the names you're
>attempting to define. Instead your code should be explicitly
>surrounded by a named namespace definition.
 
a computer has files
in files there are .dll library and executable
so the only good namespace is file name [+ path]
 
>>#define P namedll.printf
 
>I don't know what you're trying to propose with this, but such abuse of
>the preprocessor is a really bad idea IMO.
 
but with that i can call functions that are in .dll using assembly and
pheraps in C or C++ too
 
so one can call all function [and C++ operator too] computer has
without problem
the info is only path+file name and name function
ram@zedat.fu-berlin.de (Stefan Ram): Jun 25 04:54PM

>I'm new to C++, and took on writing a VI-like editor as my "break the seal"
>project.
 
You need a terminal library to address issues like moving
the cursor, or otherwise resort to a line-editor like »ex«.
Of course, the most famous library for that is curses.
 
I have written a console text editor for my C++ course
as an example in the lesson about reading input from
the console.
 
#include <iostream>
#include <ostream>
#include <istream>
#include <cstdlib>
 
int arg( ::std::string & s ){ return atoi( s.substr( 1 ).c_str() ); }
 
int main()
{ ::std::cout << "i1 insert \"1\"\n+1 move cursor right\n-1 "
" move cursor left\nd1 erase 1 character\n";
::std::string s{ "alpha" }, command; int pos = 0; while( 1 )
{ ::std::cout << s << '\n' << ::std::string( pos, ' ' ) << '^' << '\n';
::std::cin >> command; switch( command.at( 0 ))
{ case '+': pos += arg( command ); break;
case '-': pos -= arg( command ); break;
case 'i': s.insert( pos, command.substr( 1 )); break;
case 'e': s.erase( pos, arg( command )); break; }}}
 
A session transcript (the user input is: »+2«, »i__«, »-1«,
and »e2«, i.e., »move right by 2«, »insert "__"«, »move left«,
and »erase two characters«):
 
i1 insert "1"
+1 move cursor right
-1 move cursor left
d1 erase 1 character
alpha
^
+2
alpha
^
i__
al__pha
^
-1
al__pha
^
e2
a_pha
^
 
>Early on in the design, I made the choice to use lists rather than vectors
>to hold the working set of lines. That is, in a ten line file, there are
>ten list elements, each effectively holding a std::string.
 
I like to hide such decisions behind an interface. The
editor UI uses an editor engine. The editor engine has a
storage engine. Only the storage engine is aware of that
decision. It is possible to have several storage engines
(like MySQL) and to select one at compile-time or even at
run-time.
 
When you have such a design you can start with the slowest
storage engine and always change this later.
 
There are special data structures for editors like ropes
or gap buffers.
 
But as long as one has not observed major problems due to lack
of speed, I would not care about such things.
 
I known an editor that is superb and fast. But for years or
even decades the manufacturer tells us that it will never support
Unicode. It seems that ASCII is engraved so hard in this editor
that they are unable to reverse that decision. Possibly,
you should keep Unicode-compatibility in mind.
ram@zedat.fu-berlin.de (Stefan Ram): Jun 25 05:16PM

Richard writes:
>I agree it is good advice to define all of your own code inside a
>namespace. What you name that namespace is up to you; it is a subjective
 
First of all: Should you write a reply, then please do not
send me an additional copy via E-mail. I read the newsgroup,
and I believe that this suffices.
 
Next:
I'd not define »main« in a namespace different from the
global namespace.
I'd not name a global namespace for application code »std«.
 
A policy that supports global publication could be
 
namespace com { namespace example { namespace PROJECT { ... }}}
 
for the owner of the domain »example.com« (Java inspired).
 
But common in C++ is:
 
namespace MANUFACTURER_NAME { namespace PROJECT_NAME { ... }}
 
.
legalize+jeeves@mail.xmission.com (Richard): Jun 25 05:10PM

[Please do not mail me a copy of your followup]
 
David Brown <david.brown@hesbynett.no> spake the secret code
>IDEs, such as NetBeans or CLion, if it looked like they would be
>significantly better for me - but it doesn't sound like there would be
>much difference.
 
I'd love it if some Eclipse user could evaluate my refactoring test
suite and tell me how good Eclipse is at refactoring:
<https://github.com/LegalizeAdulthood/refactor-test-suite>
--
"The Direct3D Graphics Pipeline" free book <http://tinyurl.com/d3d-pipeline>
The Computer Graphics Museum <http://computergraphicsmuseum.org>
The Terminals Wiki <http://terminals.classiccmp.org>
Legalize Adulthood! (my blog) <http://legalizeadulthood.wordpress.com>
Ian Collins <ian-news@hotmail.com>: Jun 21 12:58PM +1200

Öö Tiib wrote:
 
>> Those are glaring defects in the Standard. Good that you've found them.
 
> Yes "template class" does not make sense but "non-template class" makes
> sense to me. Or ... how else to call the latter?
 
A class? Or if you must be specific, a concrete class
 
--
Ian Collins
Victor Bazarov <v.bazarov@comcast.invalid>: Jun 20 07:32PM -0400

On 6/20/2015 8:23 AM, JiiPee wrote:
 
> just to clarify,
> "it" - are you refering here to the definition of the "const member
> function"?
 
"It": your inability to return a ref to non-const part of a const object.
 
 
> ok, so we can think that p_ (after declaring the function as const
> function) becomes like:
> const int p_;
 
Exactly.
 
> int foo() const { return p_; }
> it would mean that we are doing a copy for the return value:
> int return = p_;
 
Yep.
 
> But with reference return value it would be like:
> int& return = p_;
> which is not working because non-const ref cannot refer to a const. Right?
 
Yep.
 
 
> If I see it like this then its easy to understand....
 
Precisely. Everything is easy if you just think about it.
 
V
--
I do not respond to top-posted replies, please don't ask
"Öö Tiib" <ootiib@hot.ee>: Jun 25 05:00AM -0700

On Thursday, 25 June 2015 13:03:22 UTC+3, Paul wrote:
> Checking if a linked list is circular is a standard problem.
 
Linked list is *rather* rarely used container in practice.
 
> My attempted solution is below. Could anyone advise
> on a smart pointer approach using unique_ptr or shared_ptr?
 
Navigation in containers goes with *iterators* in C++.
Do *not* use smart pointer for navigating around in
container, regardless if it is based on smart pointers.
Smart pointer is terrible for navigation.

> Also, please let me know if there are any problems with
> this approach.
 
I don't see any problems in the algorithm.
 
> I know it leaks memory but I'm looking to a
> smart-pointer implementation to fix that.
 
You can use raw pointers like crappy substitution
to iterators (assuming list is made with smart pointers)
No much rewrite needed to your algorithm:
 
Node* slowIterator = head.get();
...
slowIterator = slowIterator->next.get();
...
if ( /* ... */ fastIterator->next.get() == slowIterator )
...
Paul <pepstein5@gmail.com>: Jun 25 05:11AM -0700

On Thursday, June 25, 2015 at 1:00:50 PM UTC+1, Öö Tiib wrote:
> ...
> if ( /* ... */ fastIterator->next.get() == slowIterator )
> ...
 
Thanks Oo. But isn't it a problem to have new without delete? I thought that was a memory leak? Thanks.
 
Paul
"Öö Tiib" <ootiib@hot.ee>: Jun 25 06:28AM -0700

On Thursday, 25 June 2015 15:11:29 UTC+3, Paul wrote:
> > if ( /* ... */ fastIterator->next.get() == slowIterator )
> > ...
 
> Thanks Oo. But isn't it a problem to have new without delete? I thought that was a memory leak? Thanks.
 
Leaks were in your 'smallTests' function. It was indeed crap.
That you can get fixed by using smart pointer for 'next'.
 
I meant your 'isCycle' that did neither 'new' nor 'delete' and
so you should keep. Iterators should not manage the object
they navigate. Smart pointers (that automatically manage)
are therefore very bad iterators.
"Öö Tiib" <ootiib@hot.ee>: Jun 25 07:06AM -0700

On Thursday, 25 June 2015 16:28:35 UTC+3, Öö Tiib wrote:
 
> > Thanks Oo. But isn't it a problem to have new without delete? I thought that was a memory leak? Thanks.
 
> Leaks were in your 'smallTests' function. It was indeed crap.
> That you can get fixed by using smart pointer for 'next'.
 
Additional warning ... making reference cycle with smart
pointers prevents these from managing the memory properly.

It is because 'unique_ptr' denotes private ownership and
'shared_ptr' denotes shared ownership. Circular ownership
does not make sense and so these are not designed to
work in such a situation. So if you make cycles with those
then you should also break the cycles manually (but better
just avoid it).
Paul <pepstein5@gmail.com>: Jun 25 07:17AM -0700

On Thursday, June 25, 2015 at 2:28:35 PM UTC+1, Öö Tiib wrote:
> so you should keep. Iterators should not manage the object
> they navigate. Smart pointers (that automatically manage)
> are therefore very bad iterators.
 
I'm sorry but I'm not sure how to proceed to get the whole thing (including the test) to work. Should I simply rewrite the Node class so that next is a std::shared_ptr<Node>?
 
Many thanks
Paul
"Öö Tiib" <ootiib@hot.ee>: Jun 25 08:49AM -0700

On Thursday, 25 June 2015 17:17:44 UTC+3, Paul wrote:
 
> I'm sorry but I'm not sure how to proceed to get the whole
> thing (including the test) to work. Should I simply rewrite
> the Node class so that next is a std::shared_ptr<Node>?
 
It is lot better idea to add constructors and destructors to
'Node' to see when those are called.
Then rewrite your 'smallTests' in a way that it works and
does not leak 'Node's (nor does delete 'Node's twice) with
raw pointers.
 
Then save it and then try to achieve same with smart pointer
as 'next' in 'Node'. Cycles are slightly tricky to deal with smart
pointers and so you will find out what is better ... if to use
fully manual management with raw pointers or to use smart
pointers and compensate the shortcoming.
You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page.
To unsubscribe from this group and stop receiving emails from it send an email to comp.lang.c+++unsubscribe@googlegroups.com.

No comments: