soft and program: Digest for comp.lang.c++@googlegroups.com

comp.lang.c++@googlegroups.com

Google Groups

std::rotate substitute - 2 Updates
Concatenating strings efficiently - 12 Updates
Multi-compare operators - 1 Update
CAlive will introduce pre and post blocks - 5 Updates

bitrex <user@example.net>: Oct 18 07:23PM -0400

I need an implementation of rotate for a microprocessor platform where I
have C++11 compiler but really nothing else available to work with. I'd
like to be able to rotate the elements of a short array of characters of
arbitrary length e.g. ABCD -> BCDA -> CDAB etc.

Something like the following should be good enough, but what's good to
use for "swap"? "algorithm" is a non-starter here, too.

#include <algorithm>

string rotate(string s) {
for (int i = 1; i < s.size(); i++)
swap(s[i-1], s[i]);
return s;
}

bitrex <user@example.net>: Oct 18 07:27PM -0400

On 10/18/2018 07:23 PM, bitrex wrote:
> swap(s[i-1], s[i]);
> return s;
> }

Perhaps the old "hack":

#define SWAP(a, b) (((a) ^= (b)), ((b) ^= (a)), ((a) ^= (b)))

?

Concatenating strings efficiently

bitrex <user@example.net>: Oct 17 10:49PM -0400

On 10/17/2018 05:43 AM, Paul wrote:
> A popular coding-interview question is copy-pasted below.

> String Compression: Implement a method to perform basic string compression using the counts of repeated characters. For example, the string aabcccccaaa would become a2blc5a3. You can assume the string has only uppercase and lowercase letters (a - z).

Interviewer has to be a big JERK to ask to store "b1" instead of just b"
and call it a "compression" algorithm, lol. or "a2" for that matter...

this is what I would call an "ultra-naive" solution I suppose:

#include <deque>
#include <stack>
#include <iostream>

std::string encode(const std::string& long_string) {
if (long_string.empty()) {
return long_string;
}

std::deque<std::stack<char>> stack_queue;
stack_queue.emplace_back(std::stack<char>{});
stack_queue.back().push(long_string.at(0)); // push first character

for (auto foo_it = long_string.begin() + 1; foo_it != long_string.end();
++foo_it) {
if (stack_queue.back().top() == *foo_it) {
stack_queue.back().push(*foo_it);
} else {
stack_queue.emplace_back(std::stack<char>{});
stack_queue.back().push(*foo_it);
}
}

std::string result;

for (const auto& stack : stack_queue) {
switch (stack.size()) {
case 2:
result += stack.top();
case 1:
result += stack.top();
break;

default:
result += stack.top();
result += std::to_string(stack.size());
}
}

return result;
}

int main() {
std::string foostring = "abbcccddddeeeeeffffff";
std::cout << encode(foostring) << std::endl;
}

Juha Nieminen <nospam@thanks.invalid>: Oct 18 08:03AM

>> String Compression: Implement a method to perform basic string compression using the counts of repeated characters. For example, the string aabcccccaaa would become a2blc5a3. You can assume the string has only uppercase and lowercase letters (a - z).

> Interviewer has to be a big JERK to ask to store "b1" instead of just b"
> and call it a "compression" algorithm, lol. or "a2" for that matter...

Where do you see a "b1"?

Juha Nieminen <nospam@thanks.invalid>: Oct 18 08:05AM

> way of approaching this problem (which will be tackled by users of other
> languages, too) assumes that operations like += are inefficient because
> s+="12" involves moving everything in the string.

If you were to create this kind of "compression" format, surely you can
spare the first few characters to store the length of the original string?

If not, you can traverse the "compressed" string once and resolve the
original length that way. (Of course you would have to make measurements
to see if it actually makes it faster.)

Ian Collins <ian-news@hotmail.com>: Oct 18 09:09PM +1300

On 18/10/18 15:49, bitrex wrote:

>> String Compression: Implement a method to perform basic string compression using the counts of repeated characters. For example, the string aabcccccaaa would become a2blc5a3. You can assume the string has only uppercase and lowercase letters (a - z).

> Interviewer has to be a big JERK to ask to store "b1" instead of just b"
> and call it a "compression" algorithm, lol. or "a2" for that matter...

https://en.wikipedia.org/wiki/Run-length_encoding

--
Ian.

"Öö Tiib" <ootiib@hot.ee>: Oct 18 06:55AM -0700

On Wednesday, 17 October 2018 18:19:34 UTC+3, Manfred wrote:
> > by factor of 2 is what that "packing" usually
> > achieves.

> In fact that is the worst case length of the encoded string.

That is perfect, then it is sure that there won't be reallocations.
For example if to compress the text of current post with that
algorithm then it will result in bit shorter than 2 times longer
size but the difference would be 9 bytes or so. If that matters
then we can use result.shrink_to_fit() at end.

scott@slp53.sl.home (Scott Lurndal): Oct 18 02:51PM

>> Interviewer has to be a big JERK to ask to store "b1" instead of just b"
>> and call it a "compression" algorithm, lol. or "a2" for that matter...

>Where do you see a "b1"?

probably depends on the font the NNTP client is using. Looks like bee ell
with the font -adobe-courier-medium-r-normal--11-80-100-100-m-60-iso8859-1.

"Öö Tiib" <ootiib@hot.ee>: Oct 18 07:56AM -0700

On Thursday, 18 October 2018 05:49:50 UTC+3, bitrex wrote:

> > String Compression: Implement a method to perform basic string compression using the counts of repeated characters. For example, the string aabcccccaaa would become a2blc5a3. You can assume the string has only uppercase and lowercase letters (a - z).

> Interviewer has to be a big JERK to ask to store "b1" instead of just b"
> and call it a "compression" algorithm, lol. or "a2" for that matter...

I also think that packing one character as "b1" is a waste.
However indicating 2 can be profitable. Majority of char strings that
we ever face are UTF8 encoded. Then "☠☠" could be compressed
as "☠2". That is shorter by two bytes (because "☠" takes three
bytes in UTF8).

What could still cause increase of result would be number characters
in input text. These should be escaped with something, for example
with '\xFE' or '\xFF' that are only used by its BOM that uses both and
is usually missing anyway.

bitrex <user@example.net>: Oct 18 02:38PM -0400

On 10/18/2018 10:56 AM, Öö Tiib wrote:
> in input text. These should be escaped with something, for example
> with '\xFE' or '\xFF' that are only used by its BOM that uses both and
> is usually missing anyway.

Interview questions can be little tricky, I think it's fine to ask
"write a program that encodes a string this way: a2b1c5a3" without
further commentary on some hypothetical real-world purpose, as a question.

Yes interviewer can say "this is a string-compression algorithm you're
writing" yeah a SHITTY string-compression algorithm. One doesn't really
want to get into a debate over run-length encoding strategies with an
interviewer, however I don't think. So one is left to guess whether they
really mean what they say.

If I'm ever in the interviewer-position I think I'll try to keep it
abstract and not add a lot of my own verbage as to how the test might
relate to some real world task it cuts down on chance for
mis-interpretation.

I also think interviewers who e.g. knock off points for a solution that
say, uses a loop to sum the first ten integers rather than the
closed-form summation formula is lame. Quick-and-correct solution in
interview situation should beat clever-and-possibly-error-prone
solution. If interviewer wants clever/efficient then interviewer should
ask for that, specifically. i.e. in the OP's post the subject says
"efficient" but the text of the post says nothing about efficiency
requirement.

Maybe this seems a little nitpicky and it may be but damn people lose
out on good jobs they want over stuff like this!

jameskuyper@alumni.caltech.edu: Oct 18 12:11PM -0700

On Thursday, October 18, 2018 at 4:03:48 AM UTC-4, Juha Nieminen wrote:
> bitrex <user@example.net> wrote:
> >> String Compression: Implement a method to perform basic string compression using the counts of repeated characters. For example, the string aabcccccaaa would become a2blc5a3. You can assume the string has only uppercase and
lowercase letters (a - z).

> > Interviewer has to be a big JERK to ask to store "b1" instead of just b"
> > and call it a "compression" algorithm, lol. or "a2" for that matter...

> Where do you see a "b1"?

The input string doesn't contain any 'l' characters, so using 'l' rather than '1' was probably a typo.

"Chris M. Thomasson" <invalid_chris_thomasson@invalid.invalid>: Oct 18 01:42PM -0700

On 10/17/2018 7:49 PM, bitrex wrote:
>> has only uppercase and lowercase letters (a - z).

> Interviewer has to be a big JERK to ask to store "b1" instead of just b"
> and call it a "compression" algorithm, lol. or "a2" for that matter...

;^)

The "a2" just makes the decoder "slower". Imho, "aa" versus "a2" equal
no savings, and just makes the logic more complicated. However, "aaa"
versus "a3" _does_ make things a little smaller...

[...]

bitrex <user@example.net>: Oct 18 06:02PM -0400

On 10/18/2018 04:42 PM, Chris M. Thomasson wrote:
> no savings, and just makes the logic more complicated. However, "aaa"
> versus "a3" _does_ make things a little smaller...

> [...]

It was probably done that way in like, friggin', 1967 as someone pointed
out with a wikipedia entry which mentioned run-length-encoding of analog
B&W television signals because the computer doing it had 12 bytes of RAM
and anything more sophisticated needed a government UNIVAC machine to
implement.

why am I implementing an encoding scheme for black and white television
on this test?!?!??!!?

bitrex <user@example.net>: Oct 18 06:08PM -0400

On 10/18/2018 04:42 PM, Chris M. Thomasson wrote:
> no savings, and just makes the logic more complicated. However, "aaa"
> versus "a3" _does_ make things a little smaller...

> [...]

I like avoiding solutions for any problem that involve raw integer
counters being incremented and/or decremented like in the original
solution. Raw counters are the flippin' devil!

Multi-compare operators

Jorgen Grahn <grahn+nntp@snipabacken.se>: Oct 18 08:03PM

On Sat, 2018-09-22, Öö Tiib wrote:
> On Saturday, 22 September 2018 15:50:44 UTC+3, David Brown wrote:
...
> FORTRAN that received and keeps receiving major curses to this day.
> People want to feel clever and I have no right to blame them since
> I want sometimes to feel clever too. ;)

Yes -- the standard fate of C++ features seems to me 5--10 years of
misuse[0], before we can agree what they're good for.

/Jorgen

[0] Plus 15 years of maintenance of code bases containing misuse.

--
// Jorgen Grahn <grahn@ Oo o. . .
\X/ snipabacken.se> O o .

CAlive will introduce pre and post blocks

"Rick C. Hodgin" <rick.c.hodgin@gmail.com>: Oct 17 08:39PM -0400

In an effort to augment existing software, and to better
document new software, and to allow a path to upgrade and
workaround existing implementation limitations that must
be dealt with for legacy purposes, CAlive will introduce
the ability to have pre { } and post { } blocks to code.

These can work on classes or structures, introducing new
storage to existing fixed structs which must pass through
legacy software in some way. And, they can be tagged to
any flow control / logic block to associate the code with
the block, but not fundamentally alter its operation. It
allows the prologue and epilogue code related to preparation
and clean- up to some block to be associated with it:

pre {
// Do some initialization here
// Allocate some things for the loop

} for (i = 0; i < 10; ++i) {
// Do the loop code

} post {
// Cleanup
}

Those three sections of code are now linked together for auto-
matic documentation purposes. Name the any of those blocks,
and an automatic outline is created.

-----
For classes or structs, it allows new fields to be added to
an existing design, to allow the new design to be passed and
referenced by its existing definition by legacy software that
does not know about its new features, and it allows the new
software that is aware of the new fields to use them as they
are silently passed through by allowing the extra data payload
to be transmitted using the traditional structure name.

For pointer math, using the standard ++ptr will cause it to
move forward / backward only by the sizeof(SWhatever), which
will cause it to fail if sequential data is stored using this
new format in a legacy app, but for single-instance pointers
that are transmitted through a legacy API and returned via a
callback, it will provide the larger payload unaware, while
maintaining the full original functionality.

The general format is:

// Traditional declaration
struct SWhatever
{
int c;
int d;
int e;
};

Assuming 32-bit integers, accesses are relative to the standard
SWhatever* ptr definition and natural address:

// With CAlive's pre and post declarations:
pre {
int a; // Accessed internally at at -8
int b; // Accessed internally at at -4

} struct SWhatever {
int c; // Accessed internally at at 0
int d; // Accessed internally at at 4
int e; // Accessed internally at at 8

} post {
int f; // Accessed internally at at 12
int g; // Accessed internally at at 16
int h; // Accessed internally at at 20
};

You can also use another syntax with labels (in any order,
and with multiple of each as needed):

struct SWhatever
{
pre:
int a;
int b;

data:
int c;
int d;
int e;

post:
int f;
int g;
int h;
};

When referencing sizeof(SWhatever), it shows only the size
of the three objects c, d, and e. When using either
presizeof(SWhatever) or postsizeof(SWhatever) it shows the
sizeof(SWhatever) plus the sizeof(SWhatever.pre) or the
sizeof(SWhatever.post).

If you need the new size of the entire structure, which
includes the pre and post blocks, then use allsizeof().
To allocate memory for an SWhatever structure, but using
the full pre/post block memory size, use allnew and the
alldelete to release it.

-----
For new pointer math, you can also use the underscore-bang
_! operator to move forward or backward from an "SWhatever*
ptr" declaration by the full quantity (including any pre/post
block bytes).

By placing the underscore before, and the bang after, it en-
capsulates the ptr variable, allowing pre- and post-blocks
to be navigated with pointer math:

SWhatever* ptr = allnew(SWhatever);

++ptr; // Would only skip forward by sizeof(SWhatever)
++_ptr!; // Skips forward by allsizeof(SWhatever)

alldelete ptr;

Mechanically, it's the equivalent of:

char* p = malloc(sizeof(SWhatever.pre) +
sizeof(SWhatever) +
sizeof(SWhatever.post));
SWhatever* ptr = (SWhatever*)(p + sizeof(SWhatever.pre));
// ptr now is pointing to the c member, and you could
// access data relative to it as needed into the pre and
// post block data areas.

--
Rick C. Hodgin

Mr Flibble <flibbleREMOVETHISBIT@i42.co.uk>: Oct 18 05:07PM +0100

On 18/10/2018 01:39, Rick C. Hodgin wrote:
[terrible idea snipped]

What has this terrible idea got to do with C++ exactly?

/Flibble

--
"Suppose it's all true, and you walk up to the pearly gates, and are
confronted by God," Bryne asked on his show The Meaning of Life. "What
will Stephen Fry say to him, her, or it?"
"I'd say, bone cancer in children? What's that about?" Fry replied.
"How dare you? How dare you create a world to which there is such misery
that is not our fault. It's not right, it's utterly, utterly evil."
"Why should I respect a capricious, mean-minded, stupid God who creates a
world that is so full of injustice and pain. That's what I would say."

"Rick C. Hodgin" <rick.c.hodgin@gmail.com>: Oct 18 10:09AM -0700

On Thursday, October 18, 2018 at 12:07:51 PM UTC-4, Mr Flibble wrote:
> On 18/10/2018 01:39, Rick C. Hodgin wrote:
> What has this terrible idea got to do with C++ exactly?

I believe you are significantly biased against me, Leigh. And
as such, if you don't already see the correlation, I doubt I'll
be able to explain it to you. But, here goes:

CAlive seeks to gain an audience from C and C++ developers since
it also has exception handling, the basic class, and features
familiar to C and C++ authors. What it does not have is the com-
plexity of C++. It essentially makes C a better C, introducing
a host of new features that others are welcome to take and incor-
porate into C or C++.

It is under development now and is slated for release in 2019 or
2020, preferably around Christmas 2019.

These posts are designed to spawn thought in individuals toward
new possibilities, and to get existing compiler authors to possibly
add those new features to existing compilers, and not necessarily
based on my ideas, but based on the thought which ensues helping
propel them to improve and increase their craft further.

> --
> [8 line signature snipped]

If you refer to this RFC 1855 document from October 1995:

Netiquette Guidelines
https://tools.ietf.org/html/rfc1855

And search for this portion, you'll find a reference to how long
signature line should be:

"If you include a signature keep it short. Rule of thumb
is no longer than 4 lines. Remember that many people pay
for connectivity by the minute, and the longer your message
is, the more they pay."

You seem to be an enforcer of netiquette, but are you also a hyp-
ocrite? Since you don't follow God and take your cues from Him,
do you just then glom on to whatever anti-Christ idea generates
in your own mind, and run with that as acceptable behavior? You
are essentially making an anti-God statement with every post,
and while it is in your signature line ... it's an ongoing state-
ment with regards to "religion" (from your point of view).

For someone who doesn't believe in God, you sure have a lot to
say about the subject. I wonder why?

The Bible speaks about people doing what's right in their own
eyes without regard to a higher authority:

https://www.biblegateway.com/passage/?search=Judges+17%3A6&version=KJV

6 In those days there was no king in Israel, but every man
did that which was right in his own eyes.

You have no unified King and are obeying what you feel is right
in your own eyes, Leigh. You are at war with others in the group,
as well as God. Such a position did not bode well for Israel at
any point in their history, and it will not bode well for you in
the end.

-----
Think about it, sir. You are better than complacency and ignor-
ance. You have an incredibly gifted mind and can rise above all
manner of falseness ... if you give yourself the opportunity.

--
Rick C. Hodgin

Mr Flibble <flibbleREMOVETHISBIT@i42.co.uk>: Oct 18 06:37PM +0100

On 18/10/2018 18:09, Rick C. Hodgin wrote:
> plexity of C++. It essentially makes C a better C, introducing
> a host of new features that others are welcome to take and incor-
> porate into C or C++.

Just because you want C++ developers to see your folly it doesn't make it
on topic here as it has nothing to do with C++ and you are deluded if you
think any of its egregious features (such as the one this thread is about)
will make it into the C++ language.

[dross snipped]
/Flibble

--
"Suppose it's all true, and you walk up to the pearly gates, and are
confronted by God," Bryne asked on his show The Meaning of Life. "What
will Stephen Fry say to him, her, or it?"
"I'd say, bone cancer in children? What's that about?" Fry replied.
"How dare you? How dare you create a world to which there is such misery
that is not our fault. It's not right, it's utterly, utterly evil."
"Why should I respect a capricious, mean-minded, stupid God who creates a
world that is so full of injustice and pain. That's what I would say."

"Rick C. Hodgin" <rick.c.hodgin@gmail.com>: Oct 18 11:04AM -0700

On Thursday, October 18, 2018 at 1:37:49 PM UTC-4, Mr Flibble wrote:
> ... your folly ... you are deluded ... egregious features ...

Well it's nice to know I was right ... you are significantly
biased against me. :-)

What happened to your GigaNews account? I thought you worked
out a deal with them to keep your account.

--
Rick C. Hodgin

You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page.
To unsubscribe from this group and stop receiving emails from it send an email to comp.lang.c+++unsubscribe@googlegroups.com.

soft and program

Thursday, October 18, 2018

Digest for comp.lang.c++@googlegroups.com - 20 updates in 4 topics

No comments:

Blog Archive

About Me