Saturday, September 19, 2015

Digest for comp.lang.c++@googlegroups.com - 2 updates in 1 topic

Paul <pepstein5@gmail.com>: Sep 19 01:56PM -0700

The following question was posted at http://www.careercup.com/question?id=14424684 --
 
Given API:
int Read4096(char* buf);
It reads data from a file and records the position so that the next time when it is called it read the next 4k chars (or the rest of the file, whichever is smaller) from the file.
The return is the number of chars read.
 
Todo: Use above API to Implement API
"int Read(char* buf, int n)" which reads any number of chars from the file.
 
Below is a solution offered by someone posting at that site. I have questions about this as follows. Some of these cases seem anomalous, namely num_read == 0 total != n-remain num_read < remain. Do these cases correspond to failed attempts to read the file? I think I spot some errors. Clearly read4096 should be Read4096. Also I think read4096(readbuf) should be Read4096(buf). I would also make num_copy simply remain instead of min(num_read, remain).
 
Is the basic idea correct though? Any other comments on this code?
 
Many thanks for your help,
 
Paul
 
 
CODE FOLLOWS BELOW ******************************
int read(char* buf, int n){
int num_it = n / 4096;
int remain = n % 4096;
int total = 0;
 
while(num_it--){
int num_read = read4096(buf);
if(num_read == 0) break;
buf += num_read;
total += num_read;
}
 
if(total != n-remain) return total;
 
char readbuf[4096];
int num_read = read4096(readbuf);
int num_copy = min(num_read, remain);
copy(readbuf, readbuf + num_copy, buf);
total += num_copy;
 
return total;
}
jt@toerring.de (Jens Thoms Toerring): Sep 19 10:49PM

> to failed attempts to read the file? I think I spot some errors. Clearly
> read4096 should be Read4096. Also I think read4096(readbuf) should be
> Read4096(buf).
 
No, definitely not. There's a large chance that 'buf' has not
enough space left to accept another 4096 bytes at this stage
but just 'remain' bytes. To avoid that you need a buffer that
can accept the full 096 bytes.
 
> I would also make num_copy simply remain instead of
> min(num_read, remain).
 
No, because you just want to append to 'buf' as much as was
requested, not everything that may have been read (which
might be more than still fits into 'buf').
 
Both your proposed changes could result in writing past the
end of 'buf' which must be avoided at all costs! That's the
raw material for buffer-overflow attacks, i.e. it not only
makes your program behave in unpredictable ways but may even
allow the "bad guys" to wreak havoc by carefully crafted files
they may get you (or some user of your progam) to read in.
 
> buf += num_read;
> total += num_read;
> }
 
This tries to read as many 4096-byte chunks as can be
read from the file (modulo some typing errors like
'read4096' versus 'Read4096').
 
> if ( total != n - remain )
> return total;
 
Here things start to get a bit strange. If 'n' is an integer
multiple of 4096 (thus 'remain' is 0 ) and everything went
well one probably should also stop here and not attempt
to read anymore from the file. The code only seems to catch
cases of read errors (or less in the file than requested).
But that could be fixed easily by using '<=' instead of '!='
in the if condition.
 
The case that 'n' is smaller than 4096 though is handled
correctly - in that case the loop never gets run, thus
'total' is still 0 and so is 'n-remain' - thus the fol-
lowing code will get executed.
 
> int num_copy = min( num_read, remain );
> copy( readbuf, readbuf + num_copy, buf );
> total += num_copy;
 
I don't see anything broken here (assuming that somewhere
before there was a line with 'using std;').
 
> return total;
> }
 
What's missing from the specifications is if it should
be possible to call this function several times. The
way it's written it can only be called once - or data
from the file may get lost. If it's to be called again
and again (e.g. with small values of 'n' for a large
file) the stuff read into 'readbuf' would have to be
stored (e.g. by making 'readbuf' a static variable) and
it would also have to be recorded in another static va-
riable how much from the last read in the last call has
not yet been returned to caller, and dealing with that on
entry into the function.
 
Since Read4096() is obviously meant to be called several
times to read in a file in chunks of that size I would
tend to assume that the same is also intended with this
function. But that won't work when bytes read from the
file but not yet returned to the caller are simply dis-
carded. So the function, as it is, satisfies the requi-
rements by the letter, but probably isn't what was ex-
pected.
Regards, Jens
--
\ Jens Thoms Toerring ___ jt@toerring.de
\__________________________ http://toerring.de
You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page.
To unsubscribe from this group and stop receiving emails from it send an email to comp.lang.c+++unsubscribe@googlegroups.com.

No comments: