Java BufferedReader back to the top of a text file?

By | July 12, 2018
Questions:

I currently have 2 BufferedReaders initialized on the same text file. When I’m done reading the text file with the first BufferedReader, I use the second one to make another pass through the file from the top. Multiple passes through the same file are necessary.

I know about reset(), but it needs to be preceded with calling mark() and mark() needs to know the size of the file, something I don’t think I should have to bother with.

Ideas? Packages? Libs? Code?

Thanks
TJ

Answers:

What’s the disadvantage of just creating a new BufferedReader to read from the top? I’d expect the operating system to cache the file if it’s small enough.

If you’re concerned about performance, have you proved it to be a bottleneck? I’d just do the simplest thing and not worry about it until you have a specific reason to. I mean, you could just read the whole thing into memory and then do the two passes on the result, but again that’s going to be more complicated than just reading from the start again with a new reader.

Questions:
Answers:

The Buffered readers are meant to read a file sequentially. What you are looking for is the java.io.RandomAccessFile, and then you can use seek() to take you to where you want in the file.

The random access reader is implemented like so:

try{
     String fileName = "c:/myraffile.txt";
     File file = new File(fileName);
     RandomAccessFile raf = new RandomAccessFile(file, "rw");
     raf.readChar();
     raf.seek(0);
} catch (FileNotFoundException e) {
     // TODO Auto-generated catch block
     e.printStackTrace();
} catch (IOException e) {
     // TODO Auto-generated catch block
     e.printStackTrace();
}

The "rw" is a mode character which is detailed here.

The reason the sequential access readers are setup like this is so that they can implement their buffers and that things can not be changed beneath their feet. For example the file reader that is given to the buffered reader should only be operated on by that buffered reader. If there was another location that could affect it you could have inconsistent operation as one reader advanced its position in the file reader while the other wanted it to remain the same now you use the other reader and it is in an undetermined location.

Questions:
Answers:

The best way to proceed is to change your algorithm, in a way in which you will NOT need the second pass. I used this approach a couple of times, when I had to deal with huge (but not terrible, i.e. few GBs) files which didn’t fit the available memory.

It might be hard, but the performance gain usually worths the effort

Questions:
Answers:

About mark/reset:

The mark method in BufferedReader takes a readAheadLimit parameter which limits how far you can read after a mark before reset becomes impossible. Resetting doesn’t actually mean a file system seek(0), it just seeks inside the buffer. To quote the Javadoc:

readAheadLimit – Limit on the number of characters that may be read while still preserving the mark. After reading this many characters, attempting to reset the stream may fail. A limit value larger than the size of the input buffer will cause a new buffer to be allocated whose size is no smaller than limit. Therefore large values should be used with care.

Questions:
Answers:

“The whole business about mark() and reset() in BufferedReader smacks of poor design.”

why don’t you extend this class and have it do a mark() in the constructor() and then do a seek(0) in topOfFile() method.

BR,
~A

Leave a Reply

Your email address will not be published. Required fields are marked *