Package org.biojavax.bio.seq.io
Class RichStreamReader
java.lang.Object
org.biojavax.bio.seq.io.RichStreamReader
- All Implemented Interfaces:
SequenceIterator
,BioEntryIterator
,RichSequenceIterator
Parses a stream into sequences.
This object implements SequenceIterator, so you can loop over each sequence
produced. It consumes a stream, and uses a SequenceFormat to extract each
sequence from the stream.
It is assumed that the stream contains sequences that can be handled by the
one format, and that they are not seperated other than by delimiters that the
format can handle.
Sequences are instantiated when they are requested by nextSequence, not
before, so it is safe to use this object to parse a gigabyte fasta file, and
do sequence-by-sequence processing, while being guaranteed that RichStreamReader
will not require you to keep any of the sequences in memory.
- Since:
- 1.5
- Author:
- Matthew Pocock, Thomas Down, Richard Holland
-
Constructor Summary
ConstructorsConstructorDescriptionRichStreamReader
(BufferedReader reader, RichSequenceFormat format, SymbolTokenization symParser, RichSequenceBuilderFactory sf, Namespace ns) Creates a new stream reader on the given reader, which will attempt to read sequences in the given format, having symbols from the given tokenization, and pass them to the given factory to be transformed into RichSequence objects in the given namespace.RichStreamReader
(InputStream is, RichSequenceFormat format, SymbolTokenization symParser, RichSequenceBuilderFactory sf, Namespace ns) Creates a new stream reader on the given input stream, which will attempt to read sequences in the given format, having symbols from the given tokenization, and pass them to the given factory to be transformed into RichSequence objects in the given namespace. -
Method Summary
Modifier and TypeMethodDescriptionboolean
hasNext()
Returns whether there are more sequences to iterate over.Returns the next sequence in the iterator.
-
Constructor Details
-
RichStreamReader
public RichStreamReader(InputStream is, RichSequenceFormat format, SymbolTokenization symParser, RichSequenceBuilderFactory sf, Namespace ns) Creates a new stream reader on the given input stream, which will attempt to read sequences in the given format, having symbols from the given tokenization, and pass them to the given factory to be transformed into RichSequence objects in the given namespace.- Parameters:
is
- the input stream to read fromformat
- the input file formatsymParser
- the tokenizer that understands the sequence symbols in the filesf
- the factory that will build the sequencesns
- the namespace the sequences will be loaded into.
-
RichStreamReader
public RichStreamReader(BufferedReader reader, RichSequenceFormat format, SymbolTokenization symParser, RichSequenceBuilderFactory sf, Namespace ns) Creates a new stream reader on the given reader, which will attempt to read sequences in the given format, having symbols from the given tokenization, and pass them to the given factory to be transformed into RichSequence objects in the given namespace.- Parameters:
reader
- the reader to read fromformat
- the input file formatsymParser
- the tokenizer that understands the sequence symbols in the filesf
- the factory that will build the sequencesns
- the namespace the sequences will be loaded into.
-
-
Method Details
-
nextSequence
Returns the next sequence in the iterator.- Specified by:
nextSequence
in interfaceSequenceIterator
- Returns:
- the next Sequence
- Throws:
NoSuchElementException
- if you call nextSequence when hasNext returns falseBioException
- if for any reason the sequence could not be retrieved
-
nextBioEntry
- Specified by:
nextBioEntry
in interfaceBioEntryIterator
- Throws:
NoSuchElementException
BioException
-
nextRichSequence
- Specified by:
nextRichSequence
in interfaceRichSequenceIterator
- Throws:
NoSuchElementException
BioException
-
hasNext
Returns whether there are more sequences to iterate over.- Specified by:
hasNext
in interfaceBioEntryIterator
- Specified by:
hasNext
in interfaceSequenceIterator
- Returns:
- true if there are more sequences to get and false otherwise
-