Class UTF8Reader

  • All Implemented Interfaces:
    java.io.Closeable, java.lang.AutoCloseable, java.lang.Readable

    public final class UTF8Reader
    extends java.io.Reader
    Class for reading characters from streams encoded in the modified UTF-8 format.

    Note that we often operate on a special Derby stream. A Derby stream is possibly different from a "normal" stream in two ways; an encoded length is inserted at the head of the stream, and if the encoded length is 0 a Derby-specific end of stream marker is appended to the data.

    If the underlying stream is capable of repositioning itself on request, this class supports multiple readers on the same source stream in such a way that the various readers do not interfere with each other (except for serializing access). Each reader instance will have its own pointer into the stream, and request that the stream repositions itself before calling read/skip on the stream.

    See Also:
    PositionedStoreStream
    • Field Summary

      Fields 
      Modifier and Type Field Description
      private char[] buffer
      Internal character buffer storing characters read from the stream.
      private int charactersInBuffer
      The number of characters in the internal buffer.
      private CharacterStreamDescriptor csd
      Descriptor containing information about the stream.
      private java.io.InputStream in
      The underlying data stream.
      private static int MAXIMUM_BUFFER_SIZE
      Maximum size in number of chars for the internal character buffer.
      private boolean noMoreReads
      Tells if this reader has been closed.
      private ConnectionChild parent
      A reference to the parent object of the stream.
      private PositionedStream positionedIn
      Stream that can reposition itself on request (may be null).
      private long rawStreamPos
      Store the last visited position in the store stream, if it is capable of repositioning itself (positionedIn != null).
      private static java.lang.String READER_CLOSED  
      private long readerCharCount
      Number of characters read from the stream.
      private int readPositionInBuffer
      The position of the next character to read in the internal buffer.
      private long utfCount
      Number of bytes read from the stream, including any header bytes.
      • Fields inherited from class java.io.Reader

        lock
    • Method Summary

      All Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      private int calculateBufferSize​(CharacterStreamDescriptor csd)
      Calculates an optimized buffer size.
      void close()
      Close the reader, disallowing further reads.
      private void closeIn()
      Close the underlying stream if it is open.
      private boolean fillBuffer()
      Fills the internal character buffer by decoding bytes from the stream.
      private void persistentSkip​(long toSkip)
      Skips the requested number of characters.
      int read()
      Reads a single character from the stream.
      int read​(char[] cbuf, int off, int len)
      Reads characters into an array.
      (package private) int readAsciiInto​(byte[] abuf, int off, int len)
      Reads characters into an array as ASCII characters.
      int readInto​(java.lang.StringBuffer sb, int len)
      Reads characters from the stream.
      (package private) void reposition​(long requestedCharPos)
      Repositions the stream so that the next character read will be the character at the requested position.
      private void resetUTF8Reader()
      Resets the reader.
      long skip​(long len)
      Skips characters.
      private java.io.IOException utfFormatException​(java.lang.String s)
      Convenience method generating an UTFDataFormatException and cleaning up the reader state.
      • Methods inherited from class java.io.Reader

        mark, markSupported, nullReader, read, read, ready, reset, transferTo
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Field Detail

      • MAXIMUM_BUFFER_SIZE

        private static final int MAXIMUM_BUFFER_SIZE
        Maximum size in number of chars for the internal character buffer.
        See Also:
        Constant Field Values
      • in

        private java.io.InputStream in
        The underlying data stream.
      • positionedIn

        private final PositionedStream positionedIn
        Stream that can reposition itself on request (may be null).
      • rawStreamPos

        private long rawStreamPos
        Store the last visited position in the store stream, if it is capable of repositioning itself (positionedIn != null).
      • utfCount

        private long utfCount
        Number of bytes read from the stream, including any header bytes.
      • readerCharCount

        private long readerCharCount
        Number of characters read from the stream.
      • buffer

        private final char[] buffer
        Internal character buffer storing characters read from the stream.
      • charactersInBuffer

        private int charactersInBuffer
        The number of characters in the internal buffer.
      • readPositionInBuffer

        private int readPositionInBuffer
        The position of the next character to read in the internal buffer.
      • noMoreReads

        private boolean noMoreReads
        Tells if this reader has been closed.
      • parent

        private ConnectionChild parent
        A reference to the parent object of the stream.

        The reference is kept so that the parent object can't get garbage collected until we are done with the stream.

      • csd

        private final CharacterStreamDescriptor csd
        Descriptor containing information about the stream. Except for the current positions, the information in this object is considered permanent and valid for the life-time of the stream.
    • Constructor Detail

      • UTF8Reader

        public UTF8Reader​(CharacterStreamDescriptor csd,
                          ConnectionChild conChild,
                          java.lang.Object sync)
                   throws java.io.IOException
        Constructs a reader on top of the source UTF-8 encoded stream.
        Parameters:
        csd - a description of and reference to the source stream
        conChild - the parent object / connection child
        sync - synchronization object used when accessing the underlying data stream
        Throws:
        java.io.IOException - if reading from the underlying stream fails
    • Method Detail

      • read

        public int read()
                 throws java.io.IOException
        Reads a single character from the stream.
        Overrides:
        read in class java.io.Reader
        Returns:
        A character or -1 if end of stream has been reached.
        Throws:
        java.io.IOException - if the stream has been closed, or an exception is raised while reading from the underlying stream
      • read

        public int read​(char[] cbuf,
                        int off,
                        int len)
                 throws java.io.IOException
        Reads characters into an array.
        Specified by:
        read in class java.io.Reader
        Returns:
        The number of characters read, or -1 if the end of the stream has been reached.
        Throws:
        java.io.IOException
      • skip

        public long skip​(long len)
                  throws java.io.IOException
        Skips characters.
        Overrides:
        skip in class java.io.Reader
        Parameters:
        len - the numbers of characters to skip
        Returns:
        The number of characters actually skipped.
        Throws:
        java.lang.IllegalArgumentException - if the number of characters to skip is negative
        java.io.IOException - if accessing the underlying stream fails
      • close

        public void close()
        Close the reader, disallowing further reads.
        Specified by:
        close in interface java.lang.AutoCloseable
        Specified by:
        close in interface java.io.Closeable
        Specified by:
        close in class java.io.Reader
      • readInto

        public int readInto​(java.lang.StringBuffer sb,
                            int len)
                     throws java.io.IOException
        Reads characters from the stream.

        Due to internal buffering a smaller number of characters than what is requested might be returned. To ensure that the request is fulfilled, call this method in a loop until the requested number of characters is read or -1 is returned.

        Parameters:
        sb - the destination buffer
        len - maximum number of characters to read
        Returns:
        The number of characters read, or -1 if the end of the stream is reached.
        Throws:
        java.io.IOException
      • readAsciiInto

        int readAsciiInto​(byte[] abuf,
                          int off,
                          int len)
                   throws java.io.IOException
        Reads characters into an array as ASCII characters.

        Due to internal buffering a smaller number of characters than what is requested might be returned. To ensure that the request is fulfilled, call this method in a loop until the requested number of characters is read or -1 is returned.

        Characters outside the ASCII range are replaced with an out of range marker.

        Parameters:
        abuf - the buffer to read into
        off - the offset into the destination buffer
        len - maximum number of characters to read
        Returns:
        The number of characters read, or -1 if the end of the stream is reached.
        Throws:
        java.io.IOException
      • closeIn

        private void closeIn()
        Close the underlying stream if it is open.
      • utfFormatException

        private java.io.IOException utfFormatException​(java.lang.String s)
        Convenience method generating an UTFDataFormatException and cleaning up the reader state.
      • fillBuffer

        private boolean fillBuffer()
                            throws java.io.IOException
        Fills the internal character buffer by decoding bytes from the stream.
        Returns:
        true if the end of the stream is reached, false if there is apparently more data to be read.
        Throws:
        java.io.IOException
      • resetUTF8Reader

        private void resetUTF8Reader()
                              throws java.io.IOException,
                                     StandardException
        Resets the reader.

        This method is used internally to achieve better performance.

        Throws:
        java.io.IOException - if resetting or reading from the stream fails
        StandardException - if resetting the stream fails
        See Also:
        reposition(long)
      • reposition

        void reposition​(long requestedCharPos)
                 throws java.io.IOException,
                        StandardException
        Repositions the stream so that the next character read will be the character at the requested position.

        There are three types of repositioning, ordered after increasing cost:

        1. Reposition within current character buffer (small hops forwards and potentially backwards - in range 1 char to MAXIMUM_BUFFER_SIZE chars)
        2. Forward stream from current position (hops forwards)
        3. Reset stream and skip data (hops backwards)
        Parameters:
        requestedCharPos - 1-based requested character position
        Throws:
        java.io.IOException - if resetting or reading from the stream fails
        StandardException - if resetting the stream fails
      • calculateBufferSize

        private final int calculateBufferSize​(CharacterStreamDescriptor csd)
        Calculates an optimized buffer size.

        The maximum size allowed is returned if the specified values don't give enough information to say a smaller buffer size is preferable.

        Parameters:
        csd - stream descriptor
        Returns:
        An (sub)optimal buffer size.
      • persistentSkip

        private final void persistentSkip​(long toSkip)
                                   throws java.io.IOException
        Skips the requested number of characters.
        Parameters:
        toSkip - number of characters to skip
        Throws:
        java.io.EOFException - if there are too few characters in the stream
        java.io.IOException - if reading from the stream fails