Class Matcher

java.lang.Object
org.biojava.utils.regex.Matcher

public class Matcher extends Object
This class is analogous to java.util.Matcher except that it works on SymbolLists instead of Strings. All coordinates are in the 1-based coordinate system used by SymbolLists.
Since:
1.4
Author:
David Huen
  • Method Summary

    Modifier and Type
    Method
    Description
    int
    end()
    Returns the index of the last character matched, plus one.
    int
    end(int group)
    Returns the index of the last Symbol, plus one, of the subsequence captured by the given group during the previous match operation.
    boolean
    Attempts to find the next subsequence of the input sequence that matches the pattern.
    boolean
    find(int start)
    Resets this matcher and then attempts to find the next subsequence of the input sequence that matches the pattern, starting at the specified index.
    Returns the input subsequence matched by the previous match.
    group(int group)
    Returns the input subsequence captured by the given group during the previous match operation.
    int
    Returns the number of capturing groups in this matcher's pattern.
    boolean
    Attempts to match the input SymbolList, starting at the beginning, against the pattern.
    boolean
    Attempts to match the entire input sequence against the pattern.
    Returns the Pattern object that compiled this Matcher.
    Resets this matcher.
    Resets this matcher with a new input SymbolList.
    int
    Returns the start index of the previous match.
    int
    start(int group)
    Returns the start index of the subsequence captured by the given group during the previous match operation.

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Method Details

    • end

      public int end()
      Returns the index of the last character matched, plus one.
      Returns:
      The index of the last character matched, plus one.
    • end

      public int end(int group) throws IndexOutOfBoundsException
      Returns the index of the last Symbol, plus one, of the subsequence captured by the given group during the previous match operation.

      Capturing groups are indexed from left to right, starting at one. Group zero denotes the entire pattern, so the expression m.end(0) is equivalent to m.end().

      Parameters:
      group - The index of a capturing group in this matcher's pattern.
      Returns:
      The index of the last Symbol captured by the group, plus one, or -1 if the match was successful but the group itself did not match anything.
      Throws:
      IndexOutOfBoundsException
    • find

      public boolean find()
      Attempts to find the next subsequence of the input sequence that matches the pattern.

      This method starts at the beginning of the input sequence or, if a previous invocation of the method was successful and the matcher has not since been reset, at the first Symbol not matched by the previous match. If the match succeeds then more information can be obtained via the start, end, and group methods.

      Returns:
      true if, and only if, a subsequence of the input sequence matches this matcher's pattern.
    • find

      public boolean find(int start) throws IndexOutOfBoundsException
      Resets this matcher and then attempts to find the next subsequence of the input sequence that matches the pattern, starting at the specified index.

      If the match succeeds then more information can be obtained via the start, end, and group methods, and subsequent invocations of the find() method will start at the first Symbol not matched by this match.

      Returns:
      true if, and only if, a subsequence of the input sequence starting at the given index matches this matcher's pattern.
      Throws:
      IndexOutOfBoundsException
    • group

      public SymbolList group()
      Returns the input subsequence matched by the previous match.

      For a matcher m with input sequence s, the expressions m.group() and s.substring(m.start(), m.end()) are equivalent. Note that some patterns, for example a*, match the empty SymbolList. This method will return the empty string when the pattern successfully matches the empty string in the input.

      Returns:
      The (possibly empty) subsequence matched by the previous match, in SymbolList form.
    • group

      public SymbolList group(int group) throws IndexOutOfBoundsException
      Returns the input subsequence captured by the given group during the previous match operation.

      For a matcher m, input sequence s, and group index g, the expressions m.group(g) and s.substring(m.start(g), m.end(g)) are equivalent. Capturing groups are indexed from left to right, starting at one. Group zero denotes the entire pattern, so the expression m.group(0) is equivalent to m.group(). If the match was successful but the group specified failed to match any part of the input sequence, then null is returned. Note that some groups, for example (a*), match the empty string. This method will return the empty string when such a group successfully matches the emtpy string in the input.

      Returns:
      The (possibly empty) subsequence captured by the group during the previous match, or null if the group failed to match part of the input.
      Throws:
      IndexOutOfBoundsException
    • groupCount

      public int groupCount()
      Returns the number of capturing groups in this matcher's pattern.

      Any non-negative integer smaller than the value returned by this method is guaranteed to be a valid group index for this matcher.

      Returns:
      The number of capturing groups in this matcher's pattern.
    • lookingAt

      public boolean lookingAt()
      Attempts to match the input SymbolList, starting at the beginning, against the pattern.

      Like the matches method, this method always starts at the beginning of the input sequence; unlike that method, it does not require that the entire input sequence be matched. If the match succeeds then more information can be obtained via the start, end, and group methods.

      Returns:
      true if, and only if, a prefix of the input sequence matches this matcher's pattern.
    • matches

      public boolean matches()
      Attempts to match the entire input sequence against the pattern.

      If the match succeeds then more information can be obtained via the start, end, and group methods.

      Returns:
      true if, and only if, the entire input sequence matches this matcher's pattern.
    • pattern

      public Pattern pattern()
      Returns the Pattern object that compiled this Matcher.
    • reset

      public Matcher reset()
      Resets this matcher.

      Resetting a matcher discards all of its explicit state information and sets its append position to zero.

      Returns:
      this matcher.
    • reset

      public Matcher reset(SymbolList sl)
      Resets this matcher with a new input SymbolList.

      Resetting a matcher discards all of its explicit state information and sets its append position to zero.

      Returns:
      this matcher.
    • start

      public int start()
      Returns the start index of the previous match.
      Returns:
      The index of the first Symbol matched.
    • start

      public int start(int group)
      Returns the start index of the subsequence captured by the given group during the previous match operation.

      Capturing groups are indexed from left to right, starting at one. Group zero denotes the entire pattern, so the expression m.start(0) is equivalent to m.start().

      Parameters:
      group - The index of a capturing group in this matcher's pattern.
      Returns:
      The index of the first character captured by the group, or -1 if the match was successful but the group itself did not match anything.