module Kramdown::Parser::Html::Parser

Contains the parsing methods. This module can be mixed into any parser to get HTML parsing functionality. The only thing that must be provided by the class are instance variable @stack for storing the needed state and @src (instance of StringScanner) for the actual parsing.

Public Instance Methods

handle_html_start_tag(line = nil) { |el, closed, handle_body| ... } click to toggle source

Process the HTML start tag that has already be scanned/checked via @src.

Does the common processing steps and then yields to the caller for further processing (first parameter is the created element; the second parameter is true if the HTML element is already closed, ie. contains no body; the third parameter specifies whether the body - and the end tag - need to be handled in case closed=false).

    # File lib/kramdown/parser/html.rb
 84 def handle_html_start_tag(line = nil) # :yields: el, closed, handle_body
 85   name = @src[1]
 86   name.downcase! if HTML_ELEMENT[name.downcase]
 87   closed = !@src[4].nil?
 88   attrs = parse_html_attributes(@src[2], line, HTML_ELEMENT[name])
 89 
 90   el = Element.new(:html_element, name, attrs, :category => :block)
 91   el.options[:location] = line if line
 92   @tree.children << el
 93 
 94   if !closed && HTML_ELEMENTS_WITHOUT_BODY.include?(el.value)
 95     closed = true
 96   end
 97   if name == 'script' || name == 'style'
 98     handle_raw_html_tag(name)
 99     yield(el, false, false)
100   else
101     yield(el, closed, true)
102   end
103 end
handle_raw_html_tag(name) click to toggle source

Handle the raw HTML tag at the current position.

    # File lib/kramdown/parser/html.rb
124 def handle_raw_html_tag(name)
125   curpos = @src.pos
126   if @src.scan_until(/(?=<\/#{name}\s*>)/mi)
127     add_text(extract_string(curpos...@src.pos, @src), @tree.children.last, :raw)
128     @src.scan(HTML_TAG_CLOSE_RE)
129   else
130     add_text(@src.rest, @tree.children.last, :raw)
131     @src.terminate
132     warning("Found no end tag for '#{name}' - auto-closing it")
133   end
134 end
parse_html_attributes(str, line = nil, in_html_tag = true) click to toggle source

Parses the given string for HTML attributes and returns the resulting hash.

If the optional line parameter is supplied, it is used in warning messages.

If the optional in_html_tag parameter is set to false, attributes are not modified to contain only lowercase letters.

    # File lib/kramdown/parser/html.rb
111 def parse_html_attributes(str, line = nil, in_html_tag = true)
112   attrs = Utils::OrderedHash.new
113   str.scan(HTML_ATTRIBUTE_RE).each do |attr, sep, val|
114     attr.downcase! if in_html_tag
115     if attrs.has_key?(attr)
116       warning("Duplicate HTML attribute '#{attr}' on line #{line || '?'} - overwriting previous one")
117     end
118     attrs[attr] = val || ""
119   end
120   attrs
121 end
parse_raw_html(el, &block) click to toggle source

Parse raw HTML from the current source position, storing the found elements in el. Parsing continues until one of the following criteria are fulfilled:

  • The end of the document is reached.

  • The matching end tag for the element el is found (only used if el is an HTML element).

When an HTML start tag is found, processing is deferred to handle_html_start_tag, providing the block given to this method.

    # File lib/kramdown/parser/html.rb
147 def parse_raw_html(el, &block)
148   @stack.push(@tree)
149   @tree = el
150 
151   done = false
152   while !@src.eos? && !done
153     if result = @src.scan_until(HTML_RAW_START)
154       add_text(result, @tree, :text)
155       line = @src.current_line_number
156       if result = @src.scan(HTML_COMMENT_RE)
157         @tree.children << Element.new(:xml_comment, result, nil, :category => :block, :location => line)
158       elsif result = @src.scan(HTML_INSTRUCTION_RE)
159         @tree.children << Element.new(:xml_pi, result, nil, :category => :block, :location => line)
160       elsif @src.scan(HTML_TAG_RE)
161         if method(:handle_html_start_tag).arity.abs >= 1
162           handle_html_start_tag(line, &block)
163         else
164           handle_html_start_tag(&block) # DEPRECATED: method needs to accept line number in 2.0
165         end
166       elsif @src.scan(HTML_TAG_CLOSE_RE)
167         if @tree.value == (HTML_ELEMENT[@tree.value] ? @src[1].downcase : @src[1])
168           done = true
169         else
170           add_text(@src.matched, @tree, :text)
171           warning("Found invalidly used HTML closing tag for '#{@src[1]}' on line #{line} - ignoring it")
172         end
173       else
174         add_text(@src.getch, @tree, :text)
175       end
176     else
177       add_text(@src.rest, @tree, :text)
178       @src.terminate
179       warning("Found no end tag for '#{@tree.value}' on line #{@tree.options[:location]} - auto-closing it") if @tree.type == :html_element
180       done = true
181     end
182   end
183 
184   @tree = @stack.pop
185 end