module Kramdown::Parser::Html::Parser
Contains the parsing methods. This module can be mixed into any parser to get HTML parsing functionality. The only thing that must be provided by the class are instance variable @stack for storing the needed state and @src (instance of StringScanner) for the actual parsing.
Public Instance Methods
Process the HTML start tag that has already be scanned/checked via @src.
Does the common processing steps and then yields to the caller for further processing (first parameter is the created element; the second parameter is true
if the HTML element is already closed, ie. contains no body; the third parameter specifies whether the body - and the end tag - need to be handled in case closed=false).
# File lib/kramdown/parser/html.rb 84 def handle_html_start_tag(line = nil) # :yields: el, closed, handle_body 85 name = @src[1] 86 name.downcase! if HTML_ELEMENT[name.downcase] 87 closed = !@src[4].nil? 88 attrs = parse_html_attributes(@src[2], line, HTML_ELEMENT[name]) 89 90 el = Element.new(:html_element, name, attrs, :category => :block) 91 el.options[:location] = line if line 92 @tree.children << el 93 94 if !closed && HTML_ELEMENTS_WITHOUT_BODY.include?(el.value) 95 closed = true 96 end 97 if name == 'script' || name == 'style' 98 handle_raw_html_tag(name) 99 yield(el, false, false) 100 else 101 yield(el, closed, true) 102 end 103 end
Handle the raw HTML tag at the current position.
# File lib/kramdown/parser/html.rb 124 def handle_raw_html_tag(name) 125 curpos = @src.pos 126 if @src.scan_until(/(?=<\/#{name}\s*>)/mi) 127 add_text(extract_string(curpos...@src.pos, @src), @tree.children.last, :raw) 128 @src.scan(HTML_TAG_CLOSE_RE) 129 else 130 add_text(@src.rest, @tree.children.last, :raw) 131 @src.terminate 132 warning("Found no end tag for '#{name}' - auto-closing it") 133 end 134 end
Parses the given string for HTML attributes and returns the resulting hash.
If the optional line
parameter is supplied, it is used in warning messages.
If the optional in_html_tag
parameter is set to false
, attributes are not modified to contain only lowercase letters.
# File lib/kramdown/parser/html.rb 111 def parse_html_attributes(str, line = nil, in_html_tag = true) 112 attrs = Utils::OrderedHash.new 113 str.scan(HTML_ATTRIBUTE_RE).each do |attr, sep, val| 114 attr.downcase! if in_html_tag 115 if attrs.has_key?(attr) 116 warning("Duplicate HTML attribute '#{attr}' on line #{line || '?'} - overwriting previous one") 117 end 118 attrs[attr] = val || "" 119 end 120 attrs 121 end
Parse raw HTML from the current source position, storing the found elements in el
. Parsing continues until one of the following criteria are fulfilled:
-
The end of the document is reached.
-
The matching end tag for the element
el
is found (only used ifel
is an HTML element).
When an HTML start tag is found, processing is deferred to handle_html_start_tag
, providing the block given to this method.
# File lib/kramdown/parser/html.rb 147 def parse_raw_html(el, &block) 148 @stack.push(@tree) 149 @tree = el 150 151 done = false 152 while !@src.eos? && !done 153 if result = @src.scan_until(HTML_RAW_START) 154 add_text(result, @tree, :text) 155 line = @src.current_line_number 156 if result = @src.scan(HTML_COMMENT_RE) 157 @tree.children << Element.new(:xml_comment, result, nil, :category => :block, :location => line) 158 elsif result = @src.scan(HTML_INSTRUCTION_RE) 159 @tree.children << Element.new(:xml_pi, result, nil, :category => :block, :location => line) 160 elsif @src.scan(HTML_TAG_RE) 161 if method(:handle_html_start_tag).arity.abs >= 1 162 handle_html_start_tag(line, &block) 163 else 164 handle_html_start_tag(&block) # DEPRECATED: method needs to accept line number in 2.0 165 end 166 elsif @src.scan(HTML_TAG_CLOSE_RE) 167 if @tree.value == (HTML_ELEMENT[@tree.value] ? @src[1].downcase : @src[1]) 168 done = true 169 else 170 add_text(@src.matched, @tree, :text) 171 warning("Found invalidly used HTML closing tag for '#{@src[1]}' on line #{line} - ignoring it") 172 end 173 else 174 add_text(@src.getch, @tree, :text) 175 end 176 else 177 add_text(@src.rest, @tree, :text) 178 @src.terminate 179 warning("Found no end tag for '#{@tree.value}' on line #{@tree.options[:location]} - auto-closing it") if @tree.type == :html_element 180 done = true 181 end 182 end 183 184 @tree = @stack.pop 185 end