Class: WebGetToolDecorator
- Inherits:
-
ToolDecorator
- Object
- ToolDecorator
- WebGetToolDecorator
- Defined in:
- app/decorators/web_get_tool_decorator.rb
Overview
Transforms Tools::WebGet responses for LLM consumption by detecting the Content-Type header and applying format-specific conversion.
Content-Type maps to a method name via simple string normalization:
"application/json" → {#application_json}
"text/html" → {#text_html}
"text/plain" → method_missing → passthrough
Adding a new format = adding one method. Unknown types fall through #method_missing and pass through unchanged.
Constant Summary collapse
- RENDER_TAGS =
Tags that never contain readable content — always removed.
%w[script style noscript iframe svg].freeze
- STRUCTURAL_TAGS =
Structural elements stripped only when no semantic content container is found.
%w[nav footer aside form header menu menuitem].freeze
- CONTENT_SELECTORS =
Semantic HTML5 containers in preference order (first match wins).
["main", "article", "[role='main']"].freeze
Constants inherited from ToolDecorator
Instance Method Summary collapse
-
#application_json(body) ⇒ Hash
Compresses JSON using TOON (Token-Optimized Object Notation) for ~40% token savings on typical JSON arrays.
-
#call(result) ⇒ String
LLM-optimized content with conversion metadata tag.
-
#decorate(body, content_type:) ⇒ Hash
Dispatches to the format-specific method derived from Content-Type.
-
#method_missing(_method_name, body) ⇒ Hash
Passthrough for unregistered content types.
- #respond_to_missing? ⇒ Boolean
-
#text_html(body) ⇒ Hash
Strips noise elements and converts semantic HTML to Markdown.
Methods inherited from ToolDecorator
Dynamic Method Handling
This class handles dynamic methods through the method_missing method
#method_missing(_method_name, body) ⇒ Hash
Passthrough for unregistered content types.
83 84 85 |
# File 'app/decorators/web_get_tool_decorator.rb', line 83 def method_missing(_method_name, body, *) {text: body, meta: nil} end |
Instance Method Details
#application_json(body) ⇒ Hash
Compresses JSON using TOON (Token-Optimized Object Notation) for ~40% token savings on typical JSON arrays.
58 59 60 61 62 63 |
# File 'app/decorators/web_get_tool_decorator.rb', line 58 def application_json(body) parsed = JSON.parse(body) {text: Toon.encode(parsed), meta: "[Converted: JSON → TOON]"} rescue JSON::ParserError {text: body, meta: nil} end |
#call(result) ⇒ String
Returns LLM-optimized content with conversion metadata tag.
33 34 35 36 37 38 39 40 41 |
# File 'app/decorators/web_get_tool_decorator.rb', line 33 def call(result) return result.to_s unless result.is_a?(Hash) && result.key?(:body) body = result[:body].to_s content_type = result[:content_type] || "text/plain" decorated = decorate(body, content_type: content_type) assemble(**decorated) end |
#decorate(body, content_type:) ⇒ Hash
Dispatches to the format-specific method derived from Content-Type.
48 49 50 51 |
# File 'app/decorators/web_get_tool_decorator.rb', line 48 def decorate(body, content_type:) method_name = content_type.split(";").first.strip.tr("/", "_").tr("-", "_") public_send(method_name, body) end |
#respond_to_missing? ⇒ Boolean
87 88 89 |
# File 'app/decorators/web_get_tool_decorator.rb', line 87 def respond_to_missing?(*, **) true end |
#text_html(body) ⇒ Hash
Strips noise elements and converts semantic HTML to Markdown. Warns when the extracted content is suspiciously short.
70 71 72 73 74 75 76 77 78 |
# File 'app/decorators/web_get_tool_decorator.rb', line 70 def text_html(body) markdown = html_to_markdown(body) = "[Converted: HTML → Markdown]" char_count = markdown.length if !body.empty? && char_count < Anima::Settings.min_web_content_chars += " [Warning: only #{char_count} chars extracted — content may be incomplete]" end {text: markdown, meta: } end |