Use Render Instruction Set to Scrape Dynamic Pages in Ruby
Learn to scrape dynamic pages using ScraperAPI’s Render Instruction Set in Ruby. Automate form input, clicks, scrolling, and waits to interact with JS-heavy sites.
The Render Instruction Set is a set of instructions that can be used to instruct the browser on specific actions to execute during page rendering. By combining these instructions, you can execute complex operations such as completing a search form or scrolling through an endlessly scrolling page. This capability enables efficient automation of dynamic web content interactions.
How to use
To send an instruction set to the browser, you send a JSON object to the API as a header, along with any other necessary parameters, including the "render=true" parameter.
In the following example, we enter a search term into a form, click the search icon, and then wait for the search results to load.

[
{
"type": "input",
"selector": { "type": "css", "value": "#searchInput" },
"value": "cowboy boots"
},
{
"type": "click",
"selector": {
"type": "css",
"value": "#search-form button[type=\"submit\"]"
}
},
{
"type": "wait_for_selector",
"selector": { "type": "css", "value": "#content" }
}
]
To send the above instruction set to our API endpoint, it must be formatted as a single string and passed as a header.
API REQUEST
require 'net/http'
require 'json'
# Define parameters including the target URL
params = {
url: "https://httpbin.org/ip" # Specify the URL you want to scrape
}
# Set the API endpoint URI (HTTPS)
uri = URI('https://api.scraperapi.com/')
uri.query = URI.encode_www_form(params) # Encode parameters into the query string
# Create a new GET request
req = Net::HTTP::Get.new(uri)
# Set Scraper API headers
req['x-sapi-render'] = 'true' # Enable rendering
req['x-sapi-api_key'] = '<YOUR_API_KEY>' # Replace with your actual Scraper API key
req['x-sapi-instruction_set' = '[{"type": "input", "selector": {"type": "css", "value": "#searchInput"}, "value": "cowboy boots"}, {"type": "click", "selector": {"type": "css", "value": "#search-form button[type=\\\"submit\\\"]"}}, {"type": "wait_for_selector", "selector": {"type": "css", "value": "#content"}}]'
# Create an HTTPS connection
http = Net::HTTP.new(uri.hostname, uri.port)
http.use_ssl = true # Enable SSL/TLS encryption
http.verify_mode = OpenSSL::SSL::VERIFY_PEER # Verify the server's certificate
# Perform the HTTPS request and store the response
website_content = http.request(req)
# Output the response body (the scraped content)
puts website_content.body
PROXY MODE
require 'httparty'
# Define headers with the required parameters
headers = {
"x-sapi-render" => "true",
"x-sapi-instruction_set" => '[{"type": "input", "selector": {"type": "css", "value": "#searchInput"}, "value": "cowboy boots"}, {"type": "click", "selector": {"type": "css", "value": "#search-form button[type=\\\"submit\\\"]"}}, {"type": "wait_for_selector", "selector": {"type": "css", "value": "#content"}}]'
}
# Set default options for HTTParty, disabling SSL verification
HTTParty::Basement.default_options.update(verify: false)
# Make the HTTP request with HTTParty
response = HTTParty.get('http://httpbin.org/ip', {
http_proxyaddr: "proxy-server.scraperapi.com",
http_proxyport: "8001",
http_proxyuser: "scraperapi",
http_proxypass: "<YOUR_API_KEY>",
headers: headers
})
# Capture the response body
results = response.body
# Output the response body
puts results
Supported Instructions
Browser instructions are organized as an array of objects within the instruction set, each with a specific structure. Below are the various instructions and the corresponding data they require:
Click
Click on an element on the page.
Args
type: str = "click" selector: dict type: Enum["xpath", "css", "text"] value: str timeout: int (optional)
Example
[{
"type": "click", "selector": { "type": "css", "value": "#search-form button[type="submit\"]" } }]
Input
Enter a value into an input field on the page.
Args
type: str = "input" selector: dict type: Enum["xpath", "css", "text"] value: str value: str timeout: int (optional)
Example
[{
"type": "input", "selector": { "type": "css", "value": "#searchInput" }, "value": "cowboy boots" }]
Loop
Execute a set of instructions in a loop a specified number of times by using the loop instruction with a sequence of standard instructions in the “instructions” argument.
Note that nesting loops isn't supported, so you can't create a “loop” instruction inside another “loop” instruction. This method is effective for automating actions on web pages with infinitely scrolling content, such as loading multiple pages of results by scrolling to the bottom of the page and waiting for additional content to load.
Args
type: str="loop" for: int instructions: array
Example
[{
"type": "loop", "for": 3, "instructions": [ { "type": "scroll", "direction": "y", "value": "bottom" }, { "type": "wait", "value": 5 } ] }]
Scroll
Scroll the page in the X (horizontal) or Y (vertical) direction, by a given number of pixels or to the top or bottom of the page. You can also scroll to a given element by adding a selector.
Args
type: str = "scroll"
direction: Enum["x", "y"]
value: int or Enum["bottom", "top"]
selector: dict type: Enum["xpath", "css", "text"] value: str
Example
[{ "type": "scroll", "direction": "y", "value": "bottom" }]
[{ "type": "scroll", "selector": {
"type":"css",
"value":"#payment-container" } }]
Wait
Waits for a given number of seconds to elapse.
Args
type: str = "wait" value: int
Example
[{ "type": "wait", "value": 10 }]
Wait_for_event
Waits for an event to occur within the browser.
Args
type: str = "wait_for_event" event: Enum["domcontentloaded", "load", "navigation", "networkidle", "stabilize"]
timeout: int (in seconds, optional) seconds:int (in seconds, optional in combination with stabilize event)
Example
[{ "type": "wait_for_event", "event": "networkidle", "timeout": 10 }] [{ "type": "wait_for_event", "event": "stabilize", "seconds": 10 }]
- domcontentloaded = initial HTML loaded - load = full page load - navigation = page navigation - networkidle = network requests stopped - stabilize = page reaches steady state (default is 5 seconds, max. is 30 seconds)
Wait_for_selector
Waits for an element to appear on the page. Takes a 'value' argument that instructs the rendering engine to wait for a specified number of seconds for the element to appear.
Args
type: str = "wait_for_selector" selector: dict type: Enum["xpath", "css", "text"] value: str timeout: int (optional)
Example
[{ "type": "wait_for_selector", "selector": { "type": "css", "value": "#content"
}, "timeout": 5 }]
Last updated
Was this helpful?