LLM Output Formats
Learn how to use ScraperAPI’s output parameter in Ruby to generate clean, structured text and markdown responses ideal for training large language models.
To properly train LLMs, a lot of high quality unbiased data is needed. There is a lot of public data that is relevant for LLMs, but at times, that data can be too noisy and too large. Luckily, we have a solution. One that gathers large-scale data and cleans it by removing irrelevant or duplicate content. The result - structured format responses, that can be used to train LLMs effectively. Simply add the parameter output_format=text or output_format=markdown to the request structure. Here are some examples:
API REQUEST
require 'net/http'
require 'json'
params = {
:api_key => "API_KEY",
:url => "https://www.google.com/search?q=Carglass+GmbH+Halle+%28Saale%29+%28Stadtbezirk+Nord%29&hl=en",
:output_format => "OUTPUT_FORMAT"
}
uri = URI('https://api.scraperapi.com/')
uri.query = URI.encode_www_form(params)
website_content = Net::HTTP.get(uri)
print(website_content)ASYNC REQUEST
require 'net/http'
require 'json'
require 'uri'
url = URI.parse("https://async.scraperapi.com/jobs")
headers = { "Content-Type" => "application/json" }
data = {
"apiKey" => "API_KEY",
"apiParams" => { "output_format" => "OUTPUT_FORMAT" },
"url" => "https://www.google.com/search?q=Carglass+GmbH+Halle+%28Saale%29+%28Stadtbezirk+Nord%29&hl=en"
}
http = Net::HTTP.new(url.host, url.port)
http.use_ssl = true
request = Net::HTTP::Post.new(url.path, headers)
request.body = data.to_json
response = http.request(request)
puts response.bodyPROXY MODE - COMING SOON!
Sample Response
Markdown:
Text:
Last updated
Was this helpful?

