LLM Output Formats

Learn how to use ScraperAPI’s output parameter in NodeJS to generate clean, structured text and markdown responses ideal for training large language models.

To properly train LLMs, a lot of high quality unbiased data is needed. There is a lot of public data that is relevant for LLMs, but at times, that data can be too noisy and too large. Luckily, we have a solution. One that gathers large-scale data and cleans it by removing irrelevant or duplicate content. The result - structured format responses, that can be used to train LLMs effectively. Simply add the parameter output_format=text or output_format=markdown to the request structure. Here are some examples:

  • API REQUEST

import axios from 'axios';
const url = "https://api.scraperapi.com/";
const params = {
    api_key: "API_KEY",
    output_format: "OUTPUT_FORMAT",
    url: "https://www.google.com/search?q=Carglass+GmbH+Halle+%28Saale%29+%28Stadtbezirk+Nord%29&hl=en"
};
axios.get(url, { params })
.then(response => {
console.log(response.data);
})
.catch(error => {
console.error("Error:", error.message);
});
  • ASYNC REQUEST

import axios from 'axios';
const url = "https://async.scraperapi.com/jobs";
const headers = {
    "Content-Type": "application/json"
};
const data = {
    apiKey: "API_KEY",
    apiParams: {
        output_format: "OUTPUT_FORMAT"
},
url: "https://www.google.com/search?q=Carglass+GmbH+Halle+%28Saale%29+%28Stadtbezirk+Nord%29&hl=en"
};
axios.post(url, data, { headers })
.then(response => {
console.log(response.data); // Print the response
})
.catch(error => {
console.error("Error:", error.message);
});
  • PROXY MODE - COMING SOON!

Sample Response

Markdown:

Text:

Last updated

Was this helpful?