LLM Output Formats
Learn how to use ScraperAPI’s output parameter in Java to generate clean, structured text and markdown responses ideal for training large language models.
To properly train LLMs, a lot of high quality unbiased data is needed. There is a lot of public data that is relevant for LLMs, but at times, that data can be too noisy and too large. Luckily, we have a solution. One that gathers large-scale data and cleans it by removing irrelevant or duplicate content. The result - structured format responses, that can be used to train LLMs effectively. Simply add the parameter output_format=text or output_format=markdown to the request structure. Here are some examples:
API REQUEST
try {
String apiKey = "API_KEY";
String url = "https://api.scraperapi.com?api_key=" + apiKey + "&output_format=OUTPUT_FORMAT&url=https://www.google.com/search?q=Carglass+GmbH+Halle+%28Saale%29+%28Stadtbezirk+Nord%29&hl=en";
URL urlForGetRequest = new URL(url);
String readLine = null;
HttpURLConnection connection = (HttpURLConnection) urlForGetRequest.openConnection();
connection.setRequestMethod("GET");
int responseCode = connection.getResponseCode();
if (responseCode == HttpURLConnection.HTTP_OK) {
BufferedReader in = new BufferedReader(new InputStreamReader(connection.getInputStream()));
StringBuffer response = new StringBuffer();
while ((readLine = in.readLine()) != null) {
response.append(readLine);
}
in.close();
System.out.println(response.toString());
} else {
throw new Exception("Error in API Call");
}
} catch (Exception ex) {
ex.printStackTrace();
}
}
}ASYNC REQUEST
PROXY MODE - COMING SOON!
Sample Response
Markdown:
Text:
Last updated
Was this helpful?

