LogoLogo
Release NotesDataPipelineFAQs
Python
Python
  • Making Requests
    • API Endpoint Method
    • Proxy Port Method
    • SDK Method
    • Async Requests Method
      • How to use
      • Callbacks
      • API Parameters
      • Async Batch Requests
      • Decoding
    • Structured Data Collection Method
      • Amazon Product Page API
      • Amazon Search API
      • Amazon Offers API
      • Amazon Reviews API
      • Ebay Product Page API
      • Ebay Search API
      • Google SERP API
      • Google News API
      • Google Jobs API
      • Google Shopping API
      • Google Maps Search API
      • Redfin Agent Details API
      • Redfin 'For Rent' Listings API
      • Redfin 'For Sale' Listings API
      • Redfin Listing Search API
      • Walmart Search API
      • Walmart Category API
      • Walmart Product API
      • Walmart Reviews API
    • Async Structured Data Collection Method
      • Amazon Product Page (Async)
      • Amazon Search API (Async)
      • Amazon Offers API (Async)
      • Amazon Reviews API (Async)
      • Ebay Product Page API (Async)
      • Ebay Search API (Async)
      • Google Search API (Async)
      • Google News API (Async)
      • Google Jobs API (Async)
      • Google Shopping API (Async)
      • Google Maps Search API (Async)
      • Redfin Agent Details API (Async)
      • Redfin 'For Rent' Listings API (Async)
      • Redfin 'For Sale' Listings API (Async)
      • Redfin Listing Search API (Async)
      • Walmart Search API (Async)
      • Walmart Category API (Async)
      • Walmart Product API (Async)
      • Walmart Reviews API (Async)
    • Making POST/PUT Requests
    • Customizing Requests
      • Amazon ZIP Code Targeting
      • Cached Results
      • Cost Control
      • Custom Headers
      • Device Type
      • Geotargeting
      • Geotargeting (Premium)
      • Parameters as Headers
      • Premium Residential/Mobile Proxy Pools
      • Rendering Javascript
        • Render Instruction Set
        • Screenshot Capture🆕
      • Sessions
  • Handling and Processing Responses
    • API Status Codes
    • Output Formats
      • JSON Response - Autoparse 📜
      • LLM Output Formats 💻
    • Response Encoding and Content-Type
  • Dashboard & Billing
    • API Key
    • Credit Usage
    • Delete Account
    • Invoice History
    • Billing Email
    • Billing Adress
    • VAT Number
    • Payment Method
    • Cancel Subscription
  • Credits and Requests
  • Account Information
  • Documentation Overview
Powered by GitBook

Quick links

  • Homepage
  • Dashboard
  • Pricing
  • Contact Sales

Resources

  • Developer Guides
  • Blog
  • Learning Hub
  • Contact Support
On this page

Was this helpful?

  1. Making Requests
  2. Async Requests Method

How to use

Submit an async job

A simple example showing how to submit a job for scraping and receive a status endpoint URL through which you can poll for the status (and later the result) of your scraping job:

import requests

r = requests.post(url = 'https://async.scraperapi.com/jobs', json={ 'apiKey': 'xxxxxx', 'url': 'https://example.com' })
print(r.text)

You can also send POST requests to the Async scraper by using the parameter “method”: “POST”. Here is an example on how to make a POST request to the Async scraper:

import requests

r = requests.post(url = 'https://async.scraperapi.com/jobs', json={ 'apiKey': 'xxxxxx', 'url': 'https://example.com', 'method': 'POST', 'body': 'var1=value1&var2=value2' })
print(r.text)

Response:

{
"id":"0962a8e0-5f1a-4e14-bf8c-5efcc18f0953",
"status":"running",
"statusUrl":"https://async.scraperapi.com/jobs/0962a8e0-5f1a-4e14-bf8c-5efcc18f0953",
"url":"https://example.com"
}

Note the statusUrl field in the response. That is a personal URL to retrieve the status and results of your scraping job. Invoking that endpoint provides you with the status first:

import requests

r = requests.get(url = 'https://async.scraperapi.com/jobs/0962a8e0-5f1a-4e14-bf8c-5efcc18f0953')
print(r.text)

Response:

{
"id":"0962a8e0-5f1a-4e14-bf8c-5efcc18f0953",
"status":"running",
"statusUrl":"https://async.scraperapi.com/jobs/0962a8e0-5f1a-4e14-bf8c-5efcc18f0953",
"url":"https://example.com"
}

You can include a meta object in your request to store custom data (your own request ID for example), which will be returned in the response as well.

Once your job is finished, the response will change and will contain the results of your scraping job:

{
"id":"0962a8e0-5f1a-4e14-bf8c-5efcc18f0953",
"status":"finished",
"statusUrl":"https://async.scraperapi.com/jobs/0962a8e0-5f1a-4e14-bf8c-5efcc18f0953",
"url":"https://example.com",
"response":{
"headers":{
"date":"Thu, 14 Apr 2022 11:10:44 GMT",
"content-type":"text/html; charset=utf-8",
"content-length":"1256",
"connection":"close",
"x-powered-by":"Express",
"access-control-allow-origin":"undefined","access-control-allow-headers":"Origin, X-Requested-With, Content-Type, Accept",
"access-control-allow-methods":"HEAD,GET,POST,DELETE,OPTIONS,PUT",
"access-control-allow-credentials":"true",
"x-robots-tag":"none",
"sa-final-url":"https://example.com/",
"sa-statuscode":"200",
"etag":"W/\"4e8-Sjzo7hHgkd15I/TYxuW15B7HwEc\"",
"vary":"Accept-Encoding"
},
"body":"<!doctype html>\n<html>\n<head>\n	<title>Example Domain</title>\n\n	<meta charset=\"utf-8\" />\n	<meta http-equiv=\"Content-type\" content=\"text/html; charset=utf-8\" />\n	<meta name=\"viewport\" content=\"width=device-width, initial-scale=1\" />\n	<style type=\"text/css\">\n	body {\n		background-color: #f0f0f2;\n		margin: 0;\n		padding: 0;\n		font-family: -apple-system, system-ui, BlinkMacSystemFont, \"Segoe UI\", \"Open Sans\", \"Helvetica Neue\", Helvetica, Arial, sans-serif;\n		\n	}\n	div {\n		width: 600px;\n		margin: 5em auto;\n		padding: 2em;\n		background-color: #fdfdff;\n		border-radius: 0.5em;\n		box-shadow: 2px 3px 7px 2px rgba(0,0,0,0.02);\n	}\n	a:link, a:visited {\n		color: #38488f;\n		text-decoration: none;\n	}\n	@media (max-width: 700px) {\n		div {\n			margin: 0 auto;\n			width: auto;\n		}\n	}\n	</style>	\n</head>\n\n<body>\n<div>\n	<h1>Example Domain</h1>\n	<p>This domain is for use in illustrative examples in documents. You may use this\n	domain in literature without prior coordination or asking for permission.</p>\n	<p><a href=\"https://www.iana.org/domains/example\">More information...</a></p>\n</div>\n</body>\n</html>\n",
"statusCode":200
}
}

Please note that the response for an Async job is stored for up to 72 hours (24hrs guaranteed) or until you retrieve the results, whichever comes first. If you do not get the response in due time, it will be deleted from our side and you will have to send another request for the same job.

If callbacks are used and the results are successfully delivered, we automatically delete the results.

Last updated 1 month ago

Was this helpful?