1 of 62

Making Requests

Learn how to make requests using ScraperAPI. Sign up for a free trial to get 5,000 free API credits.

Using ScraperAPI is easy. Just send the URL you would like to scrape to the API along with your API key and the API will return the HTML response from the URL you want to scrape.

ScraperAPI uses API keys to authenticate requests. To use the API you need to sign up for an account and include your unique API key in every request.

If you haven’t signed up for an account yet, then sign up for a free trial here with 5,000 free API credits!

You can use the API to scrape web pages, API endpoints, images, documents, PDFs, or other files just as you would any other URL. Note: there is a 2MB limit per request.

There are five ways in which you can send GET requests to ScraperAPI:

Via our Async Scraper service http://async.scraperapi.com
Via our API endpoint http://api.scraperapi.com?
Via one of our SDKs (only available for some programming languages)
Via our proxy port http://scraperapi:APIKEY@proxy-server.scraperapi.com:8001
Via our Structured Data service https://api.scraperapi.com/structured/

Choose whichever option best suits your scraping requirements.

Important note: regardless of how you invoke the service, we highly recommend you set a 70 seconds timeout in your application to get the best possible success rates, especially for some hard-to-scrape domains.

API Endpoint Method

ScraperAPI exposes a single API endpoint for you to send GET requests. Simply send a GET request to http://api.scraperapi.com with two query string parameters and the API will return the HTML response for that URL:

api_key which contains your API key, and
url which contains the url you would like to scrape

You should format your requests to the API endpoint as follows:

curl "http://api.scraperapi.com?api_key=APIKEY&url=http://httpbin.org/ip"

To enable other API functionality when sending a request to the API endpoint simply add the appropriate query parameters to the end of the ScraperAPI URL.

For example, if you want to enable Javascript rendering with a request, then add render=true to the request:

curl "http://api.scraperapi.com/?api_key=APIKEY&url=http://httpbin.org/ip&render=true"

To use two or more parameters, simply separate them with the “&” sign.

curl "http://api.scraperapi.com?api_key=APIKEY&url=http://httpbin.org/ip&render=true&country_code=us"

Async Requests Method

To ensure a higher level of successful requests when using our scraper, we’ve built a new product, Async Scraper. Rather than making requests to our endpoint waiting for the response, this endpoint submits a job of scraping, in which you can later collect the data from using our status endpoint.

Scraping websites can be a difficult process; it takes numerous steps and significant effort to get through some sites’ protection which sometimes proves to be difficult with the timeout constraints of synchronous APIs. The Async Scraper will work on your requested URLs until we have achieved a 100% success rate (when applicable), returning the data to you.

Async Scraping is the recommended way to scrape pages when success rate on difficult sites is more important to you than response time (e.g. you need a set of data periodically).

How to use

Submit an async job

A simple example showing how to submit a job for scraping and receive a status endpoint URL through which you can poll for the status (and later the result) of your scraping job:

curl -X POST -H "Content-Type: application/json" -d '{"apiKey": "xxxxxx", "url": "https://example.com"}' "https://async.scraperapi.com/jobs"

You can also send POST requests to the Async scraper by using the parameter “method”: “POST”. Here is an example on how to make a POST request to the Async scraper:

curl -X POST -H "Content-Type: application/json" -d '{"apiKey": "xxxxxx", "url": "https://example.com", "method": "POST", "body": "var1=value1&var2=value2"}' "https://async.scraperapi.com/jobs"

Response:

{
"id":"0962a8e0-5f1a-4e14-bf8c-5efcc18f0953",
"status":"running",
"statusUrl":"https://async.scraperapi.com/jobs/0962a8e0-5f1a-4e14-bf8c-5efcc18f0953",
"url":"https://example.com"
}

Note the statusUrl field in the response. That is a personal URL to retrieve the status and results of your scraping job. Invoking that endpoint provides you with the status first:

curl "https://async.scraperapi.com/jobs/0962a8e0-5f1a-4e14-bf8c-5efcc18f0953"

Response:

{
"id":"0962a8e0-5f1a-4e14-bf8c-5efcc18f0953",
"status":"running",
"statusUrl":"https://async.scraperapi.com/jobs/0962a8e0-5f1a-4e14-bf8c-5efcc18f0953",
"url":"https://example.com"
}

You can include a meta object in your request to store custom data (your own request ID for example), which will be returned in the response as well.

Once your job is finished, the response will change and will contain the results of your scraping job:

{
"id":"0962a8e0-5f1a-4e14-bf8c-5efcc18f0953",
"status":"finished",
"statusUrl":"https://async.scraperapi.com/jobs/0962a8e0-5f1a-4e14-bf8c-5efcc18f0953",
"url":"https://example.com",
"response":{
"headers":{
"date":"Thu, 14 Apr 2022 11:10:44 GMT",
"content-type":"text/html; charset=utf-8",
"content-length":"1256",
"connection":"close",
"x-powered-by":"Express",
"access-control-allow-origin":"undefined","access-control-allow-headers":"Origin, X-Requested-With, Content-Type, Accept",
"access-control-allow-methods":"HEAD,GET,POST,DELETE,OPTIONS,PUT",
"access-control-allow-credentials":"true",
"x-robots-tag":"none",
"sa-final-url":"https://example.com/",
"sa-statuscode":"200",
"etag":"W/\"4e8-Sjzo7hHgkd15I/TYxuW15B7HwEc\"",
"vary":"Accept-Encoding"
},
"body":"<!doctype html>\n<html>\n<head>\n	<title>Example Domain</title>\n\n	<meta charset=\"utf-8\" />\n	<meta http-equiv=\"Content-type\" content=\"text/html; charset=utf-8\" />\n	<meta name=\"viewport\" content=\"width=device-width, initial-scale=1\" />\n	<style type=\"text/css\">\n	body {\n		background-color: #f0f0f2;\n		margin: 0;\n		padding: 0;\n		font-family: -apple-system, system-ui, BlinkMacSystemFont, \"Segoe UI\", \"Open Sans\", \"Helvetica Neue\", Helvetica, Arial, sans-serif;\n		\n	}\n	div {\n		width: 600px;\n		margin: 5em auto;\n		padding: 2em;\n		background-color: #fdfdff;\n		border-radius: 0.5em;\n		box-shadow: 2px 3px 7px 2px rgba(0,0,0,0.02);\n	}\n	a:link, a:visited {\n		color: #38488f;\n		text-decoration: none;\n	}\n	@media (max-width: 700px) {\n		div {\n			margin: 0 auto;\n			width: auto;\n		}\n	}\n	</style>	\n</head>\n\n<body>\n<div>\n	<h1>Example Domain</h1>\n	<p>This domain is for use in illustrative examples in documents. You may use this\n	domain in literature without prior coordination or asking for permission.</p>\n	<p><a href=\"https://www.iana.org/domains/example\">More information...</a></p>\n</div>\n</body>\n</html>\n",
"statusCode":200
}
}

Please note that the response for an Async job is stored for up to 72 hours (24hrs guaranteed) or until you retrieve the results, whichever comes first. If you do not get the response in due time, it will be deleted from our side and you will have to send another request for the same job.

If callbacks are used and the results are successfully delivered, we automatically delete the results.

Callbacks

Using a status URL is a great way to test the API or get started quickly, but some customer environments may require some more robust solutions, so we implemented callbacks. Currently only webhook callbacks are supported but we are planning to introduce more over time (e.g. direct database callbacks, AWS S3, etc).

An example of using a webhook callback:

curl -X POST -H "Content-Type: application/json" -d '{"apiKey": "xxxxxx", "url": "https://example.com", "callback": {"type": "webhook", "url": "https://yourcompany.com/scraperapi"}}' "https://async.scraperapi.com/jobs"

Using a callback you don’t need to use the status URL (although you still can) to fetch the status and results of the job. Once the job is finished the provided webhook URL will be invoked by our system with the same content as the status URL provides.

Just replace the https://yourcompany.com/scraperapi URL with your preferred endpoint. You can even add basic auth to the URL in the following format: https://user:pass@yourcompany.com/scraperapi

By default, we'll call the webhook URL you provide for successful requests. If you'd like to receive data on failed attempts too, you will have to include the expectsUnsuccessReport: true parameter in your request structure.

An example of using callbacks that report on the failed attempts as well:

curl -X POST -H "Content-Type: application/json" -d '{
  "apiKey": "API_KEY",
  "url": "https://httpbin.org/status/500",
  "callback": {
    "type": "webhook",
    "url": "YYYYY"
  },
  "expectsUnsuccessReport": true
}' "https://async.scraperapi.com/jobs"

Response:

{
  id: 'db726d84-2609-4709-93cb-88c4a9fa4288',
  attempts: 50,
  status: 'failed',
  statusUrl: 'https://async.scraperapi.com/jobs/db726d84-2609-4709-93cb-88c4a9fa4288',
  failReason: 'failed_due_to_timeout',
  url: 'http://httpbin.org/status/500'
}

Note: The system will try to invoke the webhook URL 3 times, then it cancels the job. So please make sure that the webhook URL is available through the public internet and will be capable of handling the traffic that you need.

Hint: Webhook.site is a free online service to test webhooks without requiring you to build a complex infrastructure.

API Parameters

You can use the usual API parameters just the same way you’d use it with our synchronous API. These parameters should go into an apiParams object inside the POST data, e.g:

{
"apiKey": "xxxxxx",
"apiParams": {
"autoparse": false, // boolean
"binary_target": false, // boolean
"country_code": "us", // string, see: https://api.scraperapi.com/geo
"device_type": "desktop", // desktop | mobile
"follow_redirect": false, // boolean
"premium": true, // boolean
"render": false, // boolean
"retry_404": false // boolean
},
"url": "https://example.com"
}

Async Batch Requests

We have created a separate endpoint that accepts an array of URLs instead of just one to initiate scraping of multiple URLs at the same time: https://async.scraperapi.com/batchjobs. The API is almost the same as the single endpoint, but we expect an array of strings in the urls field instead of a string in url.

curl -X POST -H "Content-Type: application/json" -d '{"apiKey": "xxxxxx", "urls": [ "https://example.com/page1", "https://example.com/page2" ]}' "https://async.scraperapi.com/batchjobs"

As a response you’ll also get an array of the same response that you get using our single job endpoint:

[
{
"id":"0962a8e0-5f1a-4e14-bf8c-5efcc18f0953",
"status":"running",
"statusUrl":"https://async.scraperapi.com/jobs/0962a8e0-5f1a-4e14-bf8c-5efcc18f0953",
"url":"https://example.com/page1"
},
{
"id":"238d54a1-62af-41a9-b0b4-63f240bad439",
"status":"running",
"statusUrl":"https://async.scraperapi.com/jobs/238d54a1-62af-41a9-b0b4-63f240bad439",
"url":"https://example.com/page2"
}
]

We recommend sending a maximum of 50,000 URLs in one batch job.

Decoding

The responses returned by the Async API for binary requests require you to decode the data as they are encoded using Base64 encoding. This allows the binary data to be sent as a text string, which can then be decoded back into its original form when you want to use it.

Example request:

curl \
-H "content-type: application/json" \
--data '{"apiKey": "API_KEY", "url": "https://pdfobject.com/pdf/sample.pdf" }' \
https://async.scraperapi.com/jobs

Decode response:

curl https://async.scraperapi.com/jobs/<JOB_ID> \
  | jq -r '.response.base64EncodedBody' \
  | base64 -d > name.pdf

Structured Data Collection Method

To make it even easier to get structured content, we created custom endpoints within our API that provide a shorthand method of retrieving content from supported domains. This method is ideal for users that need to receive structured data in JSON or CSV.

Structured Data Endpoints

Amazon Endpoints:

Amazon Product Page API
Amazon Search API
Amazon Offers API
Amazon Reviews API

Ebay Endpoints:

Ebay Product Page API
Ebay Search API

Google Endpoints:

Google Search API
Google News API
Google Jobs API
Google Shopping API
Google Maps Search API

Redfin Endpoints:

Redfin 'For Rent' Listings API
Redfin 'For Sale' Listings API
Redfin Listing Search API

Walmart Endpoints:

Walmart Search API
Walmart Category API
Walmart Product API
Walmart Reviews API

Amazon Product Page API

This endpoint will retrieve product data from an Amazon product page and transform it into usable JSON. It also provides links to all variants of the product (if any).

curl "https://api.scraperapi.com/structured/amazon/product?api_key=API_KEY&asin=ASIN&country=COUNTRY&tld=TLD"

Parameters

Details

API_KEY (required)

User's normal API Key

ASIN (required)

= Amazon Standard Identification Number. Please not that ASIN's are market specific (TLD). You can usually find the ASINs in the URL of an Amazon product (example: B07FTKQ97Q)

TLD

Amazon market to be scraped.

Valid values include:

com (amazon.com)

co.uk (amazon.co.uk)

ca (amazon.ca)

de (amazon.de)

es (amazon.es)

fr (amazon.fr)

it (amazon.it)

co.jp (amazon.co.jp)

in (amazon.in)

cn (amazon.cn)

com.sg (amazon.com.sg)

com.mx (amazon.com.mx)

ae (amazon.ae)

com.br (amazon.com.br)

nl (amazon.nl)

com.au (amazon.com.au)

com.tr (amazon.com.tr)

sa (amazon.sa)

se (amazon.se)

pl (amazon.pl)

COUNTRY

Valid values are two letter country codes for which we offer Geo Targeting (e.g. “au”, “es”, “it”, etc.).

Where an amazon domain needs to be scraped from another country (e.g. scraping amazon.com from Canada to get Canadian shipping information), both TLD and COUNTRY parameters must be specified.

OUTPUT_FORMAT

For structured data methods we offer CSV and JSON output. JSON is default if parameter is not added. Options:

csv
json

ZIP Code Targeting

To find out mote about ZIP Code targeting, please follow this link

Sample Response

{
  "name": "Animal Adventure | Sweet Seats | Pink Owl Children's Plush Chair",
  "product_information": {
    "product_dimensions": "14 x 19 x 20 inches",
    "color": "Pink",
    "material": "95% Polyurethane Foam, 5% Polyester Fibers",
    "style": "Pink Owl",
    "product_care_instructions": "Machine Wash",
    "number_of_items": "1",
    "brand": "Animal Adventure",
    "fabric_type": "Polyurethane",
    "unit_count": "1.0 Count",
    "item_weight": "1.9 pounds",
    "asin": "B06X3WKY59",
    "item_model_number": "49559",
    "manufacturer_recommended_age": "18 months - 4 years",
    "best_sellers_rank": [
      "#9,307 in Home & Kitchen (See Top 100 in Home & Kitchen)",
      "#69 in Kids' Furniture"
    ],
    "customer_reviews": {
      "ratings_count": 5665,
      "stars": "4.7 out of 5 stars"
    },
    "is_discontinued_by_manufacturer": "No",
    "release_date": "January 1, 2018",
    "manufacturer": "Animal Adventure"
  },
  "brand": "Visit the Animal Adventure Store",
  "brand_url": "https://www.amazon.com/stores/AnimalAdventure/page/8871D37C-D748-4489-8FBA-8D6A62092857?ref_=ast_bln",
  "full_description": "Our Sweet Seats character chairs double as adorable room décor and a child’s favorite plush pal. Rich fabrics, sweet dimensional faces and a soft yet sturdy design make these character chairs ideal for kids 18 months +, and parents will be thrilled to learn that they can easily wash and care for their child’s new favorite seat. WASHING INSTRUCTIONS: Simply take a paper clip, hook through the safety zipper and pull to open the character chair cover. Carefully cut the tack stitch thread that secures the cover to the center foam block with a pair of scissors. Once all tacks have been cut, you can remove the foam blocks. Once you’ve removed all of the inner foam pieces, your cover is ready to wash. For best results, we recommend using a delicates bag in a machine wash cold setting, and afterwards laying the cover flat to air dry.",
  "pricing": "",
  "list_price": "",
  "shipping_price": null,
  "availability_status": "",
  "images": [
    "https://m.media-amazon.com/images/I/51RzYQSm5oL.jpg",
    "https://m.media-amazon.com/images/I/416w6PJNGWL.jpg",
    "https://m.media-amazon.com/images/I/51vmBiGcqNL.jpg",
    "https://m.media-amazon.com/images/I/51Zm4EdynxL.jpg",
    "https://m.media-amazon.com/images/I/510JJx9LuiL.jpg",
    "https://m.media-amazon.com/images/I/51JwT+-IwnL.jpg"
  ],
  "product_category": "Home & Kitchen › Furniture › Kids' Furniture",
  "average_rating": 4.7,
  "small_description": "About this item \n \nPolyurethane Imported Plush slip cover is removable and washable. Zipper closure is child-safe (parents can easily open closer with a simple paper clip). Lightweight and easy to move. The perfect size (14\"L x 19\"W x 20\"H). Ages 18 months and up.",
  "feature_bullets": [
    "Polyurethane",
    "Imported",
    "Plush slip cover is removable and washable.",
    "Zipper closure is child-safe (parents can easily open closer with a simple paper clip).",
    "Lightweight and easy to move.",
    "The perfect size (14\"L x 19\"W x 20\"H).",
    "Ages 18 months and up."
  ],
  "total_reviews": 5665,
  "total_answered_questions": 111,
  "model": "49559",
  "customization_options": {
    "style": [
      {
        "is_selected": true,
        "url": null,
        "value": "Pink Owl",
        "price_string": "",
        "price": 0,
        "image": null
      },
      {
        "is_selected": false,
        "url": "https://www.amazon.com/dp/B0731Y59HG/ref=twister_B0B9FV38F9?_encoding=UTF8&psc=1",
        "value": "Teal Owl",
        "price_string": "",
        "price": 0,
        "image": null
      }
    ]
  },
  "seller_id": null,
  "seller_name": null,
  "fulfilled_by_amazon": null,
  "fast_track_message": "",
  "aplus_present": true
}

Amazon Offers API

This endpoint will retrieve offers for a specified product from an Amazon offers page and transform it into usable JSON.