For selected domains we offer a parameter that parses the data and returns structured JSON format.
You enable the parsing simply by adding autoparse=true to your request.
Available domains:
Google
Amazon
Walmart
Ebay
Redfin
Search Result
Product Pages
Product Pages
Products Pages
'For Sale' Listings
News Results
Search Results
Category Pages
Search Results
Job Results
Offers
Search Results
Shopping Results
Product Reviews
Google Maps
API REQUEST
import requests
payload = {'api_key': 'APIKEY', 'autoparse': 'true', 'url':'https://www.amazon.com/dp/B07V1PHM66'}
r = requests.get('https://api.scraperapi.com', params=payload)
print(r.text)
# Scrapy users can simply replace the urls in their start_urls and parse function
# ...other scrapy setup code
start_urls = ['https://api.scraperapi.com?api_key=APIKEY&url=' + url + 'autoparse=true']
def parse(self, response):
# ...your parsing logic here
yield scrapy.Request('http://api.scraperapi.com/?api_key=APIKEY&url=' + url + 'autoparse=true', self.parse)
PROXY MODE
import requests
proxies = {
"http": "http://scraperapi.autoparse=true:APIKEY@proxy-server.scraperapi.com:8001"
}
r = requests.get('https://www.amazon.com/dp/B07V1PHM66', proxies=proxies, verify=False)
print(r.text)
# Scrapy users can likewise simply pass their API key in headers.
# NB: Scrapy skips SSL verification by default.
# ...other scrapy setup code
start_urls = ['https://www.amazon.com/dp/B07V1PHM66']
meta = {
"proxy": "http://scraperapi.autoparse=true:APIKEY@proxy-server.scraperapi.com:8001"
}
def parse(self, response):
# ...your parsing logic here
yield scrapy.Request(url, callback=self.parse, headers=headers, meta=meta)
SDK METHOD
// from scraperapi_sdk import ScraperAPIClient
client = ScraperAPIClient('APIKEY')
result = client.get(url = 'https://www.amazon.com/dp/B07V1PHM66', autoparse=true).text
print(result)
# Scrapy users can simply replace the urls in their start_urls and parse function
# Note for Scrapy, you should not use DOWNLOAD_DELAY and
# RANDOMIZE_DOWNLOAD_DELAY, these will lower your concurrency and are not
# needed with our API
# ...other scrapy setup code
start_urls =[client.scrapyGet(url = 'https://www.amazon.com/dp/B07V1PHM66', autoparse=true)]
def parse(self, response):
# ...your parsing logic here
yield scrapy.Request(client.scrapyGet(url = 'https://www.amazon.com/dp/B07V1PHM66', autoparse=true), self.parse)
We recommend using our Structured Data Endpoints instead of the autoparse parameter.