Self-Hosted (Local)

Prerequisites

  • ScraperAPI Account

  • Python 3.11 +

  • Claude (used in this guide)

  • Docker (optional)

Install the scraperapi-mcp-server using pip:

pip install scraperapi-mcp-server

Setup

If you don't have an account with us yet, head over to scraperapi.comarrow-up-right to create one and grab your API key from the Dashboardarrow-up-right area. You will need it to authenticate the requests that your LLM client will be making.

Now that you have your API Key, add this JSON block to your client configuration file:

Using Python:

{
  "mcpServers": {
    "ScraperAPI": {
      "command": "python",
      "args": ["-m", "scraperapi_mcp_server"],
      "env": {
        "API_KEY": "<YOUR_SCRAPERAPI_API_KEY>"
      }
    }
  }
}

Using Docker:

Configuration for Claude Desktop App:

  1. Open Claude Desktop Application.

  2. Access the Settings Menu.

  3. Click on the settings icon (typically a gear or three dots in the upper right corner).

  4. Select the "Developer" tab.

  5. Click on "Edit Config" and paste the JSON block in the configuration file.

That's it, the MCP Server is fully configured! Include the keyword scrape in a prompt, and the LLM will automatically use ScraperAPI to retrieve the data you need.

If you are interested in digging deeper, you can set up the MCP server locally on your machine. Herearrow-up-right you will find instructions on how to setup, debug and customize the server for advanced development.

Parameters

scrape (required)

Tells the LLM to scrape a URL from the internet using ScraperAPI.

url (requierd)

URL you wish to scrape.

render (optional)

Defaults to False. Set to True if the page requires JavaScript rendering to display its contents.

country_code (optional)

Activate country geotargeting (e.g. “us”, “es”, “uk”, etc.).

premium (optional)

Set to True to use residential IPs with your scrapes.

ultra_premium (optional)

Activates advanced bypass mechanisms when set to True. Can not be combined with premium.

device_type (optional)

Defaults to desktop. Set to mobile to use mobile user agents with the scrapes.

Prompt Templates

  • Please scrape this URL <URL>. If you receive a 500 server error, identify the website's geo-targeting and add the corresponding country_code to overcome geo-restrictions. If errors continues, upgrade the request to use premium proxies by adding premium=true. For persistent failures, activate ultra_premium=true to use enhanced anti-blocking measures.

  • Can you scrape URL <URL> to extract <SPECIFIC_DATA>? If the request returns missing/incomplete<SPECIFIC_DATA>, set render=true to enable JS Rendering.

Example Prompt (Claude)

"Please scrape this URL https://www.lowes.com/pd/Kozyard-12-ft-x-16-ft-Gazebo-Dark-Brown-Metal-Square-Screened-Gazebo-with-Steel-Roof/5014900669arrow-up-right. If you receive a 500 server error identify the website's geo-targeting and add the corresponding country_code to overcome geo-restrictions. If errors continues, upgrade the request to use premium proxies by adding premium=true. For persistent failures, activate ultra_premium=true to use enhanced anti-blocking measures. Get me the price of the product."

Response

The LLM first attempted to scrape the product page with a flat request (no extra parameters), but it was unable to find the price information requested in the prompt. It then identified that the price was likely being loaded dynamically via JavaScript and determined that enabling JS Rendering (render=true) would be the next step.

The LLM then retried the request, this time with render=true and successfully retrieved the price information.

The full set of price options, product details, and availability were successfully extraced once JavaScript Rendering was applied.

circle-info

This illustrates why ScraperAPI is essential to the process. Without the smart proxy, user agent rotation, and JavaScript Rendering, the LLM would not have been able to access the webpage undisrupted and capture the price details successfully.

Last updated