# July 2025

## Use the Langchain ScraperAPI package

You are building your own LLM? And want to use Langchain? Great, then you should use the Langchain ScraperAPI package to access all data without getting blocked.&#x20;

<a href="https://github.com/scraperapi/langchain-scraperapi" class="button primary" data-icon="hand-back-point-right">Github Repo Langchain ScraperAPI</a>

### <mark style="background-color:blue;">What you can do?</mark>

| Tool class                   | Use it to                                   |
| ---------------------------- | ------------------------------------------- |
| `ScraperAPITool`             | Grab the HTML/text/markdown of any web page |
| `ScraperAPIGoogleSearchTool` | Get structured Google Search SERP data      |
| `ScraperAPIAmazonSearchTool` | Get structured Amazon product-search data   |

### <mark style="background-color:blue;">Installation</mark>

```bash
pip install -U langchain-scraperapi
```

### <mark style="background-color:blue;">Setup</mark>

Create an account at <https://www.scraperapi.com/> and get an API key, then set it as an environment variable:

```bash
import os
os.environ["SCRAPERAPI_API_KEY"] = "your-api-key"
```

Learn more how to integrate this in the official Langchain ScraperAPI Github Repo or read our guide how to integrate it:

<a href="https://www.scraperapi.com/integration-tutorials/langchain/" class="button primary" data-icon="hand-back-point-right">Integration Plan</a>

## Introduction to ScraperAPI Crawler v1.0

Want to crawl multiple linked pages without writing your own crawler? Our Crawler handles link discovery, scraping, retries, and webhook delivery for you. Just define the start URL, link pattern, and a crawl budget, we take care of the rest.

### <mark style="background-color:blue;">Getting Started</mark>

No Installation needed! Just send a POST request to <https://crawler.scraperapi.com/job>

```bash
POST https://crawler.scraperapi.com/job
```

### <mark style="background-color:blue;">Example Payload</mark>

```json
{
  "api_key": "<YOUR API KEY>",
  "start_url": "https://www.zillow.com/homes/44269_rid/",
  "max_depth": 5,
  "crawl_budget": 50,
  "url_regexp_include": ".*", //Use .* to crawl all pages on the site.
  "url_regexp_exclude": ".*/product/.*", //Optional paramter. Leave empty to include all pages on the site.
  "api_params": {
    "country_code": "us"
  },
  "callback": {
    "type": "webhook",
    "url": "<YOUR CALLBACK WEBHOOK URL>"
  },
  "enabled": true,
  "schedule": {
    "name": "NAME_OF_CRAWLER", // Name of the crawler.
    "interval": "once" //once, hourly, daily, weekly, monthly.
  }
}
```

Once the job is running, it streams results to your webhook in real-time and sends you a summary after the job is complete.

Check out the full guide and integration examples below:

<a href="/pages/x7ppoMyIbpskmqaByd91" class="button primary" data-icon="hand-back-point-right">ScraperAPI Crawler docs</a>


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.scraperapi.com/resources/release-notes/july-2025.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
