Crawler <--> n8n
This page covers how to use the ScraperAPI Crawler within n8n. Through our community node, you can seamlessly initiate and manage crawler jobs directly from your workflow.
Note: this integration uses the same community node as the basic ScraperAPI API resource and therefore requires a self‑hosted n8n instance. If you haven't installed the node yet or configured your API key, follow the installation instructions on the n8n Integration page.
How it works
Crawling Workflow
Add a ScraperAPI node to your workflow.
Select the Crawler resource.
Choose the operation you want to run:
Initiate a Crawler Job.
Get Job Status.
Cancel a Crawler Job.
Configure crawler settings:
Start URL (where crawling begins).
Max Depth and Crawl Budget.
URL Regex Include (Regex pattern for URLs to include). You can then use tools like regex101 for debugging.
Callback Webhook URL (where to stream the results).
URL Regex Exclude - Enter a Regex pattern to skip certain URLs. Any URL that matches this pattern will not be crawled.
Schedule Interval - The interval at which the crawler will run: Once, Hourly, Daily, Weekly, Monthly.
Schedule Name - Name of the crawler.
When defining parameters, you can choose the 'Let the model define this parameter' and the connected AI model will automatically set the most appropriate value based on your prompt.
Configure any optional parameters (see available Parameters).

AI Chat Model Scraping Workflow
Integrating an AI Chat Model into your workflow unlocks prompt-driven crawling, allowing you to initiate crawls using natural language.
Add a Chat Message Received trigger.
Add an AI Agent node.
Connect an AI Chat Model (e.g. OpenAI) node to the Agent (Chat Model input).
Connect a Simple Memory node to the Agent (Memory input).
Connect the ScraperAPI node to the Agent (Tool input).
Add a system prompt to the AI Agent explaining how it should behave.
The example below demonstrates how to use the ScraperAPI Crawler with n8n to crawl real estate website listings and find properties in Queens, New York using natural language.

Last updated

