Crawler <--> n8n

This page covers how to use the ScraperAPI Crawler within n8n. Through our community node, you can seamlessly initiate and manage crawler jobs directly from your workflow.

triangle-exclamation

How it works

Crawling Workflow

  1. Add a ScraperAPI node to your workflow.

  2. Select the Crawler resource.

  3. Choose the operation you want to run:

  • Initiate a Crawler Job.

  • Get Job Status.

  • Cancel a Crawler Job.

  1. Configure crawler settings:

  • Start URL (where crawling begins).

  • Max Depth and Crawl Budget.

  • URL Regex Include (Regex pattern for URLs to include). You can then use tools like regex101arrow-up-right for debugging.

  • Callback Webhook URL (where to stream the results).

circle-check
  1. Configure any optional parameters (see available Parametersarrow-up-right).

AI Chat Model Scraping Workflow

Integrating an AI Chat Model into your workflow unlocks prompt-driven crawling, allowing you to initiate crawls using natural language.

  • Add a Chat Message Received trigger.

  • Add an AI Agent node.

  • Connect an AI Chat Model (e.g. OpenAI) node to the Agent (Chat Model input).

  • Connect a Simple Memory node to the Agent (Memory input).

  • Connect the ScraperAPI node to the Agent (Tool input).

  • Add a system prompt to the AI Agent explaining how it should behave.

The example below demonstrates how to use the ScraperAPI Crawler with n8n to crawl real estate website listings and find properties in Queens, New York using natural language.

Last updated