Endpoints and Parameters 🧩

The following section provides a list of the current API endpoints and parameters. Explore these options to tailor your experience and optimize your integration with our DataPipeline.

List of Endpoints

Create a project
POST   <https://datapipeline.scraperapi.com/api/projects>

Get a single project
GET    <https://datapipeline.scraperapi.com/api/projects/:id>

Update a project
PATCH  <https://datapipeline.scraperapi.com/api/projects/:id>

Archive a project
DELETE <https://datapipeline.scraperapi.com/api/projects/:id>   

List projects
GET    <https://datapipeline.scraperapi.com/api/projects>

List the jobs of a project
GET    <https://datapipeline.scraperapi.com/api/projects/:id/jobs>

Cancel a job
DELETE <https://datapipeline.scraperapi.com/api/projects/:id/jobs/:jobId>

Fields / Parameters

name

The project name is purely for representation purposes.

schedulingEnabled

true / false

This specifies the automatic rescheduling of a project. If it's set to false, the project won't be rescheduled after the next run (if applicable)

scrapingInterval

- hourly - daily - weekly - monthly - cron expression like "10 * * * *"

scheduledAt

Next project runtime. Should be specified in ISO 8601 format. Simply use "now" as your reference point to initiate at the earliest possible time.

createdAt

The ISO 8601 representation of the date when the project was created

projectType

This can be URLs for simple URL scraping, or a number of other values if you need to get structured data (JSON,CSV)

Valid projectType parameters:

- urls - amazon_product - amazon_offer - amazon_review - amazon_search - google_jobs - google_news - google_search - google_shopping - walmart_category - walmart_product - walmart_search

projectInput

The source of the list of URLs, search terms etc. for the project.

There are 2 valid variants - A simple list (this is the default): {"type": "list", "list": ["<https://example.com>", "<https://httpbin.org>"] } - Webhook input. The result of the address of the webhook has to be a new line delimited list of search terms, asins, etc. {"type": "webhook_input", "url": "<https://the.url.where.your.list.is>" }

webhookOutput

The results are always saved and you can download them through the results link or through the UI. If you want to get a webhook callback from your jobs you can add an optional parameter: ... "webhookOutput": { "url": "<url>"} ... If you want to encode the results as multipart/form-data (because for example you use webhook.site) you can use the extra webhookEncoding parameter like that: "webhookOutput": { "url": "<url>", "webhookEncoding": "multipart_form_data_encoding"} To unset a webhookOutput set it's value ot null "webhookOutput": null

notificationConfig

Conifugre when to send email notifications about the finished jobs

•notifyOnSuccess - valid values never, with_every_run, daily, weekly

•notifyOnFailure - valid values with_every_run, daily, weekly

apiParams

output_format

csv / JSON Set output format you want to get for SDE

Last updated