Skip to content

Pipeliner API

The pipeliner API is the main method for interacting with data pipeliner. It allows users to generate code from a mapping specification as well as validate their mapping specification.

To simplify interactions with the pipeliner API we have created the pipeliner CLI, this is the recommended approach over calling the API points directly.

Authentication

Authentication to pipeliner API is achieved via the use of an API Key. You can obtain an API key from the Your Account page. A valid API is required for all requests to the pipeliner API.

Command Reference

submit

Description

The submit endpoint is used to auto-generate ETL code from a mapping specification. It will create a pull request in the specified GitHub repository with the ETL code as per the mapping specification.

Request

POST /engine/submit HTTP/1.1
Host: api.datapipeliner.io
Content-Type: application/json
X-API-Key: <pipeliner api key string>

{
  "file_content": "Base64EncodedFileData",
  "base_branch": "main",
  "branch": "pipeliner-code",
  "repo_name": "adventure-works-etl",
  "layer_name": "silver",
  "system_name": "adventure_works",
  "catalog_only": false,
  "github": {
    "organisation": "AdventureWorks",
    "token": <github token>
  }
}

Request Body

{
  "file_content": String,
  "base_branch": String,
  "branch": String,
  "repo_name": String,
  "layer_name": String,
  "system_name": String,
  "catalog_only": Boolean,
  "github": GitHubProperties
}

file_content (string) Base64 encoded file content of the mapping specification.

base_branch (string) Name of the branch to create the pull request against.

branch (string) The name of the branch pipeliner should commit the code to.

repo_name (string) Name of the GitHub repository pipeliner should commit the code too.

layer_name (string) The name of the target layer in the datalake the mapping specification defines tables for. e.g. silver

system_name (string) The business name of the system the mapping specification if for. e.g. employees

catalog_only (boolean) If true only the infrastructure for the tables definition will be produced. If false, all outputs will be produced. Defaults to false.

GitHubProperties

ogranisation (string) Name of the GitHub organisation which hols the repo.

token (string) A GitHub token with required access to the repo.

Response

200 OK
Content-Type: application/json
{
  "pull_request": "https://github.com/AdventureWorks/adventure-works-etl/pull/1",
  "request_id": "18e0d8df-eca3-4a71-a931-185085023431"
}

Response Body

{
  "pull_request": String,
  "request_id": String
}

pull_request (string) URL to the pull request created by pipeliner.

request_id (string)

Errors

500 Internal Server Error
Content-Type: application/json

{
    "message": "[InternalServerError] Description of error.",
    "request_id": "UniqueRequestID"
}

message (string) Error type and description of the error.

request_id (string) The request ID, you can use this in communications with pipeliner support.

curl Example

Request:

export SPEC_B64=$(openssl base64 -e -in ./curated_adventure_works.xlsx -A)
export PIPELINER_API_KEY="mypipelinerapikey"
export GITHUB_TOKEN="mygithubtoken"
curl \
    -H 'Content-Type: application/json' \
    -H 'X-API-Key: $PIPELINER_API_KEY' \
    -X POST \
    -d '{"file_content": "$SPEC_B64", "base_branch": "main", "branch": "pipeliner-code", "repo_name": "adventure-works-etl", "layer_name": "silver", "system_name": "adventure_works", "catalog_only": false, "github": {"org_name": "AdventureWorks", "token": "$GITHUB_TOKEN"}}' \
    https://api.datapipeliner.io/engine/submit
Response:
{
  "pull_request": "https://github.com/AdventureWorks/adventure-works-etl/pull/1",
  "request_id": "18e0d8df-eca3-4a71-a931-185085023431"
}