Extract structured data from documents

Upload one or more documents together with a target JSON schema. A project is created automatically and the full pipeline (text extraction → agent generation → mapping → extraction) runs in the background. Poll GET /projects/{project_id}/status/ until pipeline_running is false and schema_status is success, then fetch results from GET /projects/{project_id}/results/schema/.

On this page

HTTP request
Request Header
Request Body
Responses
Security
Python example
Definitions
- V1ExtractRequestRequest
- V1ExtractResponse

HTTP request

POST https://api.parsewise.ai/api/v1/extract/

Request Header

Name	Required	Type	Description
`X-API-Key`	Yes	string	API key with the `pw_live_` prefix. See Authentication.

Request Body

Supported content types: multipart/form-data.

Name	Required	Type	Description
`files`	Yes	array<string (binary)>	One or more document files to process.
`schema`	Yes	any	A valid JSON Schema (Draft 2020-12) describing the desired output shape.
`project_name`	No	string	Optional name for the created project. Defaults to “API Extraction”.

Responses

Status	Type	Description
`202`	V1ExtractResponse	—

Security

ApiKeyAuth — apiKey — in X-API-Key header. API key with pw_live_ prefix.

Python example

import os
import requests

API_KEY = os.environ["PARSEWISE_API_KEY"]
BASE_URL = "https://api.parsewise.ai/api/v1"

# Send files as multipart/form-data; repeat the "files" field per file.
with open("example.pdf", "rb") as f:
    files = {"files": f}

resp = requests.post(
    f"{BASE_URL}/extract/",
    headers={"X-API-Key": API_KEY},
    files=files,
)
resp.raise_for_status()
print(resp.json() if resp.content else None)

Definitions

V1ExtractRequestRequest

Name	Required	Type	Description
`files`	Yes	array<string (binary)>	One or more document files to process.
`schema`	Yes	any	A valid JSON Schema (Draft 2020-12) describing the desired output shape.
`project_name`	No	string	Optional name for the created project. Defaults to “API Extraction”.

V1ExtractResponse

Name	Required	Type
`project_id`	Yes	string (uuid)
`status_url`	Yes	string
`results_url`	Yes	string