Getting Started
Your first extraction in Parsewise, either from the app or via Python.
Table of contents
Prerequisites
- A Parsewise account. Sign up at parsewise.ai if you don’t have one yet.
- A source document or two to experiment with (PDF, Word, Excel, PowerPoint, or image).
- For the API path: an API key from the Developer page in the app (app.parsewise.ai/developer). API access is an entitlement — if you don’t see the Developer page, email support@parsewise.ai.
Concepts
| Concept | What it is |
|---|---|
| Project | A workspace holding documents, agents, and results for one subject area (e.g. “Q4 leases”). |
| Document | A source file you upload. Parsed on upload so agents can read it. |
| Agent | A reusable definition of what to extract and how. One agent produces one column of values. |
| Dimension | Optional attachment that splits an agent’s output into multiple rows (e.g. one per clause, one per year). |
| Result | The resolved extracted values, with citations back to the source pages. |
In the app
1. Create a project
- Go to app.parsewise.ai and click New Project.
- Choose Blank Project, give it a name (e.g. “Lease tests”), and click Create Project.
2. Upload documents
- On the Documents page, click Upload Files (or drag files onto the page).
- Wait for each document’s status to reach Processed.
3. Create an extraction agent
Two ways to do this:
With Navi (recommended for new users):
- Open Navi from the sidebar.
- Type something like: “Create an agent that extracts the annual rent in USD.”
- Review the proposed configuration and click Create & Launch.
Manually on the Agents page:
- Go to Agents and click Create → Manually.
- Fill in:
- Name — e.g.
Annual rent (USD) - Cell type —
number - Unit —
USD - Extraction task — “Extract the total annual rent in USD. Return a plain number with no currency symbol or thousands separator.”
- Name — e.g.
- Click Save, then Launch All.
4. View results
- Go to the Results page.
- Switch between Table and By Agent using the toggle in the header.
- Click any value to open the Entity Details page, where you can see the underlying sources, the document pages they came from, and override the resolved value if needed.
- Click the Download Excel button to export.
5. Iterate
- Not quite right? Edit the agent’s extraction task and click Launch All again. Destructive changes re-extract that agent against every document.
- New documents arrived? Upload them and click Launch All. Only new or invalidated work is done.
With the API
API examples hit https://api.parsewise.ai/api/v1, with your key in
the X-API-Key header. Keys are scoped to one organisation.
export PARSEWISE_API_KEY=pw_live_...
End-to-end hello world — create a project, upload a document, launch an agent, and print results:
import os, time, requests
BASE = "https://api.parsewise.ai/api/v1"
H = {"X-API-Key": os.environ["PARSEWISE_API_KEY"]}
p = requests.post(f"{BASE}/projects/", headers=H,
json={"name": "Demo"}).json()
pid = p["id"]
with open("lease.pdf", "rb") as f:
requests.post(f"{BASE}/projects/{pid}/documents/", headers=H,
files={"file": f}).raise_for_status()
requests.post(f"{BASE}/projects/{pid}/agents/", headers=H, json={
"name": "Annual rent (USD)",
"extraction_instructions": "Extract the annual rent in USD as a number.",
"value_type": "number",
"unit": "USD",
}).raise_for_status()
requests.post(f"{BASE}/projects/{pid}/agents/launch/",
headers=H).raise_for_status()
while requests.get(f"{BASE}/projects/{pid}/agents/status/",
headers=H).json()["pipeline_running"]:
time.sleep(5)
for row in requests.get(f"{BASE}/projects/{pid}/results/",
headers=H).json()["results"]:
print(row["agent_name"], "→", row["resolution_result"]["value"])
See the API Reference for the full walkthrough, robust polling with exponential backoff, per-document citations, error handling, and the FAQ.
Tips for good results
- Be specific in extraction tasks. Describe the value first, then
where to find it. Include the expected format (e.g. “ISO 8601 date
YYYY-MM-DD“, “number without thousands separators”). - Start narrow. Prove an agent out on one well-understood document before pointing it at thousands.
- Read the citations, not just the values. A citation confirms the agent found the right span, which matters even when the value happens to look correct.
- Pick the right value type. Use
numberwhen you’ll aggregate or compare; otherwise usestringand specify the format in the task. - One concept per agent. If a name needs “and”, split it into two agents so each value has its own column and its own citations.
More guidance in Platform → Agent design best practices.
Next steps
- Read the API reference for the full endpoint list and FAQ.
- Explore platform concepts to understand dimensions, per-document mode, and Navi in depth.
- Stuck? Email support@parsewise.ai.