前言

选择 UniParser-Tools 或 UniParser-API，并了解调用前需要知道的基础配置。

Uni-Parser 提供两种接入方式：

方式	说明	适用场景
UniParser-Tools	高层封装的 Python 库	推荐给大多数用户。它封装了提交任务、轮询、结果获取和常见格式转换，适合脚本、批处理、科研工作流和 Agent 集成。
UniParser-API	原生 HTTP 调用	适合需要直接控制 HTTP 请求、鉴权、轮询、回调、字段开关或系统集成细节的场景。

如果不确定选哪一种，优先使用 UniParser-Tools；只有在需要细粒度控制接口行为时，再使用 UniParser-API。

极简流程：

获取 X-API-Key（主页玻尔登录一键授权获取，或填入已有 Key）。
提交文件、图片片段或 URL 解析任务。
选择同步等待 sync=true，或异步提交后根据 token 获取结果。
获取 content、objects、pages_dict、pages_tree 等结果。
按需选择 Markdown、HTML、LaTeX、Plain 等输出格式。

认证与计价

认证：在主页通过玻尔登录一键授权即可获取 X-API-Key 并自动发放试用额度；若已有 Key，可在「输入已有 API Key」处直接填入。（原验证码访客注册已下线，统一收敛到玻尔登录。）

计价：按解析页数计费，基础单价 ¥0.05 / 页；每页前 2 个分子已包含在页费内，超出部分每个分子额外 ¥0.05。仅在解析成功后扣费，失败或超时不计费。余额与「预估可解析页数」可在主页认证区查看。

文件大小、页数、请求频率等额度与限速规则见「参考信息」与「常见问题」。

Uni-Parser supports two integration methods:

Method	Description	When to use it
UniParser-Tools	A high-level Python library wrapper	Recommended for most users. It wraps task submission, polling, result fetching, and common output conversion for scripts, batch jobs, research workflows, and agent integrations.
UniParser-API	Native HTTP calls	Use this when you need direct control over HTTP requests, authentication, polling, callbacks, output switches, or system integration details.

If you are unsure which path to choose, start with UniParser-Tools. Use UniParser-API only when you need fine-grained control over the HTTP workflow.

Minimal workflow:

Get an X-API-Key (one-click via Bohrium sign-in (玻尔登录) on the homepage, or paste an existing key).
Submit a file, image snippet, or URL parsing task.
Use sync=true to wait synchronously, or submit asynchronously and fetch results later with the token.
Fetch content, objects, pages_dict, pages_tree, or other result fields.
Choose Markdown, HTML, LaTeX, Plain, or other output formats as needed.

Authentication & pricing

Authentication: Sign in with Bohrium (玻尔登录) on the homepage for one-click authorization to obtain an X-API-Key and an automatically granted trial balance; if you already have a key, paste it under "enter an existing API Key". (The former captcha-based visitor registration has been retired and consolidated into Bohrium sign-in.)

Pricing: Billing is per parsed page at a base rate of ¥0.05 / page. The first 2 molecules on each page are included in the page fee; each additional molecule costs ¥0.05. You are charged only after a parse succeeds — failed or timed-out tasks are not billed. Your balance and an estimated "pages remaining" are shown in the homepage auth panel.

File-size, page-count, and rate-limit rules are covered in "Reference" and "FAQ".

II.

UniParser-Tools

在GitHub查看

推荐给大多数用户的 Python 客户端封装。

快速开始

以下为基础调用方式；完整示例与最新用法请以仓库内的 README.md 和 notebook 为准。

1. 安装

UniParser-Tools 是 UniParser-API 的 Python 客户端封装，适合在脚本、notebook、批处理和 Agent 工作流中直接调用。

git clone https://github.com/dptech-corp/UniParser-Tools.git
cd UniParser-Tools
pip install -r requirements.txt
pip install -e .

2. 初始化客户端

初始化客户端时需要提供 api_key。认证方式和 API-Key 配置参考用户认证。

import os
from uniparser_tools.api.clients import UniParserClient

parser = UniParserClient(
    host="https://uniparser.dp.tech/",
    api_key=os.getenv("UNIPARSER_API_KEY"),
)

3. 提交 PDF 解析任务

UniParser-Tools 封装了不同解析入口：

文件解析：trigger_file
图片片段解析：trigger_snip
URL 解析：trigger_url

常用请求格式参考请求格式。UniParser-Tools 会自动生成唯一 token 并随 trigger 请求提交，通常不需要手动指定；只有需要复用外部任务索引时才建议显式传入。

解析配置用于控制提交任务时解析哪些语义元素，以及各元素采用快速解析、高质量解析或返回原始图像等模式。ParseMode / ParseModeTextual 只用于提交阶段，不要传给 get_formatted。详情参考解析配置 SemanticType。

from uniparser_tools.common.constant import ParseMode, ParseModeTextual

result = parser.trigger_file(
    file_path="example.pdf",
    textual=ParseModeTextual.OCRHighQuality,  # high quality
    equation=ParseMode.OCRHighQuality,        # high quality
    table=ParseMode.OCRHighQuality,           # high quality
    chart=ParseMode.DumpBase64,               # original image base64
    figure=ParseMode.DumpBase64,              # original image base64
    expression=ParseMode.DumpBase64,          # original image base64
    molecule=ParseMode.OCRFast,               # fast
)

token = result["token"]

图片片段解析使用 trigger_snip，适合解析截图、局部页面或单张图片：

snip_result = parser.trigger_snip(
    snip_path="example.png",
    textual=ParseModeTextual.OCRHighQuality,
    table=ParseMode.OCRHighQuality,
    equation=ParseMode.OCRHighQuality,
)

snip_token = snip_result["token"]

URL 解析使用 trigger_url，适合提交公网可访问的 PDF 链接：

url_result = parser.trigger_url(
    pdf_url="https://example.com/paper.pdf",
    textual=ParseModeTextual.OCRHighQuality,
    table=ParseMode.OCRHighQuality,
    equation=ParseMode.OCRHighQuality,
)

url_token = url_result["token"]

4. 获取 Markdown 结果

获取结果时可以通过输出开关控制返回全文 content、语义块 objects、页级结构 pages_dict 或层级结构 pages_tree；本示例开启 content=True 获取格式化全文。FormatFlag 只用于获取结果阶段，不要传给 trigger_file。输出开关详情参考输出配置，Markdown、HTML、LaTeX 等格式参考输出格式 FormatFlag。

from uniparser_tools.common.constant import FormatFlag

result = parser.get_formatted(
    token,
    content=True,
    textual=FormatFlag.Markdown,
    table=FormatFlag.Markdown,
    equation=FormatFlag.Markdown,
)

print(result["content"])

Quick Start

The following is the basic workflow. For complete and latest examples, see the repo README.md and notebooks.

1. Install

UniParser-Tools is a Python client wrapper for UniParser-API, suitable for scripts, notebooks, batch processing, and agent workflows.

git clone https://github.com/dptech-corp/UniParser-Tools.git
cd UniParser-Tools
pip install -r requirements.txt
pip install -e .

2. Initialize the client

Provide an api_key when initializing the client. See User Authentication for authentication and API-Key configuration details.

import os
from uniparser_tools.api.clients import UniParserClient

parser = UniParserClient(
    host="https://uniparser.dp.tech/",
    api_key=os.getenv("UNIPARSER_API_KEY"),
)

3. Submit a PDF parsing task

UniParser-Tools wraps different parsing entry points:

File parsing: trigger_file
Image snippet parsing: trigger_snip
URL parsing: trigger_url

See Request Format for common request formats. UniParser-Tools automatically generates a unique token and submits it with the trigger request, so you usually do not need to provide one manually; pass one explicitly only when you need to reuse an external task index.

Parsing options control which semantic elements are parsed when submitting a task, and whether each element uses fast parsing, high-quality parsing, or returns the original image. ParseMode / ParseModeTextual are only for the submit stage; do not pass them to get_formatted. See Parsing Options SemanticType for details.

from uniparser_tools.common.constant import ParseMode, ParseModeTextual

result = parser.trigger_file(
    file_path="example.pdf",
    textual=ParseModeTextual.OCRHighQuality,  # high quality
    equation=ParseMode.OCRHighQuality,        # high quality
    table=ParseMode.OCRHighQuality,           # high quality
    chart=ParseMode.DumpBase64,               # original image base64
    figure=ParseMode.DumpBase64,              # original image base64
    expression=ParseMode.DumpBase64,          # original image base64
    molecule=ParseMode.OCRFast,               # fast
)

token = result["token"]

Use trigger_snip for image snippets, screenshots, page regions, or single images:

snip_result = parser.trigger_snip(
    snip_path="example.png",
    textual=ParseModeTextual.OCRHighQuality,
    table=ParseMode.OCRHighQuality,
    equation=ParseMode.OCRHighQuality,
)

snip_token = snip_result["token"]

Use trigger_url for publicly accessible PDF URLs:

url_result = parser.trigger_url(
    pdf_url="https://example.com/paper.pdf",
    textual=ParseModeTextual.OCRHighQuality,
    table=ParseMode.OCRHighQuality,
    equation=ParseMode.OCRHighQuality,
)

url_token = url_result["token"]

4. Fetch Markdown output

When fetching results, output switches control whether to return full text content, semantic blocks objects, page-level structures pages_dict, or hierarchical structures pages_tree. This example enables content=True to fetch formatted full text. FormatFlag is only for the result-fetching stage; do not pass it to trigger_file. See Output Switches for output switches, and Output FormatFlag for Markdown, HTML, LaTeX, and other formats.

from uniparser_tools.common.constant import FormatFlag

result = parser.get_formatted(
    token,
    content=True,
    textual=FormatFlag.Markdown,
    table=FormatFlag.Markdown,
    equation=FormatFlag.Markdown,
)

print(result["content"])

III.

UniParser-API

使用 curl 或 requests 直接提交解析并获取结果。

UniParser-API 适合直接用 HTTP 接口提交解析任务并获取结果。示例使用 curl 和 requests，不依赖 UniParser-Tools 客户端。

1. 提交解析任务

使用 /trigger-file-async 上传 PDF，并通过表单字段设置 token 和解析配置项。token 是 trigger 必填字段，使用 curl / requests 时需要手动提供：

# sync=true 表示提交请求会等待解析完成；2 表示高质量解析，1 表示快速解析。
curl -X POST "https://uniparser.dp.tech/trigger-file-async" \
  -H "X-API-Key: your_api_key" \
  -F "token=example_001" \
  -F "file=@example.pdf" \
  -F "sync=true" \
  -F "textual=2" \
  -F "equation=2" \
  -F "table=2" \
  -F "molecule=1"

Python requests 示例：

import os
import requests

host = "https://uniparser.dp.tech/"
headers = {"X-API-Key": os.environ["UNIPARSER_API_KEY"]}
token = "example_001"
data = {
    "token": token,
    "sync": "true",  # 等待解析完成；大文件或批处理可改为 "false"
    "textual": "2",  # 2 = 高质量解析
    "equation": "2",  # 2 = 高质量解析
    "table": "2",  # 2 = 高质量解析
    "molecule": "1",  # 1 = 快速解析
}

with open("example.pdf", "rb") as file:
    response = requests.post(
        f"{host}trigger-file-async",
        headers=headers,
        data=data,
        files={"file": file},
        timeout=60,
    )
response.raise_for_status()

2. 获取结果

使用提交解析时提供的 token 获取格式化结果。/get-formatted 接收 JSON body。仅需要 Markdown 全文时：

curl -X POST "https://uniparser.dp.tech/get-formatted" \
  -H "X-API-Key: your_api_key" \
  -H "Content-Type: application/json" \
  -d '{
    "token": "example_001",
    "content": true,
    "textual": "markdown",
    "table": "markdown",
    "equation": "markdown"
  }'

Python requests 示例：

payload = {
    "token": token,
    "content": True,
    "textual": "markdown",
    "table": "markdown",
    "equation": "markdown",
}

response = requests.post(
    f"{host}get-formatted",
    headers=headers,
    json=payload,
    timeout=60,
)
response.raise_for_status()
result = response.json()
print(result["content"])

如需坐标、类型或层级结构，可开启 objects、pages_dict 或 pages_tree。

The native API lets you submit parsing tasks and fetch results directly over HTTP. The examples below use curl and requests; they do not rely on the UniParser-Tools client.

1. Submit a Parsing Task

Use /trigger-file-async to upload a PDF and set token plus parsing options as form fields. token is required by trigger endpoints; when using curl / requests, you must provide it manually:

# sync=true waits for parsing to finish; 2 means high-quality parsing, 1 means fast parsing.
curl -X POST "https://uniparser.dp.tech/trigger-file-async" \
  -H "X-API-Key: your_api_key" \
  -F "token=example_001" \
  -F "file=@example.pdf" \
  -F "sync=true" \
  -F "textual=2" \
  -F "equation=2" \
  -F "table=2" \
  -F "molecule=1"

Python requests example:

import os
import requests

host = "https://uniparser.dp.tech/"
headers = {"X-API-Key": os.environ["UNIPARSER_API_KEY"]}
token = "example_001"
data = {
    "token": token,
    "sync": "true",  # wait for parsing to finish; use "false" for large files or batches
    "textual": "2",  # 2 = high-quality parsing
    "equation": "2",  # 2 = high-quality parsing
    "table": "2",  # 2 = high-quality parsing
    "molecule": "1",  # 1 = fast parsing
}

with open("example.pdf", "rb") as file:
    response = requests.post(
        f"{host}trigger-file-async",
        headers=headers,
        data=data,
        files={"file": file},
        timeout=60,
    )
response.raise_for_status()

2. Fetch Results

Use the token you provided when submitting the task to fetch formatted results. /get-formatted accepts a JSON body. For Markdown full text:

curl -X POST "https://uniparser.dp.tech/get-formatted" \
  -H "X-API-Key: your_api_key" \
  -H "Content-Type: application/json" \
  -d '{
    "token": "example_001",
    "content": true,
    "textual": "markdown",
    "table": "markdown",
    "equation": "markdown"
  }'

Python requests example:

payload = {
    "token": token,
    "content": True,
    "textual": "markdown",
    "table": "markdown",
    "equation": "markdown",
}

response = requests.post(
    f"{host}get-formatted",
    headers=headers,
    json=payload,
    timeout=60,
)
response.raise_for_status()
result = response.json()
print(result["content"])

If you need coordinates, block types, or hierarchy, enable objects, pages_dict, or pages_tree.

IV.

参考信息

集中说明解析结果中的版面类型、输出格式和错误码。

用户认证

UniParser-API 使用请求头 X-API-Key 进行身份认证。可在 UniParser 主页通过玻尔登录一键授权获取 API-Key（首次登录自动发放试用额度），或在已有 API-Key 时直接填入使用；如需更高额度、更高权限或长期账号，请联系我们开通。

curl -X POST "https://uniparser.dp.tech/trigger-file-async" \
  -H "X-API-Key: your_api_key" \
  -F "token=example_001" \
  -F "file=@example.pdf" \
  -F "sync=true"

推荐将密钥放在环境变量中，例如 UNIPARSER_API_KEY，避免写入代码仓库。

请求格式

接口	请求格式	说明
`/trigger-file-async`	`multipart/form-data`	上传 PDF 文件，文件字段为 `file`，其他参数也通过表单字段传入
`/trigger-snip-async` / `/trigger-url-async`	JSON body	图片片段或 URL 解析请求
`/get-formatted`	JSON body	格式化结果获取请求

注意：所有 trigger 接口都必须提供 token。token 是解析任务的唯一索引，作用类似 UUID。UniParser-Tools 会自动生成唯一 token；使用 curl 或 requests 调用 UniParser-API 时必须手动指定，并自行保证唯一性，推荐使用 UUID。

sync 控制任务提交后的等待方式：sync=false 是默认异步模式，提交后立即返回任务状态和 token，需要稍后用同一 token 获取结果；sync=true 会在提交请求中等待解析完成或超时，适合示例、调试和小文件。大文件或批处理建议使用异步模式，并根据任务大小合理设置 timeout（默认 1800 秒）。

解析配置 SemanticType

提交解析任务时，可以分别控制以下语义元素：

阶段提示：SemanticType 与 ParseMode / ParseModeTextual 属于提交阶段，用于决定解析哪些内容以及解析质量；不要在 /get-formatted 或 get_formatted 中使用它们。

参数	含义	可用模式
`textual`	普通文本、标题、脚注等	`0` / `1` / `2` / `3`
`equation`	独立数学公式	`0` / `1` / `2` / `-1`
`table`	表格	`0` / `1` / `2` / `-1`
`chart`	图表	`0` / `1` / `-1`
`figure`	图片、插图、照片	`0` / `1` / `-1`
`expression`	化学反应式	`0` / `1` / `-1`
`molecule`	化学分子结构	`0` / `1` / `-1`

取值说明：0 表示禁用，1 表示快速解析，2 表示高质量解析，-1 表示返回原始图像 Base64。textual 还支持 3，用于从数字原生 PDF 中直接抽取文字。

枚举名	对应数值
`ParseMode.DumpBase64` / `ParseModeTextual.DumpBase64`	`-1`
`ParseMode.OCRFast` / `ParseModeTextual.OCRFast`	`1`
`ParseMode.OCRHighQuality` / `ParseModeTextual.OCRHighQuality`	`2`
`ParseModeTextual.DigitalExported`	`3`（仅 `textual`）

注意：chart、expression、molecule 目前无高质量模式。

表格模式：fast 与 hq

table=1（fast）返回的是“结构模板 + 单元格文本”的组合数据，通常需要结合 structure、placeholders 和 contents 还原表格：structure 描述表格骨架，placeholders 标记单元格占位，contents 提供对应的单元格文本。
table=2（hq）的 structure 中通常已经直接包含完整表格数据，例如带文本内容的 HTML / Markdown 表格；使用时一般不需要再额外通过 placeholders 和 contents 拼回单元格内容。
因此，fast 更适合需要自行处理表格骨架和单元格文本映射的场景；hq 更适合直接消费完整表格结构的场景。高质量模式耗时通常更长；如果高质量服务不可用，系统会回退到可用的快速表格解析服务。
表格模式只影响表格区域的结构识别与单元格内容识别。表题、表注以及表格外正文仍主要受 textual 配置影响；如果希望表格内文字和正文都尽量高质量，通常同时设置 table=2 和 textual=2。

输出配置

解析任务的 token 用于后续结果获取。通过 /get-formatted 获取格式化结果时，可以用输出开关控制返回内容：

注意：结果获取接口使用 JSON body。调用 UniParser-API 时请设置 Content-Type: application/json；未传字段会使用默认值。

开关	默认值	返回字段类型	说明
`content`	`true`	`str`	返回格式化后的全文内容，适合 LLM、搜索和阅读场景
`objects`	`false`	`list[dict]`	返回扁平语义块列表；每项包含 `class`、`confidence`、`float_xyxy`、`page`、`str` 等字段，适合需要坐标、类型和块级内容的场景
`pages_dict`	`false`	`list[list[dict]]`	返回按页组织的扁平布局；每一页是一个布局块列表，适合页级结构化处理
`pages_tree`	`false`	`list[list[dict]]`	返回按页组织的树形布局；保留父子层级关系，适合需要层级结构的场景
`molecule_source`	`false`	控制选项	开启后在结构化结果中保留分子原始源信息；默认关闭是为了减少响应体积，关闭时会清空分子块的 `source`
`marginalia`	`true`	控制选项	控制格式化结果中是否保留页边/边栏信息；默认保留是为了尽量不丢失原文信息

输出格式 FormatFlag

/get-formatted 可以为各语义元素指定输出格式：textual、table、molecule、chart、figure、expression、equation。

阶段提示：FormatFlag 属于获取结果阶段，只控制返回文本的格式；不要在 trigger 请求或 trigger_file 中使用它。

取值	名称	说明	示例
`plain`	`Plain`	纯文本，无前后缀	`ATP inhibits ATR.`
`markup`	`Markup`	标记文本，默认值	`\begin{paragraph}ATP inhibits ATR.\end{paragraph}`
`markdown`	`Markdown`	Markdown 格式，适合 LLM / 文档	`## Results`、`- IC50: 0.5 nM`
`latex`	`Latex`	LaTeX，特别适合公式	$E = mc^2$
`html`	`Html`	HTML，特别适合表格	`<table>...</table>`

objects、pages_dict、pages_tree 这三类结构化返回不会受 FormatFlag 影响；格式化主要作用于 content 和 objects 中的文本字段。

常见版面类型 LayoutType

在 objects、pages_dict、pages_tree 中，每个解析块都会带有 type 字段，用于描述该块在文档版面中的语义类型。

类型	含义	通常归属的大类
`documenttitle`	文档标题	`textual`
`title`	章节标题	`textual`
`paragraph`	普通段落正文	`textual`
`abstract`	摘要	`textual`
`reference`	参考文献	`textual`
`table`	表格主体	`table`
`tablecaption` / `tablefootnote`	表题 / 表注	`textual`，语义上附属于表格
`equation` / `equationinline`	独立公式 / 行内公式	`equation`
`molecule` / `moleculeid`	分子结构 / 分子编号	`molecule` 或相关文本
`figure` / `image`	图片、插图、照片或图像区域	`figure`
`figurecaption` / `imagecaption`	图题 / 图片标题	`textual`，语义上附属于图片
`expression` / `expressioncaption`	化学反应式 / 反应式标题	`expression` 或相关文本
`chart` / `legend`	图表 / 图例	`chart` 或相关文本
`pageheader` / `pagefooter` / `pagenumber`	页眉、页脚、页码	页边信息
`group` / `tablegroup` / `figuregroup` / `moleculegroup`	组合节点	结构节点，常见于 `pages_tree`

type 是结果标签，不等同于提交任务时的 textual、table、molecule 等解析配置项。

API 与前端限制

UniParser 会同时检查 API 账号额度和前端上传限制。API 限制由账号类型、权限和申请配置决定；前端限制只影响网页上传，不代表 API 账号的最终可解析额度。

限制类型	适用对象	默认限制	说明
API 文件体积 / 页数	访客标准账号	200 MB，2000 页	超出后会返回文件大小或页数限制相关错误
API 文件体积 / 页数	付费用户 / 已开通高额度账号	默认 300 MB，6000 页	具体额度以 API 申请文档或账号开通配置为准，可按需申请调整
前端上传体积	网页端上传	300 MB	仅限制前端页面可上传的文件大小；直接调用 API 时仍以账号 API 限制为准

请求频率限制按账号、接口和服务端配置生效。触发限速时接口会返回 429，错误信息通常包含当前命中的限制；批量任务建议使用异步提交、控制并发，并在重试时增加退避等待。如需更高请求频率或更高文件额度，请在申请 API 权限时一并说明。

错误码清单 ErrorFlag

类别	常见错误	说明	处理建议
请求参数错误 `400`	`Token_Required`：缺少 `token` 参数 `Token_Invalid`：`token` 含非法字符，只允许字母、数字、下划线和短横线 `Token_Duplicated`：`token` 已存在且任务未完成 `File_Required`：缺少上传文件 `File_Invalid`：PDF 无效或页数为 0 `URL_Invalid`：URL 不是有效 PDF 或下载失败 `Request_Invalid`：请求参数格式或取值非法	必填参数、文件内容、URL 可访问性或字段取值不符合要求	补齐必填字段；为原生 API 重新生成唯一 `token`；确认文件可打开、URL 可下载，并按文档取值传参
解析状态错误 `400`	`Status_Not_Found`：找不到该 `token` 对应的状态文件 `Status_Decode_Failed`：状态 JSON 损坏或无法解析 `Result_Not_Found`：解析结果文件不存在 `Result_Decode_Failed`：结果 JSON 损坏或无法解析	通常说明任务不存在、结果已清理、任务失败，或服务端缓存文件异常	确认使用的是提交时返回的同一个 `token`；异步任务先等待完成再获取；长期缺失时重新提交任务
语言和页面错误 `400`	`Lang_Unknown`：无法判断文档语言 `Lang_Not_Supported`：当前语言不支持对应 OCR / 解析流程 `Pages_Required`：特定接口缺少页码选择	文档语言、扫描质量或页码选择不满足解析要求	检查文档是否可读；必要时指定页码或换用支持语言的文档；扫描件建议提升图像质量
用户限制错误 `400`	`File_Size_Exceeded`：文件大小超过账号权限 `PDF_Pages_Exceeded`：PDF 页数超过账号权限 `File_Type_Not_Allowed`：文件类型不在允许列表中 `Domain_Not_Allowed`：URL 下载域名不在允许列表中	和账号权限、文件大小、页数、文件类型、下载域名白名单有关	压缩或拆分文件；换用允许的文件类型或域名；如需更高额度或权限，请联系我们开通
系统和处理错误 `400/500`	`OS_Error`：文件 IO、路径、权限或系统资源错误 `Process_Failed`：解析子任务执行失败，可能由模型、输入文件或运行环境导致	文件读写、模型服务或任务进程异常	保留 `token` 和错误响应，稍后重试；若持续失败，请携带文件、token 和错误信息反馈
认证和授权错误 `401/402/403`	`Authentication required`：缺少 `Authorization` 或 `X-API-Key` `Authentication failed`：JWT Token 或 API-Key 无效 / 过期 `Insufficient balance`：账户余额不足 `Administrator privileges required`：接口需要管理员权限 `Missing permission`：账号缺少指定解析权限	请求头、API-Key、余额或权限不满足要求	检查 `X-API-Key` 是否正确、未过期且有余额；需要管理员或特殊解析权限时联系我们开通
速率限制错误 `429`	`Rate limit exceeded: <limit>`：超出全局或接口限速 `Rate limit exceeded, <user_rate_limit>`：超出用户级限速	请求频率或并发超过限制	降低并发和重试频率，增加退避等待；批处理请排队提交，必要时申请更高限速

User Authentication

UniParser-API uses the X-API-Key request header. On the UniParser homepage, sign in with Bohrium (玻尔登录) for one-click authorization to obtain an API-Key (a trial balance is granted on first sign-in), or paste an existing API-Key directly. For higher quotas, broader permissions, or a long-term account, please contact us.

curl -X POST "https://uniparser.dp.tech/trigger-file-async" \
  -H "X-API-Key: your_api_key" \
  -F "token=example_001" \
  -F "file=@example.pdf" \
  -F "sync=true"

Store the key in an environment variable such as UNIPARSER_API_KEY instead of hard-coding it in source code.

Request Format

Endpoint	Request format	Notes
`/trigger-file-async`	`multipart/form-data`	Upload a PDF in the `file` field; other parameters are also form fields
`/trigger-snip-async` / `/trigger-url-async`	JSON body	Image snippet or URL parsing requests
`/get-formatted`	JSON body	Formatted result-fetching request

Note: every trigger endpoint requires a token. The token is the unique index for a parsing task, similar to a UUID. UniParser-Tools generates a unique token automatically; when using curl or requests with UniParser-API, you must provide it manually and ensure uniqueness. UUIDs are recommended.

sync controls how the trigger request waits: sync=false is the default asynchronous mode and returns task status plus token immediately, so you fetch results later with the same token; sync=true waits in the submit request until parsing finishes or times out, which is convenient for examples, debugging, and small files. For large files or batch processing, prefer async mode and set timeout according to task size (default: 1800 seconds).

Parsing Options SemanticType

When submitting a parsing task, you can control these semantic element fields:

Stage note: SemanticType and ParseMode / ParseModeTextual belong to the submit stage. They decide what to parse and at what quality; do not use them in /get-formatted or get_formatted.

Field	Meaning	Available modes
`textual`	Prose text, headings, footnotes, etc.	`0` / `1` / `2` / `3`
`equation`	Display mathematical equations	`0` / `1` / `2` / `-1`
`table`	Tables	`0` / `1` / `2` / `-1`
`chart`	Charts	`0` / `1` / `-1`
`figure`	Figures, illustrations, photos	`0` / `1` / `-1`
`expression`	Chemical reaction expressions	`0` / `1` / `-1`
`molecule`	Chemical molecular structures	`0` / `1` / `-1`

Value notes: 0 disables parsing, 1 uses fast parsing, 2 uses high-quality parsing, and -1 returns the original image as Base64. textual also supports 3 for direct text extraction from digital PDFs.

Enum name	Value
`ParseMode.DumpBase64` / `ParseModeTextual.DumpBase64`	`-1`
`ParseMode.OCRFast` / `ParseModeTextual.OCRFast`	`1`
`ParseMode.OCRHighQuality` / `ParseModeTextual.OCRHighQuality`	`2`
`ParseModeTextual.DigitalExported`	`3` (`textual` only)

Note: chart, expression, and molecule currently do not have a high-quality mode.

Table Mode: fast vs hq

table=1 (fast) returns a combination of a structure template and cell text. You usually need structure, placeholders, and contents together to reconstruct the table: structure describes the table skeleton, placeholders mark cell placeholders, and contents provides the corresponding cell text.
table=2 (hq) usually puts the complete table data directly in structure, such as an HTML / Markdown table with text content. In this mode, you generally do not need to stitch cell content back from placeholders and contents.
Therefore, fast is better when you want to handle the mapping between table skeleton and cell text yourself; hq is better when you want to consume a complete table structure directly. High-quality mode usually takes longer; if the high-quality service is unavailable, UniParser falls back to an available fast table parser.
Table mode only affects table-region structure recognition and cell-content recognition. Table captions, table footnotes, and surrounding body text are mainly controlled by textual; when you want both table text and body text to use higher quality, usually set both table=2 and textual=2.

Output Switches

The task token is used for subsequent result fetching. Use /get-formatted with output switches to choose what to fetch:

Note: result-fetching endpoints use a JSON body. When calling UniParser-API directly, set Content-Type: application/json; omitted fields use default values.

Switch	Default	Returned field type	Description
`content`	`true`	`str`	Formatted full-document content, useful for LLMs, search, and reading
`objects`	`false`	`list[dict]`	Flat semantic block list. Each item includes fields such as `class`, `confidence`, `float_xyxy`, `page`, and `str`; useful when coordinates, block types, and block-level content are needed
`pages_dict`	`false`	`list[list[dict]]`	Flattened page-level layout. Each page is a list of layout blocks, useful for page-level structured processing
`pages_tree`	`false`	`list[list[dict]]`	Page-level tree layout that preserves parent-child hierarchy, useful when hierarchical structure is needed
`molecule_source`	`false`	Control option	Keeps original molecule source information in structured outputs. It defaults to disabled to reduce response size; when disabled, molecule block `source` is cleared
`marginalia`	`true`	Control option	Controls whether marginal / side-bar information is kept in formatted output. It defaults to enabled to avoid dropping source-document information

Output FormatFlag

/get-formatted can specify output formats for these semantic element fields: textual, table, molecule, chart, figure, expression, and equation.

Stage note: FormatFlag belongs to the result-fetching stage. It only controls the format of returned text; do not use it in trigger requests or trigger_file.

Value	Name	Meaning	Example
`plain`	`Plain`	Plain text, no markers	`ATP inhibits ATR.`
`markup`	`Markup`	Marker-wrapped text, default	`\begin{paragraph}ATP inhibits ATR.\end{paragraph}`
`markdown`	`Markdown`	Markdown, suitable for LLMs and documents	`## Results`, `- IC50: 0.5 nM`
`latex`	`Latex`	LaTeX, especially useful for equations	$E = mc^2$
`html`	`Html`	HTML, especially useful for tables	`<table>...</table>`

Structured outputs such as objects, pages_dict, and pages_tree are not affected by FormatFlag; formatting mainly applies to text fields in content and objects.

Common Layout Types LayoutType

In objects, pages_dict, and pages_tree, each parsed block has a type field describing what the block represents on the page.

Type	Meaning	Usually belongs to
`documenttitle`	Document title	`textual`
`title`	Section heading	`textual`
`paragraph`	Main body paragraph	`textual`
`abstract`	Abstract	`textual`
`reference`	Reference entry	`textual`
`table`	Table body	`table`
`tablecaption` / `tablefootnote`	Table caption / footnote	`textual`, semantically attached to a table
`equation` / `equationinline`	Display / inline equation	`equation`
`molecule` / `moleculeid`	Molecular structure / molecule label	`molecule` or related text
`figure` / `image`	Figure, illustration, photo, or image region	`figure`
`figurecaption` / `imagecaption`	Figure / image caption	`textual`, semantically attached to an image
`expression` / `expressioncaption`	Chemical reaction expression / caption	`expression` or related text
`chart` / `legend`	Chart / legend	`chart` or related text
`pageheader` / `pagefooter` / `pagenumber`	Header, footer, page number	Marginal information
`group` / `tablegroup` / `figuregroup` / `moleculegroup`	Group node	Structural node, common in `pages_tree`

type is a result label. It is not the same layer as parsing options such as textual, table, or molecule.

API And Frontend Limits

UniParser checks both API account quotas and frontend upload limits. API limits depend on account type, permissions, and approved quota configuration; frontend limits only apply to uploads through the web page and do not represent the final parsing quota for direct API calls.

Limit type	Applies to	Default limit	Notes
API file size / page count	Visitor standard accounts	200 MB, 2000 pages	Requests beyond the quota return file-size or page-count limit errors
API file size / page count	Paid users / accounts with higher quotas	Default 300 MB, 6000 pages	The actual quota follows the API application document or account provisioning configuration and can be adjusted on request
Frontend upload size	Web upload page	300 MB	This only limits files uploaded from the frontend; direct API calls are still governed by the account API quota

Request-rate limits are enforced by account, endpoint, and server-side configuration. When a request is rate-limited, the API returns 429, usually with the specific limit that was hit. For batch jobs, prefer asynchronous submission, control concurrency, and use backoff before retrying. If you need higher request throughput or larger file quotas, include that requirement in the API permission request.

Error Codes ErrorFlag

Category	Common errors	Meaning	What to do
Request parameter errors `400`	`Token_Required`: missing `token` parameter `Token_Invalid`: `token` contains invalid characters; only letters, numbers, underscores, and hyphens are allowed `Token_Duplicated`: `token` already exists and the task is not finished `File_Required`: upload file is missing `File_Invalid`: invalid PDF or zero-page PDF `URL_Invalid`: URL is not a valid PDF or download failed `Request_Invalid`: request parameter format or value is invalid	Required parameters, file content, URL accessibility, or field values are invalid	Fill in required fields; generate a new unique `token` for native API calls; verify the file opens, the URL downloads, and field values match the docs
Parsing status errors `400`	`Status_Not_Found`: no status file found for the `token` `Status_Decode_Failed`: status JSON is corrupted or cannot be decoded `Result_Not_Found`: parsing result file does not exist `Result_Decode_Failed`: result JSON is corrupted or cannot be decoded	Usually means the task does not exist, the result was cleaned up, the task failed, or server-side cache files are abnormal	Make sure you use the same `token` returned at submission; wait for async tasks to finish before fetching; resubmit if the result is no longer available
Language and page errors `400`	`Lang_Unknown`: document language cannot be detected `Lang_Not_Supported`: the language is not supported by the OCR / parsing flow `Pages_Required`: page selection is missing for endpoints that require it	Document language, scan quality, or page selection does not meet parsing requirements	Check that the document is readable; provide pages when required; improve scan quality or use a supported-language document
User limit errors `400`	`File_Size_Exceeded`: file size exceeds account limits `PDF_Pages_Exceeded`: PDF page count exceeds account limits `File_Type_Not_Allowed`: file type is not allowed `Domain_Not_Allowed`: URL download domain is not allowed	Related to account permissions, file size, page count, file type, or domain allowlist	Compress or split the file; use an allowed file type or domain; contact us for higher quotas or broader permissions
System and processing errors `400/500`	`OS_Error`: file I/O, path, permission, or system resource error `Process_Failed`: parsing subprocess failed, possibly caused by the model, input file, or runtime environment	File I/O, model service, or task process failed	Keep the `token` and error response, retry later, and report the file, token, and error details if it keeps failing
Authentication and authorization `401/402/403`	`Authentication required`: missing `Authorization` or `X-API-Key` header `Authentication failed`: JWT Token or API-Key is invalid / expired `Insufficient balance`: account balance is not enough `Administrator privileges required`: endpoint requires admin privileges `Missing permission`: account lacks specific parsing permissions	Headers, API-Key, balance, or permissions are insufficient	Check that `X-API-Key` is correct, unexpired, and has balance; contact us for administrator or special parsing permissions
Rate limit errors `429`	`Rate limit exceeded: <limit>`: global or endpoint limit exceeded `Rate limit exceeded, <user_rate_limit>`: user-level rate limit exceeded	Request frequency or concurrency exceeds limits	Reduce concurrency and retry frequency, add backoff, queue batch jobs, or request a higher rate limit

常见问题

解答调用方式、任务状态、限制额度和结果字段等常见疑问。

`token` 是什么，为什么必须提供唯一 `token`？

token 是解析任务的唯一索引，用于查询状态和获取结果。原生 API 调用时需要自行保证唯一性，推荐使用 UUID；如果复用未完成任务的 token，可能返回 Token_Duplicated。

如何获取 API Key？支持玻尔登录吗？

在 UniParser 主页通过玻尔登录一键授权即可获取 X-API-Key，首次登录会自动发放试用额度；如果已经有 Key，可在「输入已有 API Key」处直接填入。原验证码访客注册已下线，统一使用玻尔登录。请求时在请求头携带 X-API-Key: <你的 Key>。

如何计费？余额和可解析页数怎么看？

按解析页数计费，基础单价 ¥0.05 / 页；每页前 2 个分子已含在页费内，超出部分每个分子额外 ¥0.05。仅在解析成功后扣费，失败或超时不计费；余额不足时解析接口返回 402。在主页认证区可查看当前余额与「预估可解析页数」。

应该使用 UniParser-Tools 还是 UniParser-API？

优先使用 UniParser-Tools。它会自动处理 token 生成、任务提交、轮询和结果获取，适合脚本、批处理和 notebook。只有在需要直接控制 HTTP 请求、鉴权、回调、并发或字段开关时，才建议使用原生 UniParser-API。

`sync=true` 和 `sync=false` 怎么选？

sync=true 会在提交请求中等待解析完成，适合小文件、调试和示例。大文件、批处理或高并发任务建议使用默认异步模式 sync=false，提交后保存返回的 token，稍后调用结果接口获取内容。

文件超过限制怎么办？

访客标准账号默认限制为 200 MB、2000 页；付费用户或已开通高额度的账号默认限制为 300 MB、6000 页，具体以 API 申请和账号配置为准。超过限制时可以压缩或拆分 PDF，或申请更高额度。

触发 `429` 限速怎么办？

当前 rate limit 使用近似滑动窗口机制。触发限速时接口会返回 429，表示当前用户、接口或全局入口的请求频率超过了服务端配置。

逐用户生效的限速主要覆盖两类 API：

解析请求 API：例如提交 PDF、URL、截图/图片解析的接口，会按照账号配置中的速率 S 限速。
结果获取 API：例如获取状态、格式化结果、段落或视图结果的接口，也会逐用户限速；这类接口的限速通常比解析提交接口宽松很多，便于轮询和批量取结果。

另外，部分管理类、信息获取类或公共入口 API 会有全局限速。全局限速不区分用户，主要用于避免异常流量导致系统后端过载。

客户端侧建议使用令牌桶控制并发请求：令牌以速率 R 放入桶中，桶容量为 B；每次请求先消耗一个令牌，没有令牌时排队或等待。这样可以平滑批量请求，避免瞬时并发打满服务端限速。

        令牌生成器（速率 R）
              |
              v   （不断放入令牌）
        +-------------+
        | 令牌桶       |  最大容量 B
        | o o o       |
        +-------------+
              |
              v   （请求到达，消耗令牌）
        [ 待发送请求 ] --------> [ 允许通过 ]

如果仍然频繁触发 429，请降低提交频率和并发数，并在重试时增加退避等待。批量任务建议排队异步提交；如果业务需要更高吞吐，可以申请更高请求频率限制。

`content`、`objects`、`pages_dict`、`pages_tree` 有什么区别？

content 返回格式化后的全文，适合阅读、检索和 LLM 输入。objects 返回扁平语义块列表，适合按块处理坐标、类型和文本。pages_dict 按页返回扁平布局。pages_tree 按页返回树形结构，适合需要父子层级关系的场景。

表格结果应该看哪个字段？

如果只需要可读文本，通常直接获取 content，并把 table 输出格式设为 markdown 或 html。如果需要结构化表格对象，请查看 objects、pages_dict 或 pages_tree 中的表格块。table=1（fast）通常需要结合 structure、placeholders、contents 还原表格；table=2（hq）通常在 structure 中直接包含完整表格数据。

`textual` 和 `table` 都要设置吗？

需要根据目标分别设置。table 控制表格区域的结构识别和单元格内容识别；表题、表注、正文段落等主要受 textual 控制。如果希望正文和表格都尽量高质量，通常同时设置 textual=2 和 table=2。

What is `token`, and why must it be unique?

token is the unique index for a parsing task and is used to query status and fetch results. Native API callers must ensure uniqueness, preferably with UUIDs. Reusing a token for an unfinished task may return Token_Duplicated.

How do I get an API Key? Is Bohrium sign-in supported?

Sign in with Bohrium (玻尔登录) on the UniParser homepage for one-click authorization to obtain an X-API-Key; a trial balance is granted on first sign-in. If you already have a key, paste it under "enter an existing API Key". The former captcha-based visitor registration has been retired in favor of Bohrium sign-in. Send the key on every request via the X-API-Key: <your key> header.

How is usage billed, and how do I check my balance?

Billing is per parsed page at a base rate of ¥0.05 / page. The first 2 molecules on each page are included in the page fee; each additional molecule costs ¥0.05. You are charged only after a parse succeeds — failed or timed-out tasks are not billed, and an insufficient balance makes the parse endpoints return 402. The homepage auth panel shows your current balance and an estimated "pages remaining" (roughly balance ÷ 0.05).

Should I use UniParser-Tools or UniParser-API?

Use UniParser-Tools by default. It handles token generation, task submission, polling, and result fetching for you, which is suitable for scripts, batch jobs, and notebooks. Use the native UniParser-API only when you need direct control over HTTP requests, authentication, callbacks, concurrency, or field switches.

How should I choose between `sync=true` and `sync=false`?

sync=true waits for parsing to finish in the submit request. It is useful for small files, debugging, and examples. For large files, batch jobs, or high-concurrency workloads, use the default async mode sync=false, store the returned token, and fetch the result later.

What should I do if my file exceeds the limit?

Visitor standard accounts are limited to 200 MB and 2000 pages by default. Paid users or accounts with higher quotas default to 300 MB and 6000 pages, with the actual quota determined by the API application and account provisioning configuration. Compress or split the PDF, or request a higher quota.

What should I do after a `429` rate-limit response?

The current rate limit is implemented as an approximate sliding-window mechanism. A 429 response means the request rate has exceeded the server-side limit for the current user, endpoint, or global entry point.

User-level rate limits mainly apply to two groups of APIs:

Parsing request APIs: endpoints that submit PDF, URL, snip, or image parsing tasks. These are limited according to the account rate S.
Result retrieval APIs: endpoints that fetch task status, formatted results, paragraphs, or view data. These are also limited per user, but their limits are usually much more tolerant than the parsing submission rate so clients can poll and fetch batch results.

Some management, information retrieval, or public-entry APIs also have global rate limits. Global limits are not user-specific; they protect the backend from overload caused by abnormal traffic.

For client-side concurrency control, we recommend using a token-bucket algorithm. Tokens are added to the bucket at rate R, and the bucket capacity is B. Each request consumes one token before it is sent; if no token is available, the request should wait or remain queued. This smooths bursty batch traffic and avoids hitting server-side rate limits.

        Token generator (rate R)
              |
              v   (adds tokens continuously)
        +-------------+
        | Token bucket |  max capacity B
        | o o o       |
        +-------------+
              |
              v   (incoming request consumes a token)
        [ Pending request ] --------> [ Allowed ]

If you still hit 429 frequently, reduce submission frequency and concurrency, and add backoff before retrying. For batch jobs, queue tasks and submit them asynchronously. If your business requires higher throughput, request a higher rate limit.

What is the difference between `content`, `objects`, `pages_dict`, and `pages_tree`?

content returns formatted full-document text, suitable for reading, search, and LLM input. objects returns a flat semantic block list for block-level coordinates, types, and text. pages_dict returns flattened page-level layouts. pages_tree returns page-level trees when parent-child hierarchy is needed.

Which field should I use for table results?

If you only need readable text, fetch content and set the table output format to markdown or html. If you need structured table objects, inspect table blocks in objects, pages_dict, or pages_tree. With table=1 (fast), you usually reconstruct tables from structure, placeholders, and contents; with table=2 (hq), structure usually contains the complete table data directly.

Should I set both `textual` and `table`?

Set them according to what you need. table controls table-region structure recognition and cell-content recognition; table captions, table footnotes, and body paragraphs are mainly controlled by textual. If you want both body text and tables to use higher quality, usually set both textual=2 and table=2.

前言