Extract data from PDFs and create a searchable collection
Learn to create extraction templates, process PDF documents, and enable AI-powered queries on extracted data.
What you will achieve
By the end of this tutorial you will have a complete flow: upload a PDF, extract structured data with AI, and allow an agent to answer questions based on that information. Estimated time: 20 minutes.
Prerequisites
- An active Rela AI account with dashboard access
- A test PDF file (e.g., equipment data sheet, purchase order, or certificate)
- A configured agent (optional, for the query step)
Step 1: Create an extraction template
- In the sidebar, go to Data > Templates.
- Click New Template.
- Define the fields you want to extract:
| Field | Type | Description |
|---|---|---|
| serial_number | text | Equipment serial number |
| manufacturer | text | Manufacturer name |
| manufacturing_date | date | Manufacturing date |
| power_kw | number | Rated power in kW |
| voltage | number | Operating voltage |
- Assign a name to the template:
Equipment Data Sheet. - Click Save.
You should see the template listed in the table with the 5 configured fields.
number but the PDF contains text like "N/A", the extraction will fail for that field. Use text if the value may not be numeric.Step 2: Upload a PDF document
- Go to Data > Extractions.
- Click New Extraction.
- Select the template
Equipment Data Sheet. - Drag your PDF file to the upload area or click to select it.
- Click Start Extraction.
You should see a progress indicator while the AI analyzes the document.
Step 3: The AI extracts the fields
- Wait for the status to change to Completed (usually 10-30 seconds).
- Review the results in the preview:
{
"serial_number": "SN-2024-00847",
"manufacturer": "Siemens",
"manufacturing_date": "2024-03-15",
"power_kw": 75,
"voltage": 480
}You should see each extracted field with its value and a confidence indicator.
Step 4: Verify and save the data
- Review each extracted field and correct any errors.
- If everything looks correct, click Approve & Save.
- The data is stored as a record within a collection.
You should see the saved record in Data > Collections inside the collection associated with the template.
Step 5: Create a query tool
- Go to Tools > New Tool.
- Select the type Collection Query.
- Configure:
- Name:
Query Equipment Sheets - Collection: select the collection for the
Equipment Data Sheettemplate - Description for AI:
Use this tool to look up technical equipment information such as serial number, manufacturer, power, and voltage.
- Name:
- Click Create.
You should see the tool listed and available for assignment to agents.
Step 6: The agent answers with extracted data
- Go to Agents and select your agent.
- In the Tools section, add
Query Equipment Sheets. - Save the changes.
- Send a message to the agent:
What is the power rating for equipment SN-2024-00847?
You should see a response like: "Equipment SN-2024-00847 manufactured by Siemens has a rated power of 75 kW."
Summary
| Step | Action | Result |
|---|---|---|
| 1 | Create template | 5-field structure defined |
| 2 | Upload PDF | Document loaded for extraction |
| 3 | AI extraction | Fields extracted automatically |
| 4 | Verify and save | Record stored in collection |
| 5 | Create tool | Collection query available |
| 6 | Query via agent | Response based on real data |
Next steps
- Process multiple PDFs in batch to load large volumes of documents
- Configure automatic extractions with processing rules
- Generate reports from extracted data