Tutorial September 2025

AI-Powered Document Auto-Tagging

Use AI assistants to automatically tag documents with custom metadata

The Manual Tagging Problem

Organizations deal with thousands of documents: contracts, support tickets, research papers, customer communications. Properly tagging these documents is essential for search and organization, but doing it manually is time-consuming and inconsistent.

SINAS's Solution

SINAS combines three systems to enable AI-powered auto-tagging:

  • Tag Definitions: Define structured metadata schemas
  • AI Assistants: Configure AI models to analyze content
  • Tagger Rules: Automatically apply tags to documents

Step-by-Step Tutorial

1. Define Your Tags

First, create tag definitions that describe what metadata you want to capture:

POST /api/v1/tags/definitions
{
  "name": "priority",
  "display_name": "Priority Level",
  "value_type": "enum",
  "applies_to": ["document"],
  "allowed_values": ["low", "medium", "high", "critical"]
}

POST /api/v1/tags/definitions
{
  "name": "category",
  "value_type": "enum",
  "applies_to": ["document"],
  "allowed_values": ["technical", "business", "legal", "financial"]
}

POST /api/v1/tags/definitions
{
  "name": "year",
  "value_type": "string",
  "applies_to": ["document"]
}

2. Create a Tagging Assistant

Configure an AI assistant specifically for tagging:

POST /api/v1/assistants
{
  "name": "Document Tagger",
  "provider": "openai",
  "model": "gpt-4",
  "system_prompt": "You are a document classification assistant. Analyze documents and return structured tags based on content, urgency, and category. Return only valid JSON with the specified tag schema.",
  "output_schema": {
    "type": "object",
    "properties": {
      "priority": {"type": "string", "enum": ["low", "medium", "high", "critical"]},
      "category": {"type": "string", "enum": ["technical", "business", "legal", "financial"]},
      "year": {"type": "string"}
    }
  }
}

3. Create a Tagger Rule

Link your tag definitions and assistant together with a tagger rule:

POST /api/v1/tags/tagger-rules
{
  "name": "Contract Auto-Tagger",
  "description": "Automatically tag contracts with priority, category, and year",
  "scope_type": "folder",
  "folder_id": "contracts_folder_uuid",
  "assistant_id": "tagger_assistant_uuid",
  "tag_definition_ids": [
    "priority_tag_uuid",
    "category_tag_uuid",
    "year_tag_uuid"
  ],
  "auto_trigger": true
}

4. Upload Documents

When you upload documents to the folder, they're automatically tagged:

POST /api/v1/documents
{
  "name": "Enterprise Agreement 2025",
  "content": "This agreement dated January 15, 2025...",
  "folder_id": "contracts_folder_uuid"
}

The tagger rule automatically:

  • Sends the document content to the AI assistant
  • Receives structured tag suggestions
  • Validates against tag definitions
  • Applies the tags to the document

Manual Tagging

You can also manually trigger tagging for specific documents:

POST /api/v1/tags/document/{document_id}/run-tagger
{
  "tagger_rule_id": "rule_uuid"
}

Bulk Tagging

Tag all documents in a folder at once:

POST /api/v1/tags/tagger-rules/{rule_id}/run-bulk
{
  "folder_id": "contracts_folder_uuid",
  "force_retag": false
}

This is perfect for:

  • Tagging existing document collections
  • Re-tagging after adding new tag definitions
  • Fixing incorrect tags from a previous run

Querying Tagged Documents

Once tagged, you can efficiently filter documents:

GET /api/v1/documents?tags=[{"key":"priority","value":"high"}]&tag_match=AND

GET /api/v1/documents?tags=[
  {"key":"category","value":"legal"},
  {"key":"year","value":"2025"}
]&tag_match=AND

Advanced Features

Tag Value Counts

See distribution of tag values:

GET /api/v1/tags/values/priority?resource_type=document

Response:
{
  "tag_name": "priority",
  "values": [
    {"value": "high", "count": 45},
    {"value": "medium", "count": 32},
    {"value": "low", "count": 18}
  ]
}

Multiple Tagger Rules

Apply different tagging strategies to different folders. For example:

  • Legal documents get compliance tags
  • Support tickets get sentiment and urgency tags
  • Research papers get topic and methodology tags

Email Auto-Tagging

The same system works for incoming emails:

POST /api/v1/tags/tagger-rules
{
  "scope_type": "inbox",
  "inbox_id": "support_inbox_uuid",
  "assistant_id": "email_tagger_uuid",
  "tag_definition_ids": ["sentiment_tag", "urgency_tag"]
}

Best Practices

1. Start with Clear Tag Definitions

Use enums for categorical tags to ensure consistency. Provide clear descriptions that help the AI understand when to apply each value.

2. Tune Your Assistant Prompt

Give examples in the system prompt of how to classify edge cases. The better your prompt, the more accurate the tagging.

3. Validate and Iterate

Review auto-tagged documents periodically. Use incorrect tags as training examples to improve your prompt.

4. Use force_retag Carefully

When running bulk operations, force_retag will overwrite existing tags. Use it when you've improved your tagging logic and want to reprocess everything.

Real-World Example

A legal firm used SINAS to automatically tag contracts:

  • 10,000+ contracts tagged in under an hour
  • 95%+ accuracy on contract type classification
  • Searchable archive with multi-tag filtering
  • Compliance tracking with automatic expiration date extraction

Conclusion

AI-powered auto-tagging transforms document management from a manual chore into an automated process. With SINAS, you can tag thousands of documents with structured metadata in minutes, making your document archive truly searchable and organized.