Upload Security

Every document uploaded to Virza goes through a multi-layer security process before it enters your workspace. This page explains the architecture designed to protect your research data and your team.

How uploads work

Virza uses a 3-phase presigned upload flow. Your document data never passes through Virza’s API server. It goes directly from your browser to encrypted cloud storage.


Your browser                    Virza API              Cloud storage (R2)
     │                              │                        │
     │  1. Request upload URL  ───→ │                        │
     │  ←── Presigned PUT URL ────  │                        │
     │                              │                        │
     │  2. Upload file directly ──────────────────────────→  │ (quarantine bucket)
     │                              │                        │
     │  3. Confirm upload ────────→ │                        │
     │                              │  ── Virus scan ──→     │
     │                              │  ←── Clean ──────      │
     │                              │  ── Move to production → │
     │  ←── Processing started ──── │                        │

Phase 1: Request upload URL

Your browser requests a presigned upload URL from the Virza API. The API validates:

File type is supported (PDF, DOCX, or TXT)
File size is within your plan’s limits
Your workspace has capacity for another document

The API returns a time-limited presigned URL (expires in 5 minutes) pointing to a quarantine bucket: a separate storage zone isolated from your workspace data.

Phase 2: Direct upload

Your browser uploads the file directly to cloud storage using the presigned URL. The file data never touches the Virza API server. This means:

No man-in-the-middle risk through the API
No server memory exposure of file contents
Faster uploads: direct to storage infrastructure

Phase 3: Confirmation and virus scan

After upload completes, your browser notifies the API. The API then:

Verifies the file exists in the quarantine bucket
Runs a virus/malware scan on the file
If clean: moves the file to the production storage bucket and starts processing
If infected: marks the document as Infected, deletes the file, and notifies you

Virus scanning

Every file is scanned before any processing begins. Files that fail the scan are:

Immediately quarantined
Permanently deleted from storage
Marked with an Infected status in your library
Never processed or indexed

If you see an “Infected” status, the file contained detected malware or a known threat signature. Do not re-upload the same file. Scan the original with local antivirus software.

File validation

Before processing, every file goes through a preflight validation check:

Check	What it validates
MIME type	Only `application/pdf`, `application/vnd.openxmlformats-officedocument.wordprocessingml.document`, and `text/plain` are accepted
File size	Must be within your plan’s maximum (10 MB–500 MB depending on plan)
Page count	PDF page count must be within your plan’s maximum (50–5,000 depending on plan)
Encryption	Password-protected and DRM-encrypted PDFs are rejected
File integrity	SHA-256 hash is computed for duplicate detection and integrity verification

Blocked file types include executables (.exe, .bat), archives (.zip, .tar), and other non-document formats.

Storage encryption

All stored files are encrypted at rest using AES-256 encryption. Files are stored in workspace-scoped paths that enforce tenant isolation. There is no shared storage between workspaces.

For enterprise security teams

No file content in API logs: file data is never logged or cached by the API server
Presigned URLs expire in 5 minutes: cannot be reused or shared
Quarantine isolation: infected files never reach the production storage bucket
SHA-256 integrity verification: file integrity is validated end-to-end
Workspace-scoped storage paths: workspaces/{workspace_id}/documents/{document_id}/original.{ext}