Upload Security
Every document uploaded to Virza goes through a multi-layer security process before it enters your workspace. This page explains the architecture designed to protect your research data and your team.
How uploads work
Virza uses a 3-phase presigned upload flow. Your document data never passes through Virza’s API server. It goes directly from your browser to encrypted cloud storage.
Your browser Virza API Cloud storage (R2)
│ │ │
│ 1. Request upload URL ───→ │ │
│ ←── Presigned PUT URL ──── │ │
│ │ │
│ 2. Upload file directly ──────────────────────────→ │ (quarantine bucket)
│ │ │
│ 3. Confirm upload ────────→ │ │
│ │ ── Virus scan ──→ │
│ │ ←── Clean ────── │
│ │ ── Move to production → │
│ ←── Processing started ──── │ │Phase 1: Request upload URL
Your browser requests a presigned upload URL from the Virza API. The API validates:
- File type is supported (PDF, DOCX, or TXT)
- File size is within your plan’s limits
- Your workspace has capacity for another document
The API returns a time-limited presigned URL (expires in 5 minutes) pointing to a quarantine bucket: a separate storage zone isolated from your workspace data.
Phase 2: Direct upload
Your browser uploads the file directly to cloud storage using the presigned URL. The file data never touches the Virza API server. This means:
- No man-in-the-middle risk through the API
- No server memory exposure of file contents
- Faster uploads: direct to storage infrastructure
Phase 3: Confirmation and virus scan
After upload completes, your browser notifies the API. The API then:
- Verifies the file exists in the quarantine bucket
- Runs a virus/malware scan on the file
- If clean: moves the file to the production storage bucket and starts processing
- If infected: marks the document as Infected, deletes the file, and notifies you
Virus scanning
Every file is scanned before any processing begins. Files that fail the scan are:
- Immediately quarantined
- Permanently deleted from storage
- Marked with an Infected status in your library
- Never processed or indexed
If you see an “Infected” status, the file contained detected malware or a known threat signature. Do not re-upload the same file. Scan the original with local antivirus software.
File validation
Before processing, every file goes through a preflight validation check:
| Check | What it validates |
|---|---|
| MIME type | Only application/pdf, application/vnd.openxmlformats-officedocument.wordprocessingml.document, and text/plain are accepted |
| File size | Must be within your plan’s maximum (10 MB–500 MB depending on plan) |
| Page count | PDF page count must be within your plan’s maximum (50–5,000 depending on plan) |
| Encryption | Password-protected and DRM-encrypted PDFs are rejected |
| File integrity | SHA-256 hash is computed for duplicate detection and integrity verification |
Blocked file types include executables (.exe, .bat), archives (.zip, .tar), and other non-document formats.
Storage encryption
All stored files are encrypted at rest using AES-256 encryption. Files are stored in workspace-scoped paths that enforce tenant isolation. There is no shared storage between workspaces.
For enterprise security teams
- No file content in API logs: file data is never logged or cached by the API server
- Presigned URLs expire in 5 minutes: cannot be reused or shared
- Quarantine isolation: infected files never reach the production storage bucket
- SHA-256 integrity verification: file integrity is validated end-to-end
- Workspace-scoped storage paths:
workspaces/{workspace_id}/documents/{document_id}/original.{ext}