Logo

Appendix: Minimal Data Structures (Manifest/Chunk/State)

This appendix provides minimal data-structure recommendations for “parallel chunking + resumable transfer + idempotent retries”. The goal is to enable high-throughput concurrency and reliable recovery after failures—without complex transactions.

General principles

  • Large objects vs. small state: chunks/manifest are large objects stored in object storage; state is small and stored in a state store.
  • Unique keys and idempotency: a chunk MUST be uniquely addressable by (transferId, fileId, chunkIndex) for idempotent retransmissions.
  • Recoverability: state MUST express the “completed set” so recovery can skip completed parts.
  • No sensitive leakage: object keys and state SHOULD avoid embedding sensitive fields such as email or original filenames (whether these are encrypted depends on your privacy policy).

A.1 Manifest minimal fields

The manifest is the receiver’s “single entry point” for download and reassembly. Minimal recommendations:

  • manifestVersion: MUST. Used for compatibility upgrades (e.g., 1).
  • transferId: MUST. Unique transfer session identifier.
  • createdAt / expiresAt: SHOULD. For lifecycle handling and UI hints.
  • policy: SHOULD. A policy summary (download-count limits, password requirement, whether sharing-before-complete is allowed, etc.).
  • files[]: MUST. File list description (at minimum file sizes and chunk ranges).
  • chunking: MUST. Includes chunkSize and how chunkCount is computed, or per-file chunkCount.
  • objectKeyRule: MUST. Derive object keys from (transferId, fileId, chunkIndex), or provide an explicit mapping table.

A.1.1 Minimal files[] fields

  • fileId: MUST. Unique file identifier (do not use filename as the unique key).
  • size: MUST. Size in bytes.
  • mime: MAY. For display and download suggestions.
  • name: MAY. If you want strict zero-knowledge / minimal leakage, omit name or store an encrypted name.
  • chunkCount: MUST. Chunk count for this file.
  • chunkOffset: MAY. If multiple files share a global chunkIndex, an offset is needed; otherwise it can be omitted.

A.1.2 Integrity and verification fields (optional but recommended)

  • chunkHashes[]: SHOULD. Per-chunk verification (one or a combination of hash/length/etag).
  • fileHash: MAY. Whole-file checksum (verify after download).
  • ciphertextLength: MAY. Ciphertext-level length consistency checks (no plaintext needed).

A.2 Chunk objects: minimal constraints

  • Unique addressing: a chunk MUST be uniquely addressable by (transferId, fileId, chunkIndex).
  • Idempotent upload: re-uploading the same chunk MUST NOT break the final state (it may overwrite or be rejected, but behavior must be consistent).
  • Minimal metadata: the server MAY record only ciphertext length, write time, etag/version, etc., for observability and troubleshooting.

A.2.1 Object key rule (example)

  • /transfers/{transferId}/chunks/{fileId}/{chunkIndex}
  • /transfers/{transferId}/manifest

Constraint: object keys MUST support bulk cleanup by transferId prefix; object keys SHOULD avoid carrying sensitive business information.

A.3 State records: minimal fields

State answers “upload/download progress and recovery”. It should be small and fast to read/write.

  • transferId:MUST。
  • status: MUST. Suggested values: UPLOADING / READY / DELETED / EXPIRED.
  • uploadedSet: MUST. Completed chunk set; SHOULD be compressed with a bitmap / range-set.
  • uploadedBytes: SHOULD. For progress display and quota checks (can be derived from chunks, but maintaining it is faster).
  • downloadCount: MAY. If you enforce download-count limits, it must be recorded and updated atomically (details depend on your storage).
  • expiresAt: SHOULD. For expiry enforcement and background cleanup.

A.3.1 Minimal output for a recovery API

For resumable transfers, the server SHOULD be able to return:

  • Transfer status (UPLOADING/READY/DELETED/EXPIRED)
  • uploadedSet (completed chunk set)
  • Policy summary (e.g., over quota, expired, whether continuing upload is allowed)

Note: boundary vs. the Security & Privacy Whitepaper

  • This appendix only describes the “structures and constraints” needed for transfer/storage.
  • For encryption fields, how key material is carried/derived, and fail-closed behavior on authentication failures, refer to the Security & Privacy Whitepaper.