October 29–November 2, 2007, Alexandria, Virginia, USA | Giuseppe Ateniese, Randal Burns, Reza Curtmola, Joseph Herring, Lea Kissner, Zachary Peterson, Dawn Song
This paper introduces a model for provable data possession (PDP) that allows a client to verify that an untrusted server possesses the original data without retrieving it. The model generates probabilistic proofs by sampling random blocks from the server, reducing I/O costs. The client maintains constant metadata to verify the proof, and the challenge/response protocol transmits a small, constant amount of data, minimizing network communication. This PDP model supports large data sets in distributed storage systems.
Two provably-secure PDP schemes are presented, more efficient than previous solutions. The server's overhead is low, and the schemes use constant bandwidth. Experiments show that PDP is bounded by disk I/O, not cryptographic computation. The schemes use homomorphic verifiable tags, allowing tags for multiple blocks to be combined into a single value. The client pre-computes tags for each block and stores the file and tags with the server. The client can verify the server's possession by generating a random challenge against a randomly selected set of blocks. The server generates a proof using the queried blocks and their tags, convincing the client of data possession without retrieving the blocks.
The efficient PDP scheme is fundamental for an archival introspection system for long-term preservation of astronomy data. The system ensures that remote data checking does not burden remote storage sites. The more efficient scheme (E-PDP) uses a single modular exponentiation at the server, providing weaker guarantees. Both schemes support public verifiability, allowing anyone to challenge the server for data possession.
The schemes are secure under the RSA and KEA1-r assumptions. They provide data format independence, allowing files to be stored without encryption. The schemes are efficient, requiring constant modular exponentiations and I/O complexity. They are suitable for large data systems. The schemes are implemented and evaluated, showing that probabilistic guarantees make it practical to verify large data sets. E-PDP is 185 times faster than previous secure protocols on 768 KB files. The schemes are applicable to public databases and have no restrictions on the number of challenges.This paper introduces a model for provable data possession (PDP) that allows a client to verify that an untrusted server possesses the original data without retrieving it. The model generates probabilistic proofs by sampling random blocks from the server, reducing I/O costs. The client maintains constant metadata to verify the proof, and the challenge/response protocol transmits a small, constant amount of data, minimizing network communication. This PDP model supports large data sets in distributed storage systems.
Two provably-secure PDP schemes are presented, more efficient than previous solutions. The server's overhead is low, and the schemes use constant bandwidth. Experiments show that PDP is bounded by disk I/O, not cryptographic computation. The schemes use homomorphic verifiable tags, allowing tags for multiple blocks to be combined into a single value. The client pre-computes tags for each block and stores the file and tags with the server. The client can verify the server's possession by generating a random challenge against a randomly selected set of blocks. The server generates a proof using the queried blocks and their tags, convincing the client of data possession without retrieving the blocks.
The efficient PDP scheme is fundamental for an archival introspection system for long-term preservation of astronomy data. The system ensures that remote data checking does not burden remote storage sites. The more efficient scheme (E-PDP) uses a single modular exponentiation at the server, providing weaker guarantees. Both schemes support public verifiability, allowing anyone to challenge the server for data possession.
The schemes are secure under the RSA and KEA1-r assumptions. They provide data format independence, allowing files to be stored without encryption. The schemes are efficient, requiring constant modular exponentiations and I/O complexity. They are suitable for large data systems. The schemes are implemented and evaluated, showing that probabilistic guarantees make it practical to verify large data sets. E-PDP is 185 times faster than previous secure protocols on 768 KB files. The schemes are applicable to public databases and have no restrictions on the number of challenges.