Verkle Trie 从 0 到 1
video: https://youtu.be/yfQ0CUU4zik
docs: https://qiwihui.notion.site/Verkle-trie-8fa545dff5014191bfb6af2765b42e6e?pvs=4
Problems
-
How to store multiple files remotely and know that those files haven’t been changed?
-
Given a starting 𝑥, compute 𝑥↦𝑥^3+5, and repeat that 1 million times. How to prove to someone I computed this, and did so correctly - without he having to re-run the whole thing.
Suppose our starting number is 𝑥=2. - x^2 = 4 - x^3 = x^2 * x = 4 * 2 = 8 - X^3 + 5 = 13 So our trace is {2, 4, 8, 13, ...} we will produce 3,000,001 numbers in computing the circuit.
→ How can we verify integrity of a vector of elements?
Solution 1: Single file hashing
For single file, we can use secure hash functions:
So a simple scheme for verifying file integrity: hash each file and save the store the hash locally.
Problem: has to store n hashes → we need constant-sized digest
Solution 2: Merkle Trees
Merkle tree
-
the root is the digest, it is constant sized
-
use merkle proof to verify if the files have been changed.
Performance
Problem: Many small files ⇒ Merkle proofs too large
Solution 3: q-ary Merkle Trees
problem: Proof size is even bigger
proof size:
Solution 4: polynomial commitment
What is polynomial commitments?
-
将长度为 的 vector 转换为多项式的点值 →
-
将唯一对应的 的多项式 ,生成为Commitment→ 拉格朗日插值
-
Lagrange Interpolation
Polynomial
- Degree
Encoding data into Polynomial using Lagrange Interpolation
Given , build a polynomial such that and degree is
Example
- Given (0, 3), (1, 6), we have
(2, 9), (3, 12), (4, 15). Suppose, given (1,6) and (3,12)
n encode to m (m > n), n-of-m data can recover the polynomial exactly!
-
-
Open 其中的一个点,提供一个 Proof 证明点值符合多项式
STUART → (1, 83), (2, 84), …, (6, 84) → f(x) → choose (4.5, 69.5) as commitment
KZG polynomial commitment
Knowledge -> Point-Values -> Coefficients -> Commitment -> Open&Prove&Verify
FFT MSM
^
|
Trusted Setup
FFT: Fast Furious Transform
MSM: multi-scalar multiplication
-
KZG Commitment 是 Polynomial Commitment 的一种算法实现
-
Elliptic curves + discrete logarithm problem
Encoding Polynomial in a finite field , q is prime:
Polynomial on an elliptic curve
where
- can be computed very fast
- , given and , it is very hard to find (it is called discrete logarithm algorithm)
- mod 7:
- 1 mod 7, 8 mod 7, 15 mod 7,….
- [n] mod 7 = 1 mod 7?
-
Trusted setup
Now we have secret such that
- Nobody knows (private key of the “god”)
- , is known to everybody (”god”’s public key)
Then, we have the commitment as
Finding another such that is almost impossible
-
Elliptic curves pairings
Find two elliptic curves, such that
Given , want to prove ,
3x+3 given data points( 1, 6), (4,2)
where is the proof (48 bytes as a point on an elliptic curve)
-
-
Polynomial Commitment 的其他实现
- KZG:PLONK、Marlin
- FRI:zkSTARK
- IPA:Bulletproof
- IPA + Halo-style aggregation:Halo 2
-
KZG Commitment的优缺点
- 缺点:需要Trusted Setup
- 优点:proof 长度短且恒定
Solution 5: Verkle trie
Replace Hash Functions in q-ary Merkle tree with Vector commitment Schemes → Verkle Trie
Performance comparison:
Verkle Trees let us trade off proof-size vs. construction time.
Verkle tree structure in Ethereum
MPT(Merkle Patricia Trie) problem
Ethereum has a total of four trees:
- the World State Trie
- Receipts Trie
- Transaction Trie
- Account Storage Trie
MPT is 2-layer structure (Tree-inside-a-tree)
- Complexity
- Imbalance
- Difficulty in understanding interactions between mechanisms such as state expiration
Vitalik has proposed a single-layer structure.
maps data to a 32-byte single key at all locations within the state:
eg. (address, storage_slot)
, (address, NONCE)
, (address, balance)
,…
values sharing the first 31 bytes of the key are included in the same bottom-layer commitment.
Tree key
-
32 bytes
-
consisting of a 31-byte stem and a 1-byte suffix. The suffix allows for distinguishing the state information (account header data, code, storage) stored by the Tree Key.
-
31-byte stem: pedersen_hash
def get_tree_key(address: Address32, tree_index: int, sub_index: int): # Asssumes VERKLE_NODE_WIDTH = 256 return ( pedersen_hash(address + tree_index.to_bytes(32, 'little'))[:31] + bytes([sub_index]) )
verkle tree structure:
Inner Node & Suffix Node(extension node)
Suffix Node
suffix node structure:
- 1: A marker for the suffix node, which is 1 on the elliptic curve but does not literally mean the number 1.
- Stem: The stem refers to the stem in the tree key.
- C1, C2: Are Pedersen Commitments.
C = Commit(1, C1, Stem, C2)
C1 and C2 commitment take the data form:
- The reason for this division is that the creation of Pedersen Commitment is limited to committing up to 256 values of maximum 253-bit size, and for 256-bit values, data loss occurs.
- Process of storing 32-byte data under a tree key:
-
Depending on the suffix, the data become v0, v1… v255
-
v0~v127 are included in C1, and v128~v255 are included in C2 to calculate the leaf node’s commitment
-
For C1, each 32-byte value of v0~v127 is divided into the upper 16 bytes (v1,0) and the lower 16 bytes (v1, 1) to serve as coefficients in a polynomial.
→ each coefficient’s data being 16 bytes (128-bit)
-
256-degree polynomial is committed:
C1 = commit([(v0,0), (v0,1), (v1,0), (v1,1)…(v127,0),(v127,1)])
C2 = commit([(v128,0), (v128,1), (v129,0), (v129,1) … (v255,0),(v255,1)])
-
C = Commit(1, C1, Stem, C2)
→ commitment for the leaf node
-
Inner Node
- holds the stem value of the tree key and stores 256 pointers to sub-nodes
- C0, C1 … C255 represent the commitments of sub-nodes, and the inner node contains these commitments.
An example of verkle tree containing 4 tree keys:
- 0x00..20
- 0xdefe…64
- 0xde03a8..02
- 0xde03a8..ff
Summary:
- The Verkle Trie consists of two types of nodes: leaf nodes and inner nodes.
- A tree key contains a stem and a suffix.
- The same stem corresponds to the same leaf node.
- Data is stored differentiated by the suffix of the tree key.
- The tree key is encoded byte by byte along the path from the root to the leaf node.
- Data is included in the commitment of the leaf node.