Verkle Trie 从 0 到 1
video: https://youtu.be/yfQ0CUU4zik
docs: https://qiwihui.notion.site/Verkle-trie-8fa545dff5014191bfb6af2765b42e6e?pvs=4
Problems
- 
How to store multiple files remotely and know that those files haven’t been changed?
 - 
Given a starting 𝑥, compute 𝑥↦𝑥^3+5, and repeat that 1 million times. How to prove to someone I computed this, and did so correctly - without he having to re-run the whole thing.
Suppose our starting number is 𝑥=2. - x^2 = 4 - x^3 = x^2 * x = 4 * 2 = 8 - X^3 + 5 = 13 So our trace is {2, 4, 8, 13, ...} we will produce 3,000,001 numbers in computing the circuit. 
→ How can we verify integrity of a vector of elements?
Solution 1: Single file hashing
For single file, we can use secure hash functions:

So a simple scheme for verifying file integrity: hash each file and save the store the hash locally.

Problem: has to store n hashes → we need constant-sized digest
Solution 2: Merkle Trees
Merkle tree
- 
the root is the digest, it is constant sized

 - 
use merkle proof to verify if the files have been changed.

 
Performance

Problem: Many small files ⇒ Merkle proofs too large
Solution 3: q-ary Merkle Trees

problem: Proof size is even bigger

proof size:
Solution 4: polynomial commitment

What is polynomial commitments?
- 
将长度为 的 vector 转换为多项式的点值 →
 - 
将唯一对应的 的多项式 ,生成为Commitment→ 拉格朗日插值
- 
Lagrange Interpolation
Polynomial
- Degree
 
Encoding data into Polynomial using Lagrange Interpolation
Given , build a polynomial such that and degree is
Example
- Given (0, 3), (1, 6), we have
 
(2, 9), (3, 12), (4, 15). Suppose, given (1,6) and (3,12)
n encode to m (m > n), n-of-m data can recover the polynomial exactly!
 
 - 
 - 
Open 其中的一个点,提供一个 Proof 证明点值符合多项式
 

STUART → (1, 83), (2, 84), …, (6, 84) → f(x) → choose (4.5, 69.5) as commitment
KZG polynomial commitment
Knowledge -> Point-Values -> Coefficients -> Commitment -> Open&Prove&Verify
                         FFT             MSM
                                          ^
                                          |
                                    Trusted Setup
FFT: Fast Furious Transform
MSM: multi-scalar multiplication
- 
KZG Commitment 是 Polynomial Commitment 的一种算法实现
- 
Elliptic curves + discrete logarithm problem
Encoding Polynomial in a finite field , q is prime:
Polynomial on an elliptic curve
where
- can be computed very fast
 - , given and , it is very hard to find (it is called discrete logarithm algorithm)
 - mod 7:
- 1 mod 7, 8 mod 7, 15 mod 7,….
 - [n] mod 7 = 1 mod 7?
 
 
 - 
Trusted setup

Now we have secret such that
- Nobody knows (private key of the “god”)
 - , is known to everybody (”god”’s public key)
 
Then, we have the commitment as
Finding another such that is almost impossible
 - 
Elliptic curves pairings
Find two elliptic curves, such that

Given , want to prove ,
3x+3 given data points( 1, 6), (4,2)
where is the proof (48 bytes as a point on an elliptic curve)

 
 - 
 - 
Polynomial Commitment 的其他实现
- KZG:PLONK、Marlin
 - FRI:zkSTARK
 - IPA:Bulletproof
 - IPA + Halo-style aggregation:Halo 2
 

 - 
KZG Commitment的优缺点
- 缺点:需要Trusted Setup
 - 优点:proof 长度短且恒定
 
 
Solution 5: Verkle trie
Replace Hash Functions in q-ary Merkle tree with Vector commitment Schemes → Verkle Trie

Performance comparison:

Verkle Trees let us trade off proof-size vs. construction time.
Verkle tree structure in Ethereum
MPT(Merkle Patricia Trie) problem
Ethereum has a total of four trees:
- the World State Trie
 - Receipts Trie
 - Transaction Trie
 - Account Storage Trie
 


MPT is 2-layer structure (Tree-inside-a-tree)
- Complexity
 - Imbalance
 - Difficulty in understanding interactions between mechanisms such as state expiration
 
Vitalik has proposed a single-layer structure.

maps data to a 32-byte single key at all locations within the state:
eg. (address, storage_slot), (address, NONCE), (address, balance),…
values sharing the first 31 bytes of the key are included in the same bottom-layer commitment.

Tree key
- 
32 bytes
 - 
consisting of a 31-byte stem and a 1-byte suffix. The suffix allows for distinguishing the state information (account header data, code, storage) stored by the Tree Key.
 - 
31-byte stem: pedersen_hash
def get_tree_key(address: Address32, tree_index: int, sub_index: int): # Asssumes VERKLE_NODE_WIDTH = 256 return ( pedersen_hash(address + tree_index.to_bytes(32, 'little'))[:31] + bytes([sub_index]) ) 

verkle tree structure:

Inner Node & Suffix Node(extension node)
Suffix Node
suffix node structure:

- 1: A marker for the suffix node, which is 1 on the elliptic curve but does not literally mean the number 1.
 - Stem: The stem refers to the stem in the tree key.
 - C1, C2: Are Pedersen Commitments.
 
C = Commit(1, C1, Stem, C2)
C1 and C2 commitment take the data form:

- The reason for this division is that the creation of Pedersen Commitment is limited to committing up to 256 values of maximum 253-bit size, and for 256-bit values, data loss occurs.
 - Process of storing 32-byte data under a tree key:
- 
Depending on the suffix, the data become v0, v1… v255
 - 
v0~v127 are included in C1, and v128~v255 are included in C2 to calculate the leaf node’s commitment
 - 
For C1, each 32-byte value of v0~v127 is divided into the upper 16 bytes (v1,0) and the lower 16 bytes (v1, 1) to serve as coefficients in a polynomial.
→ each coefficient’s data being 16 bytes (128-bit)
 - 
256-degree polynomial is committed:
C1 = commit([(v0,0), (v0,1), (v1,0), (v1,1)…(v127,0),(v127,1)])C2 = commit([(v128,0), (v128,1), (v129,0), (v129,1) … (v255,0),(v255,1)])
 - 
C = Commit(1, C1, Stem, C2)→ commitment for the leaf node 
 - 
 
Inner Node

- holds the stem value of the tree key and stores 256 pointers to sub-nodes
 - C0, C1 … C255 represent the commitments of sub-nodes, and the inner node contains these commitments.
 
An example of verkle tree containing 4 tree keys:
- 0x00..20
 - 0xdefe…64
 - 0xde03a8..02
 - 0xde03a8..ff
 

Summary:
- The Verkle Trie consists of two types of nodes: leaf nodes and inner nodes.
 - A tree key contains a stem and a suffix.
 - The same stem corresponds to the same leaf node.
 - Data is stored differentiated by the suffix of the tree key.
 - The tree key is encoded byte by byte along the path from the root to the leaf node.
 - Data is included in the commitment of the leaf node.