Lookup (Type 2)
#
Recap of types
#
Type | Description | Recap | This |
---|
lookup1 | | Each element of array is in (or another small set). | |
lookup2 | | Each element of array is in a disclosed table of values . | ✅ |
Relation
#
Intuition
#
The prover () holds an array of integers from : . It will produce a succinct (independent of ) proof that each value in is the element of a public table . The prover will encode and into polynomials: and . Assume is equal to , then the prover can complete the proof by simply performing the product check or the permutation check between and . However, if is not equal to (this is the case we want to solve), the prover needs to reveal the points where equals to , which makes zero knowledge impossible. Thus, the prover needs to construct auxiliary polynomial(s) to prove the constraints between and are desired.
There are different approaches to achieving the lookup argument, e.g., halo2 and Plookup. We will discuss these two approaches as follows.
halo2
#
The lookup argument in halo2 requires the prover to construct two auxiliary vectors, and , where is the permutation of and sorted (ascending or descending does not matter), is the permutation of and sorted by . For any two sets such that , we say is sorted by when the elements in have the same order as they do in . Let us demonstrate the halo2 lookup argument with a concrete example, . The prover constructs and such that
We can observe that is equal to or . Since does not exist when , we have to enforce the other rule, . With these two constraints and the proof that is the permutation of and is the permutation of , the prover can prove each element in exists in .
Plookup
#
We start with an unoptimized version of Plookup that is conceptually simpler than the final version and is still fully succinct. The inputs are: an array containing the values of the lookup table of size ; and an array of length (typically longer than but not necessarily) that will be proven to contain only values from somewhere in .
The prover will create 5 helper arrays ( to ) to demonstrate this overall property ( for all ). The helper arrays will each be of size with the exception of which is . They are as follows:
- where
It is probably worth an example at this point.
The first array is the concatenation for and . The intuition for this is as follows. Every element of is in (what is being proven) but the converse is not true, not every element of is necessarily in . By constructing , the prover has an array where every element of appears at least once. This will be convenient later. Additionally, if the prover can show every element in is in , it implies every element in the original is in so it can now move forward with .
How does compare to ? They both have the same elements but has a bunch of extra duplicates of values. Also the appearance of elements in are in a different (arbitrary) order. Next the prover will sort the values of , grouping all duplicates together, and having them appear in the same order as the original (which does not have to be sorted).
Now how does compare to ? They have the same elements in the same order (including 3 which is not in the original ) however has a bunch of duplicates of some of the elements. Next we want to “flag” all the elements that are duplicates. The way we do this is to take the difference between each neighbouring elements. If the neighbouring elements are the same (duplicates), this is will place a 0 in that position. If they are not, a non-zero number will appear instead.
This is straight-forward until we hit the last element in and which has no “next” element in the array. We will leave it for now as an arbitrary integer .
We have marked the duplicates elements but have some other integers when neighbouring elements are not the same. The idea is do the same thing with and we create , which we then pad with zeros.
If is a permutation of and everything else is correctly constructed, then all elements in are from .
The prover will show that is constructed correctly with the gadget. It will not prove that is sorted correctly (if it does not sort it correctly, the protocol will not work) but it will prove that is a permutation of using . It will prove is constructed correctly by and is . Last, it will prove that is correctly formed with and finally, that it is a permutation of with .
to the end of does not change anything about the argument
In fact, the only difference between and itself is that there are several duplicates
With these arrays, the following constraints are demonstrated:
- Uses gadget
- Uses
- Uses
- Uses
- Uses and then
Unlike halo2, Plookup requires only one auxiliary vector, , where is the union set of and , and sorted by . The prover encodes , , and into polynomials: , , and , and computes and with some random challenges. The theorem tells us the two products are equal if and only if: , and and sorted by .
Protocol Details
#
halo2
#
Array Level
#
- and are given a public table
- holds an array of integers ()
- computes an array of integers () such that:
- is a permutation of and sorted in ascending or descending
- computes an array such that:
- is a permutation of and sorted by
- and have the following relations:
Polynomial Level
#
We assume arrays , , , and are encoded as the y-coordinates into a univariant polynomial where the x-coordinates (called the domain ) are chosen as the multiplicative group of order with generator (see Background for more). In short, is the first element and is the last element of . If is larger than the length of the array, the array can be padded with elements of value 1 (which will not change the product).
Recall the four constraints we want to prove:
- is a permutation of
- is a permutation of
- The first value in equals to the first value in
- The rest of the values in are of the form
In polynomial form, the constraints are ( are challenges from ):
- For :
- For all :
We take care of the “for ” conditions by zeroing out the rest of the polynomial that is not zero. See the gadget zero1 for more on why this works.
Instead of proving and are vanishing through two permutation checks, we can construct an accumulator to make it more efficient (the domain needs to be expanded to , denoted by ):
Now we have the new and :
These equations are true for every value of (but not necessarily true outside of these values). To show this, we divide each polynomial by , which is a minimal vanishing polynomial for that does not require interpolation to create. If the quotients are polynomials (and not rational functions), then , , , and must be vanishing on too. Specifically, the prover computes,
Instead of proving the four polynomials are zero polynomials one by one, we can linearly combine the four polynomials with a random challenge sent by to compute:
When are vanishing on the domain , is also vanishing with high probability. Again, if and only if is vanishing over the field , exists.
By rearranging, we can get as a true zero polynomial (zero at every value both in and outside of it):
Ultimately the halo2 lookup argument will satisfy the following constraints at the Commitment Level:
- Show exists
- Show is correctly constructed from , , , , and
- Show is a zero polynomial
Commitment Level
#
The verifier will never see the arrays or polynomials themselves. They are undisclosed because they either (i) contain private data or (ii) are too large to examine and maintain a succinct proof system. Instead, the prover will use commitments.
The prover will write the following commitments to the transcript:
The prover will generate a random challenge evaluation point (using strong Fiat-Shamir) on the polynomial that is outside of . Call this point . It will be used by the prover to create polynomial (see above) and the prover will write to the transcript:
The prover will generate a second random challenge evaluation point (using strong Fiat-Shamir) on the polynomial that is outside of . Call this point . The prover will write the three points and opening proofs to the transcript:
To check the proof, the verifier uses the transcript to construct the value as follows:
Finally, if the constraint system is true, the following constraint will be true (and will be false otherwise with overwhelming probability, due to the Schwartz-Zippel lemma on and ) :
Plookup
#
Array Level
#
- and are given a public table of integers ()
- holds an array of integers ()
- constructs an array of integers () such that:
- is a union set of and
- is sorted by
- , , and have the following relations:
- For each , there exists a such that
- Let be the set of those indices, and let . For each , there exists a such that
Polynomial Level
#
We assume arrays , , and are encoded as the y-coordinates into a univariant polynomial where the x-coordinates (called the domain ) are chosen as the multiplicative group of order with generator (see Background for more). In short, is the first element and is the last element of . If is larger than the length of the array, the array can be padded with elements of value 1 (which will not change the product).
Recall the two constraints we want to prove:
- For each , there exists a such that
- Let be the set of those indices, and let . For each , there exists a such that
In polynomial form, the constraints are ( are challenges from ):
To efficiently prove the above polynomial holds, we can use a similar trick in halo2 lookup by constructing an accumulator :
However, the above accumulator does not exist because the degree of the denominator, , is different from and . Specifically, there are elements in , elements in , but elements in . Thus we have to decompose the denominator to make it have the same iteration as and do. It is worth noting for the numerator, the left term contains while the right term contains and . Therefore, it will be convenient to assume . The order of is . To traverse in steps, we can halve it to and , and prove . Then we can compute the accumulator such that:
Similarly, we take care of the “for ” conditions by zeroing out the rest of the polynomial that is not zero. See the gadget zero1 for more on why this works.
These equations are true for every value of (but not necessarily true outside of these values). To show this, we divide each polynomial by , which is a minimal vanishing polynomial for that does not require interpolation to create. If the quotients are polynomials (and not rational functions), then , , and must be vanishing on too. Specifically, the prover computes,
Instead of proving the three polynomials are zero polynomials one by one, we can linearly combine the three polynomials with a random challenge sent by to compute:
When are vanishing on the domain , is also vanishing with high probability. Again, if and only if is vanishing over the field , exists.
By rearranging, we can get as a true zero polynomial (zero at every value both in and outside of it):
Ultimately the Plookup will satisfy the following constraints at the Commitment Level:
- Show exists
- Show is correctly constructed from , , , , and
- Show is a zero polynomial
Commitment Level
#
The verifier will never see the arrays or polynomials themselves. They are undisclosed because they either (i) contain private data or (ii) are too large to examine and maintain a succinct proof system. Instead, the prover will use commitments.
The prover will write the following commitments to the transcript:
The prover will generate a random challenge evaluation point (using strong Fiat-Shamir) on the polynomial that is outside of . Call this point . It will be used by the prover to create polynomial (see above) and the prover will write to the transcript:
The prover will generate a second random challenge evaluation point (using strong Fiat-Shamir) on the polynomial that is outside of . Call this point . The prover will write the three points and opening proofs to the transcript:
To check the proof, the verifier uses the transcript to construct the value as follows:
Finally, if the constraint system is true, the following constraint will be true (and will be false otherwise with overwhelming probability, due to the Schwartz-Zippel lemma on and ) :
Security Proof
#
halo2
#
Completeness
#
Any honest prover can do the computations explained above and create an accepting proof.
Soundness
#
We prove knowledge soundness in the Algebraic Group Model (AGM). To do so, we must prove that there exists an efficient extractor such that for any algebraic adversary the probability of winning the following game is .
- Given , outputs commitments to , , , ,
- , given access to ’s outputs from the previous step, outputs , , , ,
- plays the part of the prover in showing that is zero at a random challenge
- wins if
- accepts at the end of the protocol
- For some ,
Our proof is as follows:
To make a zero polynomial, has to prove the four vanishing polynomials are correct. For and , the permutation check tells us the probability that accepts the proof is negligible if some elements in are not in . By the Schwartz-Zippel lemma, we know the probability that is vanishing is negligible if . Therefore, to win the KS game, has to prove is correct with the winning condition (some elements in do not appear in ). Assume , to make such exist, has to equal to . And because , has to equal to . Thus, to make the winning condition hold, must hold, which contradicts to the condition of , .
Zero-Knowledge
#
Before we prove the above protocol is zero-knowledge, it is worth noting the protocol is different from the lookup argument of halo2 in the real world. Specifically, the real halo2 lookup optimizes the two permutation checks into one and fills the table with some random numbers for the PLONK-based proof system. We refer to the official halo2 handbook to see the details. To prove the above protocol is zero-knowledge, we do so by constructing a probabilistic polynomial time simulator which, for every (possibly malicious) verifier , can output a view of the execution of the protocol that is indistinguishable from the view produced by the real execution of the program.
The simulator generates an array by randomly filling it with elements from , then follows the same steps a prover would prove the lookup argument. computes and interpolates the four arrays into their respective polynomials, , , , and . It computes and finally outputs the commitments to each of these polynomials (and writes the commitments to the transcript). Then, it generates the random challenge (once again this is by strong Fiat-Shamir). It creates opening proofs for , , and , and writes these to the transcript as well. Since knows each of the above polynomials, it can honestly compute this step and the proof will be accepted by . The transcript it generates this way will be indistinguishable from a transcript generated from a real execution, since has the property of Indistinguishability of Commitments due to the randomization by .
Plookup
#
Completeness
#
Any honest prover can do the computations explained above and create an accepting proof.
Soundness
#
We prove knowledge soundness in the Algebraic Group Model (AGM). To do so, we must prove that there exists an efficient extractor such that for any algebraic adversary the probability of winning the following game is .
- Given , outputs commitments to , , ,
- , given access to ’s outputs from the previous step, outputs , , ,
- plays the part of the prover in showing that is zero at a random challenge
- wins if
- accepts at the end of the protocol
- For some ,
Our proof is as follows:
To make a zero polynomial, has to prove the three vanishing polynomials are correct. It is easy to observe that and can be constructed even if some elements in do not appear in , so we focus on the . By the soundness of the product check, we know must be the union set of and if we want to the product of equals to the product of and . Recall the equation needs to prove:
Thus, for each , there exists a factor in the right-hand side of the equation equal to , which means . We can get and for . Similarly, we can get for . Since should be equal to or less than , when , must hold at the same time, which contradicts to the winning assumption. Therefore, the protocol is sound.
Zero-Knowledge
#
To prove the above protocol is zero-knowledge, we do so by constructing a probabilistic polynomial time simulator which, for every (possibly malicious) verifier , can output a view of the execution of the protocol that is indistinguishable from the view produced by the real execution of the program.
The simulator generates an array by randomly filling it with elements from , then follows the same steps a prover would prove the lookup argument. computes , , and and interpolates the five arrays into their respective polynomials, , , , , and . It computes and finally outputs the commitments to each of these polynomials (and writes the commitments to the transcript). Then, it generates the random challenge (once again this is by strong Fiat-Shamir). It creates opening proofs for , and , and writes these to the transcript as well. Since knows each of the above polynomials, it can honestly compute this step and the proof will be accepted by . The transcript it generates this way will be indistinguishable from a transcript generated from a real execution since has the property of Indistinguishability of Commitments due to the randomization by .