- Start Date: 2020-08-10
- Design PR: ZcashFoundation/zebra#868
- Zebra Issue: ZcashFoundation/zebra#964
Summary
This RFC describes an architecture for asynchronous script verification and its interaction with the state layer. This architecture imposes constraints on the ordering of operations in the state layer.
Motivation
As in the rest of Zebra, we want to express our work as a collection of work-items with explicit dependencies, then execute these items concurrently and in parallel on a thread pool.
Definitions
- UTXO: unspent transparent transaction output.
Transparent transaction outputs are modeled in
zebra-chain
by thetransparent::Output
structure. - outpoint: a reference to an unspent transparent transaction output, including a transaction hash and output index.
Outpoints are modeled in
zebra-chain
by thetransparent::OutPoint
structure. - transparent input: a previous transparent output consumed by a later transaction (the one it is an input to).
Modeled in
zebra-chain
by thetransparent::Input::PrevOut
enum variant. - coinbase transaction: the first transaction in each block, which creates new coins.
- lock script: the script that defines the conditions under which some UTXO can be spent.
Stored in the
transparent::Output::lock_script
field. - unlock script: a script satisfying the conditions of the lock script, allowing a UTXO to be spent.
Stored in the
transparent::Input::PrevOut::lock_script
field.
Guide-level explanation
Zcash's transparent address system is inherited from Bitcoin. Transactions spend unspent transparent transaction outputs (UTXOs) from previous transactions. These UTXOs are encumbered by locking scripts that define the conditions under which they can be spent, e.g., requiring a signature from a certain key. Transactions wishing to spend UTXOs supply an unlocking script that should satisfy the conditions of the locking script for each input they wish to spend.
This means that script verification requires access to data about previous UTXOs, in order to determine the conditions under which those UTXOs can be spent. In Zebra, we aim to run operations asynchronously and out-of-order to the greatest extent possible. For instance, we may begin verification of a block before all of its ancestors have been verified or even downloaded. So we need to design a mechanism that allows script verification to declare its data dependencies and execute as soon as all required data is available.
It's not necessary for this mechanism to ensure that the transaction outputs remain unspent, only to give enough information to perform script verification. Checking that all transaction inputs are actually unspent is done later, at the point that its containing block is committed to the chain.
At a high level, this adds a new request/response pair to the state service:
Request::AwaitSpendableUtxo { output: OutPoint, ..conditions }
requests a spendabletransparent::Output
, looked up usingOutPoint
.Response::SpendableUtxo(Utxo)
supplies the requestedtransparent::Output
as part of a newUtxo
type, if the output is spendable based onconditions
;
Note that this request is named differently from the other requests,
AwaitSpendableUtxo
rather than GetUtxo
or similar. This is because the
request has rather different behavior:
- the request does not complete until the state service learns about a UTXO matching the request, which could be never. For instance, if the transaction output was already spent, the service is not required to return a response.
- the request does not complete until the output is spendable, based on the
conditions
in the request.
The state service does not cancel long-running UTXO requests. Instead, the caller is responsible for deciding when a request is unlikely to complete. (For example, using a timeout layer.)
This allows a script verifier to asynchronously obtain information about previous transaction outputs and start verifying scripts as soon as the data is available. For instance, if we begin parallel download and verification of 500 blocks, we should be able to begin script verification of all scripts referencing outputs from existing blocks in parallel, and begin verification of scripts referencing outputs from new blocks as soon as they are committed to the chain.
Because spending outputs from older blocks is more common than spending outputs from recent blocks, this should allow a significant amount of parallelism.
Reference-level explanation
Data structures
We add the following request and response to the state protocol:
#![allow(unused)] fn main() { enum Request::AwaitSpendableUtxo { outpoint: OutPoint, spend_height: Height, spend_restriction: SpendRestriction, } /// Consensus rule: /// "A transaction with one or more transparent inputs from coinbase transactions /// MUST have no transparent outputs (i.e.tx_out_count MUST be 0)." enum SpendRestriction { /// The UTXO is spent in a transaction with transparent outputs SomeTransparentOutputs, /// The UTXO is spent in a transaction with all shielded outputs AllShieldedOutputs, } }
As described above, the request name is intended to indicate the request's behavior. The request does not resolve until:
- the state layer learns of a UTXO described by the request, and
- the output is spendable at
height
withspend_restriction
.
The new Utxo
type adds a coinbase flag and height to transparent::Output
s
that we look up in the state, or get from newly committed blocks:
#![allow(unused)] fn main() { enum Response::SpendableUtxo(Utxo) pub struct Utxo { /// The output itself. pub output: transparent::Output, /// The height at which the output was created. pub height: block::Height, /// Whether the output originated in a coinbase transaction. pub from_coinbase: bool, } }
Transparent coinbase consensus rules
Specifically, if the UTXO is a transparent coinbase output, the service is not required to return a response if:
spend_height
is less thanMIN_TRANSPARENT_COINBASE_MATURITY
(100) blocks after theUtxo.height
, orspend_restriction
isSomeTransparentOutputs
.
This implements the following consensus rules:
A transaction MUST NOT spend a transparent output of a coinbase transaction from a block less than 100 blocks prior to the spend.
Note that transparent outputs of coinbase transactions include Founders’ Reward outputs and transparent funding stream outputs.
A transaction with one or more transparent inputs from coinbase transactions MUST have no transparent outputs (i.e.tx_out_count MUST be 0).
Inputs from coinbase transactions include Founders’ Reward outputs and funding stream outputs.
https://zips.z.cash/protocol/protocol.pdf#txnencodingandconsensus
Parallel coinbase checks
We can perform these coinbase checks asynchronously, in the presence of multiple chain forks, as long as the following conditions both hold:
-
We don't mistakenly accept or reject spends to the transparent pool.
-
We don't mistakenly accept or reject mature spends.
Parallel coinbase justification
There are two parts to a spend restriction:
- the
from_coinbase
flag, and - if the
from_coinbase
flag is true, the coinbaseheight
.
If a particular transaction hash h
always has the same from_coinbase
value,
and h
exists in multiple chains, then regardless of which Utxo
arrives first,
the outputs of h
always get the same from_coinbase
value during validation.
So spends can not be mistakenly accepted or rejected due to a different coinbase flag.
Similarly, if a particular coinbase transaction hash h
always has the same height
value,
and h
exists in multiple chains, then regardless of which Utxo
arrives first,
the outputs of h
always get the same height
value during validation.
So coinbase spends can not be mistakenly accepted or rejected due to a different height
value.
(The heights of non-coinbase outputs are irrelevant, because they are never checked.)
These conditions hold as long as the following multi-chain properties are satisfied:
from_coinbase
: across all chains, the set of coinbase transaction hashes is disjoint from the set of non-coinbase transaction hashes, and- coinbase
height
: across all chains, duplicate coinbase transaction hashes can only occur at exactly the same height.
Parallel coinbase consensus rules
These multi-chain properties can be derived from the following consensus rules:
Transaction versions 1-4:
[Pre-Sapling ] If effectiveVersion = 1 or nJoinSplit = 0, then both tx_in_count and tx_out_count MUST be nonzero. ... [Sapling onward] If effectiveVersion < 5, then at least one of tx_in_count, nSpendsSapling, and nJoinSplit MUST be nonzero.
A coinbase transaction for a block at block height greater than 0 MUST have a script that, as its first item, encodes the block height height as follows.
For height in the range {1 .. 16}, the encoding is a single byte of value 0x50 + height.
Otherwise, let heightBytes be the signed little-endian representation of height, using the minimum nonzero number of bytes such that the most significant byte is < 0x80. The length of heightBytes MUST be in the range {1 .. 8}. Then the encoding is the length of heightBytes encoded as one byte, followed by heightBytes itself.
https://zips.z.cash/protocol/protocol.pdf#txnencodingandconsensus
The transaction ID of a version 4 or earlier transaction is the SHA-256d hash of the transaction encoding in the pre-v5 format described above.
https://zips.z.cash/protocol/protocol.pdf#txnidentifiers
Transaction version 5:
[NU5 onward] If effectiveVersion ≥ 5, then this condition must hold: tx_in_count > 0 or nSpendsSapling > 0 or (nActionsOrchard > 0 and enableSpendsOrchard = 1). ... [NU5 onward] The nExpiryHeight field of a coinbase transaction MUST be equal to its block height.
https://zips.z.cash/protocol/protocol.pdf#txnencodingandconsensus
non-malleable transaction identifiers ... commit to all transaction data except for attestations to transaction validity ... A new transaction digest algorithm is defined that constructs the identifier for a transaction from a tree of hashes ... A BLAKE2b-256 hash of the following values: ... T.1e: expiry_height (4-byte little-endian block height)
https://zips.z.cash/zip-0244#t-1-header-digest
Since:
- coinbase transaction hashes commit to the block
Height
, - non-coinbase transaction hashes commit to their inputs, and
- double-spends are not allowed;
Therefore:
- coinbase transaction hashes are unique for distinct heights in any chain,
- coinbase transaction hashes are unique in a single chain, and
- non-coinbase transaction hashes are unique in a single chain, because they recursively commit to unique inputs.
So the required parallel verification conditions are satisfied.
Script verification
To verify scripts, a script verifier requests the relevant UTXOs from the
state service and waits for all of them to resolve, or fails verification
with a timeout error. Currently, we outsource script verification to
zcash_consensus
, which does FFI into the same C++ code as zcashd
uses.
We need to ensure this code is thread-safe.
Database implementation
Implementing the state request correctly requires considering two sets of behaviors:
- behaviors related to the state's external API (a
Buffer
edtower::Service
); - behaviors related to the state's internal implementation (using
rocksdb
).
Making this distinction helps us to ensure we don't accidentally leak
"internal" behaviors into "external" behaviors, which would violate
encapsulation and make it more difficult to replace rocksdb
.
In the first category, our state is presented to the rest of the application
as a Buffer
ed tower::Service
. The Buffer
wrapper allows shared access
to a service using an actor model, moving the service to be shared into a
worker task and passing messages to it over an multi-producer single-consumer
(mpsc) channel. The worker task receives messages and makes Service::call
s.
The Service::call
method returns a Future
, and the service is allowed to
decide how much work it wants to do synchronously (in call
) and how much
work it wants to do asynchronously (in the Future
it returns).
This means that our external API ensures that the state service sees a linearized sequence of state requests, although the exact ordering is unpredictable when there are multiple senders making requests.
Because the state service has exclusive access to the rocksdb database, and the
state service sees a linearized sequence of state requests, we have an easy
way to opt in to asynchronous database access. We can perform rocksdb operations
synchronously in the Service::call
, waiting for them to complete, and be
sure that all future requests will see the resulting rocksdb state. Or, we can
perform rocksdb operations asynchronously in the future returned by
Service::call
.
If we perform all writes synchronously and allow reads to be either synchronous or asynchronous, we ensure that writes cannot race each other. Asynchronous reads are guaranteed to read at least the state present at the time the request was processed, or a later state.
Lookup states
Now, returning to the UTXO lookup problem, we can map out the possible states with this restriction in mind. This description assumes that UTXO storage is split into disjoint sets, one in-memory (e.g., blocks after the reorg limit) and the other in rocksdb (e.g., blocks after the reorg limit). The details of this storage are not important for this design, only that the two sets are disjoint.
When the state service processes a Request::AwaitSpendableUtxo
referencing
some UTXO u
, there are three disjoint possibilities:
u
is already contained in an in-memory block storage;u
is already contained in the rocksdb UTXO set;u
is not yet known to the state service.
In case 3, we need to queue u
and scan all future blocks to see whether
they contain u
. However, if we have a mechanism to queue u
, we can
perform check 2 asynchronously, because restricting to synchronous writes
means that any async read will return the current or later state. If u
was
in the rocksdb UTXO set when the request was processed, the only way that an
async read would not return u
is if the UTXO were spent, in which case the
service is not required to return a response.
Lookup implementation
This behavior can be encapsulated into a PendingUtxos
structure described below.
#![allow(unused)] fn main() { // sketch #[derive(Default, Debug)] struct PendingUtxos(HashMap<OutPoint, oneshot::Sender<Utxo>>); impl PendingUtxos { // adds the outpoint and returns (wrapped) rx end of oneshot // checks the spend height and restriction before sending the utxo response // return can be converted to `Service::Future` pub fn queue( &mut self, outpoint: OutPoint, spend_height: Height, spend_restriction: SpendRestriction, ) -> impl Future<Output=Result<Response, ...>>; // if outpoint is a hashmap key, remove the entry and send output on the channel pub fn respond(&mut self, outpoint: OutPoint, output: transparent::Output); /// check the list of pending UTXO requests against the supplied `utxos` pub fn check_against(&mut self, utxos: &HashMap<transparent::OutPoint, Utxo>); // scans the hashmap and removes any entries with closed senders pub fn prune(&mut self); } }
The state service should maintain an Arc<Mutex<PendingUtxos>>
, used as follows:
- In
Service::call(Request::AwaitSpendableUtxo { outpoint: u, .. }
, the service should:
- call
PendingUtxos::queue(u)
to get a futuref
to return to the caller; - spawn a task that does a rocksdb lookup for
u
, callingPendingUtxos::respond(u, output)
if present; - check the in-memory storage for
u
, callingPendingUtxos::respond(u, output)
if present; - return
f
to the caller (it may already be ready). The common case is thatu
references an old spendable UTXO, so spawning the lookup task first means that we don't wait to check in-memory storage foru
before starting the rocksdb lookup.
- In
f
, the future returned byPendingUtxos::queue(u)
, the service should check that theUtxo
is spendable before returning it:
- if
Utxo.from_coinbase
is false, return the utxo; - if
Utxo.from_coinbase
is true, check that:spend_restriction
isAllShieldedOutputs
, andspend_height
is greater than or equal toMIN_TRANSPARENT_COINBASE_MATURITY
plus theUtxo.height
,- if both checks pass, return the utxo.
- if any check fails, drop the utxo, and let the request timeout.
- In
Service::call(Request::CommitBlock(block, ..))
, the service should:
- check for double-spends of each UTXO in the block, and
- do any other transactional checks before committing a block as normal.
Because the
AwaitSpendableUtxo
request is informational, there's no need to do the transactional checks before matching against pending UTXO requests, and doing so upfront can run expensive verification earlier than needed.
- In
Service::poll_ready()
, the service should callPendingUtxos::prune()
at least some of the time. This is required because when a consumer uses a timeout layer, the cancelled requests should be flushed from the queue to avoid a resource leak. However, doing this on every call will result in us spending a bunch of time iterating over the hashmap.
Drawbacks
One drawback of this design is that we may have to wait on a lock. However, the critical section basically amounts to a hash lookup and a channel send, so I don't think that we're likely to run into problems with long contended periods, and it's unlikely that we would get a deadlock.
Rationale and alternatives
High-level design rationale is inline with the design sketch. One low-level
option would be to avoid encapsulating behavior in the PendingUtxos
and
just have an Arc<Hashmap<..>>
, so that the lock only protects the hashmap
lookup and not sending through the channel. But I think the current design is
cleaner and the cost is probably not too large.
Unresolved questions
- We need to pick a timeout for UTXO lookup. This should be long enough to account for the fact that we may start verifying blocks before all of their ancestors are downloaded.