Protecting validator keys is difficult due to their dual requirements of security and accessibility. This article discusses various methods to increase both security and accessibility of validator keys.
A previous article discussed the protection of Ethereum 2 withdrawal keys. It also provided some explanation about Ethereum 2 keys in general, and is recommended reading before proceeding with this article. Likewise, a basic understanding of validators is expected: what they are, why they require keys, and what they do in their standard operation.
The validator key is an Ethereum 2 key used to certify messages sent by validators. A validator must send these messages to be granted rewards (and avoid suffering penalties).
Measuring the risk for validator keys is difficult: validator keys provide no access to funds themselves, so if an attacker obtains your validator key they cannot profit from it in a direct sense. Indirect attacks such as blackmail1 are possible, as are spoiling attacks where the attacker's goal is for you to lose funds rather than for the attacker to gain them2. As such, it is generally considered that the security requirements for validator keys are very high.
However, unlike withdrawal keys validator keys are required to be accessible at all times: a validator is required to sign multiple messages every epoch (~6.5 minutes). As such, the access requirements for validator keys are also very high.
Thus we find ourselves in a dilemma: traditionally we can have security or accessibility, not both. The requirement for both security and accessibility is one that doesn't fit well with the traditional scale where increased security provides decreased accessibility, and vice versa.
This dual requirement for both security and accessibility requires implementation of more advanced measures that can provide high levels of both without significant compromise. This can be achieved through layering. The purpose of this article is to examine some of the features that each layer provides and to understand which may or may not be applicable to a given validator setup.
Ethereum 2 keys come in a pair of public and private keys. Note that due to the focus of this article, from here on the term "key" refers to what is often called the secret or private key; when discussing the public key this is stated explicitly.
First, we must define the goals of both the attacker and the user. When we know the reasons for the protection of the validator key, we can consider how much each feature contributes to that purpose. The first goal is that of the attacker, which can be defined as follows:
The attacker's goal is to be able to sign an arbitrary message considered valid against the validator public key.
Of course, if this were the only purpose then destroying the validator key would prevent the attacker from achieving its goal. However, this would be of no help to the user. The second goal, then, is that of the user, which can be defined as follows:
The user's goal is to be able to sign all desirable messages, and no undesirable messages, for the validator public key.
For the purposes of this article a desirable message is one which will gain a reward, and an undesirable message is one that will create a slashing event3.
Note that for the attacker to achieve its goal, it only needs to sign a single arbitrary message, whereas the user's goal is on-going. This asymmetry is typical of security systems where the attacker must only win once, but the target must win every time.
To this end, we need a system of layers, or individual features, that provide enhanced security or enhanced accessibility, and can combined to provide high levels of both. A good security model will have multiple layers, each of them providing (either individually or in combination with other layers) some specific protection or backup of other layers, and with enough layers providing accessibility to ensure the user's goal can be met. Note that this article focuses on technical protections: other security, such as operational and social, are outside of the scope of this article but are a critical part of any security model and should be addressed accordingly.
To investigate how we protect a validator key we start with a simple representation:
A validator key, like all keys in Ethereum 2, is nothing more than a number4. If this key is stored without any protection, it can easily be obtained, and the attacker's goal met, by (for example):
This also fails the user's goal of not signing undesirable messages. So let's take the first step to securing the validator key by encrypting it:
There are various ways in which the validator key can be encrypted, for example the EIP-2335 standard5. Encrypting the validator key results in the key being unreadable to an attacker who does not have the passphrase required to decrypt the data6; this also protects the key "at rest" (on the storage of the validator client, on data backups, etc.).
At first glance this appears to remove the methods the attacker can use to meet their goal. However, it also fails the user's goal as they can no longer sign desirable messages: an encrypted key is of as much use to the user as the attacker. For the user to be able to sign messages the key must be available decrypted, and as such the validator process requires access to the decryption key. But if the validator process can access the key, so too can the attacker, particularly if the decryption key is stored on the validator client which can then be attacked.
So, by itself this seems to be a relatively bad change. However, when combined with the next layer it provides much stronger protection:
If the decryption passphrase is stored remotely, encrypting the validator key provides significantly enhanced protection7. Because the passphrase no longer resides on the validator client's storage there is no way an unencrypted key can be obtained from the on-disk data. Instead, this would require an attacker to carry out far more sophisticated attacks such as pulling the decrypted key from memory or imitating the validator client process to be sent the decryption key.
However, although more difficult it is still possible for a sophisticated attack to obtain the validator key. In addition, the user is still able to inadvertently sign undesirable messages. Improved security and accessibility for the user can thus be obtained with an additional remote signer layer:
A remote signer separates the core features of the validator client: working out what to put in a message, signing the message, and providing the message to the Ethereum 2 network. The first and last features stay with the validator client, but the middle feature moves to the remote signer. The remote signer also introduces slashing protection, as it can decide on which messages are desirable and which are not, and sign or reject them accordingly.
What stops the attacker simply shifting focus and attacking the remote signer instead of the validator client? Firstly, the signer can have much higher levels of security than the validator client. The validator client has to carry out a number of tasks, including talking to other elements on the Ethereum 2 network such as beacon nodes. These communications can provide attackers with information and attack vectors. As the signer only talks to the validator clients they have a much more constrained set of activities, which allows for higher security on both servers.
Secondly, a remote signer brings additional benefits for the user. It is now possible to have multiple validator clients talking to the same remote signer, which allows high availability validator client infrastructures to be built; the remote signer ensures that no undesirable messages are signed by the validator clients.
However, the fact remains that the remote signer is a single point of failure: if it is attacked, or merely suffers a failure, it results in the user being unable to sign the desirable messages. Is there a way in which the remote signer can be made more resilient without losing either of the two benefits outlined above?
Threshold signing is a layer that provides both additional security and accessibility, again building on the layers beneath it. The validator key is put through a process called Shamir secret sharing, where a number of keys are generated from the validator private key and distributed to remote signers.
Now, each remote signer holds a key derived from the validating key such that a subset of the remote signers can create a valid signature, known as a threshold signature. A sample situation where two of the three remote signers provide individual signatures to form a single aggregate signature is shown below.
Threshold signatures are often written as , which means that any signatures of the total possible signatures are required to form a valid signature. If the threshold is , for example, a valid signature can be formed from any two responses.
With the introduction of threshold signing, the loss of a single server8 will no longer remove the user's ability to sign desirable messages, and the use of thresholds means that there is no increase in the ability for a user to inadvertently sign an undesirable message. Given that hardware, software and operation failures do occur threshold signing is a highly desirable feature of a resilient validating infrastructure.
Although a threshold signature scheme provides additional security, it has two major weaknesses. Firstly, distributing the key does not prevent long-term attacks: in a environment once one signer has been compromised there is a perpetual weakness such that if a second signer is compromised then the attacker can meet their goal. Secondly, the distribution process starts with a key to distribute, which means that there is a risk of an attacker obtaining the key prior to it being distributed and making all of the work carried out to protect the key irrelevant. Both of these issues can be addressed by using distributed key generation:
Distributed key generation is a relatively complicated topic, and the details of how it works is outside of the scope of this article. We will, however, provide a brief functional explanation to show how it can overcome the limitations of simple threshold signing.
Distributed key generation begins with the user deciding the threshold they wish to obtain, for example . They then select three remote signers from those available, and initiate the generation process. Each of the three remote keymanagers9 generates its own key, which it does not share with the user or any other signer, plus some additional public information. The public information from all three signers is combined to create a composite public key.
When a signature is required it can be obtained from just two of the three signers, similar to simple threshold signing.
If one signer is unavailable, for example it is down for maintenance or under attack, an alternate set of two signers can be used to obtain the same signature.
It can be seen that distributed key generation provides the same benefit as simple threshold signing, and the removal of the requirement to start from a single key removes one of its weaknesses. However, the perpetual weakness that arises if one of the servers is compromised remains. How can distributed key generation help here?
The answer is a process called rekeying. Rekeying is the act of destroying the existing secret key and generating a new secret key on each signer that retains the properties of the previous key i.e. any two signatures can still be combined to provide a valid signature for the public key. The new keys are unrelated to the old keys, in that the new ones cannot be calculated from the old ones.
In the case of one of the signers being compromised they can undergo the rekeying process. Old and new keys cannot be combined to generate a valid signature, rendering the stolen key held by the attacker useless.
The rekeying process can be carried out multiple times, in the case of multiple signers being compromised over time.
Security is a very broad area, and the above information is provided to explain the features and benefits of various security layers, rather than define any sort of perfect, or even best, solution. There are many other points to consider when protecting validator keys, some of which are touched on here.
It is expected that hardware wallets supporting BLS12-381 will become available in the near future. A hardware wallet could replace the simple on-disk key storage system at the lower layers, but doing so may prevent advanced techniques such as distributed key generation.
As with any situation involving security, it is possible to spend unlimited amounts of money chasing a little bit more security. Each user will need to decide the attacks against which they wish to protect themselves, and how much they are willing to spend to achieve that protection.
The remote storage of passphrases has been discussed above, but is there any additional benefit from storing validator keys remotely as well (or instead)? Although there is no security benefit, it can allow faster recovery of validator clients in case of hardware failure, which is of benefit to the user. It does, however, come with additional risk as the remote storage must be configured correctly to ensure that access to the keys is permitted only to those permitted.
Although the article focuses on protecting validator keys from attack, the reality is that most losses of validator keys will occur due to more prosaic reasons, most commonly loss of the hardware on which the key resides. Users will need a back-up strategy, bearing in mind that if an attacker can reach the backed up keys they will have achieved their goal of being able to sign an arbitrary message considered valid against the validator public key. Suitable measures should be taken to ensure that backed up validator keys are as inaccessible as possible, ideally being entirely off-line and physically secure.
If you have multiple validator keys should they have some sort of relationship, to either each other or to their respective withdrawal keys? Doing so brings no security benefit for active validating, although it can bring ease of use benefits if, for example, backing up the validator keys involves just a single seed rather than many individual keys. Users should also consider if they may wish to separate control of the validator keys at any stage, and plan accordingly.
This article provides a number of methods to help protect validator keys while ensuring they remain accessible to carry out their attesting duties.
Each user should consider the level of security they wish to provide, along with the steps they should take to provide them. It should be remembered that validator keys do not control funds, so an attacker who steals a validator key cannot obtain any sort of direct reward by obtaining it10.
This is a purely technical examination of protecting validator keys. There are other areas, such as operational and social security requirements, that should be considered. To fully protect a validator key, a security provider must take all of these areas, and more, into account.
Alternatively, you could use a staking service that provides such features within its infrastructure. Attestant is building the hardware, software and operations to provide an institutional-grade staking service where you retain control of your funds at all times, combining technical, risk, and treasury management expertise.
"Give us some money or we'll slash your validators." ↩
Although this can have an indirect impact on their own wealth with a "Goldfinger attack" or combining an attack of suitable magnitude with suitable derivatives on Ether. ↩
A slashing event is one that could cause harm to the Ethereum 2 network, and is punished with a large fine and long-term lock up of the remaining funds. ↩
Between 1 and 52,435,875,175,126,190,479,447,740,508,185,965,837,690,552,500,527,637,822,603,658,699,938,581,184,512, inclusive. ↩
At time of writing this standard is still in draft form, however is being used by many Ethereum 2 key generators. ↩
For the purposes of this article we assume that a strong passphrase is used such that it is not susceptible to a brute-force attack. ↩
This is a good example of how two layers can work in combination to provide higher levels of security than either could manage by themselves. ↩
Other variants of threshold signing can increase this level of redundancy. ↩
We use the term keymanager as the remote servers now carry out more operations than just signing. ↩
It should also be noted that there is no guarantee a potential attacker is aware of this fact. ↩