Many crypto ecosystems are born from times of adversity and Ethereum is no exception. This article provides some words of hope during these difficult times.
In light of the trial of Sam Bankman-Fried and the associated collapse of the FTX crypto exchange and trading firm Alameda Research, it’s worth taking a minute to ponder how the events of 2022 were able to happen in the first place. For all the tumult and billions of dollars’ worth of value lost, the problems weren’t the result of Decentralised Finance (DeFi) per se or its underlying infrastructural code, but instead resided in age-old human error, miscalculation and outright fraud within Centralised Finance (CeFi) platforms.
Cryptocurrency innovation, its potential application and rapid rise in market valuation all clouded people’s judgement as to whether established rules, standards and procedures that had been around for mainstream finance over many decades were relevant. It is the firm belief of the author that they are especially relevant in the area of systems management. The infrastructure and number of intermediaries may have changed but human nature has not. Basic concepts such as segregation of duty, standardised processes accompanied by proper corporate governance are as important today as they ever were.
When service providers operating in any digital asset field - whether it be an exchange, lender or staking platform - start to scale, these “boring” (albeit critical) processes and procedures must be adopted. In the early days of the crypto revolution, very clever individuals could build platforms implementing new ideas relatively quickly and in many cases were highly successful. However, as the number of people using these systems increased and the sum of assets managed reached dizzying heights, the approach of “move fast and break things” couldn’t continue as the stakes (excuse the pun) became significant. In the traditional world of large corporate financial institutions these approaches are rightfully labelled as amateurish and cowboy-like. This isn’t to say that developers shouldn’t continue to develop code and come up with new ideas - it’s what makes our world go round - but before any of this work sees the light of day in a production environment, it should be reviewed, tested, signed off and most importantly thoroughly checked by independent team members.
Working with large corporate development teams for many years, the author saw first hand how quickly developers could alter code and generate a constant stream of changes to the development code base. The inevitable issue was keeping track of those changes and without proper processes, procedures and systems in place the end result could be a non-working system. To mitigate this situation a Software Development Life Cycle (SDLC) was adopted, forcing any change to go through a set of rigorous stages with gatekeepers. I'll outline those stages at the end of this article.
In general, nothing outlined in that appendix should be considered unusual, esoteric, or even over the top in a Tier 1 financial institution with heavily invested systems. Failing to properly test systems can lead to vulnerabilities being exploited leading to devastating consequences that can fatally undermine collective confidence in the team and the system. When identifying issues and bugs, the lesson that the price to fix an issue will increase almost exponentially the further along the SDLC you move was learned many years ago.
While reliability and cyber-security are fundamental to ensuring that trust is maintained, assurance and independent auditing have an equal role to play in establishing credibility. Proper auditing, ISO-27001 accreditation and regular penetration tests should be worn as badges of honour by any business looking to take itself seriously. By generating proper audits and reviews of all the logs, compliance teams can analyse patterns and see if something isn’t right. Issues such as servers being overloaded and disk capacity management are highly relevant for digital asset service providers as the days of people managing a few servers to mine BTC and ETH have long gone as this industry grows up.
As with any new and exciting development, it’s easy to lose sight of the fact that old habits die hard and technology is neutral. It is human nature and choices that decide how it's applied or misused. Developers should remember this and take into account that the important guidelines for any system design are often the most uninteresting. Factors outside the applications themselves are just as important. Even though the blockchain with its associated cryptography is highly guarded and secure, the environment in which crypto applications run should themselves be secure. Servers must be protected where nobody can access them without multiple layers of authentication such as the use of biometrics combined with FIDO keys. The choice of operating system and all that is deployed to the server should not be compromised and must be minimised as much as practical. The same security mentality applied to the applications has to be used in the wider “systems” context. It is no use creating a super secure server environment if all your client data resides on a developer's laptop shared with their family.
In summary, as the crypto industry starts to mature, it is inevitable that some established processes and techniques will become vitally important as much as the innovative crypto applications themselves. Ultimately, they are all part of the same larger system.
Any code change could not be submitted to Development until it had been through a set of basic tests and any new functionality had to be covered by a new set of unit tests.
The development code base was rebuilt four times a day from source. If you can’t build your application from scratch you are in deep trouble.
Each successful build would go through extensive unit/integration tests. If the build failed the build stream was stopped and all new code would be halted from being applied until the build issue was resolved.
The last successful build of the day would be used to do automated overnight regression/differential tests.
On a regular basis a successful Development build was moved into the “Test and Regression” environment. At that point it was subjected to simulations of real world activities.
Priority was always focused on ensuring the build in the Test environment ran to completion and any bug fixes were merged back into the Development stream.
A successful Test environment build was then moved to a User Acceptance Testing (UAT) environment where Production users were allowed to use real world data as part of their own tests. Again, bugs in this environment took precedence to fix over the previous two environments.
Finally, only full sign-off from the UAT environment would enable the build to finally be promoted into Production.