#7: Ready to risk

From Junior to CTO Weekly Issue

Jun 25, 2023

Ready to risk

Everything is going according to plan. Period. How often have you seen situations without any deviations with respect to the original idea? I guess these are extremely rear ones. In this issue, we will talk about reality and how to work in the area of partial control.

Let’s talk about daily work, and first of all, let’s segregate things that are fully clear and things where there is some level of uncertainty. What is the problem with the second category? Right! Something might go wrong. ETA might move, and resources might not be enough. For example, you need to introduce a feature but you found a bug that is not likely to be seen but the impact is critical. At the same time, your QA found another problem that will be seen by many users but the real negative impact is about zero. So you should be able to measure risks to prioritize these two problems? For sure! But first, let’s clarify what is the mental model of the RISK concept. Here is my understanding.

An actor performs an action. Any action introduces risks that might impact the environment in which the action is performed, resources that the action requires, and outcomes (for example, quality) of the action. To prevent overspent resources, low-quality results, or environmental impact, we might do insurance to limit loss or prepare. It might be better to try to avoid this impact by doing some preparations. Working with risks of complex activities might be too complex, so you could decompose the action by doing the mitigation more efficiently. For example, we break down complex epics into smaller tasks to move in predictable steps to avoid spending time and other resources and maintain better management of the final result quality after every milestone. To make sure that you do not miss anything and that your activity is critical for some key aspects (limited resources/potential environmental impact/top result quality is critical), try to build a risk mitigation plan.

Depending on your organization and industry, there might be different practices applied, but in general steps of building a risk mitigation plan are:

Risk Identification - name risks
Risk Assessment - quantify the risks
Risk Rating - prioritize risks, segregate acceptable and not acceptable ones
Risk Tracking - formulate risk metrics
Risk Monitoring - track how risk metrics are exceeded
Implementation - formulate what to do if an unwanted event happened

Practically, you might not need a document that formulates all these things, discussion with your team is enough in some cases.

For example,

A service might stop working after deployment because, on startup, it starts data migrations in the database, and migrations might fail in the middle.
We do implement breaking changes in migrations in favor of delivery frequency. So, there is a high probability of facing this issue.
The service’s outage is not a show-stopper because this is not a service a client interacts with.
Releases with data migrations introduce this risk
If something went wrong, we might
- restore database
- rollback service to the previous version (if possible)

Conclusion

Risks are around us even if we ignore them. Working with risks helps to make products on a drastically higher level of service. Even basic risk-first thinking assists in improving your delivery.

DevTower

Discussion about this post