Right-Sized Risk Mitigation

I’ve spent a lot of time in power plants and have seen many odd equipment arrangements (I’m trying to be nice). One of these equipment arrangements had a throttle valve located off the edge of a platform in a forest of piping that, fortunately, only required adjustments during off-normal plant conditions. The area was so cluttered that access was only possible from the platform. It was very loud in the room, requiring double hearing protection, due to several large pumps and the small size of the room. The entire area was extremely dirty, requiring personnel to wear coveralls and face masks with air filters. The area was not air-conditioned either, and this being a power plant, it was hot! To reach the valve, operators had to move through a jungle of piping on the platform, lie down on their stomachs near to the edge, and shimmy through dirt and grime to reach the edge of the platform at the nearest location to the valve. This was all done while traversing in a fall protection harness! The conditions were miserable for the operators.


One day an operator turned the valve wheel in the wrong direction. Unfortunately, there was no easy way to alert the operator to their error because of the noise. The response time to fix the error was slow and got management’s attention. In response to the error, a step-by-step, continuous use procedure was written and a peer-check was now required. The procedure provided the operators guidance for identifying the valve, unlocking, operating, and applying the appropriate tags, as necessary. The procedure also provided the guidance of how many turns and in what direction to rotate the valve wheel for the amount of throttling that could be required.

mis-operations began to pile up


The operators (plural because of the peer checker), upon reaching the valve, were covered in filth, drenched in sweat, and usually a little perturbed, would begin following the procedure for each step of the process. The mis-operations began to pile up. After several more mistakes, operators began to be disciplined. Initially, operators were required to go to training for procedure adherence and valve operation. After that, they were given several days off without pay. Many of the operators were given time off with only one mis-operation of the valve. Eventually, the plant manager threatened to fire the next operator to get the valve operation wrong. All of this surprised me as it seemed like such a common error should be evaluated for a common cause rather than disciplining the operators!


On the same day as the firing threat, one of the most senior operators turned the valve in the wrong direction!


The operation’s shift lead took control before any further discipline could be administered. Probably before anyone in management had been alerted to the problem. The shift lead rebriefed the operator, who had just made the error, took the procedure away, and sent them, by themselves, to fix the valve’s position.


Voila! The operator adjusted the valve perfectly. Once the whole story of the day’s events came out, a root cause evaluation team was assembled to determine what the real issues were. Why had the operator who had just made the error with all of the risk mitigating tools made an error but once all of the tools were taken away, been able to operate the valve correctly?

The underlying problem that we identified was that the operators were overwhelmed with too many distractions from the environment, interruptions during the task, and additional pressures from employment threats. Our procedural efforts to mitigate the risk of misoperation had actually removed a layer of protection by adding too many distractions to the operator. This negatively contributed to an already error-likely situation by making the operators question their own judgement by trying to recall if the procedure meant to turn the valve wheel to them or to the valve (to close the valve from the operator’s perspective, the wheel should be rotated counter-clockwise) rather than using their knowledge to properly operate the valve.

What this means to the organization

One of my common experiences, no matter the industry, is the failure of an organization to identify the level of risk mitigation required for an activity. Many times, the wrong risk level is incorporated into the writing of a procedure. Events like the mis-operation of the backward, upside-down valve can be prevented by identifying the level of risk and the human factors involved for the activity. This valve did not have major consequences associated with its mis-operation (minimum impact from error, thus low risk). It was in a horrible area that was not amenable to paper procedures or peer checks. Requiring a continuous-use procedure with a peer-check was the opposite of what was needed. A laminated checklist or just a pre-job brief would have been acceptable. In this instance, it was the failure of the organization to recognize the consequence of an error and make the correct changes prior to so many events occurring. Unfortunately, this wasn’t the case and much angst was created by the mismanagement of the problem.

Meet the author:  Dave Cates

Meet the author:  Dave Cates


Procedures can be extremely powerful tools. They can help us safely operate almost anything on the planet; from airplanes to nuclear reactors to locking doors at the close of business. Do we really need to use the same step-by-step, circle-slash, peer-checked, independent verification procedure for each of those tasks? Some tasks are complex or have very bad consequences if an error occurs. Those tasks require step-by-step guidance with a strict use of human performance tools. On the other hand, other tasks are simple and have very low consequences of failure. These tasks can utilize simple, less specific procedures or even a short checklist. Trust me, if these operators would have had a checklist, they would have thanked you!