The latest ransomware attack on critical infrastructure
- Dec 4, 2021 1:32 pm GMT
I was surprised to read this story from ZDNet on Friday, describing yet another devastating cyberattack on a critical infrastructure organization, this time an electric utility. Of course, even though the utility, Delta-Montrose Electric Association (DMEA) in Colorado, never used the word “ransomware” in their announcement of the attack, everyone interviewed for the article seemed to think it was a ransomware attack.
But even if it wasn't, what I'm most interested in is the fact that, by the utility's own reckoning, 90% of its internal systems (which I interpret as "IT network") were down. Yet the utility says their electric operations weren't affected at all. This simply shows that the utility followed a cardinal principle for critical infrastructure: Complete separation of the IT and OT networks, so there is no direct logical path by which an infected IT system might infect the OT network.
I know of two other devastating ransomware attacks on critical infrastructure in the US. In both of these, the IT network was pretty much destroyed by the ransomware, while the OT network wasn’t touched by it. Yet in both of these cases, the OT network had to be completely shut down in the wake of the attack, along with the IT network. How could that happen, assuming the networks really were separated?
The more recent of these two attacks was, of course, Colonial Pipeline. In that attack, after most or all of the IT network was brought down, the OT network (and therefore the pipelines themselves) were also brought down. Colonial said at the time that they did this out of the usual "abundance of caution". However, a WaPo editorial pointed out that, with Colonial’s billing system (which was on the IT network, a normal practice even in critical infrastructure) being down, Colonial couldn't invoice for gas deliveries.
Even more importantly (and this was a fact I learned from somebody who commented on one of my posts on Colonial – my longtime friend Unknown), since Colonial is a common carrier and doesn't own the gas it delivers, they would literally have been on the hook for the entire cost of all the gas they delivered but didn't invoice, if they'd continue to run their pipeline network. So the OT network had to come down as well, even though it wasn't directly impacted by the ransomware.
The previous attack was in 2018, when a very large US electric utility was hit with a devastating ransomware attack. As with Colonial and DMEA, the IT network was completely down but the OT network hadn't been directly affected by the ransomware. The IT department decided they had to wipe over 10,000 systems on their IT network and rebuild them from backups. According to two independent sources, the original plan was to leave the two grid control centers (part of the OT network) running during the approximately 24 hours it would take to do this.
However, the utility then decided that if they left the OT network running, they would run the risk that even a single system in the control center might have been infected, and then might re-infect the IT network as soon as the latter was brought back up – meaning they’d have to repeat the entire process of wiping and rebuilding. So they decided they had no choice but to wipe and rebuild the control center systems (about 2,000), including their VOIP phone system.
The result was that for 24 hours, the power grid in a multi-state area was run by operators using cell phones. It was pure luck that a serious incident didn't occur during this time, because power system events usually happen too quickly for humans to react properly; the event would probably have been over long before the control center staff would have been able to diagnose the problem and the solution through phone calls.[i]
In both of these previous attacks, the OT network was logically separated from the IT network, but from a larger point of view it wasn’t. In Colonial’s case, the problem was that there was a system on the IT network (the billing system), which had to be up in order for operations to continue. How could the OT shutdown have been prevented? Clearly, the billing systems should either have been on the OT network, or (since that might itself have caused problems), they should have been on a network segment of their own. The ransomware would never have reached them, and they would presumably have continued to operate after the attack. Thus, operations wouldn’t have had to be shut down.
And what could the utility in the second attack have done differently, to prevent having to shut down their control centers? It seems to me that the root cause of that shutdown was the fact that the utility didn’t trust its own controls, put in place to prevent exactly the sort of traffic between networks that they were worried might get through.
In case you’re wondering, these were high impact Control Centers under NERC CIP-002 R1. The CIP requirements that govern separation of networks, CIP-005 R1 and R2, should have been sufficient to prevent spread of malware between the networks. However, it’s also very easy to violate those requirements, and my guess is that somebody in IT didn’t want to take the chance that someone had slipped up in maintaining the required controls, no matter how unlikely it was that they had.
I guess the moral of this second story is that, if you’re ever in doubt about whether your IT and OT networks are really separated, you should take further steps to remove those doubts. That way, if this incident happens to your organization (utility, pipeline, oil refinery, etc.), you’ll be able to leave the OT network running without suffering a nervous breakdown.
Thus, the fact that DMEA was able to continue delivering electric power to their customers (although with delayed billing) shows they not only had the required separation between their IT and OT networks, but they also
- Didn't have any system dependencies linking their IT and OT networks, as in the case of Colonial, and
- Had enough controls in place that they didn’t doubt that the two networks were really logically separated.
Good for them!
Any opinions expressed in this blog post are strictly mine and are not necessarily shared by any of the clients of Tom Alrich LLC. Nor are they shared by CISA’s Software Component Transparency Initiative, for which I volunteer as co-leader of the Energy SBOM Proof of Concept. If you would like to comment on what you have read here, I would love to hear from you. Please email me at firstname.lastname@example.org.
[i] This incident was reported in the press at the time, but the utility’s announcement said that operations weren’t “affected”. They weren’t affected in the narrow sense that there was no outage. However, the loss of visibility in their control center itself was an “effect” on the grid, and the utility should have reported it to DoE as such.
Get Published - Build a Following
The Energy Central Power Industry Network is based on one core idea - power industry professionals helping each other and advancing the industry by sharing and learning from each other.
If you have an experience or insight to share or have learned something from a conference or seminar, your peers and colleagues on Energy Central want to hear about it. It's also easy to share a link to an article you've liked or an industry resource that you think would be helpful.