The mission of this group is to bring together utility professionals in the power industry who are in the thick of the digital utility transformation. 


Use Machine Learning to Automate Measurement Point Identification Throughout Your Enterprise

image credit: Royalty free by Ryzhi, Shutterstock
Allison Salke's picture
Senior Product Marketing Manager Oracle Utilities

Allison began her career in financial services developing some of the industry's earliest real-time clearing systems. As a consultant Allison has helped many established and early stage...

  • Member since 2017
  • 6 items added with 9,924 views
  • Mar 26, 2020 8:59 pm GMT

Often, we think of innovation as an invention that creates a brand-new class of devices like the iPhone or a “disruptive” app that changes our daily life like Google or Facebook. Sometimes innovation can happen in a business model by the incremental application of a technology like machine learning (ML) over time. For example, a recent Harvard Business Review noted: “you need only a computer system to be able to perform tasks traditionally handled by people.”[i] In a previous post we discussed using ML and open source tools to automate load prediction. Our present ML innovation project involves automating measurement management.

Your access to Member Features is limited.

As the Grid complexity increases daily, so do the jobs of the engineers who constantly create, update, and load measurement files into each of their Operational Technology (OT) systems. One issue that makes this work complex: each SCADA, OMS, and GIS system has its own naming convention for data points. When a new device is added to network, each OT system needs to be updated as well. Typically, this is a highly manual process as each system manages its own name space. In many system architectures, engineers register new devices in the GIS/OMS system, generate new SCADA data point files, and send points to the SCADA system. This often involves engineers running scripts that were custom developed to manage the difference of metadata (name spaces and device models) across these various OT systems.

Figure 1 In many system architectures, a manual process is required to keep OT systems’ measurement point tables up to date.
Figure 1 In many system architectures, a manual process is required to keep OT systems’ measurement point tables up to date.

In some instances, our utility customers benefit from our automated Asset ID Management (AIM) solution that automatically keep Operational Technology (OT) systems measurement fields up to date with new devices and avoids the manual registration of these devices.

Figure 2 In some instances, automated systems update measurement point tables via hard-coding and configuration files.
Figure 2 In some instances, automated systems update measurement point tables via hard-coding and configuration files.

Creating the configuration files and the scripts that automate the AIM process requires considerable of manual code development by experience engineers and is the perfect place where ML can increase productivity.

Automated Measurement Point Identification

In organizations that encode OMS measurement lookups into their SCADA data point names, an extract, transform, load (ETL) process can be employed. Specifically, the system extracts date point names, transforms the points to OMS measurements, and loads the data into the OMS systems.

This all sounds great; but, with 10,000’s of data points creating the mappings can be time consuming, prone to errors, and rely on institutional knowledge. This seemed like a perfect task performed by people where ML could be helpful and provide innovation to a process that is repeating throughout the power industry.

Applying Machine Learning to OMS SCADA Points Table Creation

Typically, SCADA systems’ data point naming conventions are complex and not uniformly applied. The task with which we thought ML could help was to automatically determine the type of measurement associated with a particular SCADA data point so that we could automatically create a measurement table in an Outage Management System. With our automated mapping solution in place, we created an ML algorithm to create prediction rules for datapoints in the OMS measurement tables to eliminate the manual process of creating configuration files.

Figure 3 Our first model involved the use case of adding devices to a SCADA system and automatically updating the OMS measurement tables.
Figure 3 Our first model involved the use case of adding devices to a SCADA system and automatically updating the OMS measurement tables.

As shown in the above diagram:

  1. The automated process starts with measurement point discovery.
    1. The data point name retrieved from the SCADA system via ICCP.
    2. The attribute part is extracted from the ICCP name. The AttributeName extractor performs a search in the SCADA table.
    3. The first AttributeName match is returned. If no match is found, an exception report is created and the search process for this ICCP name stops. 
  2. The base name is extracted and tokenized.
    1. A BaseName Extractor removes the attribute name substring from the ICCP name.
    2. The BaseName is now available for further processing.
    3. A tokenizer splits the BaseName into tokens with a user defined separator. After the tokenization, Production rules based on relative association between SCADA tokens and OMS tokens are generated in a more legible format for users.
    4. A few examples of SCADA point to OMS name mapping are used as a supervised dataset to train the neural network. From these mappings, the previously mentioned production rules are generated, which then are the labels for the model to learn.
    5. The ordered list of extracted tokens is available for further processing.
  3. The ML algorithm processes the tokens and predicts a production rule.
    1. OmsName generator tries to match token set against a list of regular expressions ordered by probabilities of prediction. For a match with probability above the user-defined cutoff, it applies the corresponding production rule to generate the OmsName candidate, which it verifies against the SCADA_POINTS Table.
    2. If it does not match, it iterates using the next Regular Expression.
    3. If no match is found, an entry is added to the exception report.
  4. The OmsName is available for the configuration file.

Preliminary Results

Our preliminary results indicate 96.51% accuracy for OMS namespace generation with limited training examples. As we continue with this project, we expect to see the accuracy improve with additional data points for learning.

Lessons Learned

Generalization was our biggest challenge. Ultimately, we wanted to make parts of the process user configurable. This meant the automated mapping process needed to learn the different types of mappings in a way we could give control and feedback to the user. We considered different propositions based on Natural Language Processing, Reinforcement Learning and ML to solve the problem. Base on the observed nature of different mappings, we chose the classification approach based on Neural Networks.

Another challenge was that to simulate a real-life scenario. The model had to learn about the production rules from a very limited set of examples. Initially, we generated a single OMSName based on the top prediction. But, in the case of ambiguity, we wanted to let the user decide between two OMSNames. This could happen if the probabilistic score given by the model was similar for 2 or more points. Additionally, we decided to add a cutoff level. This creates a human-assisted ML process: with each iteration, the learning process could improve and resolve the previously known ambiguities.

Next Steps

In addition to better accuracy, we are leveraging open-source python libraries like Pandas and NumPy to create an ML pipeline for the mapping, thus increasing the computational efficiency and reducing load time. The future will entail Name Entity Recognition for tokenization. This means we will seek to identify different components within a name, such as a device ID and attribute, with unstructured text.

This is another example of how we applied ML and open source libraries to eliminate manual repetitive tasks for operations engineers. On the scale of innovative disruption, it may not become a verb like the way we “Google” a topic to get information. However, as a practice over time, the incremental application of ML is a path to automation and productivity. It reduces monotonous tasks for engineers and it increases their available time to ensure Grid reliability and optimization.  Internally, it enables our Professional Services engineers to deliver increased value to our customers by greatly decreasing the time required to integrate name spaces across OT systems.


[i] Iansiti, Marco and Lakhani, Karim R. “Competing in the Age of AI,” HBR (Jan-Feb 2020)

Allison Salke's picture
Thank Allison for the Post!
Energy Central contributors share their experience and insights for the benefit of other Members (like you). Please show them your appreciation by leaving a comment, 'liking' this post, or following this Member.
More posts from this member
Spell checking: Press the CTRL or COMMAND key then click on the underlined misspelled word.
Matt Chester's picture
Matt Chester on Mar 26, 2020

Really interesting, thanks for sharing Allison. Do you have any utilities involved as stakeholders or for pilot/rollout? I'd be curious about how they're utilizing or looking at these opportunities. 

Allison Salke's picture
Allison Salke on Mar 27, 2020

Thank you Matt. This work is in development with real-life data. Internally the ML has been helpful as has been the process of looking for areas where we can apply ML. 

Jim Horstman's picture
Jim Horstman on Mar 27, 2020

Hi Allison, great article. It's a little abstract for me so seeing some actual 'data' examples would help. This concept would be great to incorporate into EPRI's Grid Model Data Management project. As an Oracle partner I hope you will be able to participate with them on the project.

Get Published - Build a Following

The Energy Central Power Industry Network is based on one core idea - power industry professionals helping each other and advancing the industry by sharing and learning from each other.

If you have an experience or insight to share or have learned something from a conference or seminar, your peers and colleagues on Energy Central want to hear about it. It's also easy to share a link to an article you've liked or an industry resource that you think would be helpful.

                 Learn more about posting on Energy Central »