The mission of this group is to bring together utility professionals in the power industry who are in the thick of the digital utility transformation. 

Post

Real-Time Business Intelligence on Transactional System Databases

image credit: © Jakub Jirsák | Dreamstime.com
Richard Brooks's picture
Co-Founder and Lead Software Engineer Reliable Energy Analytics LLC

Inventor of patent pending (16/933161) technology: METHODS FOR VERIFICATION OF SOFTWARE OBJECT AUTHENTICITY AND INTEGRITY and the Software Assurance Guardian™ (SAG ™) Point Man™ (SAG-PM™)...

  • Member since 2018
  • 1,428 items added with 585,847 views
  • Dec 7, 2021
  • 953 views

This item is part of the Special Issue - 2021-12 - Data Analytics & Intelligence, click here for more

Traditional Business Intelligence solutions employ a series of data collection methods, data warehouses, operational data stores, data lakes and other ways of storing data for use by business users. Many times, these data stores are populated using traditional Extract Transform and Load (ETL) tools and/or change data capture methods that copy data from a source location into its eventual resting place in one or more of these data stores. This traditional approach works well for many data analytic and data engineering use cases. But there are some cases where having real-time, in place access to transactional source data from a BI tool can be very useful. This article describes one approach to implementing “real-time BI” capabilities over a transactional system used for settlement data in energy markets, which I’ll refer to as a “Virtual Star” design, following the “Star Schema” concepts originally proposed by Ralph Kimball.

Your access to Member Features is limited.

A traditional Star Schema modeled database used in Business Intelligence solutions contains a series of dimensional data items, i.e., Time, Location, and other defining dimensional characteristics for a corpus of data along with data referred to “measure data”, used to conduct various type of data analysis. A simple star schema may consist of two dimensions, i.e., Time and Location, used to report on ambient temperature, i.e., on 11/24/2021 at 16:00:00 UTC, the temperature for Boston MA, was 40 degrees Fahrenheit where date and time are identified in a Time Dimension, Boston is stored in a Location Dimension and a table containing temperatures (measures) recorded in the location at the specified time interval. This data is collected by a real-time transactional system used to capture weather data, which is eventually stored in a data warehouse for further analytic purposes, i.e., measuring the effects of temperature on Solar Panel energy production in Boston. Data scientists use the data stored in the data warehouse for all type of analytic and statistical functions and, sometimes this activity can be quite burdensome on a BI computer system and database. This is one reason why many BI systems and data warehouses reside “along-side” and independent from transactional systems and their databases as opposed to being implemented on the transactional system database. This separation prevents a BI system from overwhelming the capabilities of a transactional system, which can cause processing problems which may impact performance of the transactional system and database.

But there are times when business users need real-time analytical capabilities on a transactional systems data. In this case, is it “ok” to have a BI system operating directly against the transactional system database? The answer is yes, but it requires some strict design controls to prevent BI analysis from impacting performance of the transactional system database. This is where the “Virtual Star” design can be applied, safely to a transactional database.  But wait, aren’t transactional databases designed for high performance computing and aren’t very “conducive” to analytic functions that benefit from having “human friendly” data content, as opposed to the more performant numeric references to reference data, i.e., the number 77 refers to cloudy weather conditions, which isn’t obvious to a human reader. A Virtual Star schema needs to address these aspects of a transactional data model by forming ‘virtual dimensions”, based on materials contained in the transactional system database, or must be constructed and reside in the BI system database to provide human readable content. Ideally, these virtual dimensions are created dynamically so that new values will be accessible to the BI data scientist using the data. Only create a virtual dimension in the BI database as a last resort, if a dynamically created virtual dimension is not viable. A Time Dimension is frequently found in the BI database that is used against transactional data as many transaction systems do not offer a rich/robust Time Dimension needed for analytic functions.

The information provided in this article is based on actual experience designing and implementing a virtual star real-time, BI implementation over a database used in an Energy Settlements transactional system within an ISO/RTO organization. A Settlements transactional system is a very complex set of applications that calculates payments for electricity generators and transmission owners and charges for Load Serving entities and local utilities. The system operates on a cadence (in hours) that was conducive to a real-time BI virtual star implementation. Transactional systems that are more active, i.e., processing 1000’s of transaction per second are unlikely to be a good fit for the virtual star, real-time BI approach described in this article.

If you are considering use of a virtual star approach to access real-time transactional data there are a few things to keep in mind:

  1. Transactional system and database performance is the highest priority; do not introduce any functionality in your virtual star that would negatively impact this performance requirement, i.e., implement query time constraints to a minute or less
  2. Identify the transaction data elements (measures) you wish to report on in real-time, i.e., weather temperature, humidity, etc. for a given time and location
  3. Identify the dimensional elements in a transaction system database that can created dynamically, at runtime (i.e., a database view) that can serve as a virtual dimension to filter, search, group and describe measure data in a human friendly manner.
  4. Identify any virtual dimensions that may need to be constructed on the BI system and database, this could require some ETL work, and should only be used as a last resort – dynamic dimensions are preferred, when feasible. A Time dimension is frequently kept within the BI System database
  5. Ensure that virtual dimensions contain data element content that is human friendly, for example ensure that a weather conditions dimension contains descriptive content, e.g., “Cloud Conditions” = 77 = “partly cloudy”
  6. Depending on the "real time" requirement, caching or materialized views could be an option.
  7. Slow or rarely changing dimensions that require complex hierarchy logic could be stored as an indexed dimension table. This puts less load on the BI database at user run time. Be careful not to list any foreign keys into the transactional database that could cause unexpected results.
  8. Be careful in choosing the data that you make available as a measure; tables with a large quantity of data may not be appropriate for a virtual star implementation, unless high performance indexing capabilities are implemented that would limit the amount of data that would be retrieved Limit the amount of time a query can report on, i.e. queries that include time dimension data cannot be for more than 1 hour and 1 day, additional constraints should also be considered i.e., query time limits, row limits, session limits, temp tablespace limits
  9. Transactional systems can change rapidly, constant monitoring and performance tuning may be needed on the BI System and database to match the capabilities of the transactional system and database
  10. Transactional system databases can occasionally contain invalid data, which is typically caught and corrected in time. Inform your data scientists of this possibility so that they will be on the lookout for “suspect data”
  11. Some data in a transactional system may not be conducive to real-time data analysis in a BI virtual star implementation, i.e., linked data tables and fine-grained temporal data, such as Phasor Measurement Unit data that is recorded at 30 times per second, may not be appropriate.

In summary, this article describes the knowledge and information gained by implementing a real-time BI Virtual Star Production implementation over a real-time Production transactional system and database used to calculate wholesale energy market payments and charges, referred to as a Settlement system. Some recommendations are provided in order to ensure a successful implementation.

Discussions
Spell checking: Press the CTRL or COMMAND key then click on the underlined misspelled word.

No discussions yet. Start a discussion below.

Get Published - Build a Following

The Energy Central Power Industry Network is based on one core idea - power industry professionals helping each other and advancing the industry by sharing and learning from each other.

If you have an experience or insight to share or have learned something from a conference or seminar, your peers and colleagues on Energy Central want to hear about it. It's also easy to share a link to an article you've liked or an industry resource that you think would be helpful.

                 Learn more about posting on Energy Central »