Skip to the content

Validating Process Safety Assumptions using Operations Data

Taylor Schuler
Business Development, Software
taylor.schuler@aesolns.com
aeSolutions, Dallas, Texas, USA
Jim Garrison
Prinicpal Specialist 
jim.garrison@aesolns.com
aeSolutions, Greenville, South Carolina, USA

Taylor’s Bio

Taylor Schuler has more than 15 years of experience in software product management for the Oil and Gas industry. Currently, Taylor is the Product Manager for aeSolutions aeFacilitator™ and aeShield™ process safety applications.

Taylor’s experience with numerous customers provides a unique foundation for gathering and prioritizing requirements, converting them into consumable and testable features for software development professionals, and ultimately deploying to the customers once complete. Drawing on his experience from hundreds of facilities across five continents makes Taylor and effective product manager for aeSolutions. Taylor holds BS degrees in Nuclear Engineering from the University of Tennessee and Physics from Roanoke College. In addition, he holds Certification in Maintenance and Reliability from the University of Tennessee.

Jim’s Bio

Jim Garrison, a recent addition to aeSolutions, is a key member of the process safety engineering team in Greenville, SC. He is a graduate of Georgia Tech with a BS in Electrical Engineering. He has over 8 years of experience designing instrumentation systems for use in hazardous areas and performing HAZOP studies and SIL selection and verification. Jim is a licensed PE in four states and is an ISA Certified Automation Professional (CAP) and ISA 84 SIS Fundamentals Specialist (ISA84 SFS).

Abstract

As facilities are assessing risk, making recommendations for gap closure, and designing safety instrumented functions (SIFs), assumptions are made to facilitate calculations in the design phase of protection layers used to reduce the likelihood of hazards occurring. Each of these assumptions are made based on design standards, process safety experience, and data supplied by the manufacturers concerning operability and reliability. The purpose of this white paper is to identify key assumptions and replace the assumptions with real-world operations data to prove that the risk may be greater than perceptions based on design. This case study will focus on looking at real functional test intervals verses those applied in the safety integrity level (SIL) calculations. It will also compare unsafe bypasses verses probability of failure on demand (PFD) and the count of initiating causes compared to the frequencies documented in the layer of protection analysis (LOPA).

Overview

As stated in the abstract, the purpose of this white paper is to use real-world data to replace assumptions made during the safety instrumented systems (SIS) lifecycle. Real-world daily operations data can be extracted from applications such as historians, asset management systems, and/or other tooling that captures relevant data regarding a SIFs performance.

This paper focuses on three assumptions that are made either during a risk assessment or designing a SIF. The three assumptions are:

  • Test Intervals: the frequency the safety devices need to be tested in order to achieve the risk reduction factor (RRF) established in design.
  • Cause Tracking: the LOPA team identifies an expected frequency of the occurrence of the cause.
  • Unsafe Bypass: periods in which the SIF is in bypass while the process continues to operate.

Placing this information in the hands of the subject matter experts enables better decisions resulting in a safer facility regardless if in a risk assessment, SIS design, or during operations and maintenance (O&M).

Case Study and Assumptions

A key assumption made in this paper is that the SIS engineers have a datamap that enables the relationship of the operations data to the SIS model from the hazard down to the tagnames required to minimize the risk. The tools used in this case study are the software products offered by aeSolutions[1], a MS Excel® spreadsheet containing data from a common historian, and a spreadsheet that contains the testing dates of critical safety devices as stored in a common asset management system.

The case study was based on data from a common SIF (Case Study SIF-01) from an unnamed company and facility. The SIF has an IL Rating = 2, and has been added to a reactor to ensure the vessel returns to a safe state in the event it pressure becomes too high.

 

Figure 001 – Reactor Protected by Case Study SIF-01

The operations data reviewed was over a 5-year period, September 1, 2009 to August 31, 2014. The SIF has three pressure transmitters as sensors with 2oo3 voting and two ball-valves as final elements with 1oo2 voting as seen in the figure below.

Figure 002 – Case Study SIF-01 Architecture

When reviewing the operating data, the historian events and test plans where based on the sensors only. All naming conventions were generalized to mask the identity of the equipment and simplify the analysis performed in this white paper.

Extended Test Intervals

As hazards with unacceptable risks are identified, the LOPA team may recommend designing a SIF to close the gap to an acceptable level. As SIS engineers design and investigate multiple what-if scenarios, the test interval, in months, for each safety device is established to achieve the desired RRF. If that test interval is extended, the RRF calculated from design is no longer valid. In this case study, the devices on Case Study SIF-01 require testing every 18 months. Based on prioritization issues, it was decided by the facility to wait until the next turnaround on the equipment under control which resulted in doubling the assumed test interval on each device (see Figure 003).

 

Figure 003 – Sensor PT-123 Design vs Actual Test Interval

Each of the three sensors were tested at the same time as well as the final elements. By adjusted the test intervals and re-running the SIL calculation, the results show:

Figure 004 – Sensor PT-123 Design vs Actual Test Interval

To narrate the results displayed in Figure 004, the SIF was required to have an IL Rating = 2 and was slightly over designed (RRF = 119). However, updating the SIL calculation with the real world test intervals, the RRF dropped to a 90 (IL Rating = 1), and introduces ~10% additional risk which represents a gap. Is a 10% acceptable? Of course, the answer could vary depending on the organization and the severity of the hazard the SIF is protecting against; however, the example is evidence of how things can change over time as difficult decision are made.

To recap the workflow:

  • LOPA recommendation following the identification of a gap
  • SIF was designed with a required test interval and SIL calculation finalized
  • Data retrieved from asset management system with timestamps to identify real world test intervals
  • SIL calculation performed with actual test intervals
  • Analysis to determine tolerance of change in risk level

Periodic Review of Historian Data

Moving onto the other assumptions replacements discussed in this white paper, data from a common historian was required. In order to effectively analyze and annotate historized events, the following workflow is required. Many of the steps can be automated, however, manual steps are required to validate data and classify the data to associate it to the appropriate parts of the process safety data model. The manual steps may vary depending on the tooling available.

  • Identify the type of events that need to be tracked. When reviewing data from the historian, it will be in the soft-tag format of [tagname]&[suffix]. For simplification purposes, this paper focuses on two generic types to simplify the analysis:
  • Cause Tracking: suffix = _TRIP
  • Unsafe Bypass: suffix = _BYP
  • Create a data map between soft-tags and the sensor tags in the SIF (see Figure 005)

Figure 005 – Data map from safety model to historian soft-tags

  • Retrieve the data from the distinct list soft-tags in the data map over a time period
  • SIS engineer reviews the results (shorter intervals of review are recommended to minimize level of effort) and documents events on architecture and voting
  • Identify initiating causes on SIF demands
  • Group events and focus on unsafe bypasses and durations
  • Aggregate data and perform analysis to determine tolerance levels
The following table represents data that has been pulled from a historian and annotated by the SIS engineer. The data was limited to soft-tags associated with the sensors on Case Study SIF-01 over a time period of 5 years, a typical duration between revalidations.

Figure 006 – Historian data used to analysis cause tracking assumptions and unsafe bypasses

Again, the SIS engineer review and documentation is less demanding if tooling is available to relate protection layer to the safeguard to the cause-consequence pair to create refined pick-list on the initiating cause column.

Cause Tracking

The data in Figure 006 enables the count of events related to an individual initiating cause. The data shows that there are two initiating causes creating a demand on Case Study SIF-01 over the 5-year period. The LOPA team identified anticipated frequencies of the causes occurring on an annual basis. Figure 007 show the results of the analysis.

Figure 007 – Cause Tracking Analysis on Case Study SIF-01

The green symbol is an indication that the historian capture a demand count less than the frequency, while the red indicates that the demand count is higher than the assumed frequency. Is this tolerable? Again, the answer is dependent on the organization and circumstances, but the data can certainly be useful in a cause review/assessment session.

Unsafe Bypass

The data in Figure 006 enables the aggregation of durations that a SIF is in an unsafe bypass state. The total duration can then be compared to the number of acceptable hours as calculated by multiplying PFD by the number of hours in the period. Figure 008 shows the comparison against the SIF target and achieved PFDAVG as well as the real world’s.

Figure 008 – Unsafe Bypass Analysis on Case Study SIF-01

The real-world PFDAVG values mimic those Figure 004. The green text indicates that Case Study SIF-01 did not exceed the acceptable hours from any of the scenario, therefore, no warning needing.

Summary

In closing, the assumptions made at the front end of the process safety lifecycle are educated, but are still assumptions. Facilities already collect a large amount a data that can ultimately be tied to safety functions. Using tooling and managing data mappings enables the facilities to place more emphasis exposures to risks and save money in areas where the process safety professional are over conservative.

In this white paper, one SIF as a case study explored. On this SIF, operations data replaced test intervals in SIL calculations, actual frequencies of initiating causes as compared to LOPA figures, and unsafe bypasses compared to PFDAVG. Figure 009 starts to show the power of expanding this analysis to an entire facility – assessing all initiating causes and all SIFs.

Figure 009 – Facility scorecard regarding process safety assumptions

To reiterate, placing this information in the hands of the subject matter experts enables better decisions resulting in a safer facility regardless if in a risk assessment, SIS design, or during operations and maintenance (O&M).

Disclaimer

The following paper is provided for educational purposes. While the authors have made reasonable efforts in the preparation of this document, aeSolutions makes no warranty of any kind and shall not be liable in any event for incidental or consequential damages in connection with the application of this document.