Data Pipeline & Executive Dashboard

During my internship at Darlington Nuclear Generating Station, I designed and delivered the station's first centralized obsolescence tracking system. The project consolidated over 2.3 million equipment records from three disparate enterprise data sources into a unified relational database, then surfaced the results through two interactive Power BI dashboards that replaced fragmented Excel workflows and gave senior management real-time visibility into obsolescence risk across all four reactor units.

The Problem

Before this project, the obsolescence management team at Darlington had no single source of truth. Data was scattered across OPG's enterprise asset management system (AssetSuite 7), SharePoint lists maintained by the modifications team, network-hosted Excel files with supplementary tracking data, and individual spreadsheets kept by different team members with overlapping but inconsistent information.

This fragmentation was not the result of negligence. Enterprise software in nuclear facilities is rigid by design. Adding a single field to AssetSuite requires IT tickets, change management processes, and potentially months of waiting. When an engineer needs to track something new, like a vendor discontinuation date or a COVID impact flag, they create an Excel column. Over years and across multiple teams, this produces parallel data systems that nobody fully understands.

The consequences were significant. Senior management had no real-time visibility into obsolescence status. Generating a report required manually compiling data from multiple sources. The same equipment could be tracked differently in different systems. Decision-making for capital projects relied on incomplete, potentially outdated information. Data quality issues, including duplicates, mismatched identifiers, and missing records, were widespread.

Data Sources

The data pipeline pulled from three primary source systems, each with its own structure, identifiers, and conventions. Integrating them was the core technical challenge of the project.

AssetSuite 7 (Enterprise Asset Management)

AssetSuite 7 is OPG's core system for managing physical assets. Data was extracted from multiple modules within the platform:

SWMS (Station Work Management System) provided 225,268 work order records, including parts on hold (Hold Code "PH"), work order types (OA, PM, OR, OD, and others), EP Scores for equipment priority, and I2P Scores for work prioritization. Equipment tags linked each work order to physical equipment in the plant.

POMS (Parts Obsolescence Management System) contained 68,891 records tracking obsolescence status per Catalog ID. Each record was classified into one of four categories: NO (confirmed not obsolete), YES (confirmed obsolete), INCOMPLETE (assessment not finished), or UNIDENTIFIABLE (cannot be determined).

EC (Engineering Changes) held 203,321 records covering the full lifecycle of engineering changes. EC types included MOD (152K), NICR (42K), DCR (5K), DCN (2.5K), and several others. Status tracking ranged from REVISED (72K) and CLOSED (47K) through APPROVED (40K), ACTIVE (11.5K), and CANCELED (11.4K).

Equipment Tags (ETAG/SPV) provided 280,289 records with equipment component tags, system codes, SPV (Supervised Plant/Vector) classifications, and four-tier critical equipment ratings across 42 unique operating systems.

CID Tables (Catalog ID Cross-Reference) linked catalog identifiers to purchase order information (31,583 records), stock and inventory data (62,631 records with on-hand, demand, available, and reorder point fields), and POMS status records (68,891 entries).

SharePoint Lists

SharePoint hosted live tracking lists maintained by the modifications team. The MMOD Master List contained 341 records across 59 columns, covering project identification, engineering details, scheduling (NIMS Schedule, Projected Quarter, Actual In-Service Date), personnel assignments (MTL, DTL, Design SM, SRE), work tracking (Work Orders, Purchase Orders, Contractors), and status fields including COVID impact flags.

The TMOD (Temporary Modification) Master List tracked 62 records across 46 columns, including installation dates, removal tracking, overdue status, and COVID extension data. A TMOD Burn Down Curve table tracked monthly removal progress, and a TMOD Overall Summary provided quarterly forecasting with initial forecasts, revised plans, and actual remaining counts.

Network Excel Files

Manually maintained spreadsheets on the corporate network provided data that enterprise systems did not capture. The most important was a date dimension table (730 rows covering two years, with 10 columns including work week identifiers, quarter mappings, and day classifications) that served as the calendar reference for the entire data model. Enterprise systems did not provide a unified date dimension, so this Excel file filled the gap.

Data Model Architecture

The Main Dashboard's relational schema comprised 25 tables connected by 21 defined relationships. The design challenge was linking records that were never designed to be joined. Different source systems used different identifiers, data types, and coding conventions for the same equipment.

The CID-QL Composite Key Strategy

The core join strategy used a CID-QL (Catalog ID + Quality Level) composite key to create reliable many-to-many relationships between the major tables. The CID Purchase Order table, CID POMS Status table, CID Stock Info table, and the EVAL table all joined to the central All Unit Main Table through this composite key. This approach was necessary because Catalog ID alone was not unique; the same part could exist at different quality levels with different obsolescence implications.

Equipment tag joins provided a second axis of integration. The EC table, Work Order ETAG Info table, ETAG SPV table, and SWMS Parts Hold data all connected through the EQ_COMPONENT_TAG field, linking the data model to physical equipment locations in the plant. An additional 13 auto-generated date relationships connected local date tables to datetime columns across the SWMS and Work Order tables.

Calculated Columns and DAX Measures

The MMOD Dashboard required custom DAX (Data Analysis Expressions) logic for several business rules. Binary flag columns classified whether each modification's design was complete, whether its mod outline was approved, and whether the modification had been put in service. A text-based classification column used the SEARCH function to parse comments and categorize modifications as either "emergent" or "planned."

Running total measures powered the burndown tracking visualizations. These used CALCULATE with FILTER and ALLSELECTED to compute cumulative counts across ordered records, enabling the TMOD burndown curve and modification completion progress tracking.

Data Quality Challenges

Data quality was the real bottleneck of this project. I spent more time understanding and reconciling data than building visualizations. The plant-wide table alone contained 382,916 records, and missing data was pervasive. Reorder points and target maximums were null for 91% of items. Modification numbers were missing for 76.5% of records. Inventory data (available, on-hand, demand) existed for only about 24% of items. Part numbers were absent for 66% of records, and obsolescence status was unknown for 15%.

Cross-table inconsistencies compounded the problem. The Catalog ID format in the plant-wide table had zero direct overlap with the Catalog ID format in the SWMS parts table; 570 unique CIDs in one system versus 151 in the other with no shared values in their raw form. Unit coding was inconsistent: the EC table used "0", "034", "012", and "SW" while the plant-wide table used numeric values 0, 1, 2, 3, 4, 12, and 34. POMS status had 36,077 items classified as "INCOMPLETE" or "UNIDENTIFIABLE," leaving their obsolescence unclear. Quality level was stored as text ("1", "3", "4", "F") in the CID tables and as floats (1.0, 2.0, 3.0, 4.0) in the plant-wide table.

Resolving these issues required composite keys (CID-QL) to enable reliable joins, data type normalization across systems, careful handling of null values in visualizations, cross-referencing equipment tags as a secondary join key, and reconciliation of different coding systems for the same entities. I designed the dashboard to surface data gaps rather than hide them, so management could see where data quality improvement was needed.

What Was Built

The project delivered two major Power BI dashboards backed by the unified relational database. Together they contained over 167 interactive visuals across 17 report pages.

Main Dashboard: 5 Pages, 50+ Visuals

The Main Dashboard (33 MB) served as the primary executive view for plant-wide obsolescence tracking across 2.3 million records.

Page 1: Obsolescence Metrics for Darlington Station was the primary executive landing page with 17 visuals. Three column charts broke down equipment by commodity category code, manufacturer part number, and model number, all color-coded by POMS obsolescence status (green for not obsolete, orange for obsolete). A donut chart showed the overall POMS status distribution. A pivot table displayed CID and ETAG counts with percentages. A 13-column detail table provided drill-down access to individual records. Seven interactive slicers (operating system, system code, unit, SPV, critical equipment, repairable indicator, POMS status) and two free-text search filters enabled executives to slice the data in any direction.

Page 2: Operating Systems provided system-level drill-downs with 14 visuals, including equipment by unit (stacked by POMS status), by system code, and by manufacturer part number, with a multi-column detail table and eight slicers and filters.

Page 3: SWMS Parts Hold analyzed work orders and parts on hold with 17 visuals. A timeline chart showed parts hold volume by early start date, stacked by POMS status. Additional charts broke down parts by catalog ID, work order type, and work order source, with detailed data tables and eight slicers.

Page 4: Priority Obsolescence List (POL) was a template page set up for the team to configure as the priority list matured.

Page 5: OPG Overwrite provided a specialized comparison view with a pivot table cross-tabulating POMS obsolescence status by catalog ID and purchase order issue date, alongside a waterfall chart showing obsolescence status changes over time.

MMOD Dashboard: 12 Pages, 117 Visuals

The MMOD Dashboard tracked Major Modifications and Temporary Modifications across the station's lifecycle.

Metrics (the main overview page) displayed 9 visuals: a pie chart for planned vs. emergent modifications, a column chart by discipline, a bar chart by project title (2020 Capital Project, IPG, DNRU2, and others), a gauge showing completion progress, a card with the count of completed modifications, and a 13-column detail table.

Year-Specific Dashboards (2019, 2020, 2021) each contained 17 visuals with gauges showing completion progress, pie charts for status distribution, bar charts for project breakdowns, detailed tracking tables, and year-specific page filters.

TMOD Dashboard was the most feature-rich page at 21 visuals. It included a gauge, donut chart, column chart, bar chart, a detailed TMOD data table, and three interactive slicers for removal condition, overdue status, and TMOD type.

Extended TMOD Burndown Curve used a line clustered column combo chart (a unique visual type) to show TMOD removal trends over time, plotting both actual removals and remaining counts on a shared axis.

Additional pages covered Work Readiness (upcoming modification details), Upcoming Modifications (near-term filtered view), Outages (modifications associated with planned outages), and MODs Affected by COVID-19 (pandemic impact tracking with dedicated tables and filters).

Key Findings

Once the data was consolidated and the dashboards were live, several findings became immediately visible for the first time. These had been buried in fragmented spreadsheets and siloed systems; the dashboard made them undeniable.

Plant-Wide Obsolescence Status

Status	Count	Percentage
OBSOLETE	180,200	47.1%
READY	115,897	30.3%
NOPURCH (No Purchase)	70,625	18.4%
H/USER (Hold/User)	11,091	2.9%
H/PEEVAL (Hold/PE Eval)	2,278	0.6%
H/REVIEW	1,732	0.5%
BOMONLY	622	0.2%
NEWITEM	41	<0.1%

Nearly half of all tracked components (47%) were flagged as obsolete. This single statistic reframed the urgency of capital planning discussions. It was no longer a question of whether obsolescence was a concern; the data showed it was the dominant condition across the plant's equipment base.

POMS Obsolescence Verification

Status	Count	Meaning
NO	137,203	Confirmed not obsolete
UNIDENTIFIABLE	79,904	Cannot be determined
YES	58,393	Confirmed obsolete
INCOMPLETE	48,853	Assessment not finished

Only 42% of items had a definitive obsolescence status (YES or NO). The remaining 58% were either unidentifiable or had incomplete assessments, highlighting a significant data quality gap that the dashboard made visible for the first time.

Quality Level Distribution

Q-Level	Count	Significance
Q1 (Nuclear Safety)	203,912	53.3%, highest safety classification
Q3	95,326	24.9%
Q4	74,722	19.5%
Q2	8,526	2.2%

Over half of tracked components (53%) were Q-Level 1, the highest nuclear safety classification. When combined with the 47% obsolescence rate, this meant a substantial portion of nuclear safety-critical equipment was approaching or had reached end-of-life, a finding with direct implications for capital project prioritization and regulatory compliance.

Parts on Hold (SWMS)

The SWMS data revealed 225,268 parts on hold across the station, all with Hold Code "PH" (Parts Hold). Of these, 208,231 (92%) were Q-Level 1, meaning nearly all held parts were nuclear safety-critical. Units 4 (89K) and 2 (76K) had the most parts on hold.

Cross-referencing with POMS data showed that 141,109 of the held parts (63%) were confirmed not obsolete, while 10,529 (5%) were confirmed obsolete, and 61,752 (27%) had unknown obsolescence status. This 27% unknown figure was actionable intelligence: it told the team exactly which held parts needed obsolescence assessments prioritized.

Inventory Insights

For items with inventory data, the averages were 17.8 units on-hand, 17.5 units in demand, and 8.0 units available. Some items showed negative availability (down to -12), indicating backorder or shortage conditions. Perhaps most telling, reorder points were set to zero for 98.7% of items, meaning there was virtually no automated reordering in place. Missing reorder thresholds were not just a data gap; they indicated a systemic process gap that the dashboard brought into sharp focus.

Business Impact

Before	After
Data in fragmented Excel files shared ad-hoc	Single centralized dashboard
Manual report compilation required	Real-time interactive filtering
No visibility into cross-system status	Unified view across all data sources
Senior management relied on verbal updates	Self-service drill-down analysis
Obsolescence status unclear for many items	2.3M+ records with clear categorization
No historical tracking or trending	Year-over-year dashboards (2019 to 2021)
COVID impact tracking ad-hoc	Dedicated COVID impact page

The dashboards enabled data-driven capital planning by making it clear which systems and components needed replacement investment most urgently. The risk identification was immediate: 47% of tracked components flagged as obsolete, with 53% being nuclear safety-critical (Q1). The SWMS Parts Hold view showed 225K parts on hold, enabling targeted procurement efforts. For regulatory compliance, the MMOD and TMOD tracking provided auditable lifecycle records for nuclear safety modifications, including burndown curves and overdue status tracking.

By the end of my internship, both dashboards were in active use by the obsolescence management team and senior management for daily operations and capital planning decisions.

Technical Skills Demonstrated

Data Engineering: Designed the complete ETL pipeline, extracting data from three disparate source systems and transforming it into a unified schema. Built a 25-table relational model with 21 defined relationships. Designed the CID-QL composite key strategy to resolve join ambiguities. Performed data type normalization across systems with incompatible representations.

SQL and Query Development: Designed and executed queries joining multiple tables on compound keys. Wrote aggregation queries (COUNT, SUM, MIN, COUNTROWS) across large datasets. Created filtered views using WHERE clauses with IN and NOT NULL conditions. Built cross-table percentage calculations using scoped evaluation.

DAX (Data Analysis Expressions): Wrote calculated columns for binary classification (complete vs. incomplete, approved vs. not, planned vs. emergent). Created running total measures using CALCULATE with FILTER and ALLSELECTED. Used the SEARCH function for text-based classification of freeform comment fields.

Data Visualization: Designed over 167 interactive visuals across 17 pages. Implemented drill-down filtering with interconnected slicers. Applied conditional formatting with color-coding by obsolescence status. Created combo charts, waterfall charts, and burndown curves. Integrated custom visuals including text filter controls for free-text equipment search.

Data Quality Management: Identified and resolved data fragmentation across three source systems. Handled missing data with null rates reaching 91% in some columns. Reconciled cross-system ID mismatches where raw identifiers had zero overlap. Designed the dashboard to surface data gaps transparently rather than masking them, turning missing data into actionable information for the team.