The Five Things That Surprised Me About Industrial Data

If you’ve spent years in manufacturing IT, you know that industrial data isn’t just “data.” It’s messy, stubborn, and full of surprises. When I first started connecting machines and systems back in the SAP xMII days, I thought the challenge was mostly technical. Turns out, the real lessons are a lot more human — and humbling. Here are five things that genuinely surprised me about working with industrial data, based on what I’ve seen, fixed, and sometimes failed at, across dozens of plants.

1. Data Quality Is Never as Good as You Think

I used to assume that the data coming out of a PLC, SCADA, or historian was pretty reliable. After all, these systems run critical processes, right? But in reality, manufacturing data is often full of gaps, duplicates, wrong timestamps, or just plain nonsense values. I’ve seen temperature tags stuck at the same value for weeks, sensors that report in the wrong units, and operators who “fix” bad readings by entering whatever gets the system to stop beeping.

One specific project at a large process plant stands out. We were building a real-time OEE dashboard, but the line downtime numbers just didn’t add up. After digging in, we found that some machines weren’t logging events at all, while others were double-counting. The fix wasn’t just technical — it required walking the floor, talking to operators, and tracing cables. That’s when I learned: data quality is a people problem as much as a tech one.

My honest opinion: if you don’t invest in data quality checks and cleansing up front, you’ll pay for it ten times over later. No fancy analytics or AI will save you from garbage in, garbage out⁠.

2. Naming Conventions Are a Hidden Nightmare

I never expected that something as simple as “what do we call this tag?” would eat up so much project time. Every plant, team, and system seems to have its own naming logic (or lack of logic). I’ve seen the same pump called “PMP_101,” “MainPump,” “001-P,” and “PumpA” — sometimes in the same facility. When you try to build a Unified Namespace or integrate MES, historian, and ERP, these inconsistencies become a real headache.

In one multi-site rollout, we spent weeks just mapping tag names across MES, historian, and DCS systems. It sounds trivial, but without a common language, everything from dashboards to root cause analysis gets harder. The good news is that standards are improving, but it’s still a battle to get everyone aligned⁠⁠.

3. There’s Always a Data Silo You Didn’t Know About

No matter how much you plan for “shop floor to top floor” integration, there’s always a hidden data silo lurking somewhere. Sometimes it’s a legacy historian that nobody remembers, or a lab system that exports data to Excel files on a shared drive. Other times, it’s a “shadow IT” solution built by a clever engineer to solve a real problem — and now it’s mission-critical.

At one food & beverage site, we discovered a whole set of environmental monitoring data was being logged to a local PC in a back office. Nobody had included it in the integration plan, but it turned out to be essential for compliance reporting. We had to scramble to bring it into the official data pipeline, which meant extra validation, documentation, and (of course) delays.

Lesson learned: never assume you know where all the data lives. Always ask, “What are we missing?” — and expect to be surprised⁠.

4. Real-Time Isn’t Always Real

Manufacturers love to talk about “real-time visibility.” But what “real-time” actually means is all over the map. For some, it’s sub-second updates; for others, it’s “I get my report by 7am.” I’ve seen systems that claim to be real-time but only refresh every 10 minutes, and others that overload the network trying to push every sensor reading instantly.

One time, we hit a hard technical limit: the system could only handle 1,000 tags in real time. When you have tens of thousands of tags per site, you have to make tough choices about what’s truly critical. And when a dashboard lags or drops data, it’s almost always because someone’s expectation of “real-time” didn’t match reality⁠⁠.

Here’s my advice: define what “real-time” means for each use case, and be honest about the trade-offs. Sometimes “fast enough” is better than “as fast as possible.”

5. Integration Is Harder Than Anyone Wants to Admit

Connecting machines, MES, historians, and cloud platforms sounds straightforward on paper. In practice, it’s a maze of protocols (OPC DA/UA, MQTT, REST), middleware (Kepware, Ignition, HighByte, SAP PCo), and security hoops. Even with modern tools, there are always edge cases: devices that don’t speak the right protocol, systems that can’t be upgraded, or data that can’t leave the plant due to compliance.

One big surprise for me was how much time gets eaten up by troubleshooting. Issues like unidirectional architectures (data only flows one way), gaps in end-to-end management (like misconfigured connectors), and lack of ownership for standardization all slow things down. And when you add cybersecurity and compliance, the integration puzzle gets even trickier.

I’ve learned to respect the complexity and not promise “seamless integration” — because there’s always something you didn’t plan for⁠⁠.

Final Thoughts

Industrial data is messy, stubborn, and full of surprises. But that’s also what makes this work interesting. The technical problems are real, but most of the time, the root causes are human — habits, shortcuts, or just the way things have always been done. If you want to make plants smarter and more connected, start by listening, walking the floor, and questioning your own assumptions.

And here’s my unpopular opinion: the best industrial data experts aren’t the ones who know the most about technology. They’re the ones who ask the best questions, spot the gaps, and aren’t afraid to admit when they’re surprised.

The Industrial IoT Blog