What Makes It So Hard to Measure, Manage, and Interpret Good Data in Supply Chain Operations?
Data quality remains a critical and often challenging aspect that flies in the face of sophisticated freight-tech solutions, causing discrepancies and inefficiencies in supply chain management.
Welcome to The Logistics Report, a weekly newsletter that discusses *anything* logistics. This is a space where we dissect market trends, chat with industry thought leaders, highlight supply chain innovation, celebrate startups, and share news nuggets.
Freight operations have received a lot of flak over the years for the lack of propensity to accept change in their midst. This resistance is largely rooted in the industry’s skepticism to adopting technology, thanks to the ‘why fix it if it ain’t broke’ mentality that stakeholders have cultivated over the years in the business.
While cracks appeared across such hardened stances by the mid 2010s, the pandemic was a catalyst that upended previously held notions of operational inflexibility, pushing companies to invest in technology en masse as their customers demanded visibility into their often-delayed freight. Real-time visibility and transparency became a household term, and regurgitated across boardrooms to the point of cliché.
However, as companies bought and integrated different freight-tech solutions, they quickly realized their problems were far from over. To start with, data recorded across various solution providers were not interoperable, as there was no ‘one standard’ that governed how data was measured or stored.
Then came the issue with ‘real-time’ data — which is seldom that, as data was rarely ever measured, stored, and transmitted the instant an event happened. And even if data streams could be streamlined and relayed in near time, there was always the challenge of weeding out noise in the data.
Data quality, or the lack of it, is a real problem that does not get discussed enough. Data noise is a frustrating problem, as it could creep into systems and damage a company’s ability to glean insights. The freight-tech solutions could be stellar, but if it’s garbage in, it’s garbage out.
The freight-tech solutions could be stellar, but if it’s garbage in, it’s garbage out.
First off, companies must understand what constitutes good data and what does not. “Good data is useful data. Is good data always entirely 100% correct? No, it doesn’t have to be. Because you can have consistently incorrect data and still use it to find patterns. On the flip side, you can have entirely correct data, but it could be completely useless,” said Genevieve Shattow, the head of analytics at ThroughPut.ai.
Taking this a step further, good data is not just defined by how accurately it gets measured or stored, but also by its relevance in the context of the problem that needs solving. A data set can be considered suitable in some situations and bad in others. “One of the reasons to always press for more data is because more data allows us to figure out the noise and the signal. More data will also allow us to pick and choose the data relevant to the current context,” said Shattow.
For instance, consider a business where product pricing is crucial, but the captured data is all about manufacturing. While the data could be very detailed and accurate, it will not be helpful for the management to understand market rates or their associated customer demand. This does not mean the manufacturing data has no importance — it is just not relevant enough until there is access to more crucial pricing data.
“The importance of data is tied to its situation,” contended Shattow. “We often have situations where data logging relies on someone needing to push a button to input it. In one case, we witnessed a person who always delayed it by two hours. We had no clear idea how long it actually took. But because it was so consistently late, we could either ignore the time it took to input that data, or consider the guy an outlier and look at other parallel and reliable data. And so, just because there is inaccurate data in a dataset, it does not mean it cannot be used.”
This reflects two realities — one, which shows that even great datasets can have noise, and two, data that relies on human interference can lend itself to inconsistencies. In the end, companies are run primarily by humans. But considering we are so used to being surrounded by robots and automated processes, it is natural to overlook and expect a robotic level of perfection that sets business processes up for failure.
This reflects two realities — one, which shows that even great datasets can have noise, and two, data that relies on human interference can lend itself to inconsistencies.
This mentality becomes an issue when it infiltrates the top management, complicating matters. “They rely on the company to be almost robot-like because they’re not making the mistakes themselves,” pointed out Shattow. “They’re not on the ground, picking packages or pushing buttons. They might see a column that says ‘rate of mistakes,’ but there’s an inability to understand that data comes from humans. Even sensors have issues. But this idea that people, en masse, perform to certain standards is clearly an issue that needs to be overcome.”
The instant gratification part of our everyday life — right from expecting same-day delivery to scrolling through social media, is resulting in a yearning for updates that come in real-time and on-demand. Then again, companies rarely need updates in real-time; they only need them on-demand.
“You need data at the cadence at which you will take action. For example, we have clients who check the dashboard once every morning. They say they need it in real-time, but they only check it once a day at 8 AM. As long as I have the data in by 6 AM, it should be fine. People think Amazon updates orders in real-time, but they don’t. And if data isn’t coming in real-time, it’s not going out in real-time either. The upper limit of how often you should update is based on how often and when they check it.”
Even in the case where data does flow in real time, companies are not equipped to handle data at such high frequency. Take the case of a pharma company that wants to measure and control conditions of the environment their drug is stored in — for instance, a refrigerator. Asking for alerts during exceptions would be ideal, but if the company demands alerts at every instance of the refrigerator door being opened, it could result in alert fatigue.
Even in the case where data does flow in real time, companies are not equipped to handle data at such high frequency.
“If the product you’re building is not useful, it isn’t a good product. It’s the same with data; if it’s not useful, it’s not good data,” argued Shattow. “It’s important to update things at a cadence that makes sense, not just for the user, but also for the data. If it takes an hour to clean up the data as it comes in, then updating it every hour isn’t going to work. Maybe every two hours, if the company checks it regularly. People often go to a website and find it hasn’t updated yet, and that’s fine. This likely wouldn’t collapse the system.”
The key to getting good data and making it useful inevitably lies in how companies treat the people they have hired to deal with it. “People need to understand that bad data is not the reflection of the data scientists who run it, or the people who generate it,” explained Shattow. “There seems to be an inherent distrust in the people below, or an expectation that they will just fix everything and make it perfect. But mathematically speaking, errors propagate. If you have uncertainty at the beginning, the only way to decrease that uncertainty is with lots of data. You need someone who understands how to combine data and how to propagate it through to its best intent.”
The key to getting good data and making it useful inevitably lies in how companies treat the people they have hired to deal with it.
That said, companies need to be mindful to differentiate between the people who design the data systems versus the people who actually end up looking at the data. While data engineers could build great pipelines to capture data, they often never actually look at the data. They rarely ask, “What does this data look like? Does the data coming out resemble the data going in, minus the noise?” A good data engineering team is life-changing for an organization, but the ones who really understand the data are usually ones with domain-specific expertise — and they need support and collaboration.
A good team is where a data scientist looks at how the data came in and came out, and works with someone who understands the data collection process and what data is expected at the other end. Together, they can build a good pipeline based on domain know-how, which can help feed credible data intelligence into operations.
However, putting this together takes time and effort. Companies are often pushing buzzwords due to market pressure, which dilutes the real essence of what makes data truly valuable and impactful for their specific operational needs. For instance, it would have taken compelling intelligence and strong internal backing for operations or supply chain people to raise their hands and resist top management’s attempts to scale up based on pandemic demand. This was obviously not the case across most companies, as reflected by the bullwhip effect that has left businesses struggling with bloated inventories post-pandemic.
“There’s a famous saying that ‘all models are wrong.’ The quality of the model can vary greatly, but it’s going to rely on the data, and more importantly, the way the data gets handled,” said Shattow. “It’s not just about the data itself, but about how you manage and interpret it that ultimately determines its effectiveness and accuracy in decision-making processes.”
Stuff I’m Involved In
I wrote a report for Holocene on the impact global trade policy changes and evolving socio-political scenarios have on supply chain health and freight movement. The report discusses complexities that companies face with understanding such changes, and the challenges with finding and leveraging localized data that can give them global supply chain context. Check this link to download the report.
I had an interesting conversation with Paul Travers, the President and CEO of Vuzix, on the In Transit podcast, where we discussed the value that augmented reality (AR) glasses brings to the warehousing segment. Travers explained how AR glasses can help quicken daily warehousing tasks like picking and sorting, enhancing operational efficiency and reducing the time it takes to train employees on the job. Check this link for the episode.
The Week in Snippets
Rising interest rates are halting warehouse construction in the US, a shift from the building boom driven by e-commerce growth during the pandemic. This downturn, marked by a significant drop in construction starts and industrial real estate sales, is influenced by higher debt costs and a slowdown in leasing demand, though the sector remains robust with low vacancies and continued e-commerce activity.
The Biden administration's target of achieving 30 GW of offshore wind energy by 2030 faces significant challenges due to rising costs, supply chain issues, and slow permitting processes. Additional hurdles include local content requirements and construction barriers, leading experts to predict a substantial downward revision in offshore wind capacity expectations.
As US warships head to the Gulf of Aden, major shipping lines like OOCL and Maersk are halting or rerouting Red Sea transits due to security concerns, significantly impacting global shipping routes. This strategic shift, influenced by recent maritime security incidents and the involvement of international navies, underscores the escalating tensions in the region and their ripple effect on global trade and maritime operations.
Union Pacific and BNSF railroads are urging U.S. Customs and Border Protection to reopen key rail bridge crossings in Texas, closed due to increased immigration activity. The closures are significantly disrupting cross-border trade, affecting a substantial portion of Union Pacific's business and causing widespread supply chain impacts, including congestion and delays in the movement of critical goods.
Quotable
“My sense is that the next bull market pricing cycle—which will happen sometime—will be much more gradual than the prior two cycles that started in Q3 2017 and June/July 2020. As such, don’t expect the market to change overnight and, instead, plan for market dynamics to evolve more slowly as we head further into 2024.”
- Jason Miller, professor of supply chain at Michigan State University, commenting on where the dry van truckload market is heading towards next year
Like what you read? Do consider subscribing! Have something you’d like me to cover? Reach out at vishnu@storskip.com