FRAMINGHAM, 23 SEPTEMBER 2011 - There’s nothing more damaging to a fragile IT-business bond than the slow-motion meltdown of a new mission-critical application—nothing, that is, except the inability of IT to stop it.
That’s the predicament Trever Scott faced. As North American director of IT for Dole Fresh Fruit, he led an effort to stabilize a promising new tracking and distribution application called TranShip four years ago.
The problems started almost immediately after implementation and were persistent. Response time was slow during peak hours. TranShip and scanners that used it would time out several times a day. The system’s database crashed and at one point the application was unavailable for three days.
The business operations and logistics staff who relied on TranShip to manage data from farms, packaging, distribution and ripening facilities were frustrated with the performance degradation. Users at Dole’s port facilities and warehouses had to track products with pen and paper. Day-to-day order planning was extremely difficult and distribution efficiency—vital for moving fresh produce—plummeted. When the system came back online, users had to re-enter data into multiple systems, resulting in errors and inconsistencies that affected operations, sales and finance. By mid-2008, Scott says, confidence in IT had hit an all-time low.
Scott’s applications group and infrastructure-support team worked overtime trying different fixes. Nothing worked. “Although our hearts were in the right place, we did not have confidence we fully understood what was causing stability issues,” Scott says. “We were working very long hours. We were tired. And we had lost hope that we could really change the direction of the course we were on.”
Finding the Needle in the Haystack
The problem was Dole’s complex architecture—six major applications stitched together with middleware interacting via multiple real-time and batch interfaces 24 hours a day—and IT’s complete inability to see into its various tiers.
“There were so many [potential] points of failure in the system. Problems could be related to communications, infrastructure, data errors, user error, or some other unknown phenomenon,” Scott explains. “We had no real tools to monitor application, server or network health.”
Scott and his team began to see a light at the end of the tunnel last year. A pilot run of Compuware’s (CPWR) Vantage application-performance management tools allowed Dole to monitor TranShip’s database, application and server performance over four weeks and finally pinpoint the source of its problems. System response time was slow because an incorrect network traffic setting had given the mission-critical distribution application a lower bandwidth priority than e-mail and web searches.
IT added memory and optimized its usage, and raised the priority of TranShip transactions. Response time and uptime increased, time-outs and slowdowns decreased, and distribution efficiency and data integrity were restored.
Sign up for CIO Asia eNewsletters.