The answer to the problem, says Fisher, is to layer structured data on top of free-form listing data, so the site can determine that all those iPhones are the same. "That allows us to understand pricing and supply and demand, and identify deals and give better recommendations and better search results and make onboarding inventories much easier."
Doing that on top of the diversity of eBay's inventory has been an "interesting technology challenge," says Fisher, to which eBay has applied machine learning and deep learning. How important is this initiative to the business? Important enough that the current generation of search technology required a "multi-hundred-million-dollar investment" to deliver the scalability and reliability that eBay needed. No existing search solution, open source or commercial, came close to filling the bill.
Open source engagement
Not surprisingly, eBay has no inclination to open-source this huge proprietary search effort -- and besides, it's specific to the way eBay works. But like many Internet giants, eBay regularly contributes open source projects to the community.
One recent, powerful example is Apache Kylin, a distributed analytics engine that provides a SQL interface and OLAP (online analytical processing) on top of Hadoop. "We have a ton of data, we do a ton of analytics.
We're an extremely data-driven company, and we've been migrating from more traditional data warehousing technologies over to Hadoop -- but we still wanted to be able to leverage existing BI tools," Fisher says. eBay created Kylin for that purpose and ultimately handed the code to the Apache Foundation.
Fisher notes that eBay has a very large Hadoop infrastructure. "In the consumer Internet world, little tiny changes can actually make a huge difference, and we do a ton of A/B testing" using Hadoop analytics to interpret the results, he says. The company has been making big investments to move from large batch jobs to near-real-time to help "make sense of this enormous amount of extremely interesting data." eBay has also plunged into the Hadoop ecosystem, leveraging Storm, Kafka, Spark, and more.
"We take advantage of open source, and we think it makes sense to also contribute to open source to be good members of the technical community," says Fisher. "It helps build our reputation as a technology community, and people all over the world, even our competitors, are using technology that came from us. "
Wrestling with OpenStack
eBay's use of one open source technology in particular is the stuff of legend: OpenStack. Nearly four years ago, InfoWorld broke the story that eBay was using OpenStack to manage a high-volume dev, test, and experimentation environment. Today, eBay is one of the largest OpenStack users in the world.
Sign up for CIO Asia eNewsletters.