"We did these PoCs very quickly, they were successful," Knott says, adding that the bank has a number of other projects lined up once the initial projects are fully up and running. "We are literally a few weeks away from going live with the five first set of use cases and then hot on the back of that we will say to all the other hundred people or so that have been waiting: 'you can now start deploying'."
From Hadoop to BigQuery and CloudML
The bank had previously run all of its analytics on-premise over the years, progressing from SQL to traditional data warehouses, before investing in Hadoop around 2011. "We had built what most people had built; a set of big data and analytics capabilities using various parts of the Hadoop ecosystem." This involved a mixture of open source and commercial technologies "which we had selected and then integrated together to basically build ourselves data lakes and analytics clusters and all that kind of stuff".
However, the Hadoop systems had limitations, such as scalability and flexibility.
"We got some value out of that but to be honest we found it hard to keep on top of, just hard to build skills at the pace required to integrate new technologies," Knott says.
"No matter how hard we ran there is always something new coming in that we wanted to get access to, but we couldn't get there quite fast enough to have really finished deploying what we were deploying previously.
"So it was hard to manage, hard to keep on top of, and also hard to scale. We had reasonable success but we were having these challenges."
The aim for the company was to access machine learning capabilities, but without the need to run the systems on-premise.
With regards to Google Cloud, HSBC is using a variety of tools. This includes BigTable and BigQuery for data analytics, Dataflow, PubSub for event handling, as well as a range of Google's machine learning APIs, including one for Data Loss Prevention.
"Around last year we started a conversation with all of the cloud providers to say 'show us what you have got'," Knott says, "and after some conversations we decided to work with Google on a series of PoCs, to answer three questions which were: if we bring some big data use cases to you, will they work, can we do the things we are trying to do? [Secondly] are they economic - can we do them at least the same price but hopefully a cheaper price than beforehand. Thirdly, is it easier, basically, which was really the big one."
Knott says that Google delivered on each of these categories.
Sign up for CIO Asia eNewsletters.