Cloudera has worked in joint collaboration with Microsoft to reduce the burden on application developers leveraging Spark. Cloudera and Microsoft, together with other open source contributors, have built a new open source Apache licensed REST-based Spark Service, called Livy, which is still in early alpha development.
Livy provides an easy way for applications to interface with Spark, submit jobs and programmatically retrieve results. At its core, Livy is a REST server for submitting, running, and managing Spark jobs and contexts. Its Client API enables fine grained Spark job submission and retrieval of results synchronously or asynchronously. Clients can consume Spark like a multi-tenant service, and not have to worry about deployment, configuration or monitoring. Livy provides Spark as a multi-tenant service with session isolation, security and user-impersonation.
Each client of Spark need not go through a Spark installation or configuration process to get started. Only a lightweight client that talks to an HTTP endpoint is needed.
Applications can build with REST-based client APIs in Java, Scala and Python for fine-grained Spark job submission, result retrieval and management of SparkContexts (the Scala and Python client APIs are under development). Spark can be invoked by applications written in diverse frameworks like Django for Python, Play for Scala or Java. Moreover, because it is REST-based, with a little work, you can also leverage Livy from applications written in languages like Node.js or Go.
Livy also makes it easy to integrate Spark into service oriented- or microservices-based architectures, which primarily interact through REST.
“Microsoft is focused on simplifying big data and advanced analytics to make technologies like Apache Hadoop and Spark available for everybody,” said Tiffany Wissner, senior director of Data Platform Marketing at Microsoft. “The collaboration on Project Livy was able to make interacting with Spark easier for developers through a REST web service and able to make Spark enterprise-ready as a robust back-end for running interactive notebooks.”
“Spark gives you fast big data processing with a general purpose flexible API. We see a natural tendency among our customers and partners to want to leverage Spark’s capabilities from client applications that can easily interface with Spark, and Livy makes that possible,” said Anand Iyer, senior product manager at Cloudera. “Livy will open Spark to new use cases, and we are hoping it attracts a community of developers that will not only build applications on top of Livy, but also contribute to it, help shape its API and enhance its functionality. It is still a very nascent project, and hence any contribution will have tremendous impact.”