Overview
Databricks is the leader in Apache Spark. With Databricks it is extremely easy to bring up and manage Apache Spark clusters and run them at scale.
Fire Insights now has an extremely deep integration with Databricks. The goal is to easily build workflows in Fire Insights using the powerful 280+ processors and run them seamlessly on Databricks.
Integration with Databricks enables users to:
View their Databricks File System in Fire Insights
View their Databases and Tables in Databricks in Fire Insights
View their Databricks Clusters in Fire Insights. Also to start and stop them.
Build workflows in Fire Insights on their data in Databricks and to run them on their Databricks Clusters.
View the results of execution of the Jobs in Fire Insights.
Architecture
Below is the Architecture diagram of the Fire Insights integration with Databricks.
Fire Insights interacts with Databricks via REST API's and also over JDBC. JDBC connections are used to get the metadata of the tables in Databricks.
Viewing Databricks clusters in Fire Insights
Users can create a connection in Fire Insights to connect to their Databricks clusters. Once the connection is set up, you can view your Databricks clusters.
Reading from and Writing to Databricks tables
Below is a simple workflow which reads data from a Databricks table and writes the results to another Databricks table. Of course you can add a lot of transforms on the data between the read and the write steps.
Executing the Job on Databricks
When executing the workflow, Fire Insights would submit the job onto the Databricks cluster. The results of execution are also visible.
Conclusion
In this blog we saw the details of the power of integration of Fire Insights with Databricks. More details of the integration are available here:
Comentários