Connect to a cluster
This recipe uses Databricks Connect to execute pre-defined Python or SQL code on a shared cluster with UI inputs.
Code snippet
app.py
import os
import streamlit as st
from databricks.connect import DatabricksSession
cluster_id = st.text_input(
"Specify cluster id:",
placeholder="0709-132523-cnhxf2p6",
help="Copy a shared Compute [cluster ID](https://docs.databricks.com/en/workspace/workspace-details.html#cluster-url-and-id) to connect to.",
)
if cluster_id:
spark = connect_to_cluster(cluster_id)
st.success("Successfully connected to Spark:", icon="✅")
session_info = {
"App Name": spark.conf.get("spark.app.name", "Unknown"),
"Master URL": spark.conf.get("spark.master", "Unknown"),
}
st.json(session_info)
spark = DatabricksSession.builder.remote(
host=os.getenv("DATABRICKS_HOST"),
cluster_id=cluster_id
).getOrCreate()
query = "SELECT 'I'm a stellar cook!' AS message"
sql_output = spark.sql(query).toPandas()
st.dataframe(sql_output)
result = spark.range(10).toPandas()
st.dataframe(result)
info
You also have the option to connect to serverless compute using Databricks Connect.
Resources
Permissions
Your app service principal needs the following permissions:
CAN ATTACH TO
permission on the cluster
See Compute permissions for more information.
Dependencies
- Databricks SDK for Python -
databricks-sdk
- Streamlit -
streamlit
requirements.txt
databricks-sdk
streamlit