Skip to main content

Connect to a cluster

This recipe uses Databricks Connect to execute pre-defined Python or SQL code on a shared cluster with UI inputs.

Code snippet

app.py
import os
from databricks.connect import DatabricksSession

cluster_id = "0709-132523-cnhxf2p6"

spark = DatabricksSession.builder.remote(
host=os.getenv("DATABRICKS_HOST"),
cluster_id=cluster_id
).getOrCreate()

# SQL operations example
a = "(VALUES (1, 'A1'), (2, 'A2'), (3, 'A3')) AS a(id, value)"
b = "(VALUES (2, 'B1'), (3, 'B2'), (4, 'B3')) AS b(id, value)"

# Inner join example
query = f"SELECT a.id, a.value AS value_a, b.value AS value_b FROM {a} INNER JOIN {b} ON a.id = b.id"
result = spark.sql(query).toPandas()
print(result)

# Generate sequence
result = spark.range(10).toPandas()
print(result)

Resources

Permissions

Your app service principal needs the following permissions:

  • CAN ATTACH TO permission on the cluster

See Compute permissions for more information.

Dependencies

requirements.txt
databricks-sdk
dash