About 35,100 results
Open links in new tab
  1. Overview - Spark 4.0.1 Documentation

    Spark Connect is a new client-server architecture introduced in Spark 3.4 that decouples Spark client applications and allows remote connectivity to Spark clusters.

  2. Apache Spark™ - Unified Engine for large-scale data analytics

    Apache Spark is a multi-language engine for executing data engineering, data science, and machine learning on single-node machines or clusters.

  3. Application Development with Spark Connect

    In Apache Spark 3.4, Spark Connect introduced a decoupled client-server architecture that allows remote connectivity to Spark clusters using the DataFrame API and unresolved logical plans as the …

  4. Spark Connect | Apache Spark

    This page explains the Spark Connect architecture, the benefits of Spark Connect, and how to upgrade to Spark Connect. Let’s start by exploring the architecture of Spark Connect at a high level.

  5. RDD Programming Guide - Spark 4.0.0 Documentation

    Spark supports two types of shared variables: broadcast variables, which can be used to cache a value in memory on all nodes, and accumulators, which are variables that are only “added” to, such as …

  6. Documentation | Apache Spark

    Apache Spark™ Documentation Setup instructions, programming guides, and other documentation are available for each stable version of Spark below: Spark

  7. Spark Connect Overview - Spark 3.5.6 Documentation

    In Apache Spark 3.4, Spark Connect introduced a decoupled client-server architecture that allows remote connectivity to Spark clusters using the DataFrame API and unresolved logical plans as the …

  8. Cluster Mode Overview - Spark 4.0.1 Documentation

    This document gives a short overview of how Spark runs on clusters, to make it easier to understand the components involved. Read through the application submission guide to learn about launching …

  9. Distributed SQL Engine - Spark 4.0.1 Documentation

    Spark SQL can also act as a distributed query engine using its JDBC/ODBC or command-line interface. In this mode, end-users or applications can interact with Spark SQL directly to run SQL queries, …

  10. Structured Streaming Programming Guide - Spark 4.0.1 Documentation

    Structured Streaming is a scalable and fault-tolerant stream processing engine built on the Spark SQL engine. You can express your streaming computation the same way you would express a batch …