Databricks Open-Sources Apache Spark Declarative ETL Framework, Boosts Pipeline Builds by 90%

Alfred Lee 1 mo ago

In a groundbreaking move for the data engineering community, Databricks has announced the open-sourcing of its core declarative ETL (Extract, Transform, Load) framework, named Apache Spark Declarative Pipelines, at the annual Data + AI Summit in San Francisco on June 11, 2025. This framework promises to revolutionize data pipeline development with a staggering 90% faster build time, empowering engineers to focus on outcomes rather than intricate coding details.

The newly open-sourced framework allows data engineers to define their pipelines using familiar languages like SQL and Python. Instead of manually coding each step, users simply describe what the pipeline should achieve, and Apache Spark handles the execution. This declarative approach simplifies both batch and streaming ETL processes, making it accessible to a broader range of professionals.

Previously known as part of Databricks' proprietary offerings under Delta Live Tables (DLT), the company has now contributed this technology to the Apache Spark open-source project. This move democratizes access to enterprise-grade data workflow automation, enabling the global Spark community to leverage Databricks’ battle-tested tools for more reliable and efficient data processing.

According to Databricks, the framework not only accelerates pipeline development but also enhances automation in areas like data quality, change data capture (CDC), ingestion, and transformation. This aligns with the company’s mission to simplify data engineering on its Data Intelligence Platform, as highlighted during the summit announcements.

The open-sourcing of Apache Spark Declarative Pipelines is expected to foster innovation within the data and AI ecosystem, encouraging collaboration and further development by community contributors. Tutorials and resources, such as those provided by Microsoft Learn for Azure Databricks, are already emerging to help users get started with building ETL pipelines using this framework.

As Databricks continues to lead in transformative tech, this release marks a significant step toward making advanced data tools more inclusive. Businesses and individual developers alike can now harness the power of faster, more efficient data pipelines to drive actionable insights and maintain a competitive edge in an increasingly data-driven world.

More Pictures

Databricks Open-Sources Apache Spark Declarative ETL Framework, Boosts Pipeline Builds by 90% - VentureBeat AI (Picture 1)

Share This Story

BEAMSTART

BEAMSTART is a global entrepreneurship community, serving as a catalyst for innovation and collaboration. With a mission to empower entrepreneurs, we offer exclusive deals with savings totaling over $1,000,000, curated news, events, and a vast investor database. Through our portal, we aim to foster a supportive ecosystem where like-minded individuals can connect and create opportunities for growth and success.

Connect with Us

Discover More

Home

Jobs

Investors

Members

Databricks Open-Sources Apache Spark Declarative ETL Framework, Boosts Pipeline Builds by 90%

More Pictures

Share This Story

Share This Story

Latest Jobs

Technical Account Manager

Relationship Manager

Technical Customer Success Manager

More News

Mistral AI Unveils European AI Cloud to Challenge AWS and Azure with Microsoft Backing

Top 5 Altcoins to Invest in Now for Rapid Recovery of Crypto Losses in 2025

US Spot Bitcoin ETFs See Record $13 Million Inflows as Investor Confidence Surges

Hyperliquid Whale Faces Massive $5.4M Bitcoin Loss in High-Stakes Liquidation

Spot Ethereum ETFs Experience Significant Outflows: What Investors Need to Know

Connect with Us

Discover More