Databricks vs. fabric: two solutions for building modern data architectures

Dr. André Kaderli is a Senior Data Scientist and part of a team of data experts at Novalytica. In this role, he is responsible for building complex data architectures that are tailored to the needs of a company. A comprehensive data architecture enables companies to make data from different systems usable and to automate data flows, e.g. as regular reporting for different stakeholders. The first step in setting up a data architecture is to get to know the current system landscape and the requirements and needs of a company in terms of data usage and reporting and to define the future set-up with the necessary data processes. In this context, André Kaderli often encounters the question of whether Microsoft Fabric or Databricks is the more suitable solution for the data processing steps to be implemented. In the following interview, he answers the most important questions.

André, can you briefly explain the context in which Databricks and Microsoft Fabric are used and what their fundamental differences are?

Both solutions are used to build modern lakehouse infrastructures that combine the strengths of data lakes and data warehouses to efficiently store, process, manage and use large volumes of data for analyses.

Databricks is an established, developer-oriented platform based on an open data warehouse model with Delta Lake. It offers high flexibility and scalability and supports various cloud providers such as Microsoft Azure, AWS and Google Cloud. Databricks is particularly suitable for complex, code-based data processes and advanced analytics, including machine learning and AI.

Microsoft Fabric, on the other hand, is a more recent, fully integrated SaaS approach from Microsoft that was introduced at the end of 2023. It is based on the OneLake centralized storage concept and uses the Delta format. Fabric is heavily focused on ease of use and low/no-code usage, with a user interface inspired by Power BI. This makes it easier for self-service teams to get started, but offers fewer individual configuration options.

For which initial situation do you recommend that organizations use Fabric or Databricks?

Databricks is suitable for companies looking for a highly configurable and scalable solution for complex data processing and advanced analytics, particularly through code-based workflows. The platform can be seamlessly combined with services such as Azure Data Factory and Power BI and can also be operated cost-effectively for smaller applications if configured efficiently.

Microsoft Fabric is primarily aimed at organizations that are already strongly anchored in the Microsoft ecosystem and focus on simple integration, fast results and self-service BI. The platform supports low- and no-code solutions for data integration and transformation and offers a Power BI-like user interface that makes it easier for business users in particular to get started.

How can the respective solutions be integrated into existing IT landscapes and what challenges might come with the implementation?

Databricks offers a high degree of technological flexibility and can be combined with various cloud providers and on-premise systems. It supports both structured and unstructured data and enables connection to existing data lakes, data warehouses and ETL pipelines. The use of open standards such as Apache Spark and Delta Lake facilitates integration, but requires technical expertise.
Microsoft Fabric scores with simple integration into existing Microsoft environments, especially in combination with Power BI. For companies that already rely heavily on Microsoft technologies, the seamless embedding of Fabric offers clear advantages. However, Fabric offers less scope for technical configuration compared to Databricks, which can be a limitation for highly individual requirements.

Microsoft Fabric is primarily aimed at organizations that are already strongly anchored in the Microsoft ecosystem and focus on simple integration, fast results and self-service BI. The platform supports low- and no-code solutions for data integration and transformation and offers a Power BI-like user interface that makes it easier for business users in particular to get started.

What does data management and data governance look like in both solutions?

With the Unity Catalog, Databricks offers a comprehensive, integrated solution for data governance. It covers all data assets and enables centralized access control, security management and detailed data lineage.

Microsoft Fabric uses Microsoft Purview for governance. Although Purview offers central metadata management and classification of data resources, the integration in Fabric is not yet fully developed, particularly with regard to the consistent lineage display.

How is the data from the respective architectures prepared in Power BI reports for the end user?

Databricks relies on code-based data processing with Python, SQL, Scala or R. This approach requires technical expertise, but allows maximum flexibility and control, especially for complex data pipelines and analyses. In practice, a medallion architecture is often used, in which data is cleansed, transformed and aggregated step by step. The final data can be integrated via Power BI connectors.

Microsoft Fabric follows a low/no-code approach with integrated tools such as Dataflows Gen2, Power Query and Pipelines. This makes it easier for business and Power BI users in particular to get started with data processing. Code-based notebooks are also available for more complex applications. Thanks to the native integration in Power BI, structured data can be visualized directly and without additional pipelines.

How do the costs and operating efficiency differ between the two solutions?

The costs vary depending on the usage scenario. Databricks offers a flexible, usage-based pricing model, which is particularly advantageous for variable utilization. In pay-as-you-go mode, Databricks scores with automatic stopping of inactive clusters, which means that costs are only incurred for actual usage.

Microsoft Fabric, on the other hand, uses a capacity-based pricing model that ensures predictability with even utilization. However, Fabric capacities must be stopped manually or via API, which simultaneously deactivates all content in the workspace – including Power BI reports. This often makes Fabric impractical in the consumption model and more attractive for reserved capacities.

Where do you think the development of Fabric and Databricks is heading and what trends should companies keep an eye on in the longer term?

Databricks is increasingly positioning itself as an end-to-end platform for data, analytics and AI. This makes it interesting for companies that deeply integrate complex data processing and AI development and rely on open, highly configurable infrastructures.

Microsoft Fabric pursues the approach of a standardized data platform within the Microsoft ecosystem. It brings together Power BI, Data Factory, Synapse, Data Lake and other services under a common interface. Fabric is aimed at organizations that are looking for a close integration of data, reporting and operational processes in a familiar ecosystem.

The decision for a platform should not only be based on current functions, but also on the long-term technical orientation, type of organization, complexity of the data landscape and internal know-how. It is advisable to actively observe trends, remain strategically flexible and, if necessary, make use of cross-technology consulting.

Thank you very much for the interview!

Share the Post:

Want to learn more?
Contact us!

Address

Seilerstrasse 4
3011 Bern

Badenerstrasse 120
8004 Zürich

Email us

Your message

Salutation
Please briefly describe what topic you are interested in and how we can assist you.
Subscribe to our newsletter

Stay updated!

Subscribe to our newsletter for the latest updates on real estate investments.

Data AI Sprechstunde Anmeldung
Mein primäres Interessensgebiet
Newsletter abonnieren
Datenschutz
This site is registered on wpml.org as a development site. Switch to a production site key to remove this banner.