In our previous Microsoft Fabric post (part 1), we delved into the process of selecting the ideal Unified Data Analytics Platform. Now, in part 2, we will explore the Microsoft Fabric platform in greater depth.
Microsoft Fabric is a data analytics platform that helps you store, manage, and analyze all of your data in one place. It is built on a SaaS foundation and offers a number of key features, including:
- OneLake: A unified data lake that allows you to store and manage all of your data in one place, regardless of its format or location.
- Synapse: A suite of data engineering, data warehousing, data science, and real-time analytics services. It provides a single platform for all of your data analytics needs.
- Power BI: A business intelligence tool that allows you to create and share interactive reports and visualizations. It is integrated with Fabric, so you can easily create reports and visualizations from your data in Fabric.
- AI-powered capabilities: Fabric uses AI to automate and optimize many of its tasks, such as data preparation, data modeling, and machine learning.
- Open, governed foundation: Fabric is built on an open and governed foundation, so you can easily integrate it with your existing data systems and tools.
- Cost management: Fabric offers a number of cost management features, such as automatic scaling and pay-as-you-go pricing, so you can control your costs.
Let’s explore some use cases where you can make use of Microsoft Fabric.
Use case 1:
Retail Company: A retail company uses Microsoft Fabric to analyze customer data to identify trends and patterns. The company uses this information to improve product placement, marketing campaigns, and customer service.
Solution: The company uses OneLake to store and manage all of its customer data, including purchase history, demographics, and product reviews. The company then uses Synapse to analyze this data to identify trends and patterns. For example, the company can use Synapse to identify which products are most popular, which products are often purchased together, and which customers are most likely to churn. The company then uses this information to improve product placement, marketing campaigns, and customer service.
Benefits:
- The company is able to get a unified view of all of its customer data, regardless of its format or location.
- The company is able to analyze its customer data in real time to identify trends and patterns.
- The company is able to use this information to improve product placement, marketing campaigns, and customer service.
Use case 2:
Finance Company: A financial services company uses Microsoft Fabric to analyze market data to make better investment decisions.
Solution: The company uses OneLake to store and manage all of its market data, including stock prices, economic data, and analyst reports. The company then uses Synapse to analyze this data to identify trends and patterns. For example, the company can use Synapse to identify undervalued stocks, to predict future market trends, and to develop trading strategies.
Benefits:
- The company is able to get a unified view of all of its market data, regardless of its format or location.
- The company is able to analyze its market data in real time to identify trends and patterns.
- The company is able to use this information to make better investment decisions.
Use case 3:
Healthcare Company: A healthcare company uses Microsoft Fabric to analyze patient data to improve the quality of care.
Solution: The company uses OneLake to store and manage all of its patient data, including medical records, lab results, and imaging data. The company then uses Synapse to analyze this data to identify trends and patterns. For example, the company can use Synapse to identify patients who are at risk of certain diseases, to develop personalized treatment plans, and to monitor the effectiveness of treatments.
Benefits:
- The company is able to get a unified view of all of its patient data, regardless of its format or location.
- The company is able to analyze its patient data in real time to identify trends and patterns.
- The company is able to use this information to improve the quality of care for its patients.
Now, let’s compare Microsoft Fabric with a few commonly used industry applications.
Feature | Microsoft Fabric | Databricks | Google Cloud Dataproc | Amazon EMR | Snowflake |
Unified platform | Yes | Yes | No | No | No |
Cloud-based | Yes | Yes | Yes | Yes | No |
Open source | Yes | Yes | Yes | Yes | No |
Spark support | Yes | Yes | Yes | Yes | No |
ML support | Yes | Yes | Yes | Yes | No |
Data lake support | Yes | Yes | Yes | Yes | No |
Pricing | Pay-as-you-go | Pay-as-you-go | Pay-as-you-go | Pay-as-you-go | Pay-as-you-go |
A unified platform is a software solution that provides a single environment for multiple tasks. This can include tasks such as data warehousing, data analytics, machine learning, and artificial intelligence.
Spark support is the ability of a data analytics platform to work with Apache Spark. Spark is a unified analytics engine for large-scale data processing. It provides a variety of features and capabilities for data exploration, analysis, machine learning, and real-time streaming.
Spark support is important for a number of reasons. First, Spark is a very popular and powerful data analytics engine. It is used by organizations of all sizes and in a variety of industries to analyze large volumes of data. Second, Spark is very versatile. It can be used for a wide range of data analytics tasks, including batch processing, streaming processing, and machine learning.
A data lake is a centralized repository that stores all of an organization’s data in its native format. This can include structured, unstructured, and semi-structured data. Data lakes are typically very large and can scale to petabytes or even exabytes of data.
Data lake support is the ability of a data lake to store and process a variety of these data types and workloads as well as batch processing, streaming processing, and machine learning workloads.
These below are the Primary principles for Data lake.
• One data lake for the entire organization
• For use with multiple analytical engines
• Data to be reused without copying it
Data lakes are often used in conjunction with data warehouses, but they serve different purposes. Data warehouses are typically used to store and analyze structured data for business intelligence and reporting purposes. Data lakes, on the other hand, are used to store and process all of an organization’s data, regardless of its format or structure. This allows organizations to gain insights from all of their data, not just the structured data that is typically stored in data warehouses.
Data lakes are also often used for machine learning and artificial intelligence (AI) applications. This is because data lakes can store and process large volumes of data from a variety of sources, which is essential for training machine learning models.
They are typically used for storing data from various sources, including logs, IoT devices, social media, and more. Data lake support in analytics software involves several key aspects:
- Data Ingestion: Data lake support involves the ability to efficiently ingest data from diverse sources into the data lake. This can include batch processing, streaming data ingestion, and the ability to handle a wide range of data formats.
- Data Storage: The software should be capable of storing data in the data lake without imposing strict structure or schema requirements. Data is typically stored in its raw form, allowing for flexibility in data exploration and analysis.
- Data Catalog: Effective data lake support often includes a data catalog or metadata management system. This catalog helps users discover, understand, and locate the data stored in the data lake. It provides metadata, data lineage, and data quality information.
- Data Security: Security features are essential for data lake support. This includes encryption of data at rest and in transit, access control, authentication, and authorization mechanisms to protect sensitive data within the lake.
- Data Governance: Data governance capabilities ensure that data stored in the lake complies with regulatory requirements and internal policies. It involves setting and enforcing data quality standards, data retention policies, and auditing capabilities.
- Data Integration: Integration capabilities are necessary to connect and combine data from various sources within the data lake. This often involves ETL (Extract, Transform, Load) processes to clean, transform, and prepare data for analysis.
- Data Processing: Data lake support includes the ability to process and analyze data within the lake. This may involve batch processing using technologies like Apache Spark or real-time processing with tools like Apache Kafka.
- Scalability: Scalability is crucial in data lake support, as data lakes often store vast amounts of data. The software should be able to scale horizontally to handle increasing data volumes and processing demands.
- Query and Analytics: Data stored in the lake should be accessible for querying and analytics. This typically involves SQL-like querying capabilities or integration with analytics tools for data exploration and visualization.
- Data Lifecycle Management: The software should support data lifecycle management, including data archiving, purging, and data versioning, to maintain data lake efficiency and governance.
Data lake support is particularly relevant in modern data analytics because it enables organizations to collect and store large volumes of data from diverse sources, making it available for analysis, machine learning, and other data-driven activities. An effective data lake support system facilitates data discovery, integration, security, and governance, enabling organizations to derive valuable insights from their data lake investments.
Microsoft Fabric Pricing
Microsoft Fabric offers two pricing models:
- Pay-as-you-go: This pricing model allows you to pay for the resources that you use on an hourly basis. There is no commitment required, and you can scale your resources up or down as needed.
- Reserved capacity: This pricing model allows you to purchase a reserved capacity of resources for a period of one or three years. Reserved capacity offers a discount over the pay-as-you-go pricing, but it requires a commitment to purchase a certain amount of resources for a specified period of time.
The specific pricing for Microsoft Fabric depends on the region where you are using it and the type of resources that you are using. You can use the Microsoft pricing calculator to estimate the cost of using Microsoft Fabric for your specific needs.
You can visit Microsoft pricing calculator here.
Please be aware that storage costs are not included in this expense, as storage may incur an additional monthly charge of 0.023 per gigabyte.
Here is an example of how the Microsoft Fabric pricing model works:
- Let’s say that you are a retail company and you want to use Microsoft Fabric to analyze your customer data. You could start by using the pay-as-you-go pricing model. This would allow you to scale your resources up or down as needed, depending on your usage.
- As your business grows and you start to use Microsoft Fabric more heavily, you may want to consider switching to the reserved capacity pricing model. This would save you money on your Microsoft Fabric costs, but it would require you to commit to purchasing a certain amount of resources for a specified period of time.
The best way to choose the right pricing model for your needs is to carefully consider your usage patterns and budget. If you are not sure which pricing model is right for you, you can contact Microsoft for assistance.
In addition to the pricing models described above, Microsoft Fabric also offers a number of other pricing options, such as spot pricing and commitment discounts. You can learn more about these pricing options on the Microsoft website.
Please be aware that storage costs are not included in this expense, as storage may incur an additional monthly charge of 0.023 per gigabyte.
User based licensing (Similar to Power BI licensing)
- Free
- Can not publish your reports
- Pro
- 10 $ per user/ per month (if you have 20 users in org then `200 $ per month)
- 20 employees and only 3 doing data analysis
- Only publish report with Pro license users within organization
- Premium or PPU (Premium Per User)
- Large team members
- 20 $ per user/ per month (if you have 20 users in org then `400 $ per month)
- larger model sizes, more frequent refreshes, XMLA read/write, deployment pipelines, and other enterprise-scale features
Additionally, you can have Per Capacity licenses. Per Capacity licenses are a type of licensing for Microsoft Fabric that allows organizations to purchase a dedicated set of resources for their data analytics needs. This includes compute, memory, and storage. Per Capacity licenses are billed on a monthly basis, and organizations can choose from a variety of SKUs to meet their specific needs. Per Capacity licenses are a good option for organizations with large or complex data analytics workloads.
- Per Capacity licenses
- 4,995 $ Per capacity/ Per month
- Consumers can be added in the plan. There is no limitation on user
Pricing is subject to change. For the most up-to-date pricing, please visit the Microsoft website.
In part 3, we will learn how to use Microsoft Fabric to perform data analytics and see some real-world examples of how it is used in different industries.