10/167 questions · Unlock full access
Q1

A financial services firm is implementing a Microsoft Fabric solution to analyze trade data. They have a requirement to enforce data residency, ensuring that all data processing for their European operations occurs within the EU region. The Fabric capacity is provisioned in the 'West Europe' region. A data engineer creates a shortcut in a Lakehouse pointing to an Azure Data Lake Storage Gen2 account located in the 'East US' region. How will Fabric handle query processing against this shortcut?

Q2

You are designing a data ingestion pipeline for a large retail company. The pipeline must ingest nightly sales data from over 1,000 stores. Each store uploads a CSV file to an Azure Blob Storage container. You need to design a robust orchestration pattern in a Fabric pipeline that processes each file individually, logs its status, and can handle failures for a specific store's file without stopping the entire nightly batch. Which orchestration pattern should you implement?

Q3

You are optimizing a Fabric Warehouse that contains a 5-billion-row fact table named `FactInternetSales`. Queries against this table frequently filter by the `OrderDateKey` column. The data in the table is currently unordered. To improve query performance for time-series analysis, you decide to implement V-Order optimization. Which statement accurately describes the effect of applying V-Order on the `OrderDateKey` column?

Q4

A data engineering team is using Git integration with Azure DevOps to manage their Fabric workspace. They have two workspaces: `Sales_Dev` for development and `Sales_Prod` for production. A junior engineer commits a change to a notebook in the `main` branch directly from the `Sales_Dev` workspace. Now, they need to promote this change to the `Sales_Prod` workspace. The team uses a deployment pipeline for promotions. What is the immediate consequence of this action on the deployment pipeline process?

Q5

You are processing real-time IoT data using a Fabric Eventstream. The incoming JSON data needs to be enriched with reference data stored in a Delta table in a Lakehouse. The enrichment must happen in near real-time as events flow through the system. Which Fabric component is the most suitable for performing this stateful stream-enrichment operation?

Q6Multiple answers

You are tasked with implementing security for a new Fabric Lakehouse. The requirements are as follows: 1. Data engineers must have full control over the Lakehouse, including the ability to read and write to tables and underlying files. 2. Business analysts must be able to query all data using the SQL endpoint but must be prevented from accessing the underlying files in OneLake. 3. A specific group of junior analysts must only see data for the 'North America' region from the `Sales` table. Which combination of security mechanisms should you use? (Select TWO)

Q7

You are troubleshooting a slow-running PySpark notebook in a Fabric workspace. The notebook reads a large Delta table, performs a series of transformations, and then joins it with a small dimension table. You observe from the Spark UI that one particular stage is taking an exceptionally long time and has a significant amount of data shuffle. The problematic code snippet is: `large_df.join(small_df, 'user_id', 'inner')` Which optimization technique should you apply to mitigate this performance bottleneck?

Q8

A manufacturing company uses Microsoft Fabric to monitor its production line. An Eventstream ingests sensor data, which is then written to a KQL database for real-time dashboarding. A critical requirement is to detect anomalies where the average temperature of a specific sensor (`sensorId` = 'SN01-T5') exceeds 150 degrees Celsius over a 5-minute period. You need to write a KQL query to create an alert based on this condition. Which KQL query correctly implements this logic?

Q9

True or False: When you use the Mirroring feature in Microsoft Fabric to replicate an Azure SQL Database, the initial setup performs a full snapshot of the source data, and subsequent changes are then replicated in near real-time using change data capture (CDC), without requiring the configuration of a separate data pipeline.

Q10

Case Study: Global E-Commerce Platform **Company Background** GlobalCart is a multinational e-commerce company that operates in North America, Europe, and Asia. They are migrating their analytics platform to Microsoft Fabric to create a unified view of their sales, customer, and inventory data. Their primary goal is to empower regional business units with self-service analytics while maintaining centralized governance and data engineering standards. **Existing Environment** Data is currently stored in a mix of sources: an on-premises SQL Server 2019 database for inventory, an Azure SQL Database for sales transactions, and Parquet files in an ADLS Gen2 account for historical clickstream data. The company has a single Fabric F64 capacity located in 'Central US'. The data engineering team is proficient in SQL and PySpark. **Requirements** 1. **Architecture:** Implement a medallion architecture using a central 'Corporate' workspace for Bronze and Silver layer Lakehouses. Each region (NA, EU, AS) must have its own separate workspace containing a Gold layer Lakehouse with data filtered specifically for that region. 2. **Data Ingestion:** Data from the on-premises SQL Server must be ingested nightly with minimal impact on the source system. Sales data from Azure SQL should be replicated in near real-time. 3. **Transformation:** Transformations from Bronze to Silver will be complex and require custom Python libraries. Transformations from Silver to the regional Gold layers will involve filtering and simple aggregations. 4. **Security:** Data analysts in each region must only be able to access the Gold layer Lakehouse in their respective regional workspace. They should not have any access to the 'Corporate' workspace or other regions' workspaces. **Problem Statement** You need to design the data flow and transformation strategy to populate the regional Gold layer Lakehouses from the centralized Silver layer Lakehouse. The solution must be efficient and scalable. Which approach best meets the requirements?