DP-100 Microsoft… Free Certification Sample Questions (2026)

Q1

A financial services company is developing a Retrieval-Augmented Generation (RAG) solution to answer questions about internal compliance documents. The documents are a mix of short policy statements (1-2 paragraphs) and long procedural guides (10-20 pages). The goal is to ensure that answers are precise and source attribution is accurate. Which data preparation strategy is most suitable for this scenario?

View answers, explanations and more in the Simulator

Q2

A data science team is using Azure Machine Learning pipelines to orchestrate a complex training workflow. A custom component in the middle of the pipeline frequently fails due to transient network issues when accessing an external data source. The team wants to make the pipeline more resilient without modifying the component's internal code. How should they configure the pipeline job to handle these intermittent failures?

View answers, explanations and more in the Simulator

Q3Multiple answers

You are designing a secure environment for a multi-team data science project. You need to ensure that each team can manage its own compute resources and data assets but cannot access the resources of other teams. All teams must use a centrally-managed set of curated Docker environments and foundation models. Which combination of Azure Machine Learning features should you use? (Select TWO)

View answers, explanations and more in the Simulator

Q4

A data scientist is using the Azure Machine Learning SDK v2 to submit a hyperparameter tuning job for a classification model. The goal is to maximize the 'AUC_weighted' metric. The search space is large, and the compute budget is limited. They need to configure the sweep job to efficiently find good parameters by terminating underperforming runs early. Which early termination policy is most appropriate for this goal?

View answers, explanations and more in the Simulator

Q5

You are developing a prompt flow that orchestrates multiple calls to a language model to generate a marketing campaign proposal. You need to ensure that the output of an early step, which generates a target audience description, is correctly passed as input to a later step that writes ad copy. Which Prompt flow feature allows you to define this data dependency?

View answers, explanations and more in the Simulator

Q6

A hospital is using an AutoML for tabular data job to predict patient readmission risk. The Responsible AI dashboard for the best model reveals that the model has a significantly lower prediction accuracy for a minority demographic group compared to the majority group. This indicates a potential fairness issue. What is the most appropriate first step to mitigate this bias?

View answers, explanations and more in the Simulator

Q7

An MLOps engineer is packaging a trained MLflow model for deployment. To ensure seamless inference, they must include information about the required input data schema directly within the model artifacts. Which file within the MLflow model directory should be modified to include this schema definition?

View answers, explanations and more in the Simulator

Q8

True or False: When using an Azure Machine Learning compute cluster for training, you are only billed for the compute time when a job is actively running on the nodes.

View answers, explanations and more in the Simulator

Q9

A team has deployed a machine learning model to a managed online endpoint with a blue-green deployment strategy. 80% of the traffic is currently directed to the stable 'blue' deployment, and 20% is directed to the new 'green' deployment for testing. After monitoring, the team confirms the green deployment is performing well and decides to route all traffic to it. Which command should be used to achieve this without causing downtime?

View answers, explanations and more in the Simulator

Q10

A data scientist is working on a local machine with the Azure Machine Learning SDK and needs to access data stored in an Azure Blob Storage container for interactive analysis in a Jupyter notebook. The workspace is configured with a datastore named `blob_datastore`. Which code snippet correctly accesses a file named `data/customers.csv` from this datastore as a Pandas DataFrame?

View answers, explanations and more in the Simulator