MLS-C01 Amazon A… Free Certification Sample Questions (2026)

Q1

A Machine Learning Specialist is required to build a supervised image-recognition model to identify a cat. The ML Specialist performs some tests and records the following results for a neural network-based image classifier: Total number of images available = 1,000 Test set images = 100 (constant test set) The ML Specialist notices that, in over 75% of the misclassified images, the cats were held upside down by their owners. Which techniques can be used by the ML Specialist to improve this specific test error?

View answers, explanations and more in the Simulator

Q2

A Machine Learning Specialist is training a model to identify the make and model of vehicles in images. The Specialist wants to use transfer learning and an existing model trained on images of general objects. The Specialist collated a large custom dataset of pictures containing different vehicle makes and models. What should the Specialist do to initialize the model to re-train it with the custom data?

View answers, explanations and more in the Simulator

Q3

A Machine Learning Specialist previously trained a logistic regression model using scikit-learn on a local machine, and the Specialist now wants to deploy it to production for inference only. What steps should be taken to ensure Amazon SageMaker can host a model that was trained locally?

View answers, explanations and more in the Simulator

Q4

A large consumer goods manufacturer has the following products on sale: • 34 different toothpaste variants • 48 different toothbrush variants • 43 different mouthwash variants The entire sales history of all these products is available in Amazon S3. Currently, the company is using custom-built autoregressive integrated moving average (ARIMA) models to forecast demand for these products. The company wants to predict the demand for a new product that will soon be launched. Which solution should a Machine Learning Specialist apply?

View answers, explanations and more in the Simulator

Q5

A company wants to classify user behavior as either fraudulent or normal. Based on internal research, a Machine Learning Specialist would like to build a binary classifier based on two features: age of account and transaction month. The class distribution for these features is illustrated in the figure provided. Based on this information, which model would have the HIGHEST accuracy?

View answers, explanations and more in the Simulator

Q6Multiple answers

A Data Scientist is developing a machine learning model to classify whether a financial transaction is fraudulent. The labeled data available fortraining consists of 100,000 non-fraudulent observations and 1,000 fraudulent observations. The Data Scientist applies the XGBoost algorithm to the data, resulting in the following confusion matrix when the trained model is applied to a previously unseen validation dataset. The accuracy of the model is 99.1%, but the Data Scientist has been asked to reduce the number of false negatives. Predicted 0 1 Actual 0 99.9661 34 1 8771123 Which combination of steps should the Data Scientist take to reduce the number of false positive predictions by the model? (Choose two.)

View answers, explanations and more in the Simulator

Q7

A Machine Learning Specialist is assigned to a Fraud Detection team and must tune an XGBoost model, which is working appropriately for test data. However, with unknown data, it is not working as expected. The existing parameters are provided as follows. param = { 'eta': 0.05, # the training step for each iteration 'silent': 1, # logging mode - quiet 'n_estimators':2000, 'max_depth' :30, 'min_child_weight' : 3, 'gamma': 0, 'subsample': 0.8, 'objective': 'multi:softprob', # error evaluation for multiclass training 'num_class': 201} # the number of classes that exist in this dataset num_round = 6 0 # the number of training iterations Which parameter tuning guidelines should the Specialist follow to avoid overfitting?

View answers, explanations and more in the Simulator

Q8

A Machine Learning Specialist is deciding between building a naive Bayesian model or a full Bayesian network for a classification problem. The Specialist computes the Pearson correlation coefficients between each feature and finds that their absolute values range between 0.1 to 0.95. Which model describes the underlying data in this situation?

View answers, explanations and more in the Simulator

Q9

Which of the following metrics should a Machine Learning Specialist generally use to compare/evaluate machine learning classification models against each other?

View answers, explanations and more in the Simulator

Q10

A Machine Learning Specialist is designing a system for improving sales for a company. The objective is to use the large amount of information the company has on users’ behavior and product preferences to predict which products users would like based on the users’ similarity to other users. What should the Specialist do to meet this objective?

View answers, explanations and more in the Simulator