Snowflake SnowPro Advanced Data Scientist - SnowPro Advanced Data Scientist DSA-C03 Exam
Page: 1 / 13
Total 61 questions
Question #1 (Topic: Exam A)
A Data Science team plotted a scatterplot between the residuals and predicted values in a linear regression. A team member found that there is a relationship between the values and a pattern.
Which conclusion can be made about this linear regression?
Which conclusion can be made about this linear regression?
A. Since there is a relationship, the mode is good.
B. Since there is a relationship, the model is not good.
C. Since there is a relationship, the prediction and residuals need to be recalculated.
D. Another model should be used to validate that the relationship and patterns are accurate.
Answer: B
Question #2 (Topic: Exam A)
A Data Scientist is creating a model that can predict the number of patients suffering from migraines. The following are the confusion matrix values:
True Positive – 100
False Positive – 20
False Negative – 30
True Negative – 150
Which of the model’s key indicators is correct?
True Positive – 100
False Positive – 20
False Negative – 30
True Negative – 150
Which of the model’s key indicators is correct?
A. Accuracy = 0.76
B. Recall = 0.83
C. Precision = 0.83
D. Error rate = 0.24
Answer: C
Question #3 (Topic: Exam A)
After training a model with a stored procedure, a Data Scientist wants the output to have more than one variable to assess the accuracy, precision, and recall values.
What should the output type of the stored procedure be?
What should the output type of the stored procedure be?
A. DataFrame
B. Integer
C. String
D. Variant
Answer: D
Question #4 (Topic: Exam A)
A Data Scientist wants to build an LLM-based application for categorizing transcripts of customer support calls into predefined categories in Snowflake. A large data set of labeled transcripts is available. The categories are not expected to change over time, but inference will need to be run on a very large number of transcripts. The first priority for the evaluation methodology is to optimize costs, and the second priority is accuracy.
What should the Data Scientist do to meet these requirements?
What should the Data Scientist do to meet these requirements?
A. Use the training data set to fine-tune the mistral-7b model and run inference with the SNOWFLAKE.CORTEX.COMPLETE function with the fine-tuned well.
B. Use the training data set to fine-tune the mixtral-8x-7b model and run inference with the SNOWFLAKE.CORTEX.COMPLETE function with the fine-tuned model.
C. Use the mixtral-8x-7b model to run inference with the SNOW LAKE.CORTEX.COMPLETE function and include as much training data as possible in the prompt.
D. Use the mistral-7b model to run inference with the SNOWFLAKE.CORTEX.COMPLETE function and include as much training data as possible in the prompt.
Answer: A
Question #5 (Topic: Exam A)
A clothing company wants to add new apparel based on customer requests. In order to prioritize the requests, the company collects customer demographic surveys that include age. The company is not concerned with a customer’s specific age, but instead what age group the customer belongs to (for example, Youth, Junior, Adult).
Which approach will prepare the data for analysis?
Which approach will prepare the data for analysis?
A. Use the K-Means algorithm to cluster the data.
B. Group the data into bin ranges aligned with the age group.
C. Group the data into bin ranges aligned to a customer’s specific age.
D. Group the data into bin ranges aligned to each purchased product.
Answer: B