Refer to the exhibit -
Answer : A
What is Hadoop?
Answer : A
Consider the example of an analysis for fraud detection on credit card usage. You will need to ensure higher-risk transactions that may indicate fraudulent credit card activity are retained in your data for analysis, and not dropped as outliers during pre-processing. What will be your approach for loading data into the analytical sandbox for this analysis?
Answer : A
Your customer provided you with 2, 000 unlabeled records and asked you to separate them into three groups. What is the correct analytical method to use?
Answer : A
You are using MADlib for Linear Regression analysis. Which value does the statement return?
SELECT (linregr(depvar, indepvar)).r2 FROM zeta1;
Answer : A
Which word or phrase completes the statement; “A theater actor is to ‘artistic and expressive’ as a data scientist is to ____________.”?
Answer : A
You submit a MapReduce job to a Hadoop cluster and notice that although the job was successfully submitted, it is not completing. What should you do?
Answer : A
The web analytics team uses Hadoop to process access logs. They now want to correlate this data with structured user data residing in their massively parallel database. Which tool should they use to export the structured data from Hadoop?
Answer : A
Refer to the exhibit.
Answer : A
Which word or phrase completes the statement? A data warehouse is to a centralized database for reporting as an analytic sandbox is to a _______?
Answer : A
What would be considered "Big Data"?
Answer : B
Refer to the exhibit.
Answer : A
What does R code nv <- v[v < 1000] do?
Answer : A
A data scientist is given an R data frame, empdata, with the columns Age, Salary,
Occupation, Education, and Gender. The data scientist would like to examine only the
Salary and Occupation columns for ages greater than 40. Which command extracts the appropriate rows and columns from the data frame?
Answer : A
What is the reason for using LOESS?
Answer : A