Which statement is true in the context of evaluating metrics for machine learning algorithms?
A. A random classifier has AUC (the area under ROC curve) of 0.5
B. Using only one evaluation metric is sufficient
C. The F-score is always equal to precision
D. Recall of 1 (100%) is always a good result
When should median value be used instead of mean value for imputing missing data?
A. for skewed data
B. for real numbers
C. for normally distributed data
D. for large data sets
Given the following matrix multiplication:
What is the value of P?
A. ?
B. 17
C. 12
D. ?
After importing a Jupyter notebook and CSV data file into IBM Watson Studio in the IBM Public Cloud project, it is discovered that the notebook code can no longer access the CSV file. What is the most likely reason for this problem?
A. CSV files cannot be used as data sources in Watson Studio.
B. The CSV file was converted to a binary blob and must be converted in the notebook code.
C. The CSV file is stored in a Cloud Object Storage.
D. The CSV file is stored in a Watson Machine Learning instance and is only accessible via REST API.
Which test is applied to determine the relationship between two categorical variables?
A. paired t-test
B. chi squared test
C. z test
D. t-test
With only limited labeled data available how might a neural network use case be realized?
A. by assigning random labels
B. by increasing the depth of the neural network
C. by creating random data
D. by using a customized pre-trained model
What is an example of a supervised machine learning algorithm that can be applied to a continuous numeric response variable?
A. linear regression
B. k-means
C. local outlier factor (LOF)
D. naive Bayes
What are two key characteristics of cloud architecture that could benefit AI applications? (Choose two.)
A. constant attention needed for maintenance and support of the cloud platform
B. capable of managing and handling dynamic workloads with automatic recovery from failures
C. hybrid clouds enable the deployment of distributed large neural networks
D. support for common business oriented language (COBOL) applications
E. the hardware requirement can be scaled up as per the demand
Which IBM Watson Machine Learning deployment method offers the ultimate flexibility in deploying a machine learning model?
A. Watson Machine Learning Python client
B. Watson Machine Learning FORTRAN client
C. Watson Studio Project
D. Watson Machine Learning REST API
What is a class of machine learning problems where the algorithm builds a mathematical model from a set of data that contains both the inputs and the desired outputs?
A. unsupervised learning
B. mentoring
C. reinforcement learning
D. supervised learning
Considering one ML application is deployed using Kubernetes, its output depends on the data which is constantly stored in the model, if needing to scale the system based on available CPUs, what feature should be enabled?
A. persistent storage
B. vertical pod autoscaling
C. horizontal pod autoscaling
D. node self-registration mode
What are the various components that make up a time series data?
A. trend, noise, covariance
B. trend, noise, kurtosis
C. trend, seasonality, causation
D. trend, seasonality, noise
A data scientist is exploring transaction data from a chain of stores with several locations. The data includes store number, date of sale, and purchase amount. If the data scientist wants to compare total monthly sales between stores, which two options would be good ways to aggregate the data? (Choose two.)
A. Find the sum of the transaction prices
B. Select the largest transaction amount by month and store
C. Write a GROUP BY query
D. Plot a time series plot of transaction amounts
E. Generate a pivot table
Which algorithm is best suited if a client needs full explainability of the machine learning model?
A. decision tree
B. logistic regression
C. support vector machine (SVM)
D. recurrent neural network
Which is the most important thing to ensure while collecting data?
A. samples collected are skewed with each other
B. samples collected are all strongly correlated with each other
C. samples collected adequately cover the space of all possible scenarios
D. samples collected focus only on the most common cases