Certbus > Databricks > Databricks Certification > DATABRICKS-CERTIFIED-PROFESSIONAL-DATA-SCIENTIST > DATABRICKS-CERTIFIED-PROFESSIONAL-DATA-SCIENTIST Online Practice Questions and Answers

DATABRICKS-CERTIFIED-PROFESSIONAL-DATA-SCIENTIST Online Practice Questions and Answers

Questions 4

Regularization is a very important technique in machine learning to prevent over fitting. And Optimizing with a L1 regularization term is harder than with an L2 regularization term because

A. The penalty term is not differentiate

B. The second derivative is not constant

C. The objective function is not convex

D. The constraints are quadratic

Browse 138 Q&As
Questions 5

A. 2.4

B. 24 0

C. .24

D. .48

E. 4.8

Browse 138 Q&As
Questions 6

Which is an example of supervised learning?

A. PCA

B. k-means clustering

C. SVD

D. EM

E. SVM

Browse 138 Q&As
Questions 7

A researcher is interested in how variables, such as GRE (Graduate Record Exam scores), GPA (grade point average) and prestige of the undergraduate institution, effect admission into graduate school. The response variable, admit/don't admit, is a binary variable.

Above is an example of:

A. Linear Regression

B. Logistic Regression

C. Recommendation system

D. Maximum likelihood estimation

E. Hierarchical linear models

Browse 138 Q&As
Questions 8

Suppose A, B , and C are events. The probability of A given B , relative to P(|C), is the same as the probability of A given B and C (relative to P ). That is,

A. P(A,B|C) P(B|C) =P(A|B,C)

B. P(A,B|C) P(B|C) =P(B|A,C)

C. P(A,B|C) P(B|C) =P(C|B,C)

D. P(A,B|C) P(B|C) =P(A|C,B)

Browse 138 Q&As
Questions 9

A. Logistic Regression

B. Support Vector Machine

C. Neural Network

D. Hidden Markov Models

E. None of the above

Browse 138 Q&As
Questions 10

Which of the following skills a data scientists required?

A. Web designing to represent best visuals of its results from algorithm.

B. He should be creative

C. Should possess good programming skills

D. Should be very good at mathematics and statistic

E. He should possess database administrative skills.

Browse 138 Q&As
Questions 11

Clustering is a type of unsupervised learning with the following goals

A. Maximize a utility function

B. Find similarities in the training data

C. Not to maximize a utility function

D. 1 and 2

E. 2 and 3

Browse 138 Q&As
Questions 12

If you are trying to predict or forecast a discrete target value, then which is the correct options?

A. Supervised Learning regression algorithms

B. Supervised Learning classification algorithms

C. Un supervised Learning

D. Density estimation algorithm

Browse 138 Q&As
Questions 13

Regularization is a very important technique in machine learning to prevent overfitting. Mathematically speaking, it adds a regularization term in order to prevent the coefficients to fit so perfectly to overfit. The difference between the L1 and L2 is...

A. L2 is the sum of the square of the weights, while L1 is just the sum of the weights

B. L1 is the sum of the square of the weights, while L2 is just the sum of the weights

C. L1 gives Non-sparse output while L2 gives sparse outputs

D. None of the above

Browse 138 Q&As
Questions 14

What is the considerable difference between L1 and L2 regularization?

A. L1 regularization has more accuracy of the resulting model

B. Size of the model can be much smaller in L1 regularization than that produced by L2- regularization

C. L2-regularization can be of vital importance when the application is deployed in resource-tight environments such as cell-phones.

D. All of the above are correct

Browse 138 Q&As
Questions 15

May have a trend component that is quadratic in nature. Which pattern of data will indicate that the trend in the time series data is quadratic in nature?

A. Naive Bayesian classifier

B. Decision tree

C. Linear regression

D. K-means clustering

Browse 138 Q&As
Questions 16

Which of the following statement true with regards to Linear Regression Model?

A. Ordinary Least Square can be used to estimates the parameters in linear model

B. In Linear model, it tries to find multiple lines which can approximate the relationship between the outcome and input variables.

C. Ordinary Least Square is a sum of the individual distance between each point and the fitted line of regression model.

D. Ordinary Least Square is a sum of the squared individual distance between each point and the fitted line of regression model.

Browse 138 Q&As
Questions 17

You are creating a regression model with the input income, education and current debt of a customer, what could be the possible output from this model?

A. Customer fit as a good

B. Customer fit as acceptable or average category

C. expressed as a percent, that the customer will default on a loan

D. 1 and 3 are correct

E. 2 and 3 are correct

Browse 138 Q&As
Questions 18

Which of the following are advantages of the Support Vector machines?

A. Effective in high dimensional spaces.

B. it is memory efficient

C. possible to specify custom kernels

D. Effective in cases where number of dimensions is greater than the number of samples

E. Number of features is much greater than the number of samples, the method still give good performances

F. SVMs directly provide probability estimates

Browse 138 Q&As
Exam Name: Databricks Certified Professional Data Scientist Exam
Last Update: Apr 21, 2024
Questions: 138 Q&As

PDF

$45.99

VCE

$49.99

PDF + VCE

$59.99