Which of the following statements are TRUE regarding the use of Data Click to load data into BigInsights? (Choose two.)
A. Big SQL cannot be used to access the data moved in by Data Click because the data is in Hive
B. You must import metadata for all sources and targets that you want to make available for Data Click activities
C. Connections from the relational database source to HDFS are discovered automatically from within Data Click
D. Hive tables are automatically created every time you run an activity that moves data from a relational database into HDFS
E. HBase tables are automatically created every time you ran an activity that moves data from a relational database into HDFS
Which statement about the Jaqi Programming Language is TRUE?
A. Jaqi always produces a MapReduce job, but Combiner functionality is optional
B. Jaqi includes the following operators: filter, extend, groupby, combine, and transform
C. Data that is read from multiple blocks (splits) is always processed in parallel by MapReduce
D. The read operator loads data from different source and formats, and then converts this data into JSON format for internal processing by the Jaqi interpreter
Which of the following are capabilities of the Apache Spark project?
A. Large scale machine learning
B. Large scale graph processing
C. Live data stream processing
D. All of the above
What are the available document formats beside PDF and MS Word when export a redacted document using Optim Review Tool?
A. TIFF, and CSF
B. TIFF, and PNG
C. JPEG, and PNG
D. Plain Text, and CSV
Which of the following statements is TRUE regarding search visualization with Apache Hue?
A. Hue utilizes Java libraries which must be installed on your system
B. The Hue Beeswax application enables you to perform queries on HBase
C. For optimal performance, the Hue Server should be one of the nodes within your Hadoop cluster
D. The Hue Sqoop UI allows transferring data from a relational database to Hadoop, but not from Hadoop to a relational database
Which source operator detects SPSS Collaboration and Deployment Services notification events for a specific SPSS Modeler file and downloads the indicated file version for the refreshed scoring branch?
A. SPSSPublish operator
B. SPSSScoring operator
C. SPSSModeler operator
D. SPSSRepository operator
Which is a benefit of row oriented table design?
A. When writing a new row, if all of the row data is supplied at the same time the entire row can be written with a single disk seek
B. When columns of a single row are required at the same time, the entire row can be retrieved with a single disk seek regardless of row size
C. When new values of a column are supplied for all rows at once, that column data can be written efficiently and replace old column data without touching any other columns for the rows
D. When an aggregate needs to be computed over many rows but only a notably smaller subset of all columns of data, reading that smaller subset of data can be faster than reading all data
For what purpose SPSS models are embedded within InfoSphere Streams application?
A. To provide high availability
B. To score streaming data using existing models
C. To create new models based on streaming data
D. To ingest and parse binary and other complex data types
Which of the following is most commonly used by Hadoop to move data between clusters?
A. Pig
B. FTP
C. JAQL
D. distcp
Which of the following is not a data-processing operations that is supported in Pig Latin?
A. filter
B. joins
C. group by
D. logistic regression
Which ONE of the following statements regarding Sqoop is TRUE?
A. By default, data is compressed with Sqoop
B. Sqoop can only read committed transactions from a source database, not uncommitted ones
C. When performing parallel imports, Sqoop always uses the primary key column in a table as the splitting column
D. When performing parallel imports, each degree of parallelism corresponds to a concurrent database connection
Which of the following must happen before the Big SQL EXPLAIN command can execute?
A. Run the ANALYZE command
B. Set the COMPATIBILITY_MODE global variable
C. Execute the SET HADOOP PROPERTY command
D. Call the SYSPROC.SYSINSTALLOBJECTS procedure
When indexing a Hive Table, which of the following is TRUE?
A. Hive tables do not support indexes
B. It increases query speed without the need for additional disk space
C. It does not increase query speed but makes data insert/delete/update faster
D. It increases query speed but requires additional processing time for data insert/update/delete and needs more disk space
Suppose that you have some log files that you need to load into HBase.
What tool could you use to perform a bulk load of the log files?
A. Import
B. Fastload
C. ImportTsv
D. None of the above
How many Job Trackers can be found in a MapReduce v1 cluster?
A. One per cluster
B. One per data node
C. One for each Mapper
D. One for each Reducer