IBM Big Data Engineer v1.0 (C2090-101)

Page:    1 / 8   
Total 106 questions

Which is a benefit of row oriented table design?

  • A. When writing a new row, if all of the row data is supplied at the same time the entire row can be written with a single disk seek
  • B. When columns of a single row are required at the same time, the entire row can be retrieved with a single disk seek regardless of row size
  • C. When new values of a column are supplied for all rows at once, that column data can be written efficiently and replace old column data without touching any other columns for the rows
  • D. When an aggregate needs to be computed over many rows but only a notably smaller subset of all columns of data, reading that smaller subset of data can be faster than reading all data


Answer : B

Reference:
http://www.ijoart.org/docs/Column-Oriented-Databases-to-Gain-High-Performance-for-Data-Warehouse-System.pdf
(7)

What Redaction feature needs to be selected when manually redacting a form through the Optim Review Tool?

  • A. Text Redaction
  • B. Image Redaction
  • C. Region Redaction
  • D. Redact by Information Type


Answer : D

Considering the following properties:
✑ Automated creation of target database schema and bulk extract and load
Real-time replication subscriptions (with CDC)


✑ Managed workload for optimized performance of potentially thousands of artifacts
✑ Ensured governance around both data access as well as for metadata capture (to support data lineage and impact assessment)
Which tool supports all of the above?

  • A. Pig
  • B. JAQL
  • C. Data Click
  • D. BigSheets


Answer : A

Reference:
http://meta7.forsythe.com/_wss/clients/508/news_feed/20151130202508951.pdf

Which of the following statements regarding Sqoop is TRUE?

  • A. Output files are always delimited text files in HDFS
  • B. The import of a single database table results in multiple files in HDFS
  • C. When exporting data back to a database, a target table will be created if it doesn't already exist
  • D. Sqoop processes the entire database table as a single unit rather than row-by-row for enhanced performance


Answer : A

Reference:
https://sqoop.apache.org/docs/1.4.6/SqoopUserGuide.html

Which of the following statements regarding importing streaming data from InfoSphere Streams into Hadoop is TRUE?

  • A. InfoSphere Streams utilizes Flume to interface to Hadoop
  • B. The HDFSFileSink operator writes files in parallel to a Hadoop Distributed File System
  • C. Buffering techniques are used to process incoming streams from InfoSphere Streams
  • D. When you use the HDFS operators to access GPFS, you must install InfoSphere Streams on an InfoSphere Big Insights data node


Answer : C

Reference:
https://books.google.com.pk/books?id=JWfRAgAAQBAJ&pg=PA147&lpg=PA147&dq=IBm+Buffering+techniques+are+used+to+process+incoming
+streams+from+InfoSphere+Streams&source=bl&ots=Z6XhA0-
Owk&sig=ACfU3U3T9ydrZHWTMvB31qQOyf6FtoDQgw&hl=en&sa=X&ved=2ahUKEwj6wsnvqvfoAhVUSxUIHWNDAycQ6AEwA3oECBEQAQ#v=onepage&q=IB m%20Buffering%20techniques%20are%20used%20to%20process%20incoming%20streams%20from%20InfoSphere%20Streams&f=false

Which BigInsights components are essential for Big Match operations?

  • A. Social Data Analytics (SDA)
  • B. HDFS, HBase, and BigSheets
  • C. HDFS and HBase
  • D. MapReduce framework, BigInsights cluster management


Answer : D

Reference:
https://www.ibm.com/support/knowledgecenter/SSWSR9_11.3.0/com.ibm.swg.im.mdmhs.pmebi.doc/topics/pme_bi_architecture.html

Bloom Filter in HBase can be used to determine which of the following?

  • A. Whether a record exists in a region server
  • B. Whether a record does not exist in a region server
  • C. Items in the catalog database
  • D. None of the above


Answer : D

Reference:
https://www.ibm.com/support/knowledgecenter/en/SSPT3X_3.0.0/com.ibm.swg.im.infosphere.biginsights.bigsql.doc/doc/bsql_create_hbase_table.html

Which statement is TRUE when loading data into Hadoop and creating Big SQL tables?

  • A. It is optional to have INSERT privileges granted to LOAD into a table with the APPEND option
  • B. You can either have INSERT or DELETE privileges granted to LOAD into a table with the OVERWRITE
  • C. Authentication information is not necessary when you connect to a secured InfoSphere BigInsights cluster
  • D. By using the LOAD HADOOP USING command, you can import data from external data sources into target Big SQL tables


Answer : D

Reference:
https://www.ibm.com/support/knowledgecenter/en/SSCRJT_5.0.1/com.ibm.swg.im.bigsql.db2biga.doc/doc/biga_load_from.html

Which Big SQL statement can be used to store a single row into an HBase table?

  • A. LOAD HIVE DATA
  • B. INSERT INTO HBASE
  • C. CREATE HBASE TABLE
  • D. LOAD USING ג€¦ INTO HBASE TABLE


Answer : B

Reference:
https://www.ibm.com/support/knowledgecenter/en/SSPT3X_3.0.0/com.ibm.swg.im.infosphere.biginsights.bigsql.doc/doc/bsql_insert.html

For what purpose SPSS models are embedded within InfoSphere Streams application?

  • A. To provide high availability
  • B. To score streaming data using existing models
  • C. To create new models based on streaming data
  • D. To ingest and parse binary and other complex data types


Answer : B

Reference:
https://www.ibm.com/developerworks/data/tutorials/dm-1109spssscoringinfospherestreams1/dm-1109spssscoringinfospherestreams1-pdf.pdf

Setting HDFS folder permissions most directly relates to which of these PCI compliance requirements?

  • A. Protect stored data
  • B. Install and maintain a firewall
  • C. Encrypt the transmission of sensitive data
  • D. Assign a unique ID to each person with access to data


Answer : C

Which of the following is most commonly used by Hadoop to move data between clusters?

  • A. Pig
  • B. FTP
  • C. JAQL
  • D. distcp


Answer : D

Reference:
https://developer.ibm.com/hadoop/2016/02/05/fast-can-data-transferred-hadoop-clusters-using-distcp/

A large Telecom company wants to store data from multiple databases into Hadoop. They plan to do bulk loads of data into Hadoop and run analytical queries.
Which data store would be ideal for this scenario?

  • A. Hive
  • B. HBase
  • C. BigSheets
  • D. Apache Spark


Answer : B

Reference:
https://developer.ibm.com/recipes/tutorials/big-data-and-hadoop-on-ibm-cloud/

Which of the following is not a data-processing operations that is supported in Pig Latin?

  • A. filter
  • B. joins
  • C. group by
  • D. logistic regression


Answer : D

Reference:
https://pig.apache.org/docs/r0.15.0/basic.html

What are the key elements that IBM Big Match Probabilistic Matching Engine leverages?

  • A. Wildcard and sorting
  • B. Wildcard and phonetics
  • C. Phonetics and searching
  • D. Phonetics and nicknames


Answer : D

Reference:
https://www.ibm.com/support/knowledgecenter/SSWSR9_11.6.0/com.ibm.mdmhs.txn.ref.doc/r_searchPersonProbabilistic.html

Page:    1 / 8   
Total 106 questions