Question 10
Your customer collects diagnostic data from its storage systems that are deployed at customer sites. The customer needs to capture and process this data by country in batches.
Why should the customer choose Hadoop to process this data?
Hadoop processes data on large clusters (10-50 max) on commodity hardware.
Hadoop is a batch data processing architecture.
Hadoop supports centralized computing of large data sets on large clusters.
Node failures can be dealt with by configuring failover with clusterware.
Hadoop processes data serially.
Correct answer: B
Explanation:
Hadoop was designed for batch processing. That means, take a large dataset in input all at once, process it, and write a large output. The very concept of MapReduce is geared towards batch and not real-time. With growing data, Hadoop enables you to horizontally scale your cluster by adding commodity nodes and thus keep up with query. In hadoop Map-reduce does the same job it will take large amount of data and process it in batch. It will not give immediate output. It will take time as per Configuration of system,namenode,task-tracker,job-tracker etc. Incorrect Answers:A: Yahoo! has by far the most number of nodes in its massive Hadoop clusters at over 42,000 nodes as of July 2011. C: Hadoop supports distributed computing of large data sets on large clustersE: Hadoop processes data in parallel.References: https://www.quora.com/What-is-batch-processing-in-hadoop
Hadoop was designed for batch processing. That means, take a large dataset in input all at once, process it, and write a large output. The very concept of MapReduce is geared towards batch and not real-time. With growing data, Hadoop enables you to horizontally scale your cluster by adding commodity nodes and thus keep up with query. In hadoop Map-reduce does the same job it will take large amount of data and process it in batch. It will not give immediate output. It will take time as per Configuration of system,namenode,task-tracker,job-tracker etc.
Incorrect Answers:
A: Yahoo! has by far the most number of nodes in its massive Hadoop clusters at over 42,000 nodes as of July 2011.
C: Hadoop supports distributed computing of large data sets on large clusters
E: Hadoop processes data in parallel.
References: https://www.quora.com/What-is-batch-processing-in-hadoop