Q11. What Mapper does ?

Ans: Maps are the individual tasks that transform input records into intermediate records. Thetransformed intermediate records do not need to be of the same type as the input records. Agiven input pair may map to zero or many output pairs.

Q12. What is the InputSplit in map reduce software?

Ans: An InputSplit is a logical representation of a unit (A chunk) of input work for a maptask; e.g., a filename and a byte range within that file to process or a row set in a text file.

Q13. What is the InputFormat ?

Ans: The InputFormat is responsible for enumerate (itemise) the InputSplits, and producing aRecordReader which will turn those logical work units into actual physical input records.

Q14. Where do you specify the Mapper Implementation?

Ans: Generally mapper implementation is specified in the Job itself.

Q15. How Mapper is instantiated in a running job?

Ans: The Mapper itself is instantiated in the running job, and will be passed a MapContextobject which it can use to configure itself.

Spark Professional Training   Spark SQL Hands Training   PySpark : HandsOn Professional Training    Apache NiFi (Hortonworks DataFlow) Training   Hadoop Professional Training   Cloudera Hadoop Admin Training Course-1  HBase Professional Traininghttp  SAS Base Certification Hands On Training OOzie Professional Training     AWS Solution Architect : Training Associate

Q16. Which are the methods in the Mapper interface?

Ans : The Mapper contains the run() method, which call its own setup() method onlyonce, it also call a map() method for each input and finally calls it cleanup() method. Allabove methods you can override in your code.

Q17. What happens if you don’t override the Mapper methods and keep them as it is?

Ans: If you do not override any methods (leaving even map as-is), it will act as the identity function, emitting each input record as a separate output.

Q18. What is the use of Context object?

Ans: The Context object allows the mapper to interact with the rest of the Hadoopsystem. It Includes configuration data for the job, as well as interfaces which allow it toemit output.

Q19. How can you add the arbitrary key-value pairs in your mapper?

Ans: You can set arbitrary (key, value) pairs of configuration data in your Job, e.g.with Job.getConfiguration().set("myKey", "myVal"), and then retrieve this data in yourmapper with Context.getConfiguration().get("myKey"). This kind of functionality is typically done in the Mapper's setup() method.

Q20. How does Mapper’s run() method works?

Ans: The Mapper.run() method then calls map(KeyInType, ValInType, Context) for eachkey/value pair in the InputSplit for that task

Premium Training : Spark Full Length Training : with Hands On Lab

AWS Exam Prepare : Kinesis Data Stream   Free Core Java 1Z0-808 Training   Scala Professional Training   Python Professional Training  Read Spark SQL Fundamental and Cookbookhttps://sites.google.com/training4exam.com/spark-sql-2-x-fundamentals/  Book : AWS Solution Architect Associate : Little Guide  NiFi CookBook By HadoopExam  AWS Security Specialization Certification: Little Guide SCS-C01     Spark Interview Questions