Get All the Questions Covering Entire Syllabus from here (2018-2019) : This material is owned by HadoopExam.com . Please dont copy its bad Karma
Question 19: You are working with a giant online retail com.................................. as well as application logs in real-time so they can apply the machine learning in real-time as well as they need this data to be saved in S3 bucket. Which of the following solution is suitable for this requirement?
- Correct Answer
- You will create two script one for infrastructure log and other one for application log and using TCP on separate port you will provide this data. Hence, your client application will make connection on these ports to read the data in real-time.
- Correct Answer
- You will be writing a script which has Map-reduce code and tag the application log and infrastructure separately. And then merge both the logs and send it over SQS. Then your consumer application can read those logs on SQS and segregate based on tags.
Correct Answer : A, C
Detailed Explaination: In Question it is very simple that they want both the logs Infrastructure IT and application logs at one place. Both the logs are available the only issue here is how optimally you can send them in real time so that consumer application can use it in real-time for applying Machine Learning.
If you see real-time data retrieval then you can start thinking of streaming solution and which is Kinesis data stream. If you find option related to that then it will be an answer. So here option-4 is correct answer. However, we need to choose two option.
Option-2: It’s talking about writing custom solution and then use TCP socket to read data in real-time. Why we need to do such development, if AWS is providing easy solution for common requirement. Hence, you cannot consider it as a correct answer.
Option-4: You can write script which can have implementation of MapReduce algorithm. We don’t think it is required to tag the logs. Yes, you can tag both the logs separately. You merger them at originator site and then create a message out of them and send it to SQL and client side you will read that logs and based on the tag you will separate the log and then consumer application will read it. Solution is possible, but it is not a right answer for the perspective of better design, it has lot of complexity both side.
Kinesis Data Stream:
- Kinesis can help you collect data as stream, also it can be real-time.
- For that we have to create a Kinesis data processing application, and it will read data from a data stream as data records.
- To create Data Processing application, you need a Kinesis client library, and you can run this application on EC2 instances.
- Kinesis Data Stream is a part of the Kinesis Streaming Data Platform, which has other products like Kinesis Data Firehose, Kinesis Video Streams, and Kinesis Data Analytis.
- For example you can collect data like IT infrastructure logs, application logs, social media logs, market data feeds, web clickstream etc.
All AWS Certification Products, Training, Books and PDF you must use are below