In this Lab, you will set up the environment required for the following Labs. The architecture of the system that you will configure in this Lab1 is as follows.
This system uses a JavaScript based tool called Kinesis Data Generator to generate logs for analysis. This tool performs authentication and authorization for sending logs using a service called Amazon Cognito. Then, Kinesis Data Generator sends logs with the designated format to a log aggregation service called Amazon Kinesis Firehose (hereafter, Firehose). Logs sent to Firehose are written in Amazon Elasticsearch Service (hereafter, Amazon ES) after collecting data with designated intervals. Amazon ES bundles browser-based visualization and analysis software called Kibana. Using this Kibana, you will perform visualization and aggregation of logs from the browser. Amazon ES monitors data and send alerts to a notification service called Amazon Simple Notification Service (hereafter, Amazon SNS) in case of any issues.
In this section, you will create an Amazon ES domain. In Amazon ES, Elasticsearch clusters are called domains. When the process of domain creation is performed, a new virtual machine starts up in the backend, and then the setup for Elasticsearch cluster will start.
In this hands-on, Elasticsearch has been configured with only one machine due to just a trial. However, Elasticsearch essentially can handle large data and obtain high availability by configuring clusters on multiple machines. Therefore, when running Amazon ES in production, prepare multiple master nodes dedicated to managing clusters as well as multiple data nodes to store actual data.
A typical Easticsearch cluster in Amazon ES consists of the following. Distribute nodes across three or more Availability Zones (hereafter AZ) in an AWS Region to keep the cluster running in a highly available in the event of a failure in a single AZ. Setting up your own Elasticsearch cluster with this configuration on EC2, and upgrading software and changing its settings can be a very hard task. But you can launch it with such a configuration with only for a few clicks using Amazon ES.
The master node can be cohabitation with the data node, however in case of large clusters or heavy workloads, it is recommended to provide a dedicated master node that only manages the cluster. For the master node, it is recommended to be set with odd number of 3 units number or more. In Amazon ES, you can choose 3 or 5 units for the master node. For more information of the reason why even numbers cannot be used, please see the official document.
In this section, you will create a Firehose stream that you can use to insert logs into Amazon ES.
IAM stands for Identity and Access Management and is a service for managing access rights to AWS service resources. It is written in the following format. In this case, you can create a EC2 instance, and also list and read or write objects that exist in the S3 bucket called my-bucket. This set of permissions is called a policy. As mentioned in the Firehose steps above, you need to allow Firehose streams to write error data to an S3 bucket and insert data into Amazon ES.
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "",
"Effect": "Allow",
"Action": [
"ec2:RunInstances",
],
"Resource": "*"
},
{
"Sid": "",
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:ListBucket",
"s3:PutObject"
],
"Resource": [
"arn:aws:s3:::my-bucket",
"arn:aws:s3:::my-bucket/*"
]
}
]
}
This creates an IAM role with multiple attached policies, and enabling Firehose to access each service using this IAM role. These are summarized in the diagram below.
You have created the Amazon ES domain and the Firehose stream to insert logs into it. However, Firehose cannot send logs to Amazon ES because you have not set the appropriate permissions yet. In this section, you will use Kibana which is a web interface on Amazon ES to give Firehose permission to insert logs.
Amazon ES is based on Open Distro for Elasticsearch, which is an open-source Elasticsearch distribution. Open Distro has original permission management model that you can use it with Amazon ES. The permissions model of Open Distro is as follows and consists of Role and Role Mappings.
Role: A unit of various privileges of Elasticsearch. For example, you can set a development role that is granted permission to manipulate the cluster itself, add or delete data, and a reader role that can read only specific log data on the cluster. Elasticsearch has several pre-defined roles.
Role Mappings: A mapping that indicates the association of Elasticsearch Role defined above with AWS IAM users, and IAM roles. This allows specific IAM roles to perform the required Elasticsearch manipulation.
In this section, you will define a new write role with permission to add logs to Amazon ES, and associate this role with the IAM role for Firehose on AWS.These are summarized in the diagram below. The role which is the same word is used in AWS IAM and Open Distro, so that it makes you confuse easily. But they are completely different. AWS IAM roles are for managing permissions on AWS, and Open Distro roles are for managing permissions on Elasticsearch clusters. This is the role of Role Mappings in Open Distro to connect them together.
In this section, set up the Kinesis Data Generator. Kinesis Data Generator is a web application developed and provided as a service by AWS to generate logs flowing into Kinesis. To learn more about this service, please read this article.
Choose [Output] tab of the CloudFormation stack you have created. You can open the setting screen of Kinesis Data Generator by clicking the URL of “KinesisDataGeneratorUrl” displayed.
Enter the user name and password you have created in the the above step to “Username” and “Password” in the top right of the screen, and then login to it.
Configure the log transfer setting actually in this step. In “Region”, choose [us-east-1] ( N. Virginia region), and then choose [workshop-firehose] you have created earlier in Stream/delivery stream.
Enter “5” to Records per second (the number of log records generated per second). This means that 5 records are created per 1 second. As a result 300 records are generated in one minute, and then sent to Firehose.
In “Record template” below, delete the sample format written under “Templete 1”, and copy and paste the following codes. This specifies the format for logging sent from IoT sensors. It automatically generates dummy log data using such as random numbers.
{
"sensorId": {{random.number(50)}},
"currentTemperature": {{random.number(
{
"min":10,
"max":150
}
)}},
"ipaddress": "{{internet.ip}}",
"status": "{{random.weightedArrayElement({
"weights": [0.90,0.02,0.08],
"data": ["OK","FAIL","WARN"]
})}}",
"timestamp": "{{date.utc("YYYY/MM/DD HH:mm:ss")}}"
}
When clicking [Test template] at the bottom of the screen, you can check the sample of the log being actually sent. You can see that five records are generated as follows:
{ "sensorId": 42, "currentTemperature": 38, "ipaddress": "29.233.125.31", "status": "OK", "timestamp": "2020/03/03 12:49:12"}
If there is no matter, click [Send data] button at last to start sending the log. The Data continues to be sent to Firehose until you click [Stop Sending Data to Kinesis] displayed in the pop-up menu or close the browser tab.
In this section, you will create a topic for Amazon SNS and email delivery settings for notifications.
A topic in SNS is a unit for managing notifications. In Lab 3, Amazon ES will send alert notifications to this topic.
arn:aws:sns:us-east-1:123456789012:amazon_es_alert
). You will create IAM roles later and use them in Lab 3.Now, you will create a subscription here. This specifies the settings where you want to subscribe the topics and receive notifications. You will register email to subscribe the topic created above.
At last, you will create an IAM role to send notifications from Amazon ES to SNS topics.
Go to the IAM console in the Management Console, and choose the menu in the order of [Roles] > [Create role]. On the creation page, click [AWS service], click [EC2] under “Common use cases”, and then click [Next: Permissions] button.
Click Create policy button to open a new browser tab. Next, choose [JSON] tab there, and overwrite it with the following codes. In addition, replace “SNS_TOPIC_ARN” in the codes below with the ARN of SNS topic you have copied earlier. When completed, click [Check policy] button at the bottom of the screen.
{
"Version": "2012-10-17",
"Statement": [{
"Effect": "Allow",
"Action": "sns:Publish",
"Resource": "SNS_TOPIC_ARN"
}]
}
Enter “amazones_sns_alert_policy” in “Name”, and click [Create policy]. When completed, close this browser tab, and go back to the role creation screen.
On the role creation screen, click the reload button at the top right, and then enter “amazones_sns_alert_policy” in the filter to narrow the policy. After checking the policy you have created, click [Next: Tag].
Do not make any changes on the tag settings screen, click [Next: Review] button.
After entering “amazones_sns_alert_role” in “Role name”, click [Create role] button to complete the role creation.
Go back to the Roles page, and then enter “amazones_sns_alert_role” in the search box to choose the role you have created. When the details screen for the role is displayed, click [Trust relationships] tab at the bottom to click [Edit trust relationship] button. After entering the edit screen, replace the existing codes with the following codes. This is the setting to make this role available from the Amazon ES domain (es.amazon.com).
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Service": "es.amazonaws.com"
},
"Action": "sts:AssumeRole"
}
]
}
arn:aws:iam::123456789012:role/amazones_sns_alert_role
) in the IAM Role details screen. This string will be used in Lab 3.Now, you have completed to set up Amazon SNS.
In Lab 1, you have set up the required environment for the later Lab. This allows Firehose to aggregate data generated by Kinesis Data Generator, insert it into Amazon ES, and check Amazon ES data from Kibana. Please proceed to Lab 2.