Note: Both the deployment template and this documentation is currently work-in-progress.
We provide an example of a template that can be used to create a new AWS EMR cluster and start a secured H2O cluster. The template is meant as a starting point and users are expected to customize it.
- Download and install HashiCorm Terraform: https://www.terraform.io/
- Navigate your terminal to directory templates/emr/terraform of the H2O-3 repository
- Run terraform init
- Create a new EMR cluster by running terraform apply (type 'yes' when you are asked to confirm the configuration)
- this command will ask you for your AWS access key and secret key
- it will setup the security groups and networking needed to run an H2O cluster (review the configuration in modules/network)
- it will create a 2 node H2O cluster using m5.xlarge instance, type and number of instances used can be changed in modules/emr/variables.tf)
- H2O Flow will be exposed on the master node of the cluster on port 54321, Flow will be password protected - use username root and password password to login. LDAP authentication is used.
- connection is secured using self-signed SSL/TLS certificate, this certificate is generated by the bootstrap process.
- You should see a similar output to:
- flow_url = https://ec2-54-162-73-112.compute-1.amazonaws.com:54321/
- Navigate your browser to the given flow_url and login using the given username and password. You will need to accept the self-signed security certificate.
Please keep in mind that this cluster will be publicly available and the H2O port be open to public.