Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SSH communicator broken #498

Closed
hahuang65 opened this issue Aug 6, 2024 · 5 comments
Closed

SSH communicator broken #498

hahuang65 opened this issue Aug 6, 2024 · 5 comments

Comments

@hahuang65
Copy link

Overview of the Issue

Unable to build AMIs with amazon-ebs.

When using temporary_iam_instance_profile_policy_document
It complains about Retryable error: InvalidParameterValue: Value (packer-66b24927-f1bf-3659-0653-2b0e2181a066) for parameter iamInstanceProfile.name is invalid. Invalid IAM Instance Profile name

Then fails to SSH [DEBUG] TCP connection to SSH ip/port failed: dial tcp: lookup localhost on [::1]:53: no such host

When using iam_instance_profile, it doesn't seem to have an error with setting the instance up, but it still tries to connect to localhost or 0.0.0.0 for SSH.

Reproduction Steps

Just run packer build -var "name=grafana" ami.pkr.hcl with my build file.
Change the temporary_iam_instance_profile_policy_document block to iam_instance_profile to test for the other case.

Plugin and Packer version

Packer v1.11.2
packer-plugin-amazon_v1.3.2_x5.0_linux_amd64

Simplified Packer Buildfile

https://gist.github.com/hahuang65/a35654fbf7261a9044e0e98a226ccbe6

Operating system and Environment details

Arch Linux

Log Fragments and crash.log files

https://gist.github.com/hahuang65/6b96e920c0c10b548664370fa0e799f2 (both scenarios are here)

@hahuang65 hahuang65 added the bug label Aug 6, 2024
@lbajolet-hashicorp
Copy link
Contributor

Hi @hahuang65,

The connection to localhost is because you specified ssh_interface = "session_manager", this will in turn start aws ssm to open a tunnel to your machine.

The following lines from your logs hint at it:

2024/08/06 11:07:43 packer-plugin-amazon_v1.3.2_x5.0_linux_amd64 plugin: 2024/08/06 11:07:43 Found available port: 8169 on IP: 0.0.0.0
2024/08/06 11:07:43 packer-plugin-amazon_v1.3.2_x5.0_linux_amd64 plugin: 2024/08/06 11:07:43 ssm: Starting PortForwarding session to instance i-02693071c9fb626cb

In this case, you have one aws ssm StartSession process running in the background, relaying ssh connections made on localhost:8169 to your machine's SSH port.

In the logs however, this fails with a lookup issue on [::1]:53, which indicates a DNS issue. Is your local DNS resolver (typically bind or systemd-resolved on Linux) running? Would you be able to check what a command like dig localhost returns?

@hahuang65
Copy link
Author

The connection to localhost is because you specified ssh_interface = "session_manager", this will in turn start aws ssm to open a tunnel to your machine.

Yup, I understand this.

So dig localhost is giving me an NXDomain. systemd-resolved is running... but (and maybe this is a silly question) shouldn't the fact that it's hitting[::1]mean that it's resolvedlocalhostto::1`?

Also, I rebooted my computer and now the error is different. Instead of no such host, I'm getting

[DEBUG] TCP connection to SSH ip/port failed: dial tcp [::1]:8065: connect: connection refused

@hahuang65
Copy link
Author

hahuang65 commented Aug 13, 2024

Ah, yes, that was because I edited my /etc/hosts file.
I noticed it was empty, so I added

127.0.0.1       localhost
255.255.255.255 broadcasthost
::1             localhost

which changed my error from no such host to connection refused.

sshd does look like it's running.

@hahuang65
Copy link
Author

hahuang65 commented Aug 13, 2024

Right, so temporary_iam_instance_profile_policy_document still gives me the
Retryable error: InvalidParameterValue: Value (packer-66b24927-f1bf-3659-0653-2b0e2181a066) for parameter iamInstanceProfile.name is invalid. Invalid IAM Instance Profile name error.

I am getting

2024/08/13 11:14:17 packer-plugin-amazon_v1.2.8_x5.0_linux_amd64 plugin: 2024/08/13 11:14:17 [INFO] Waiting for SSH, up to timeout: 5m0s
==> grafana.amazon-ebs.ami: Waiting for SSH to become available...
2024/08/13 11:14:17 packer-plugin-amazon_v1.2.8_x5.0_linux_amd64 plugin: 2024/08/13 11:14:17 [DEBUG] TCP connection to SSH ip/port failed: dial tcp [::1]:8503: connect: connection refused
2024/08/13 11:14:17 packer-plugin-amazon_v1.2.8_x5.0_linux_amd64 plugin: 2024/08/13 11:14:17 Retryable error: TargetNotConnected: i-027c2e698ea22d1fa is not connected.
2024/08/13 11:14:18 packer-plugin-amazon_v1.2.8_x5.0_linux_amd64 plugin: 2024/08/13 11:14:18 Retryable error: TargetNotConnected: i-027c2e698ea22d1fa is not connected.
2024/08/13 11:14:18 packer-plugin-amazon_v1.2.8_x5.0_linux_amd64 plugin: 2024/08/13 11:14:18 Retryable error: TargetNotConnected: i-027c2e698ea22d1fa is not connected.
2024/08/13 11:14:19 packer-plugin-amazon_v1.2.8_x5.0_linux_amd64 plugin: 2024/08/13 11:14:19 Retryable error: TargetNotConnected: i-027c2e698ea22d1fa is not connected.
2024/08/13 11:14:22 packer-plugin-amazon_v1.2.8_x5.0_linux_amd64 plugin: 2024/08/13 11:14:22 Retryable error: TargetNotConnected: i-027c2e698ea22d1fa is not connected.
2024/08/13 11:14:22 packer-plugin-amazon_v1.2.8_x5.0_linux_amd64 plugin: 2024/08/13 11:14:22 [DEBUG] TCP connection to SSH ip/port failed: dial tcp [::1]:8503: connect: connection refused
2024/08/13 11:14:25 packer-plugin-amazon_v1.2.8_x5.0_linux_amd64 plugin: 2024/08/13 11:14:25 Retryable error: TargetNotConnected: i-027c2e698ea22d1fa is not connected.
2024/08/13 11:14:27 packer-plugin-amazon_v1.2.8_x5.0_linux_amd64 plugin: 2024/08/13 11:14:27 [DEBUG] TCP connection to SSH ip/port failed: dial tcp [::1]:8503: connect: connection refused
2024/08/13 11:14:32 packer-plugin-amazon_v1.2.8_x5.0_linux_amd64 plugin: 2024/08/13 11:14:32 [DEBUG] TCP connection to SSH ip/port failed: dial tcp [::1]:8503: connect: connection refused

regardless if I use temporary_iam_instance_profile_policy_document or just straight iam_instance_profile.

@hahuang65
Copy link
Author

Looks like this is all user error. Layers of problems:

  1. /etc/hosts didn't have entries for localhost
  2. My nat instance wasn't on

Sorry for the trouble and wasted time/attention.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants