Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docker-compose RA status fails when another docker-compose service is running #1741

Closed
tomofumi-yuki opened this issue Feb 3, 2022 · 7 comments · Fixed by #1775
Closed

Comments

@tomofumi-yuki
Copy link

The docker-compose agent fails to correctly recognize that a service is running when there are multiple services running on the same machine.

The docker_compose_status function:

https://github.com/ClusterLabs/resource-agents/blob/main/heartbeat/docker-compose#L137

uses the output of docker ps and calculates the offset of the STATUS column, and counts the number of Up containers. This number needs to match the number of containers returned by docker-compose ps.

The code at line 143:
https://github.com/ClusterLabs/resource-agents/blob/main/heartbeat/docker-compose#L143

replaces '\n' with '|' so that grep matches on multiple container IDs returned by docker-compose ps (stored as $DKPS). However, since there is a '\n' at the end of docker ps --no-trunc output, which makes the grep run with an regex of the form: "ID1|ID2|...|IDX|". Since this expression ends with '|' it matches the entire input.

This causes the value of $UPNU to be larger than the number of expected containers, making the status check fail by entering the else clause.

The fix I have used is to replace the code at line 143 with the following:
STAT_MSG=$(docker ps --no-trunc | grep -E $(echo "$DKPS" | tr '\n' '|' | sed 's/|$/|STATUS/') | expand)

this should capture only the container IDs in $DKPS and the header line, which is used for calculating the offset of the Up column. This fix would clearly not work if the image name of a container included 'STATUS'.

@oalbrigt
Copy link
Contributor

oalbrigt commented Feb 3, 2022

Thank you for the report. I'll have a look at it next week, and make a PR for it.

@mcsakoff
Copy link

mcsakoff commented Feb 7, 2022

JFYI, my version of the fix:

...
docker_compose_status()
{
	# use docker-compose ps if YML found, otherwise try docker ps and kill containers
	if [ -r "$DIR/$YML" ]; then

		DKPS=$(cd $DIR; $COMMAND -f $YML ps -q)
		if [ -n "$DKPS" ]; then
			# get number of all containers
			PSNU=$(echo "$DKPS" | wc -l)
			# get number of running containers
			while IFS= read -r UUID; do
				UP=$(docker inspect --format='{{.State.Running}}' "$UUID")
				[ "$UP" == true ] && UPNU=$((UPNU+1))
			done <<< "$DKPS"
		else
			PSNU=0
			UPNU=0
		fi

		if [ "${PSNU:-0}" -ne 0 ]; then
			if [ ${UPNU:-0} -eq 0 ]; then
...

@oalbrigt
Copy link
Contributor

...

replaces '\n' with '|' so that grep matches on multiple container IDs returned by docker-compose ps (stored as $DKPS). However, since there is a '\n' at the end of docker ps --no-trunc output, which makes the grep run with an regex of the form: "ID1|ID2|...|IDX|". Since this expression ends with '|' it matches the entire input.

Wouldnt it be better to remove the last "|"? Like sed 's/|$//.

@tomofumi-yuki
Copy link
Author

Wouldnt it be better to remove the last "|"? Like sed 's/|$//.

The first line of the docker ps output is later use at ln. 145:
STATEPOS=$(echo "$STAT_MSG" | head -n1 | egrep -o 'STATUS.*$' | wc -c)

I am replacing the last '|' with '|STATUS' to have the grep also retain the first line.

In any case, I did not intend to propose my "fix" directly as a patch. It is a workaround that worked for me, which I left as a quick fix for others that might encounter this issue. As I have noted, it would not work for all image names, and a more general solution is probably more appropriate to patch this issue.

The fix provided by @mcsakoff seems more general with less reliance on the output format of docker ps, and looks like a better place to start for a patch to this issue.

@kennychennetman
Copy link
Contributor

Sorry Guys, I've not looked after this agent for a while. I did send a PR on Sep 24 last year to fix this issue but still waiting for approval:
#1694
However, I really recognize the fix from @mcsakoff is much better than mine. Thanks a lot!

@oalbrigt
Copy link
Contributor

@mcsakoff I've simplified your patch, and removed the bashisms.

Can you confirm whether this works or if there's any issues with it?
f322ba1

@mcsakoff
Copy link

@oalbrigt, I confirm it works fine and looks much better.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants