Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

K6 is stuck on stage: initialization if the init job fails #290

Closed
JorTurFer opened this issue Sep 20, 2023 · 8 comments
Closed

K6 is stuck on stage: initialization if the init job fails #290

JorTurFer opened this issue Sep 20, 2023 · 8 comments
Labels
bug Something isn't working
Milestone

Comments

@JorTurFer
Copy link
Contributor

JorTurFer commented Sep 20, 2023

Brief summary

The operator creates the init job successfully, but if the pod fails for any reason, the operator doesn't notice it and the K6 job is stuck on the stage initialization until you manually remove it.

I'm willing to fix it (or at least to try it xD)

k6-operator version or image

latest (sha256:79df77fea27ab5820ce3f25167268d5094be2fc10d182283fce9921e3786fed1)

K6 YAML

Something that produces an error on init job

Other environment details (if applicable)

No response

Steps to reproduce the problem

Deploy a K6 manifest that produces a fail on the init pod. For example, linking a file that doesn't exist

Expected behaviour

The stage of K6 changes

Actual behaviour

The stage of K6 doesn't change

@JorTurFer JorTurFer added the bug Something isn't working label Sep 20, 2023
@yorugac
Copy link
Collaborator

yorugac commented Sep 25, 2023

Hi @JorTurFer! Thanks for working on this. I wouldn't call it a bug but more of an improvement in logic, TBH 😄

And this issue came up in several contexts recently! So linking key issues / PRs:

Quite a lot. I need to grok all of these to figure out what should be merged, changed, etc. It's in my TODO in the next couple of weeks so shouldn't be a long wait 👍 But as a heads up, there are duplicates and almost conflicts between the above.

@JorTurFer
Copy link
Contributor Author

I wouldn't call it a bug but more of an improvement in logic

I have to disagree because literally the test is f**ked up. I mean, any kind of error during the initializing will stuck the test without any useful feedback (nor useless feedback, it doesn't give any feedback at all) 😄
Any kind of automation over the test resource fails due to this. and it's quite annoying. Knowing the root cause, we have added more checks on different resources, but K6 resource's status isn't usable

@JorTurFer
Copy link
Contributor Author

Hello!
Any update about this topic?

@yorugac
Copy link
Collaborator

yorugac commented Feb 8, 2024

Hi @JorTurFer, apologies for such a delay. Yes, actually, it's a good time to make this addition for the next release, given past and future work, but I'll have to ask for an update of your PR. Will comment over there.

@yorugac yorugac added this to the 0.14 milestone Feb 8, 2024
@alifemove
Copy link

alifemove commented Feb 12, 2024

Is this related to where it gets stuck like this

time="2024-02-12T14:55:22Z" level=debug msg="Runner successfully initialized!"
time="2024-02-12T14:55:22Z" level=debug msg="Parsing CLI flags..."
time="2024-02-12T14:55:22Z" level=debug msg="Consolidating config layers..."
time="2024-02-12T14:55:22Z" level=debug msg="Parsing thresholds and validating config..."
time="2024-02-12T14:55:22Z" level=debug msg="Initializing the execution scheduler..."
time="2024-02-12T14:55:22Z" level=debug msg="Starting 2 outputs..." component=output-manager
time="2024-02-12T14:55:22Z" level=debug msg=Starting... output=InfluxDBv1

Init      [   0% ] Starting outputs
default   [   0% ]

and just does nothing after that?

My script is working locally, but when I try to run it in circleci this is all the further I get.

@JorTurFer
Copy link
Contributor Author

Hi @JorTurFer, apologies for such a delay. Yes, actually, it's a good time to make this addition for the next release, given past and future work, but I'll have to ask for an update of your PR. Will comment over there.

Sure, I'll rebase it this week and update the conflicts 😄

@yorugac yorugac modified the milestones: 0.14, 0.15 Mar 28, 2024
@yorugac
Copy link
Collaborator

yorugac commented May 23, 2024

This appears to be fixed now, with PRs #291 and #401. Thanks @JorTurFer and @irumaru!

Some additional notes on the expected behaviour when initializer fails:

  • With cleanup: "post" option, k6-operator will delete resources pretty fast. So in order to observe it reliably, it is good to have a proper monitoring solution to watch logs and job / pod creation.
  • In cloud output mode, the test run will never get created in GCk6 so it won't appear in the UI. IOW, one has to check logs and metrics on their cluster to troubleshoot the error.

@yorugac yorugac closed this as completed May 23, 2024
@JorTurFer
Copy link
Contributor Author

Nice!
Sorry for being missing, my last weeks have been terrible :(
Happy to see that it's solved 😄 Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants