Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix livestatereporter to consider multiple plugins in one app #5457

Merged
merged 26 commits into from
Jan 24, 2025

Conversation

ffjlabo
Copy link
Member

@ffjlabo ffjlabo commented Dec 26, 2024

What this PR does:

Fix to start the livestatereporter when starting pipedv1.

Also, I added a plugin name field for the application model to list apps by it.

I fixed the livestatereporter to consider the multiple plugins situation.

Modify to use DeploySource

  • Added deploy_source to GetLivestateRequest.

Modify to merge Livestate

LivestateState

  • Add plugin name to the ResourceState
  • HealthStatus: Total sum of statuses across all plugins

SyncState

  • Status: Total sum of statuses across all plugins
    • Priority is InvalidConfig > OutOfSync > Sync > Unknown
  • ShortReason:
    • Combine results of each plugin separated by new lines
  • Reason:
    • Combine results of each plugin separated by new lines

Note: the current Web UI for each field above.
PipeCD

Why we need it:

These fixes are needed to start the internal reporter for each plugin.

Since multiple plugins will be supported for one app, each plugin will be able to support its own Livestate.
Therefore, I retrieved all the Livestates and merged the results to display them.

Which issue(s) this PR fixes:

Part of #5363

Does this PR introduce a user-facing change?:

  • How are users affected by this change:
  • Is this breaking change:
  • How to migrate (if breaking change):

Copy link

codecov bot commented Dec 26, 2024

Codecov Report

Attention: Patch coverage is 56.85279% with 85 lines in your changes missing coverage. Please review.

Project coverage is 26.62%. Comparing base (b6fbc71) to head (c5284f3).
Report is 7 commits behind head on master.

Files with missing lines Patch % Lines
...app/pipedv1/livestatereporter/livestatereporter.go 43.57% 67 Missing and 12 partials ⚠️
pkg/model/application.go 90.38% 4 Missing and 1 partial ⚠️
pkg/app/pipedv1/controller/planner.go 50.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #5457      +/-   ##
==========================================
+ Coverage   26.29%   26.62%   +0.33%     
==========================================
  Files         464      464              
  Lines       49626    49737     +111     
==========================================
+ Hits        13047    13241     +194     
+ Misses      35546    35437     -109     
- Partials     1033     1059      +26     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@ffjlabo ffjlabo marked this pull request as ready for review December 26, 2024 09:33
@@ -45,6 +45,8 @@ message Application {
string platform_provider = 15;
// The names of deploy taget where to deploy this application.
repeated string deploy_targets = 16;
// The name of plugin used to deploy this application.
string plugin = 17;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's not a single plugin which deploys an application but it's a list of plugins do

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@khanhtc1202
Thanks, fixed it on 0c4fc0d

@ffjlabo ffjlabo force-pushed the start-livestatereporter-on-pipedv1 branch from c1654db to 0c4fc0d Compare December 27, 2024 01:16
@ffjlabo ffjlabo requested a review from khanhtc1202 December 27, 2024 01:24
@ffjlabo ffjlabo enabled auto-merge (squash) December 27, 2024 01:24
@khanhtc1202
Copy link
Member

@ffjlabo go lint error found, please fix it

@ffjlabo
Copy link
Member Author

ffjlabo commented Dec 27, 2024

Thanks. I don't know it. I'll fix it.

lint job is succeed, but the result of the golangci-lint is in the labeler 🤔
Start_livestatereporter_on_pipedv1_by_ffjlabo_·_Pull_Request__5457_·_pipe-cd_pipecd

@ffjlabo
Copy link
Member Author

ffjlabo commented Dec 27, 2024

@khanhtc1202 Fixed for the lint

Warashi
Warashi previously approved these changes Dec 27, 2024
Copy link
Contributor

@Warashi Warashi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ffjlabo
Copy link
Member Author

ffjlabo commented Jan 6, 2025

📝 We decided to determine the plugin used on the application from app.pipecd.yaml. So I will fix not to add plugin name to the application model.

@ffjlabo ffjlabo mentioned this pull request Jan 7, 2025
@ffjlabo ffjlabo force-pushed the start-livestatereporter-on-pipedv1 branch 2 times, most recently from 0e3e607 to cd2b7f1 Compare January 17, 2025 05:06
Comment on lines 131 to 170
// TODO: Set the limit based on the piped config
limit := runtime.GOMAXPROCS(0) / 2

// Report the application live state to the control plane.
snapshot := &model.ApplicationLiveStateSnapshot{
ApplicationId: app.Id,
PipedId: app.PipedId,
ProjectId: app.ProjectId,
Kind: app.Kind,
ApplicationLiveState: res.GetApplicationLiveState(),
appsPerReporter := len(apps) / limit
if appsPerReporter == 0 {
appsPerReporter = 1
}

appsGrouped := make([][]*model.Application, 0)
for i := 0; i < len(apps); i += appsPerReporter {
end := i + appsPerReporter
if end > len(apps) {
end = len(apps)
}
snapshot.DetermineApplicationHealthStatus()

if _, err := pr.apiClient.ReportApplicationLiveState(ctx, &pipedservice.ReportApplicationLiveStateRequest{
Snapshot: snapshot,
}); err != nil {
pr.logger.Error("failed to report application live state",
zap.String("application-id", app.Id),
zap.Error(err),
)
appsGrouped = append(appsGrouped, apps[i:end])
}

eg, ctx := errgroup.WithContext(ctx)
eg.SetLimit(limit)

r.logger.Info("start flushing snapshots", zap.Int("application-count", len(apps)), zap.Int("concurrency", limit))
for _, apps := range appsGrouped {
eg.Go(func() error {
r.flushAll(ctx, apps, repoMap)
return nil
})
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At first, I tried to implement the livestate reporter to process the apps grouped by plugins because the livestatereporter for pipedv0 processes the apps grouped by platform providers.

But it finds the plugin name of the app based on the app config.
It takes some more time and resource to prepare the deploy sources for all of the apps in one time.
So I implemented it in paralell.

Comment on lines +182 to +212
dir, err := os.MkdirTemp(r.workingDir, fmt.Sprintf("app-%s-*", app.Id))
if err != nil {
r.logger.Error("failed to create temporary directory", zap.Error(err))
return err
}

dsp := deploysource.NewProvider(dir, deploysource.NewLocalSourceCloner(repo, "target", "HEAD"), app.GitPath, r.secretDecrypter)
ds, err := dsp.Get(ctx, io.Discard)
if err != nil {
r.logger.Error("failed to get deploy source", zap.String("application-id", app.Id), zap.Error(err))
return err
}

cfg, err := config.DecodeYAML[*config.GenericApplicationSpec](ds.ApplicationConfig)
if err != nil {
r.logger.Error("unable to parse application config", zap.Error(err))
return err
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use the Deploy Source for two reasons.

  1. to use app.pipecd.yaml for the app
  2. to pass it when calling GetLivestate. It expects to use it do do drift detection.

@ffjlabo ffjlabo force-pushed the start-livestatereporter-on-pipedv1 branch from 08a4aa9 to c951ab9 Compare January 17, 2025 05:56
@@ -108,6 +108,8 @@ message ResourceState {
string health_description = 7;
// The target where this resource is deployed.
string deploy_target = 8;
// The plugin name that manages this resource.
string plugin_name = 9;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need it to show livestates by grouping with plugin

Signed-off-by: Yoshiki Fujikane <[email protected]>
Signed-off-by: Yoshiki Fujikane <[email protected]>
Signed-off-by: Yoshiki Fujikane <[email protected]>
Signed-off-by: Yoshiki Fujikane <[email protected]>
Signed-off-by: Yoshiki Fujikane <[email protected]>
Signed-off-by: Yoshiki Fujikane <[email protected]>
@ffjlabo ffjlabo force-pushed the start-livestatereporter-on-pipedv1 branch from e40079e to d14e25d Compare January 22, 2025 09:21
option go_package = "github.com/pipe-cd/pipecd/pkg/plugin/api/v1alpha1/common";

import "validate/validate.proto";
import "pkg/model/common.proto";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like we don't use this 👀

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, removed them

@ffjlabo ffjlabo force-pushed the start-livestatereporter-on-pipedv1 branch from d14e25d to 11a9a14 Compare January 22, 2025 10:08
logger *zap.Logger
}
// TODO: Set the limit based on the piped config
limit := runtime.GOMAXPROCS(0) / 2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When runtime.GOMAXPROCS(0) returns 1, limit becomes 0.
This causes the division-by-zero error.
Please set the limit 1 if it is 0.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, fixed on 5ea0571

case <-snapshotTicker.C:
pr.flushSnapshots(ctx)
eg, ctx := errgroup.WithContext(ctx)
eg.SetLimit(limit)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use the appsGrouped to limit concurrency, so we don't need this SetLimit.
Or, we can use only eg.SetLimit and simply call eg.Go with all apps.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, you're right. Also there is no reason to use errgroup, so I fixed to use wait group instead of it.
af57c01

// See the License for the specific language governing permissions and
// limitations under the License.

syntax = "proto3";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about adding comments on where to use this package?
I imagine this common package to be used in the deployment/livestate package as a dependency, and I want to clarify it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I forgot to save the file. Added it
c7df424

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about this long comment?

"// package common defines common messages which are depended on by messages in packages under pkg/plugin/api/v1alpha1."

As Sawada-san mentioned, I want to clarify what common means because common package will be also visible to plugin developers.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, fixed it.
c5284f3

Signed-off-by: Yoshiki Fujikane <[email protected]>
Signed-off-by: Yoshiki Fujikane <[email protected]>
Signed-off-by: Yoshiki Fujikane <[email protected]>
Copy link
Contributor

@Warashi Warashi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I forgot to review the test codes.
I commented on it.

logger: zaptest.NewLogger(b),
}

pr.flushSnapshots(context.Background())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no use of b *testing.B; what is this benchmark to measure?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to measure the time to execute it in parallel for reference.
I left it because It will be helpful to check the performance. WDYT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I got it.
Then you should use testing.B as below.

for range b.N {
	pr.flushSnapshots(context.Background())
}

ref; https://pkg.go.dev/testing#hdr-Benchmarks

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I added it.
3adea7e

logger: zaptest.NewLogger(t),
}

pr.flushSnapshots(context.Background())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are no assertion statements.
Does this test only assert that no panic occurred? If so, it's better to comment on that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I added a comment and a new test for flush to check the logic. (currently added only the success case)
8b8f4c6

Signed-off-by: Yoshiki Fujikane <[email protected]>
Signed-off-by: Yoshiki Fujikane <[email protected]>
@ffjlabo ffjlabo requested a review from Warashi January 23, 2025 04:46
Warashi
Warashi previously approved these changes Jan 23, 2025
Copy link
Contributor

@Warashi Warashi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Signed-off-by: Yoshiki Fujikane <[email protected]>
Copy link
Contributor

@Warashi Warashi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍🏻

@ffjlabo ffjlabo merged commit f50cc80 into master Jan 24, 2025
25 checks passed
@ffjlabo ffjlabo deleted the start-livestatereporter-on-pipedv1 branch January 24, 2025 05:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants