Implement Log Aggregation & Searching #88

dgarnitz · 2023-10-31T05:14:23Z

VectorFlow has many logs spread of over different containers. We need these logs to be aggregated into a searchable form.

One option could be to use Kibana with Elastic Search. If the logs have metrics in them, we may want to put those into prometheus

dgarnitz · 2024-04-04T18:52:07Z

Scope this out. We need this upgrade to solve this issue.

What to Build

Add an additional database table that stores error messages. Do this by adding a new model object. Make sure that the job_id and batch_id are tracked
For now this can store the entire stacktrace, as long as its not too long,
Alter the code in the api, worker and extractor so that whenever an error is logged, it is also saved to the database. add a method to a utils file, save_error() that does this and use the util method in each file
Add an endpoint that returns all the errors for a given batch_id. return a JSON object with a field errors that is an array of stack traces
Add an endpoint that returns all the errors for a given job_id. return a JSON object with a field errors that is a dictionary featuring batch_id as the key and an array of stack traces as the value

dgarnitz · 2024-04-08T20:05:31Z

No need for kibana or prometheus, just store in the DB

dgarnitz mentioned this issue Apr 4, 2024

Bug: Locally error's are not returned to user #47

Open