-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add benchmarking dataset with labelled anomalies for scoring performance of detector algorithms #12
Comments
@ecomodeller I found some datasets with labelled anomalies here: https://github.com/numenta/NAB |
@Rhadhi Have you checked out the license for that repo? it seems to be quite strict and copy-left, so if we want to use material from the numenta/NAB repo we need to change our license to the same one (AGPL-3.0 License) as far as I can tell. What do you think? If I am right, making our repo AGPL would then imply that anyone using our repo would also have to make it AGPL... maybe not what we want? |
I don't know any open datasets at DHI that we can use. We have to ask around and see if someone has some annotated dataset they are willing to share. There are lots of data, but not so many with labels and probably even fewer that are public, unfortunately. |
I will try to ask around on DHI yammer for labelled data sets with anomalies. @ecomodeller Do you have labels for the DMI data set we have in repo? Otherwise I will try to label the obvious ones with the algorithms, e.g. |
@laurafroelich @ecomodeller @akfDHI How do you like this message to be posted on yammer: We are trying to establish best practices and automated ways of identifying anomalies/outliers in time series data.
Currently we are working on algorithms based on everything from simple range checks to machine learning models. Check out and potentially contribute to our open source anomaly detection python package on DHI's Github here: https://github.com/DHI/anomalydetection |
Sounds good to me :) |
Can we make an interactive application to assist the labelling process?
|
Sounds good to me too. Which Yammer channel?
…________________________________
From: Laura Froelich <[email protected]>
Sent: Friday, 29 January 2021 06.36
To: DHI/anomalydetection <[email protected]>
Cc: Anne Katrine V.Falk <[email protected]>; Mention <[email protected]>
Subject: Re: [DHI/anomalydetection] Add benchmarking dataset with labelled anomalies for scoring performance of detector algorithms (#12)
Sounds good to me :)
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub<#12 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AIECKWFY5HSULGVL4ZQVLS3S4JCMXANCNFSM4WKJFNNQ>.
|
@ecomodeller There is one open source tool here: https://trainset.geocene.com/ |
@ecomodeller Is this relevant: http://www.marineinsitu.eu/dashboard ? |
We got a labelled dataset from an actual DHI case based on groundwater measurements. Unfortunately, the dataset cannot be published publicly on github. |
Please note that we now have an interactive application for labelling outliers and training a detector. |
Do you know about any (open source) datasets at DHI that has labelled anomalies that we can use for testing? @ecomodeller @laurafroelich @akfDHI
The text was updated successfully, but these errors were encountered: