-
Notifications
You must be signed in to change notification settings - Fork 5
Features
Daniele Branchini edited this page Aug 11, 2015
·
1 revision
- All arkimet functionality besides metadata extraction and dataset recovery is file format agnostic.
- Data is treated like an opaque, read only binary string, that is never modified to guarantee integrity.
- Data files in the archive are only accessed using append operations, to avoid the risk of accidentally corrupting existing data.
- The extraction of metadata is very flexible, and it can be customized with the simple and well known LUA scripting language.
- Metadata contains timestamped annotations to track data workflow.
- Metadata can be summarised, to represent what data can be found in a big dataset without needing to access its contents. Summaries can be shared to build data catalogs.
- Remote data access is provided through arki-server, an HTTP server application.
- arki-server can serve data from local datasets, as well as from remote datasets served by other. arki-server instances (this allows, for example, to provide a single arki-server external front-end to various internal arki-servers in an organisation).
- arki-server can be run behind apache mod-proxy to provide encrypted (SSL) or authenticated access.
- Client data access is done using the featureful libCURL, and can access the server over SSL or through HTTP proxies.
- When performing a query, it is possible to extract only the summary of its results, as a quick preview before actually transfering the result data.
- Postprocessing chains can be provided by the server to transfer only the postprocessed data (e.g. transferring an average value instead of a large grid of data).
- File layout can be customised depending on data volumes (one file per day, one file per month, etc.)
- Each dataset can be configured to index a different set of metadata items, to provide the best tradeoff between indexing speed, disk space used by the index and query speed.
- arkimet can detect if a datum already exists in a dataset, and either replace the old version or refuse to import the new one. It is possible to customize what metadata fields make data unique in each dataset.
- Datasets are self-contained, so it is possible to store them in offline media, and query them right away as soon as the offline media comes online.
- A powerful and flexible suite of commandline tools allows to easily integrate arkimet into automated data processing chains in production systems.
- arki-server not only allows remote access to the datasets, but it also provides a low-level, web-based query interface.
- ArkiWEB (soon to be released) is a web-based front-end to arkimet that provides simple and powerful browsing and data retrieval for end users.