Skip to content

Commit

Permalink
docs: main readme
Browse files Browse the repository at this point in the history
Signed-off-by: Alex Aizman <[email protected]>
  • Loading branch information
alex-aizman committed Jan 28, 2025
1 parent dfb0be9 commit dcd1175
Show file tree
Hide file tree
Showing 3 changed files with 38 additions and 27 deletions.
10 changes: 5 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,11 +30,11 @@ For easy usage, management, and monitoring, there's also:
```console
$ ais <TAB-TAB>

bucket job storage remote-cluster prefetch evict create
object auth archive alias put rmo dsort
cluster show log ls start wait search
config help tls stop get blob-download
etl advanced performance download rmb cp
bucket job storage remote-cluster download create wait
object auth archive alias get evict cp
cluster show log rmo dsort stop ls
config help tls space-cleanup put scrub prefetch
etl advanced performance blob-download rmb start search
```

AIS runs natively on Kubernetes and features open format - thus, the freedom to copy or move your data from AIS at any time using the familiar Linux `tar(1)`, `scp(1)`, `rsync(1)` and similar.
Expand Down
10 changes: 5 additions & 5 deletions docs/cli.md
Original file line number Diff line number Diff line change
Expand Up @@ -115,11 +115,11 @@ The recommended and, actually, fastest way to get started with CLI is to type `a
```console
$ ais <TAB-TAB>

bucket job storage remote-cluster prefetch evict create
object auth archive alias put rmo dsort
cluster show log ls start wait search
config help tls stop get blob-download
etl advanced performance download rmb cp
bucket job storage remote-cluster download create wait
object auth archive alias get evict cp
cluster show log rmo dsort stop ls
config help tls space-cleanup put scrub prefetch
etl advanced performance blob-download rmb start search
```

These are the current set of top-level commands. Each command has its own extended help (the `--help` option) and, usually, multiple sub-commands
Expand Down
45 changes: 28 additions & 17 deletions docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,22 +18,33 @@ AIS [consistently shows balanced I/O distribution and linear scalability](https:

## Features

* **Deploys anywhere**. AIS clusters are immediately deployable on any commodity hardware, on any Linux machine(s).
* **Highly available** control and data planes, end-to-end data protection, self-healing, n-way mirroring, erasure coding, and arbitrary number of extremely lightweight access points.
* **REST API**. Comprehensive native HTTP-based API, as well as compliant [Amazon S3 API](/docs/s3compat.md) to run unmodified S3 clients and apps.
* **Unified namespace** across multiple [remote backends](/docs/providers.md) including Amazon S3, Google Cloud, and Microsoft Azure.
* **Network of clusters**. Any AIS cluster can attach any other AIS cluster, thus gaining immediate visibility and fast access to the respective hosted datasets.
* **Turn-key cache**. Can be used as a standalone highly-available protected storage and/or LRU-based fast cache. Eviction watermarks, as well as numerous other management policies, are per-bucket configurable.
* **ETL offload**. The capability to run I/O intensive custom data transformations *close to data* - offline (dataset to dataset) and inline (on-the-fly).
* **File datasets**. AIS can be immediately populated from any file-based data source (local or remote, ad-hoc/on-demand or via asynchronus batch).
* **Read-after-write consistency**. Reading and writing (as well as all other control and data plane operations) can be performed via any (random, selected, or load-balanced) AIS gateway (a.k.a. "proxy"). Once the first replica of an object is written and _finalized_ subsequent reads are guaranteed to view the same content. Additional copies and/or EC slices, if configured, are added asynchronously via `put-copies` and `ec-put` jobs, respectively.
* **Write-through**. In presence of any [remote backend](/docs/providers.md), AIS executes remote write (e.g., using vendor's SDK) as part of the [transaction](https://github.com/NVIDIA/aistore/blob/main/docs/overview.md#read-after-write-consistency) that places and _finalizes_ the first replica.
* **Small file datasets.** To serialize small files and facilitate batch processing, AIS supports TAR, TAR.GZ (or TGZ), ZIP, and TAR.LZ4 formatted objects (often called _shards_). Resharding (for optimal sorting and sizing), listing contained files (samples), appending to existing shards, and generating new ones from existing objects and/or client-side files - is also fully supported.
* **Kubernetes**. Provides for easy Kubernetes deployment via a separate GitHub [repo](https://github.com/NVIDIA/ais-k8s) and [AIS/K8s Operator](https://github.com/NVIDIA/ais-k8s/tree/master/operator).
* **Command line management**. Integrated powerful [CLI](/docs/cli.md) for easy management and monitoring.
* **Access control**. For security and fine-grained access control, AIS includes OAuth 2.0 compliant [Authentication Server (AuthN)](/docs/authn.md). A single AuthN instance executes CLI requests over HTTPS and can serve multiple clusters.
* **Distributed shuffle** extension for massively parallel resharding of very large datasets.
* **Batch jobs**. APIs and CLI to start, stop, and monitor documented [batch operations](/docs/batch.md), such as `prefetch`, `download`, copy or transform datasets, and many more.
***Deploys anywhere**. AIStore deploys _immediately_ and _anywhere_, from an [all-in-one](https://github.com/NVIDIA/aistore/blob/main/deploy/prod/docker/single/README.md) ready-to-use docker container and Google [Colab](https://aistore.nvidia.com/blog/2024/09/18/google-colab-aistore) notebook, on the one hand, to multi-petabyte [Kubernetes](https://github.com/NVIDIA/ais-k8s) clusters at NVIDIA data centers, on the other. There are [no limitations](https://github.com/NVIDIA/aistore/blob/main/docs/overview.md#no-limitations-principle) - the system will run on any Linux machine, virtual or physical.
***Highly available** control and data planes, end-to-end data protection, self-healing, n-way mirroring, erasure coding, and arbitrary number of extremely lightweight access points.
***REST API**. Comprehensive native HTTP-based API, as well as compliant [Amazon S3 API](/docs/s3compat.md) to run unmodified S3 clients and apps.
***Unified namespace** across multiple [remote backends](/docs/providers.md) including Amazon S3, Google Cloud, and Microsoft Azure.
***Network of clusters**. Any AIS cluster can attach any other AIS cluster, thus gaining immediate visibility and fast access to the respective hosted datasets.
***Turn-key cache**. Can be used as a standalone highly-available protected storage and/or LRU-based fast cache. Eviction watermarks, as well as numerous other management policies, are per-bucket configurable.
***ETL offload**. The capability to run I/O intensive custom data transformations *close to data* - offline (dataset to dataset) and inline (on-the-fly).
***File datasets**. AIS can be immediately populated from any file-based data source (local or remote, ad-hoc/on-demand or via asynchronus batch).
***Read-after-write consistency**. Reading and writing (as well as all other control and data plane operations) can be performed via any (random, selected, or load-balanced) AIS gateway (a.k.a. "proxy"). Once the first replica of an object is written and _finalized_ subsequent reads are guaranteed to view the same content. Additional copies and/or EC slices, if configured, are added asynchronously via `put-copies` and `ec-put` jobs, respectively.
***Write-through**. In presence of any [remote backend](/docs/providers.md), AIS executes remote write (e.g., using vendor's SDK) as part of the [transaction](https://github.com/NVIDIA/aistore/blob/main/docs/overview.md#read-after-write-consistency) that places and _finalizes_ the first replica.
***Small file datasets.** To serialize small files and facilitate batch processing, AIS supports TAR, TAR.GZ (or TGZ), ZIP, and TAR.LZ4 formatted objects (often called _shards_). Resharding (for optimal sorting and sizing), listing contained files (samples), appending to existing shards, and generating new ones from existing objects and/or client-side files - is also fully supported.
***Kubernetes**. Provides for easy Kubernetes deployment via a separate GitHub [repo](https://github.com/NVIDIA/ais-k8s) and [AIS/K8s Operator](https://github.com/NVIDIA/ais-k8s/tree/master/operator).
***Access control**. For security and fine-grained access control, AIS includes OAuth 2.0 compliant [Authentication Server (AuthN)](/docs/authn.md). A single AuthN instance executes CLI requests over HTTPS and can serve multiple clusters.
***Distributed shuffle** extension for massively parallel resharding of very large datasets.
***Batch jobs**. APIs and CLI to start, stop, and monitor documented [batch operations](/docs/batch.md), such as `prefetch`, `download`, copy or transform datasets, and many more.

For easy usage, management, and monitoring, there's also:
* **Integrated easy-to-use [CLI](/docs/cli.md)**, with top-level commands including:
```console
$ ais <TAB-TAB>

bucket job storage remote-cluster download create wait
object auth archive alias get evict cp
cluster show log rmo dsort stop ls
config help tls space-cleanup put scrub prefetch
etl advanced performance blob-download rmb start search
```

AIS runs natively on Kubernetes and features open format - thus, the freedom to copy or move your data from AIS at any time using the familiar Linux `tar(1)`, `scp(1)`, `rsync(1)` and similar.

Expand All @@ -45,7 +56,7 @@ For developers and data scientists, there's also:
* [Boto3 support](https://github.com/NVIDIA/aistore/tree/main/python/aistore/botocore_patch) for interoperability with AWS SDK for Python (aka [Boto3](https://boto3.amazonaws.com/v1/documentation/api/latest/index.html)) client
- and other [Botocore](https://github.com/boto/botocore) derivatives.

For the original AIStore **white paper** and design philosophy, for introduction to large-scale deep learning and the most recently added features, please see [AIStore Overview](/docs/overview.md) (where you can also find six alternative ways to work with existing datasets). For our 2024 KubeCon presentation, please see [AIStore: enhancing petascale Deep Learning across Cloud backends](https://www.youtube.com/watch?v=N-d9cbROndg).
For the original AIStore **white paper** and design philosophy, for introduction to large-scale deep learning and the most recently added features, please see [AIStore Overview](/docs/overview.md) (where you can also find six alternative ways to work with existing datasets). For our 2024 KubeCon presentation, please see [AIStore: Enhancing petascale Deep Learning across Cloud backends](https://www.youtube.com/watch?v=N-d9cbROndg).

Finally, [getting started](/docs/getting_started.md) with AIS takes only a few minutes.

Expand Down

0 comments on commit dcd1175

Please sign in to comment.