-
Argonne National Laboratory
- Lemont, IL
-
21:14
- 12h behind - https://scholar.google.com/citations?user=dd7fUtEAAAAJ&hl=en
-
-
fsdp_proposal Public
Forked from khossain4337/fsdp_proposalCompare FSDP with DeepSpeed
TeX UpdatedDec 18, 2024 -
user-guides Public
Forked from argonne-lcf/user-guidesALCF Systems User Documentation
HTML UpdatedDec 17, 2024 -
-
pyutils Public
This is a set of utils that I created throughout the years
-
dlio_ml_workloads Public
Forked from argonne-lcf/dlio_ml_workloadsReference workloads for DLIO Benchmark
-
horovod Public
Forked from horovod/horovodDistributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.
Python Other UpdatedMay 30, 2024 -
dlio_benchmark Public
Forked from argonne-lcf/dlio_benchmarkAn I/O benchmark for deep Learning applications
Python Apache License 2.0 UpdatedApr 12, 2024 -
-
Megatron-DeepSpeed Public
Forked from argonne-lcf/Megatron-DeepSpeedOngoing research training transformer language models at scale, including: BERT & GPT-2
Python Other UpdatedMar 15, 2024 -
DeepSpeed Public
Forked from deepspeedai/DeepSpeedDeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Python Apache License 2.0 UpdatedMar 15, 2024 -
E3SM-IO Public
Forked from Parallel-NetCDF/E3SM-IOBenchmark programs using the I/O pattern of E3SM
C++ Other UpdatedFeb 15, 2024 -
-
-
dlio-profiler Public
Forked from LLNL/dftracerA low-level profiler for capture I/O calls from deep learning applications.
C++ MIT License UpdatedOct 19, 2023 -
vol-cache Public
Forked from HDFGroup/vol-cacheHDF5 Cache VOL connector for caching data on fast storage layers and moving data asynchronously to the parallel file system to hide I/O overhead.
C BSD 3-Clause "New" or "Revised" License UpdatedSep 20, 2023 -
-
MLPerf_training Public
Forked from mlcommons/trainingReference implementations of MLPerf™ training benchmarks
-
-
E4S-Documenter Public
Forked from E4S-Project/E4S-DocumenterA tool to generate documentation for a project based on project metadata (README, Changelog, License, etc.) stored in a yaml file.
Python MIT License UpdatedApr 10, 2023 -
-
h5bench Public
Forked from hpc-io/h5benchA benchmark suite for measuring HDF5 performance.
C Other UpdatedDec 6, 2022 -
ai-science-training-series Public
Forked from argonne-lcf/ai-science-training-seriesJupyter Notebook UpdatedSep 27, 2022 -
incubator-mxnet Public
Forked from apache/mxnetLightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more
C++ Apache License 2.0 UpdatedJun 21, 2022 -
dlio_profiling Public
This repo demonstrate how to profile I/O for deep learning applications. This is based on VaniDL
Python UpdatedMay 26, 2022 -
training_results_v1.1 Public
Forked from mlcommons/training_results_v1.1Python Other UpdatedApr 5, 2022 -
amrex Public
Forked from AMReX-Codes/amrexAMReX: Software Framework for Block Structured AMR
C++ Other UpdatedJan 26, 2022 -
scorpio Public
Forked from E3SM-Project/scorpioA high-level Parallel I/O Library for structured grid applications
C UpdatedJan 21, 2022 -
vanidl Public
Forked from hariharan-devarajan/vanidlVaniDL is an tool for analyzing I/O patterns and behavior with Deep Learning Applications.
Python MIT License UpdatedNov 2, 2021 -
vol-async Public
Forked from HDFGroup/vol-asyncHDF5 Asynchronous I/O VOL connector that enables asynchronous I/O for HDF5 applications
C Other UpdatedSep 17, 2021