Skip to content

Latest commit

 

History

History
74 lines (59 loc) · 1.7 KB

HISTORY.rst

File metadata and controls

74 lines (59 loc) · 1.7 KB

History

0.1.16 (2023-11-15)

  • TBD

0.1.15 (2023-11-15)

  • Fix P2P issues

0.1.14 (2023-11-12)

  • Normalize dataframe column name to avg_latency_ms
  • Fix size_in_bytes to take into account dtype

0.1.13 (2023-11-06)

  • ModelArts multi-node entrypoint script
  • Fix local/global rank ambiguity

0.1.12 (2023-10-13)

  • Add device event timing in P2P benchmarks (bug reported by Shanlan li)

0.1.11 (2023-10-08)

  • Fix alltoall device issue

0.1.10 (2023-10-07)

  • Fix elapsed time measurement for NPU/CUDA
  • Improve plotting scripts

0.1.9 (2023-10-02)

  • Fix alltoall torch.arange issue
  • Improve further plotting scripts by directly using torchrun

0.1.8 (2023-09-29)

  • Measure time in nanoseconds
  • Fix bandwidth measurements (MR: 28)

0.1.7 (2023-09-28)

  • Improve scripts functionality
  • Introduce plotter scripts for collectives,latency and bandwidth

0.1.6 (2023-09-26)

  • Fix CSV output filename
  • Minor plotting improvements

0.1.5 (2023-09-21)

  • Fix bumpversion quotes
  • Fixed input tensor size bug on reduce_scatter benchmark
  • Cover all torch dtypes (int8/int16/uint8/etc.)

0.1.2 (2023-09-20)

  • Fix critical bug on setting up the NPU/HCCL environment
  • Introduce dtype as an argument parameter
  • Port all_gather and reduce_scatter collectives
  • Added basic plotting utility under scripts/

0.1.1 (2023-06-15)

  • First release containing P2P OMB-Py benchmarks
  • Support allreduce and broadcast collectives
  • Add torch_npu environment check