[WIP] Set allocation as loop in fusion segmentor #3880

Priya2698 · 2025-02-13T02:20:13Z

No description provided.

Priya2698 · 2025-02-13T02:20:43Z

!test

github-actions · 2025-02-13T02:21:00Z

Description

Added loop allocation for sharded tensors
Updated sharded tensor check logic
Commented out unnecessary pass in pre-segmenter

Changes walkthrough 📝

Relevant files

Enhancement

fusion_segmenter.cpp `Added loop allocation for sharded tensors` csrc/fusion_segmenter.cpp Added `setAllocationAsLoopForShardedTvs` method to finalize Implemented logic to set allocation domain as loop domain for sharded tensors	+17/-0
utils.cpp `Updated sharded tensor check logic` csrc/multidevice/utils.cpp Updated `isSharded` function to check both allocation and loop domains Added error handling for multiple sharded axes	+23/-12
fusion_segmenter.h `Added method declaration` csrc/fusion_segmenter.h Added declaration for `setAllocationAsLoopForShardedTvs` method	+3/-0

Miscellaneous

pre_segmenter.cpp `Commented out unnecessary pass` csrc/preseg_passes/pre_segmenter.cpp Commented out `MakeReshardingContiguousPass` as it may be unnecessary	+1/-1

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

🧪 No relevant tests

⚡ Recommended focus areas for review

Performance Impact

The new function setAllocationAsLoopForShardedTvs modifies the allocation domain of sharded tensors. It is important to evaluate the performance impact of this change and ensure that it meets the expected performance goals.

void SegmentedFusion::setAllocationAsLoopForShardedTvs() {
  auto set_allocation_as_loop = [](std::vector<Val*> vals) {
    auto tvs = ir_utils::filterByType<TensorView>(vals);
    std::for_each(tvs.begin(), tvs.end(), [](TensorView* tv) {
      if (isSharded(tv)) {
          tv->setAllocationDomain(tv->getLoopDomain(), true);
        }
      });
  };

  for (auto group : groups()) {
    set_allocation_as_loop(group->inputs());
    set_allocation_as_loop(group->outputs());
  }
}

Logic Verification

The logic for determining if a TensorView is sharded has been updated. It is crucial to verify that this new logic correctly identifies sharded tensors and does not introduce any false positives or negatives.

bool isSharded(const TensorView* tv) {
  // First check allocation domain if available, or the logical domain.
  auto num_sharded_axes = std::count_if(
      tv->getMaybeAllocationDomain().begin(),
      tv->getMaybeAllocationDomain().end(),
      [](IterDomain* id) { return id->isDeviceDim(); });

  if (num_sharded_axes == 1) {
    return true;
  }

  // Check if only the loop domain is sharded.
  // It is possible if the allocation domain has not been set yet.
  if (num_sharded_axes == 0) {
    num_sharded_axes = std::count_if(
      tv->getLoopDomain().begin(),
      tv->getLoopDomain().end(),
      [](IterDomain* id) { return id->isDeviceDim(); });
  }

  NVF_ERROR(
      num_sharded_axes <= 1,
      "Multiple IterDomains parallelized on DIDx in TensorView ",
      tv);

  return num_sharded_axes == 1;

Commented Code

The line // OptimizationPass<MakeReshardingContiguousPass>::runPass(fusion); is commented out. It is important to understand why this pass is not being run and whether it is necessary for the overall functionality or performance of the code.

// OptimizationPass<MakeReshardingContiguousPass>::runPass(fusion);

set alloc as loop in segmentor

23aef6b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] Set allocation as loop in fusion segmentor #3880

[WIP] Set allocation as loop in fusion segmentor #3880

Priya2698 commented Feb 13, 2025

Priya2698 commented Feb 13, 2025

github-actions bot commented Feb 13, 2025

[WIP] Set allocation as loop in fusion segmentor #3880

Are you sure you want to change the base?

[WIP] Set allocation as loop in fusion segmentor #3880

Conversation

Priya2698 commented Feb 13, 2025

Priya2698 commented Feb 13, 2025

github-actions bot commented Feb 13, 2025

Description

Changes walkthrough 📝

PR Reviewer Guide 🔍