Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Proportional Uneven RW Inference Sharding (#2734)
Summary: Support bucketization aware inference sharding in TGIF for ZCH bucket boundaries from training. A "best effort" sharding is performed across bucket boundaries proportional to memory list. * Added bucketization awareness to RW sharding, * TGIF sharding now ensures at most 1 bucket difference across equal memory uneven shards as opposed to previous logic of remainder rows to last shard * InferRWSparseDist checks for customized embedding_shard_metadata for uneven shards before dividing evenly Differential Revision: D69057627
- Loading branch information