Develop upstream sync 230201 #1987

i-chaochen · 2023-02-01T19:46:29Z

Disable unit tests:

multi_client_test_nccl_local_2gpus Failed UT: multi_client_test_nccl_local_2gpus #1980
device_tracer_test_gpu https://github.com/ROCmSoftwarePlatform/frameworks-internal/issues/3175
to fix estimator_export error, I put estimator_export back to tf_export tensorflow@3837d0f
cudnn_determinstic_ops_test needs to check. https://github.com/ROCmSoftwarePlatform/frameworks-internal/issues/3563

~~5. tensor_or_memref.h is failed to build at cpu test~~

Also, roc_blas.cc needs to be upstreaming.
https://github.com/ROCmSoftwarePlatform/frameworks-internal/issues/3547

Before this change, XLA compilation fails when calling `tf.image.non_max_suppression_padded` with undertemined batch size for input boxes. The reason behind this failure is because: 1. When calling this op with parameter `sorted_input=False`, `_sort_scores_and_boxes` will be called. 2. In `_sort_scores_and_boxes`, if input boxes has dynamic batch size, during XLA compilation time, `tf.shape(boxes)[1]` will be unknown because of this reshape operation. 3. In the while loop body (`supression_loop_body`), `num_tiles` cannot be determined because `tf.shape(boxes)[1]` is unkonwn during compilation. 4. The op `math_ops.range(num_tiles)` failed to be compiled because `num_tiles` is unknown during compilation time. This change will fix this issue by adding `num_boxes` to the reshape op. PiperOrigin-RevId: 504902678

Prior to this change, it only accepted 1, false, or true. PiperOrigin-RevId: 504902827

In a similar fashion to tfl.sin and tlf.cos legalization, a tosa.table provides an atan approximation. Additional logic is then used to determine the correct quadrant of the atan2 function. Signed-off-by: Luke Hutton <[email protected]> Change-Id: Iae1384009d825d01e5cf48ad7c3ff8fba77114cf

* remove unnecessary type checks * add note about numerical behaviour of std::atan2 * improve error message for expected inputs * undo change updating copyright year Change-Id: Iea8339da437a5ff3e6fe065c715c2c97e696fdbb

Currently users have to provide dummy deleter if they want to declare ExecutionReference variable. This change solves that issue by declaring ExecutionReference as a class, which wraps std::unique_ptr<ExecutionContext, void(*)(ExecutionContext *)>. Ex. struct Test { ExecutionReference exec_ref; }; PiperOrigin-RevId: 504904730

…e_spec_registry.py`. PiperOrigin-RevId: 504905987

PiperOrigin-RevId: 504906696

…hod in TF selective_registration_header_lib. PiperOrigin-RevId: 504910088

PiperOrigin-RevId: 504910311

PiperOrigin-RevId: 504910444

Move verifyRecvOp and verifyInfeedOp with TokenType openxla/stablehlo#852 Move inferFftOp and verifyRngOp with Enums openxla/stablehlo#853 PiperOrigin-RevId: 504910725

… to `framework/type_spec_registry.py`. PiperOrigin-RevId: 504911860

Ensure that there is at least a single non control input to the Identity/IdentityN. PiperOrigin-RevId: 504913650

…` to `framework/type_spec_registry.py`. PiperOrigin-RevId: 504917585

PiperOrigin-RevId: 504920230

PiperOrigin-RevId: 504922596

PiperOrigin-RevId: 504923081

PiperOrigin-RevId: 504928473

Export GetBestAlgorithm fom GemmAlgorithmPicker so that runtime autotuning use it as a subroutine. PiperOrigin-RevId: 504928946

PiperOrigin-RevId: 504930976

The shape refiner reruns shape inference for nested function calls every time a materialized argument is requested. Before this change the constant folding mechanism used in the refiner stopped traversing the subgraph right after attending the first unresolved argument. After this change it will continue traversal to cover all unresolved arguments. This change reduces preprocessing time of some of our models by 75%. PiperOrigin-RevId: 504931297

PiperOrigin-RevId: 504935198

PiperOrigin-RevId: 504937807

…otuner. PiperOrigin-RevId: 504939086

PiperOrigin-RevId: 504939291

This breaks a cycle between "python/framework/graph_util_impl.py" and "python/framework/convert_to_constants.py". PiperOrigin-RevId: 504939493

PiperOrigin-RevId: 504940089

PiperOrigin-RevId: 504940378

* Cleanup of error messages. * Cleanup casting inputs/outputs. * Use CHECK-DAG instead of CHECK for constants. * Spelling Change-Id: Ibf01ac5b711944bb2efbd7184820a304cbae8501 Signed-off-by: Luke Hutton <[email protected]>

PiperOrigin-RevId: 504949702

This adds a pass that provides some debug info with which basic line number info can be generated. Adapted from Flang's AddDebugFoundationPass. PiperOrigin-RevId: 506213461

PiperOrigin-RevId: 506217195

PiperOrigin-RevId: 506225078

…izontal_loop_fusion PiperOrigin-RevId: 5062358

This fixes double-free errors or memory leaks for example when the running of the HLO is unsuccessful. The old code-path is also left there, as a lot of our code depends on the ability to run the same HLO multiple times without reallocating the input buffers. PiperOrigin-RevId: 506238363

PiperOrigin-RevId: 506239134

PiperOrigin-RevId: 506239156

PiperOrigin-RevId: 506240202

PiperOrigin-RevId: 506243978

PiperOrigin-RevId: 506244287

PiperOrigin-RevId: 506252805

i-chaochen · 2023-02-01T20:04:38Z

retest Ubuntu-GPU-single please
retest Ubuntu-GPU-multi please
retest Ubuntu-CPU please
retest Ubuntu-sanity please

jayfurmanek · 2023-02-02T14:45:05Z

retest Ubuntu-sanity please

i-chaochen · 2023-02-02T22:40:45Z

retest Ubuntu-GPU-single please

tensorflower-gardener and others added 30 commits January 26, 2023 11:59

[JAX] Allow JAX_USE_PJRT_C_API_ON_TPU to be 0

1869d54

Prior to this change, it only accepted 1, false, or true. PiperOrigin-RevId: 504902827

address comments

cbe054a

* remove unnecessary type checks * add note about numerical behaviour of std::atan2 * improve error message for expected inputs * undo change updating copyright year Change-Id: Iea8339da437a5ff3e6fe065c715c2c97e696fdbb

Update ops references from framework/type_spec.py to `framework/typ…

02438f1

…e_spec_registry.py`. PiperOrigin-RevId: 504905987

Update the metric.

f707809

PiperOrigin-RevId: 504906696

Factor out get_default_ops and make get_ops_from_nodedef a public met…

4b46509

…hod in TF selective_registration_header_lib. PiperOrigin-RevId: 504910088

For dynamic shapes, need to read metadata to get literal's real size.

40763f2

PiperOrigin-RevId: 504910311

Removed code related to PipelineSubMeshes.

6285a27

PiperOrigin-RevId: 504910444

[mhlo] Reuse 4 shape functions from StableHLO

27e0323

Move verifyRecvOp and verifyInfeedOp with TokenType openxla/stablehlo#852 Move inferFftOp and verifyRngOp with Enums openxla/stablehlo#853 PiperOrigin-RevId: 504910725

Update references in python/data/util from framework/type_spec.py…

a342ad9

… to `framework/type_spec_registry.py`. PiperOrigin-RevId: 504911860

[TFG] Fix index out of bound issue in TFG shape inference pass.

62efa48

Ensure that there is at least a single non control input to the Identity/IdentityN. PiperOrigin-RevId: 504913650

Update references in python/ops/linalg from `framework/type_spec.py…

024892d

…` to `framework/type_spec_registry.py`. PiperOrigin-RevId: 504917585

Migrate remaining tf.function validation to FunctionType

b7d4db7

PiperOrigin-RevId: 504920230

Update selective_build_scripts with NDK r21e

29bb19a

PiperOrigin-RevId: 504922596

[XLA:GPU] Layout normalization for dynamic-update-slice

f481e61

PiperOrigin-RevId: 504923081

Preserve HLO names when removing infeed and outfeed

f67e32b

PiperOrigin-RevId: 504928473

[XLA:GPU] Reuse GetBestAlgorithm in runtime autotuning

dfc491d

Export GetBestAlgorithm fom GemmAlgorithmPicker so that runtime autotuning use it as a subroutine. PiperOrigin-RevId: 504928946

Add check for "ksize" argument in MaxPoolWithArgmax

bec8b93

PiperOrigin-RevId: 504930976

Migrate remaining tf.function validation to FunctionType

88ec865

PiperOrigin-RevId: 504935198

[NFC] Use correct thunk kind for NcclCollectivePermuteStartThunk

a8ae7be

PiperOrigin-RevId: 504937807

#tf-data A stop gap fix to add available memory check in prefetch aut…

9f4acbb

…otuner. PiperOrigin-RevId: 504939086

#tf-data Restart the stage_based_autotune experiment at 5%

a887b9f

PiperOrigin-RevId: 504939291

Move MatMul flop calculation methods to flops_registry.py.

8326cb7

This breaks a cycle between "python/framework/graph_util_impl.py" and "python/framework/convert_to_constants.py". PiperOrigin-RevId: 504939493

Merge pull request tensorflow#59436 from awsaf49:to_ordinal

df98914

PiperOrigin-RevId: 504940089

Change DTensor to pass raw Module to ParallelExecutor

c7aa8db

PiperOrigin-RevId: 504940378

address comments

3986b07

* Cleanup of error messages. * Cleanup casting inputs/outputs. * Use CHECK-DAG instead of CHECK for constants. * Spelling Change-Id: Ibf01ac5b711944bb2efbd7184820a304cbae8501 Signed-off-by: Luke Hutton <[email protected]>

#tf-data-service Add thread annotations.

91e8d71

PiperOrigin-RevId: 504949702

tyb0807 and others added 19 commits January 31, 2023 22:23

[xla:cpu] Add debug info to XLA CPU pipeline

12eef94

This adds a pass that provides some debug info with which basic line number info can be generated. Adapted from Flang's AddDebugFoundationPass. PiperOrigin-RevId: 506213461

update fuzztest dependency

78034c6

PiperOrigin-RevId: 506217195

Remove references to stream_executor/lib

9edf0d9

PiperOrigin-RevId: 506225078

Merge pull request tensorflow#58763 from shawnwang18:upstream/xla_hor…

722fbb8

…izontal_loop_fusion PiperOrigin-RevId: 5062358

compat: Update forward compatibility horizon to 2023-02-01

98c1863

PiperOrigin-RevId: 506239134

Update GraphDef version to 1394.

e784dac

PiperOrigin-RevId: 506239156

Fix a typo in the documentation in preemption_watcher.py

d38f8f0

PiperOrigin-RevId: 506240202

Rollback of PR tensorflow#58763

6802154

PiperOrigin-RevId: 506243978

[GmlSt] Group tiling passes for cpu, gpu and triton.

b753419

PiperOrigin-RevId: 506244287

Propagate quantize_params in prepare_pass

f206f16

PiperOrigin-RevId: 506252805

iinnit merge

f0b3a19

resolve all conflicts

ead926e

enable rocm/hip for multi_client_test

57d4b8d

disable nccl 2gpus tests

997bcd9

disable device_tracer_test_gpu as it's flaky

5a98ecf

disable tensor_or_memref_test and interpreter_value_test

3570021

add rocsolvoler_warpper.h env.h

48cd8ae

keeep estimator_export in tf_export

6fc8030

cudnn_deterministic_ops_test is disable again

2b02f29

i-chaochen requested a review from jayfurmanek February 2, 2023 22:40

fix tensor_or_memref.h build error and enable tensor_or_memref_test

15bddce

i-chaochen requested a review from rahulbatra85 February 3, 2023 11:21

i-chaochen mentioned this pull request Feb 3, 2023

Develop upstream sync 230130 #1981

Closed

jayfurmanek approved these changes Feb 3, 2023

View reviewed changes

i-chaochen merged commit a61751c into develop-upstream Feb 3, 2023

i-chaochen mentioned this pull request Mar 15, 2023

Develop upstream sync 230313 #2020

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Develop upstream sync 230201 #1987

Develop upstream sync 230201 #1987

i-chaochen commented Feb 1, 2023 •

edited

Loading

i-chaochen commented Feb 1, 2023

jayfurmanek commented Feb 2, 2023

i-chaochen commented Feb 2, 2023

Develop upstream sync 230201 #1987

Develop upstream sync 230201 #1987

Conversation

i-chaochen commented Feb 1, 2023 • edited Loading

i-chaochen commented Feb 1, 2023

jayfurmanek commented Feb 2, 2023

i-chaochen commented Feb 2, 2023

i-chaochen commented Feb 1, 2023 •

edited

Loading