forked from facebookresearch/dlrm
-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
17 changed files
with
4,078 additions
and
33 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
# Code of Conduct | ||
|
||
Facebook has adopted a Code of Conduct that we expect project participants to adhere to. | ||
Please read the [full text](https://code.fb.com/codeofconduct/) | ||
so that you can understand what actions will and will not be tolerated. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,36 @@ | ||
# Contributing to DLRM | ||
We want to make contributing to this project as easy and transparent as | ||
possible. | ||
|
||
## Pull Requests | ||
We actively welcome your pull requests. | ||
|
||
1. Fork the repo and create your branch from `master`. | ||
2. If you've added code that should be tested, add tests. | ||
3. If you've changed APIs, update the documentation. | ||
4. Ensure the test suite passes. | ||
5. Make sure your code lints. | ||
6. If you haven't already, complete the Contributor License Agreement ("CLA"). | ||
|
||
## Contributor License Agreement ("CLA") | ||
In order to accept your pull request, we need you to submit a CLA. You only need | ||
to do this once to work on any of Facebook's open source projects. | ||
|
||
Complete your CLA here: <https://code.facebook.com/cla> | ||
|
||
## Issues | ||
We use GitHub issues to track public bugs. Please ensure your description is | ||
clear and has sufficient instructions to be able to reproduce the issue. | ||
|
||
Facebook has a [bounty program](https://www.facebook.com/whitehat/) for the safe | ||
disclosure of security bugs. In those cases, please go through the process | ||
outlined on that page and do not file a public issue. | ||
|
||
## Coding Style | ||
* 4 spaces for indentation rather than tabs | ||
* 80 character line length | ||
* in general, please maintain a consistent style with the rest of the code | ||
|
||
## License | ||
By contributing to DLRM, you agree that your contributions will be licensed | ||
under the LICENSE file in the root directory of this source tree. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,21 @@ | ||
MIT License | ||
|
||
Copyright (c) Facebook, Inc. and its affiliates. | ||
|
||
Permission is hereby granted, free of charge, to any person obtaining a copy | ||
of this software and associated documentation files (the "Software"), to deal | ||
in the Software without restriction, including without limitation the rights | ||
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell | ||
copies of the Software, and to permit persons to whom the Software is | ||
furnished to do so, subject to the following conditions: | ||
|
||
The above copyright notice and this permission notice shall be included in all | ||
copies or substantial portions of the Software. | ||
|
||
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR | ||
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, | ||
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE | ||
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER | ||
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, | ||
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE | ||
SOFTWARE. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,149 @@ | ||
#!/bin/bash | ||
# Copyright (c) Facebook, Inc. and its affiliates. | ||
# | ||
# This source code is licensed under the MIT license found in the | ||
# LICENSE file in the root directory of this source tree. | ||
|
||
#check if extra argument is passed to the test | ||
if [[ $# == 1 ]]; then | ||
dlrm_extra_option=$1 | ||
else | ||
dlrm_extra_option="" | ||
fi | ||
#echo $dlrm_extra_option | ||
|
||
build=1 | ||
cpu=1 | ||
gpu=1 | ||
pt=1 | ||
c2=1 | ||
|
||
ncores=28 #12 #6 | ||
nsockets="0" | ||
|
||
ngpus="1 2 4 8" | ||
|
||
numa_cmd="numactl --physcpubind=0-$((ncores-1)) -m $nsockets" #run on one socket, without HT | ||
dlrm_pt_bin="python dlrm_s_pytorch.py" | ||
dlrm_c2_bin="python dlrm_s_caffe2.py" | ||
|
||
data=random #synthetic | ||
print_freq=100 | ||
rand_seed=727 | ||
|
||
c2_net="async_scheduling" | ||
|
||
#Model param | ||
mb_size=2048 #1024 #512 #256 | ||
nbatches=1000 #500 #100 | ||
bot_mlp="512-512-64" | ||
top_mlp="1024-1024-1024-1" | ||
emb_size=64 | ||
nindices=100 | ||
emb="1000000-1000000-1000000-1000000-1000000-1000000-1000000-1000000" | ||
interaction="dot" | ||
|
||
#_args="--mini-batch-size="${mb_size}\ | ||
_args=" --num-batches="${nbatches}\ | ||
" --data-generation="${data}\ | ||
" --arch-mlp-bot="${bot_mlp}\ | ||
" --arch-mlp-top="${top_mlp}\ | ||
" --arch-sparse-feature-size="${emb_size}\ | ||
" --arch-embedding-size="${emb}\ | ||
" --num-indices-per-lookup="${nindices}\ | ||
" --arch-interaction-op="${interaction}\ | ||
" --numpy-rand-seed="${rand_seed}\ | ||
" --print-freq="${print_freq}\ | ||
" --print-time"\ | ||
" --enable-profiling " | ||
|
||
c2_args=" --caffe2-net-type="${c2_net} | ||
|
||
if [ $build = 1 ]; then | ||
BUCK_DISTCC=0 buck build @mode/opt //experimental/mnaumov/hw/dlrm:dlrm_s_pytorch //experimental/mnaumov/hw/dlrm:dlrm_s_caffe2 | ||
fi | ||
|
||
# CPU Benchmarking | ||
if [ $cpu = 1 ]; then | ||
echo "--------------------------------------------" | ||
echo "CPU Benchmarking - running on $ncores cores" | ||
echo "--------------------------------------------" | ||
if [ $pt = 1 ]; then | ||
outf="model1_CPU_PT_$ncores.log" | ||
outp="dlrm_s_pytorch.prof" | ||
echo "-------------------------------" | ||
echo "Running PT (log file: $outf)" | ||
echo "-------------------------------" | ||
cmd="$numa_cmd $dlrm_pt_bin --mini-batch-size=$mb_size $_args $dlrm_extra_option > $outf" | ||
echo $cmd | ||
eval $cmd | ||
min=$(grep "iteration" $outf | awk 'BEGIN{best=999999} {if (best > $7) best=$7} END{print best}') | ||
echo "Min time per iteration = $min" | ||
# move profiling file(s) | ||
mv $outp ${outf//".log"/".prof"} | ||
mv ${outp//".prof"/".json"} ${outf//".log"/".json"} | ||
|
||
fi | ||
if [ $c2 = 1 ]; then | ||
outf="model1_CPU_C2_$ncores.log" | ||
outp="dlrm_s_caffe2.prof" | ||
echo "-------------------------------" | ||
echo "Running C2 (log file: $outf)" | ||
echo "-------------------------------" | ||
cmd="$numa_cmd $dlrm_c2_bin --mini-batch-size=$mb_size $_args $c2_args $dlrm_extra_option 1> $outf 2> $outp" | ||
echo $cmd | ||
eval $cmd | ||
min=$(grep "iteration" $outf | awk 'BEGIN{best=999999} {if (best > $7) best=$7} END{print best}') | ||
echo "Min time per iteration = $min" | ||
# move profiling file (collected from stderr above) | ||
mv $outp ${outf//".log"/".prof"} | ||
fi | ||
fi | ||
|
||
# GPU Benchmarking | ||
if [ $gpu = 1 ]; then | ||
echo "--------------------------------------------" | ||
echo "GPU Benchmarking - running on $ngpus GPUs" | ||
echo "--------------------------------------------" | ||
for _ng in $ngpus | ||
do | ||
# weak scaling | ||
# _mb_size=$((mb_size*_ng)) | ||
# strong scaling | ||
_mb_size=$((mb_size*1)) | ||
_gpus=$(seq -s, 0 $((_ng-1))) | ||
cuda_arg="CUDA_VISIBLE_DEVICES=$_gpus" | ||
echo "-------------------" | ||
echo "Using GPUS: "$_gpus | ||
echo "-------------------" | ||
if [ $pt = 1 ]; then | ||
outf="model1_GPU_PT_$_ng.log" | ||
outp="dlrm_s_pytorch.prof" | ||
echo "-------------------------------" | ||
echo "Running PT (log file: $outf)" | ||
echo "-------------------------------" | ||
cmd="$cuda_arg $dlrm_pt_bin --mini-batch-size=$_mb_size $_args --use-gpu $dlrm_extra_option > $outf" | ||
echo $cmd | ||
eval $cmd | ||
min=$(grep "iteration" $outf | awk 'BEGIN{best=999999} {if (best > $7) best=$7} END{print best}') | ||
echo "Min time per iteration = $min" | ||
# move profiling file(s) | ||
mv $outp ${outf//".log"/".prof"} | ||
mv ${outp//".prof"/".json"} ${outf//".log"/".json"} | ||
fi | ||
if [ $c2 = 1 ]; then | ||
outf="model1_GPU_C2_$_ng.log" | ||
outp="dlrm_s_caffe2.prof" | ||
echo "-------------------------------" | ||
echo "Running C2 (log file: $outf)" | ||
echo "-------------------------------" | ||
cmd="$cuda_arg $dlrm_c2_bin --mini-batch-size=$_mb_size $_args $c2_args --use-gpu $dlrm_extra_option 1> $outf 2> $outp" | ||
echo $cmd | ||
eval $cmd | ||
min=$(grep "iteration" $outf | awk 'BEGIN{best=999999} {if (best > $7) best=$7} END{print best}') | ||
echo "Min time per iteration = $min" | ||
# move profiling file (collected from stderr above) | ||
mv $outp ${outf//".log"/".prof"} | ||
fi | ||
done | ||
fi |
Oops, something went wrong.