Run benchmark

Prerequisites

To achieve best performance please reference to Recommendations to achieve best performance;
Build intelcaffe by using scripts 'scripts/build_intelcaffe.sh'. If you want to do single node performance benchmark, you can use 'scripts/build_intelcaffe.sh --compiler icc/gcc'; otherwise if you want to do multinodes performance benchmark, you can use 'scripts/build_intelcaffe.sh --multinode --compiler icc/gcc --layer_timing'.

Steps to run benchmark

Prepare your own benchmark config file, you can find a template under 'scripts/benchmark_config_template.json';
- Below are the fields descriptions on benchmark config file:
  - "topology" : "", the topology you want to benchmark, right now support alexnet/googlenet/googlenet_v2/resnet_50/all;
  - "hostfile" : "", your hostfile path, which is necessary while running multinodes benchmark;
  - "network" : "", currently support tcp/opa;
  - "netmask" : "", name of your ethernet card interface like eth0; this is necessary while running on tcp network,
  - "dummy_data_use" : true, set as true if you want to use dummy data, else set as false to use actual datasets and you need to specify the dataset path within the model protocol file; default is true to use dummy data for benchmarking;
  - "scal_test" : false, set as false if you want to do throughput test, else set as true if you want to do scaling test; this is only available while you are running multinodes mode and the number of hosts is power of 2; default is false;
  - "caffe_bin" : "", caffe built binary path;
  - "engine" : "" you can choose CAFFE, MKL2017 or MKLDNN and you can obtain best performance on CPU by using MKLDNN;
  - "batch_size_table" : {}, this is batch size table which contains batch sizes you want to use on various topology and platform combinations; for default, we are using best known batch size from experiences obtained within our internal tests.
Once you set your config file ready, then use 'scripts/run_benchmark.py --configfile your_config_file.json' to start benchmarking.
After your running is done, you can check your final results or failure logs under 'result-benchmark-YearMonthDayHourMinuteSecond.log'; for more detailed logs you can check on 'result-platform-topology-YearMonthDayHourMinuteSecond.log'.

Deprecated method on old bash script (run_benchmark.sh, we will remove this bash script on next release, please use run_benchmark.py instead to do benchmarking)

Run benchmark on single node(use alexnet as example).

scripts/run_benchmark.sh --topology alexnet Images/sec performance will be reported at the end. Below is the example of alexnet performance on KNL. batch size : 256 average time : 321 ms benchmark speed : 797 images/s

Run benchmark on multi nodes(use alexnet as example).

scripts/run_benchmark.sh --topology alexnet --hostfile /your/path/hostfile --network tcp/opa --netmask your-NIC-name Images/sec performance will be reported at the end. Below is the example of alexnet performance on 2 BDWs. batch size : 512 average time : 349 ms benchmark speed : 1467 images/s

Scripts are verified on the OSes: CentOS 7.4.

Run benchmark

Prerequisites

Steps to run benchmark

Deprecated method on old bash script (run_benchmark.sh, we will remove this bash script on next release, please use run_benchmark.py instead to do benchmarking)

Run benchmark on single node(use alexnet as example).

Run benchmark on multi nodes(use alexnet as example).

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally