Speed Benchmarking the SDAR-algorithm • sdarr

The SDAR-algorithm as standardized in ASTM E3076-18 uses numerous linear regressions (.lm.fit() from the stats-package). As the number of linear regressions during the SDAR-algorithm scales with the square of the number of data points in the normalized data range, it can become painfully slow for test data with high resolution. An estimation is given in in NIST Technical Note 2050 by E. Lucon, which is using an excel-spreadsheet for the calculations.

Data Set for Speed Estimation

The Test Data Set for Speed Estimation was generated with different resolution for x (strain) from 11.3 to 15.6 effective bits. A total of 24 synthetic test records resembling tensile mechanical testing of aluminium (EN AW-6060-T66), with a toe region added and some minor noise in the synthetic stress data) was created using sdarr::synthesize_test_data().

Test data was evaluated with the sequential plan using one core, and the multisession plan using four and eight cores. The time for execution of sdarr::sdar(), sdarr:sdar_lazy() and sdarr:sdar_lazy() with enforced sub-sampling was measured using pracma::tic,toc.
The evaluation was run on a 2021 MacBook Pro (with M1 Max processor and 64 GB Ram under macOS 13.4.1).

Side Note on Bechmarking

As benchmarking takes a significant amount of time, the results were pre-calculated (using the rmd-file speed_improvement_data.Rmd, which is available within the installed sdarr-package). Use rmarkdown::render() to have the benchmarking conducted on you computer and knitted to an html-document into the current working directory, i.e. run the following block:

# knit the benchmarking-rmd to html-file
# (and save the result data in the current working directory)
# caution: might take some time...
speed_improvement_data <- rmarkdown::render(
  input = paste0(system.file(package = "sdarr"), 
                 "/rmd/speed_improvement_data.Rmd"),
  params = list(
    # set Number of cores for benchmarking
    use_cores = c(1, 4, 8), 
    # synthetic data - set min and max effective number of bits
    enob = c(11.3, 15.6),
    # synthetic data - set Number of synthetic test records
    length.out = 24,
    sdarr_benchmark_result_filepath = file.path(
      getwd(), "sdarr_benchmark_result.rda")),
  knit_root_dir = getwd(),
  output_dir = getwd()
  )
  
# knit the evaluation-rmd to html-file
speed_improvement_evaluation <- rmarkdown::render(
  input = paste0(system.file(package = "sdarr"), 
                 "/rmd/speed_improvement_evaluation.Rmd"),
  params = list(sdarr_benchmark_result_filepath = file.path(
    getwd(), "sdarr_benchmark_result.rda")),
  knit_root_dir = getwd(),
  output_dir = getwd()
  )
  
# view the knitted file
utils::browseURL(speed_improvement_evaluation)

Benchmarking Results Data Set

The results of the benchmarking contain speed estimations for normalized data ranges from 53 to 1021 points. A total of 24 synthetic test records was analyzed using 1 to 8 cores. Data was pre-calculated (using the rmd-file speed_improvement_data.Rmd, which is available within the installed sdarr-package) and copy-pasted here.

Results

The execution time of the different algorithms over the number of points in the normalized range is given in the plot.

Summary and Conclusion

As expected, the standard SDAR-algorithm seems to scale quadratic with the number of points in the normalized data range. It also benefits the most from using several cores (except for some minor overhead of ~100ms when using 8 cores at very few points in the normalized data range).

The random sub-sampling modification, which is available via sdar_lazy(), drastically reduces processing time compared to the standard SDAR-algorithm for higher resolution test data. For lower resolution test data, the algorithm will fall back to the standard SDAR-algorithm but still, there is some additional overhead to be considered.
Comparing the results, the speed improvement of the random sub-sampling modification is apparent at 250 - 400 points in the normalized data range (depending on the number of cores used). At very high-resolution data (> 600 points in the normalized range), the execution time for sdar_lazy() seems to stabilize to a (more or less) constant value of 1.8 seconds (when using four cores) to 3 seconds (when using only one or eight cores). The actual time will vary depending on the machine used for computation.

Using using several cores for the SDAR-algorithm via setting multisession plan will drastically improve performance. Yet, the additional overhead might slow down processing when using the maximum available cores. Tweaking the plan (see the vignette A Future for R: Future Topologies for further information) might increase overall performance when processing batches of data.