Speed Benchmarking the SDAR-algorithm
Source:vignettes/articles/speed_improvment.Rmd
speed_improvment.Rmd
The SDAR-algorithm as standardized in ASTM E3076-18 uses numerous linear regressions (.lm.fit() from the stats-package). As the number of linear regressions during the SDAR-algorithm scales with the square of the number of data points in the normalized data range, it can become painfully slow for test data with high resolution. An estimation is given in in NIST Technical Note 2050 by E. Lucon, which is using an excel-spreadsheet for the calculations.
Data Set for Speed Estimation
The Test Data Set for Speed Estimation was generated with different resolution for x (strain) from 11.3 to 15.6 effective bits. A total of 24 synthetic test records resembling tensile mechanical testing of aluminium (EN AW-6060-T66), with a toe region added and some minor noise in the synthetic stress data) was created using sdarr::synthesize_test_data().
Test data was evaluated with the sequential
plan using one core, and the multisession
plan using four and eight cores. The time for execution of sdarr::sdar(),
sdarr:sdar_lazy()
and sdarr:sdar_lazy() with enforced sub-sampling was measured using
pracma::tic,toc.
The evaluation was run on a 2021 MacBook Pro (with M1 Max processor and
64 GB Ram under macOS 13.4.1).
Side Note on Bechmarking
As benchmarking takes a significant amount of time, the results were
pre-calculated (using the rmd-file speed_improvement_data.Rmd, which is
available within the installed sdarr-package). Use
rmarkdown::render()
to have the benchmarking conducted on
you computer and knitted to an html-document into the current working
directory, i.e. run the following block:
# knit the benchmarking-rmd to html-file
# (and save the result data in the current working directory)
# caution: might take some time...
speed_improvement_data <- rmarkdown::render(
input = paste0(system.file(package = "sdarr"),
"/rmd/speed_improvement_data.Rmd"),
params = list(
# set Number of cores for benchmarking
use_cores = c(1, 4, 8),
# synthetic data - set min and max effective number of bits
enob = c(11.3, 15.6),
# synthetic data - set Number of synthetic test records
length.out = 24,
sdarr_benchmark_result_filepath = file.path(
getwd(), "sdarr_benchmark_result.rda")),
knit_root_dir = getwd(),
output_dir = getwd()
)
# knit the evaluation-rmd to html-file
speed_improvement_evaluation <- rmarkdown::render(
input = paste0(system.file(package = "sdarr"),
"/rmd/speed_improvement_evaluation.Rmd"),
params = list(sdarr_benchmark_result_filepath = file.path(
getwd(), "sdarr_benchmark_result.rda")),
knit_root_dir = getwd(),
output_dir = getwd()
)
# view the knitted file
utils::browseURL(speed_improvement_evaluation)
Benchmarking Results Data Set
The results of the benchmarking contain speed estimations for normalized data ranges from 53 to 1021 points. A total of 24 synthetic test records was analyzed using 1 to 8 cores. Data was pre-calculated (using the rmd-file speed_improvement_data.Rmd, which is available within the installed sdarr-package) and copy-pasted here.
Results
The execution time of the different algorithms over the number of points in the normalized range is given in the plot.
Summary and Conclusion
As expected, the standard SDAR-algorithm seems to scale quadratic with the number of points in the normalized data range. It also benefits the most from using several cores (except for some minor overhead of ~100ms when using 8 cores at very few points in the normalized data range).
The random sub-sampling modification, which is available via
sdar_lazy()
, drastically reduces processing time compared
to the standard SDAR-algorithm for higher resolution test data. For
lower resolution test data, the algorithm will fall back to the standard
SDAR-algorithm but still, there is some additional overhead to be
considered.
Comparing the results, the speed improvement of the random sub-sampling
modification is apparent at 250 - 400 points in the normalized data
range (depending on the number of cores used). At very high-resolution
data (> 600 points in the normalized range), the execution time for
sdar_lazy()
seems to stabilize to a (more or less) constant
value of 1.8 seconds (when using four cores) to 3 seconds (when using
only one or eight cores). The actual time will vary depending on the
machine used for computation.
Using using several cores for the SDAR-algorithm via setting multisession plan will drastically improve performance. Yet, the additional overhead might slow down processing when using the maximum available cores. Tweaking the plan (see the vignette A Future for R: Future Topologies for further information) might increase overall performance when processing batches of data.