[PDF] A practical approach to testing random number generators in computer algebra systems

Abstract

This paper has a practical aim. For a long time, implementations of pseudorandom number generators in standard libraries of programming languages had poor quality. The situation started to improve only recently. Up to now, a large number of libraries and weakly supported mathematical packages use outdated algorithms for random number generation. Four modern sets of statistical tests that can be used for verifying random number generators are described. It is proposed to use command line utilities, which makes it possible to avoid low-level programming in such languages as C or C++. Only free open source systems are considered.

Full PDF

aa r X i v : . [ c s . M S ] A p r A practical approach to testing random number generators in computer algebrasystems

Migran N. Gevorkyan, ∗ Dmitry S. Kulyabov,

1, 2, † Anastasia V. Demidova, ‡ and Anna V. Korolkova § Department of Applied Probability and Informatics,Peoples’ Friendship University of Russia (RUDN University),6 Miklukho-Maklaya St, Moscow, 117198, Russian Federation Laboratory of Information TechnologiesJoint Institute for Nuclear Research6 Joliot-Curie, Dubna, Moscow region, 141980, Russia

This paper has a practical aim. For a long time, implementations of pseudorandom numbergenerators in standard libraries of programming languages had poor quality. The situation startedto improve only recently. Up to now, a large number of libraries and weakly supported mathematicalpackages use outdated algorithms for random number generation. Four modern sets of statisticaltests that can be used for verifying random number generators are described. It is proposed to usecommand line utilities, which makes it possible to avoid low-level programming in such languagesas C or C++. Only free open source systems are considered.

Keywords: random number generation, TestU01, PractRand, DieHarder, gjrand ∗ [email protected] † [email protected] ‡ [email protected] § [email protected] I. INTRODUCTION

While modeling technical systems with control it is often required to study characteristics of these systems. Also itis necessary to study the inﬂuence of system parameters on characteristics. In systems with control there is a parasiticphenomenon as self-oscillating mode. We carried out studies to determine the region of the self-oscillations emergence.However, the parameters of these oscillations were not investigated. In this paper, we propose to use the harmoniclinearization method for this task. This method is used in control theory, but this branch of mathematics rarely usedin classical mathematical modeling. The authors oﬀer a methodological article in order to introduce this method tonon-specialists.

II. INTRODUCTION

Random numbers have wide applications in computer science, e.g., for statistical testing, in cryptography, and insimulation. However, the generation of truly random numbers is a labor-intensive task because generators of trulyrandom numbers are very complicated and costly. Moreover, the generation of truly random numbers can be timeconsuming, and a program can be forced to wait for the next number for a random amount of time. Most often, incomputer science we deal with pseudorandom rather than with truly random numbers. In this paper, we considersoftware implementations of only pseudorandom number generators, which will be called random number generatorsbelow.The random number generators must satisfy the following criteria:• to prevent cycling of the random number sequence, the period must be suﬃciently long;• the algorithm must be eﬃcient in terms of time and the amount of computation resources;• the algorithm must be able to reproduce the same sequence of random numbers any number of times;• the algorithm must be portable between various hardware architectures and operational environments.Testing a random numbers generator is actually reduced to verifying that the random numbers produced by it areindependent, and identically and uniformly distributed on the unit interval. The studies on this topic were performedby Soviet and Russian researchers.There are various algorithm for generating random numbers (some of them are brieﬂy described in Section III), andthey can have diﬀerent implementations. Many theoretical studies discuss and compare algorithms; however, we donot touch this issue and concentrate on the practical estimation of the quality of random number sequences generatedby software implementations. The estimation is performed using software tools.In this paper, we consider the following issues. First, open source computer algebra systems often use randomnumber generators that did not pass all available tests. Second, it is preferable to use the best (at least with respectto a certain criterion) random number generator. To resolve these issues, we describe a structure of a test bed forinvestigating software implementations of random number generators. This test bed makes it possible to estimatethe current implementation of the generator used in a computer algebra system. On addition, one can ﬁnd anothersoftware implementation and build it into an open source computer algebra system.As an illustration, we demonstrate the proposed procedure on certain computer algebra system.

III. GENERATORS IN MODERN COMPUTER ALGEBRA SYSTEMS

The invention of the ﬁrst random number generator is attributed to von Neumann in 1946. Later, in 1949, Lehmerproposed another algorithm, which was then generalized and received the name of linear congruential generator(LCG) [1]. This generator and its modiﬁcations became the main algorithm implemented in the libraries of Fortran,ADA, and C.In 1995, George Marsaglia proposed a set of statistical tests for verifying sequences of random numbers claimed tobe uniformly distributed. The application of this test set showed that the vast majority of random number generatorsproduce low-quality sequences that do not pass the majority of tests. This set of tests became widely known andstimulated researchers to seek better random number generator algorithms and their implementations.At the current time, modern versions of standard libraries in supported programming languages and computeralgebra systems, such as Maple [2], Mathematica [3], and SymPy [4], use implementations of the Mersenne Twister(MT) algorithm. This algorithm received wide use as a high-quality replacement for LCG because it is the ﬁrstalgorithm the implementations of which passed all tests available at that time. This test was developed in 1997 [5]and received its name due to the use of the Mersenne prime number − . Depending on the implementation, itprovides a period of up to − . The main drawback of this algorithm is its awkwardness and, as a consequence,slow program code. Note that currently much simpler and more eﬃcient algorithms are available (see [6–8]). In otherrespects, this generator produces a high-quality random sequence and is applicable in the majority of applications.Proceed to the main goal of this paper. If a research uses the generation of random numbers, then how can oneverify the quality of a sequence of such numbers? This can be needed if a nonstandard computer algebra system oran outdated version is used. Even if a modern system is used, the issue of choosing the seed value remains.An obvious answer to this question is the use of a software package that implements a set of statistical tests.However, all packages known to the authors of this paper are implemented in C or C++, and low-level programmingis required to use their functions. Since computer algebra systems use high-level problem-oriented programminglanguages, the use of C and C++ functions can be diﬃcult or even impossible.In our opinion, this diﬃculty can be overcome by using command line utilities. The use of such utilities saves onethe necessity to introduce C or C++ code in the program and replaces it by creating a script that passes the sequenceof numbers to be analyzed to the input of the testing utility. IV. STATISTICAL TESTS

Testing random number generators is a classical problem of testing statistical hypotheses. However, in mathematicalstatistics the null hypothesis should be typically refuted. By contrast, in testing random number generators, the nullhypothesis should be conﬁrmed. In other words, tests are designed for validating that the generated random sequencesindeed consist of independently distributed random numbers that are not deterministically related. The generatorsuccessfully passes the test if no statistically signiﬁcant deviations from the null hypothesis were found.Since each random number generator is a deterministic algorithm, there always exists a statistical test that thisgenerator cannot pass. The generator’s quality is determined by the number of tests that it can pass. For this reason,software implementations combine tests into test sets, which are jointly applied.We now list some tests included in the statistical test set DieHard.• The craps test. 200000 sequences of uniformly distributed random numbers are generated, and each of themis used to simulate the game of craps. Then, it is checked how well the empirical values of expectation andvariance match their theoretical values. The χ -squared test is used for statistical testing.• Birthday spacing. A random sequence on a large interval is generated. The spacings between the points shouldbe asymptotically exponentially distributed.• Overlapping permutations. A large number of samples consisting of ﬁve consecutive random numbers aregenerated. The 120 possible orderings should occur with statistically equal probability.DieHard includes 12 tests. Modern sets of statistical tests include dozens of tests. V. SOFTWARE PACKAGES OF SETS OF STATISTICAL TESTS

We have already mentioned that the ﬁrst package of a set of statistical tests for random number gener- ators wasDieHard [9] developed by G. Marsaglia in 1995. It was distributed on a CD, and is currently available on the Internet.Currently, DieHard is out of use, but its tests are now included in other packages.Among the software packages containing statistical tests, we distinguish the following four.• TestU01 [10, 11] developed by Pierre L’Ecuyer and Richard Simard. This package is written in ANSI C. Currently,this is the most widespread set of tests. It can test the generators that produce numbers in the interval [0 , .The latest version is 1.2.3 as of August 18, 2009.• PractRand [12] developed by Chris Doty-Humphrey. This package is written in С++11 with elements of C99.It takes a stream of bytes at its input and is able to test 32- and 64-bit generators, and it can cope with largeamounts of data. The latest version is 0.94 as of August 4, 2018.• gjrand [13]. There is no author’s name on the oﬃcial site of this package. It is written in С99. Takes a stream ofbytes at its input and is distributed with a set of various generators that are able to produce not only uniformlydistributed sequences but also normal, Poisson’s, and some other distributions. The latest version is 4.2.1 as ofNovember 28, 2014. Table I. Summary characteristics of testing packages

Пакет Язык CMD Unix Windows Версия Год Сайт

TestU01

ANSI C - ++ ± PractRand

C99, C++11 + + + 0.94 04.08.2018 [12] gjrand

C99 + + ± DieHarder

C99 + ++ ± • DieHarder [14] developed by R. Brown. It is declared to be as a successor of DieHard. It is written in C andrequires the library GSL [15] for its operation; it is able to test any generator that has the interface like theinterfaces of the generators included in GSL. The latest version is 3.31.1 as of June 19, 2017.Table I lists the basic characteristics of the packages discussed in this paper. The column Unix indicates if thepackage can be installed under *nix systems. The column

Windows has the sign plus if the program can be builtwithout installing CygWin or MinGW. By expending suﬃcient eﬀort, each of these libraries can be compiled underWindows as well.All the test packages listed in Table I have open source code. TestU01 and DieHarder are available for installationin the oﬃcial repositories of distribution kits, in particular Ubuntu 18.10. The column CMD indicates the presenceor absence of the command line utility. The double plus in the column Unix marks the program packages included inthe repositories. The two other packages are installed by compiling the source codes. The symbol ± in the columnWindows marks the packages the compilation of which requires the emulators CygWin or MinGW. Each package canbe used by linking the library to a С or С++ program and through the command line utility (except for TestU01).DieHarder has the richest command line utility. A. Installation of packages under Unix-type OS

Here we describe the installation of the packages into the user’s home directory without administrator rights. Theinstallation was performed under GNU/Linux Ubuntu 18.10. The set of compilers gcc version 8 was used. It is seenfrom the documentation to the packages that any compiler supporting the C standards up to C99, inclusive, and C++up to С++11, inclusive, will do.In the user’s home directory, create the following hierarchy of directories: mkdir -p ~/usr/bin ~/usr/lib ~/usr/share ~/usr/include • The directory ~/usr/bin for executable ﬁles.• The directory ~/usr/lib for shared and static libraries (.so and .a).• The directory ~/usr/include for header ﬁles.• The directory ~/usr/share for examples and documentation.To the ﬁle ~/.bashrc , we add the following environment variables: export PATH="$HOME/usr/bin/:$PATH"export LD_LIBRARY_PATH="$HOME/usr/lib:$LD_LIBRARY_PATH"export LIBRARY_PATH="$HOME/usr/lib:$LIBRARY_PATH"export C_INCLUDE_PATH="$HOME/usr/include:$C_INCLUDE_PATH"

This makes it possible for the command interpreter to seek the executable ﬁles also in the directory ~/usr/bin andallows the compiler to automatically include libraries and header ﬁles.

1. Installation of TestU01

To install TestU01, download the zip archive from the oﬃcial site [10], unzip it, and go to the root directory: wget \protect\vrule ← ֓ width0pt\protect\href{http://simul.iro.umontreal.ca/testu01/TestU01.zip}{http://simul.iro.umontreal.ca/testu01/TestU01.zip}uz TestU01.zipcd TestU01-1.2.3/ TestU01 is distributed with the set of scripts Autoconf and Automake; therefore, the compilation and installationprocesses are reduced to the execution of the three following commands: ./configure --prefix=$HOME/usrmakemake install

The option --prefix=$HOME/usr makes it possible to install the program locally into ~/usr .After the installation, the directory ~/usr/include will contain a lot of header ﬁles. To better organize them, wecreate the directory ~/usr/include/testu01 and move into it all .h ﬁles of TestU01. The same can be done in thedirectory ~/usr/lib containing library ﬁles.

2. Installation of gjrand

Download the source codes archive from the oﬃcial site [13]: wget https://datapacket.dl.sourceforge.net/project/gjrand/gjrand/ gjrand.4.2.1/gjrand.4.2.1.tar.bz2tar -xvjf gjrand.4.2.1.tar.bz2cd gjrand.4.2.1 gjrand is compiled using the bash-scripts compile . The library source ﬁles are in the directory src : cd src && ./compile At the output, we obtain the compiled ﬁles of the dynamic gjrand.so and static gjrand.a libraries, which wemanually move to the directory ~/usr/lib : cp -t ~/usr/lib/gjrand gjrand.a gjrand.socp -t ~/usr/include/gjrand gjrand.h Return to the root directory using the command cd .. and go through all other subdirectories in the same way. Eachof them contains the script compile , which must be executed: cd testother/src && ./compile

At the output, we obtain the executable ﬁles in testother/bin . Again, return one level up cd .. , go to the followingdirectory, and execute cd testmisc/ && ./compile to obtain the executable ﬁle kat , which can be used to check the validity of the library operation. This utility performsa number of tests to check if the program works as designed.Go to the following directory and execute cd testunif/src && ./compile to obtain at the output the executable ﬁles for each static test. They are placed into the directory testunif/bin .Two ﬁles mcp and pmcp will appear in testunif ; they are the command line utilities for testing the generated bitsequences. These utilities use executable ﬁles residing in the directory testunif/bin ; therefore, they cannot be placedinto another directory.The directory cd testfunif/src && ./compile is completely similar to the preceding directory, but the tests are intended for double numbers in the interval [0 , .The directory testother cd testother/src && ./compile contains tests for nonuniform distributions. The executable ﬁles will also be placed into testother/bin .As a result of installing gjrand , we obtained two library ﬁles in the directory ~/usr/lib/gjrand and a header ﬁlein the directory ~/usr/include/gjrand . The other utilities remain in the corresponding directories.Note that the installation completed without errors. The package author used the compiler option -Wall , and onlyinsigniﬁcant warnings about the use of the if without parentheses were obtained.

3. Installation of DieHarder

To build the program, the library GSL (GNU Scientiﬁc Library) [15] is needed. In Ubuntu, it can be installed byexecuting the command apt install libgsl-dev

Download the source code archive from the oﬃcial site [14] and unzip it: wget https://webhome.phy.duke.edu/~rgb/General/ dieharder/dieharder-3.31.1.tgzuz dieharder-3.31.1.tgzcd dieharder-3.31.1

To install DieHarder, execute as in the other cases ./configure --prefix=$HOME/usrmakemake install

Additionally, when ./configure is executed, the option --disable-shared may be used, which results in the staticcompilation of the command line utility, and no dynamic library will be created. This is convenient if only thecommand line utility is needed.In the process of compiling, there was an error—the compiler could not ﬁnd the declaration of the type intptr_t .This can be repaired by including the header ﬁle stdint.h into the ﬁle ./include/dieharder/libdieharde.h . Thefurther installation completed without errors. While make install was executed, the library ﬁles were moved intothe directory ~/usr/lib , the subdirectory dieharder was automatically created in ~/usr/include , and all headerﬁles was placed into it. The command line utility dieharder was also placed into ~/usr/bin .

4. Installation of PractRand

Download the source code archive from the oﬃcial site [12] and unzip it: wget https://netcologne.dl.sourceforge.net/project/ pracrand/PractRand_0.94.zipuz PractRand_0.94.zip

The archive contains the ready to use dynamic and static libraries for Windows. There is also the project ﬁle forVisual Studio. For the installation under Unix, go to the directory unix and start the building process using thecommand make : cd PractRand_0.94/unixmake This requires a compiler for C++ that supports the C++11 standard.During the building process, there were errors related to letter cases in header ﬁle names. For example, the ﬁle

Coup16.h was mentioned in test.cpp in lowercase letters even thought the ﬁle name begins with an uppercase letter.The same error was found in the ﬁle names

NearSeq.h , birthday.h , and the directory Tests . Probably, this isbecause the author developed the program under Windows, in which ﬁle names are case insensitive.After eliminating the errors, we obtain four executable ﬁles and the static library libpracrand.a . We have describedthe installation of four program packages, and now we describe the utilities for testing sequences of random numbersprovided by these packages.

B. Command line utilities

The packages

PractRand , DieHarder , and gjrand provide command line utilities. These utilities make it possibleto execute statistical tests on a sequence of random numbers read from the standard input stream or from a ﬁle. Themost functionally rich utility is included in

DieHarder .

1. PractRand utilities

After compiling and building PractRand, we obtain four executable ﬁles:•

RNG_output runs one of the generators included in the package.

Table II. Options mcp of the package gjrand

Option Description –tiny (10 MB) –small (100 MB) –standard (1 GB) (default) –big (10 GB) –huge (100 GB) –tera (1 TB) –ten-tera (10 TB) number

Number of bytes –no-rewind

Do not rewind • RNG_test is the utility designed for testing the generators distributed with the package or the data streamobtained from the standard input.•

RNG_benchmark measures the performance of embedded generators and the generators added by the user.•

Test_calibration is used for testing and conﬁguring test sets.Testing is performed using

RNG_test . For example, to run the test for the embedded generator jsf32 , it is suﬃcientto execute the command

RNG_test jsf32

For testing an external program, one should feed the generated binary stream of unsigned integers to the standardinput of

RNG_test , e.g., ./random -G 4 -N 1000000000000000 | RNG_test stdin32

The option stdin32 makes

RNG_test to interpret the stream of binary data as a set of 32-bit numbers. If the option stdin64 is indicated, then the numbers are interpreted as 64-bit ones; with the option stdin , the program decidesfor itself how to interpret the input data. For testing data from a ﬁle, the following command can be used: cat file.data | RNG_test stdin64

The data in the ﬁle must be binary; text ﬁles are not supported.The tests are executed fairly quickly, and the user gets a report at the output in which the failed and suspicioustests are listed. With the option -p1 , the program prints all results (including the passed tests).

2. gjrand utilities

After building, a lot of executable ﬁle are obtained. Two of them are designed for testing external generators. Oneof them resides in the directory testfunif and is called mcp (master computer program); the second one is in thedirectory testother and is called fmcp . The utility mcp is designed for testing sequences of unsigned random numbers,and fmcp tests numbers in the interval [0 , . In other respects, there are no diﬀerences between them.The program mcp takes only the binary stream at the standard input. This is done using the pipe ./random -G 4 -N 1000000000000000 | mcp --big The input stream is subjected to all set of tests. The results are printed as they are produced. The pro- gram has noother options, except for the amount of data to be tested (see Table II).

3. dieharder utility

After the package

DieHarder has been installed, the command line utility dieharder becomes available. Thisutility can run and test the library

GSL and the random number generators included in

DieHarder . In addition, it cantest streams of random numbers loaded from ﬁles or from the standard input stream. A detailed help on all availableoptions can be obtained by executing the utility with the option -h . Here we list the most important options:• -l show the list of available tests,• -dn select the test n for application,• -a apply all tests,• -g -1 show the list of available random number generators. The selected generator can be tested by specifyingthe option -g n , where n is the index of the generator in the list.In particular, the generators have the following options:• The option instructs the utility to take the stream of binary data fed to the standard input stdin.• The options and switches on reading data from a binary and text ﬁle, respectively.• The options and make it possible to read the stream of bytes from the pseudodevices /dev/random and /dev/urandom .For testing external generators, the preferable method is to pass a continuous stream of binary data through thepipe ( | ). For example, this can be done as follows: ./random -G 4 -N 1000000000000000 | dieharder -g 200 -a The program random uses the generator number 4 for generating numbers. Note that this does not cause theprocess hangup because dieharder will close the channel and terminate the process after completing all tests.In the case of reading random numbers from a text ﬁle, this ﬁle must have a speciﬁc structure. Each randomnumber must be on a separate line, and the ﬁrst lines of the ﬁle must contain the following data: type of data ( d indicates that the numbers are long integers), the number of numbers in the ﬁle, and the number of bits in them (32or 64 bits). Here is an example of such a ﬁle: type: dcount: 5numbit: 641343742658553450546163299420274983667023111285719358198731296616083714213600417179712607770735227 As soon as such a ﬁle is created, it can be passed to dieharder for testing: dieharder -a -g 202 -f file.in > file.out

The ﬂag -f speciﬁes the input ﬁle containing the numbers for the analysis. The testing results will be saved to file.out . For a full-ﬂedged test, more than numbers are required; therefore, the ﬁle size can be as large as tensof gigabytes or more. If the number of numbers is less, then dieharder can begin to read the ﬁle from the beginning,which can deteriorate the testing results. VI. EXAMPLE OF TESTING THE GENERATORS SYMPY AND MAXIMA

Let us discuss how the testing of an arbitrary random number generator can be organized using the computeralgebra system Maxima and the library SymPy for Python. We will use CAS Maxima version 5.42.1 and Python3.6.8, the installer Miniconda. We do not want to compare various implementations of gen- erations; rather, we providean example that can be used for testing generators in other libraries, packages, and computer algebra systems.Since SymPy is a python module, we use the standard module random for generating random numbers. Thefollowing code fragment outputs unsigned 64-bit random integers into the standard output stream. import randomimport syswhile True:r = random.randint(0,2**64-1)r = r.to_bytes(8, byteorder="little", signed=False)sys.stdout.buffer.write(r)

This fragment generates an integer in the interval [0 , ) , which is then transformed to the binary form and placedinto the standard output stream. By placing this script into the ﬁle rand_test.py , we can organize testing byexecuting the command line instruction python rand_test.py | dieharder -g 200 -a or python rand_test.py | RNG_test stdin64 In the case of Maxima, a sequence of random integers can be generated using the code fragment for i: 1 thru 10 step 1 do print(random(2^64-1))$

However, at the output we obtain a column of integers in text format; for this reason, we use the ability of dieharder to read text data from ﬁles (see Subsection V B 3).Since both computer algebra systems use the Mersenne twister algorithm, the testing results are almost identical.The utility

RNG_test for SymPy produces the result no anomalies in 133 test result(s) , while DieHarder showsin the report ﬁve weakly passed tests out of more than thirty. The results produced by DieHarder for Maxima arealmost the same.

VII. CONCLUSION

A detailed description of the experience in building a test bed for investigating software implementations of pseu-dorandom number generators is given. The software described in this paper provides a researcher with a selection ofthree command line utilities. These utilities eliminate the need for low-level programming. They can be convenientlyused with an intermediate connecting programming language.To use these utilities, one should place the generated random numbers into the standard output stream in binaryform. Since the majority of computer algebra systems are designed for high-level programming, it can be diﬃcult toprovide data in binary form. In this case, only the utility dieharder can be used because it can accept data in textformat.At the end of the paper, methods of investigating software implementations of random number generators aredemonstrated using two open source computer algebra systems as examples.

ACKNOWLEDGMENTS

The publication has been prepared with the support of the “RUDN University Program 5-100”. [1] D. E. Knuth, The Art of Computer Programming, 3rd Edition, Vol. 2, Addison-Wesley Longman Publishing Co., Inc.,Boston, MA, USA, 1997.[2] Maple home site (2020).URL [3] Mathematica home site (2020).URL [4] SymPy home site (2020).URL [5] M. Matsumoto, T. Nishimura, Mersenne Twister: A 623-dimensionally Equidistributed Uniform Pseudo-random Number Generator,ACM Trans. Model. Comput. Simul. 8 (1) (1998) 3–30. doi:10.1145/272991.272995 .URL [6] F. Panneton, P. L’Ecuyer, On the Xorshift Random Number Generators, ACM Trans. Model. Comput. Simul. 15 (4) (2005)346–361. doi:10.1145/1113316.1113319 .[7] M. E. O’Neill, PCG: A Family of Simple Fast Space-Eﬃcient Statistically Good Algorithms for Random Number Generation,Tech. Rep. HMC-CS-2014-0905, Harvey Mudd College, Claremont, CA (2014).URL [8] P. Boldi, S. Vigna, On the Lattice of Antichains of Finite Intervals, Order 35 (1) (2018) 57–81. doi:10.1007/s11083-016-9418-8 .[9] G. Marsaglia, The Marsaglia Random Number CDROM including the Diehard Battery of Tests of Randomness (1995).URL https://web.archive.org/web/20160125103112/http://stat.fsu.edu/pub/diehard/ [10] P. L’Ecuyer, R. Simard, TestU01 — Empirical Testing of Random Number Generators (2009).URL http://simul.iro.umontreal.ca/testu01/tu01.html [11] P. L’Ecuyer, R. Simard, TestU01: A C library for empirical testing of random number generators, ACM Transactions onMathematical Software (TOMS) 33 (4) (2007) 22.URL [12] C. Doty-Humphrey, PractRand oﬃcial site (2018).URL http://pracrand.sourceforge.net/ [13] Gjrand random numbers oﬃcial site (2014).URL http://gjrand.sourceforge.net/ [14] R. G. Brown, D. Eddelbuettel, D. Bauer, Dieharder: A Random Number Test Suite (2017).URL [15] M. Galassi, B. Gough, G. Jungman, J. Theiler, J. Davies, M. Booth, F. Rossi, GSL — GNU Scientiﬁc Library (2019).URL r X i v : . [ c s . M S ] A p r Практический подход к тестированию генераторов случайных чисел системкомпьютерной алгебры

М. Н. Геворкян, ∗ Д. С. Кулябов,

1, 2, † А. В. Демидова, ‡ и А. В. Королькова § Кафедра прикладной информатики и теории вероятностей,Российский университет дружбы народов,117198, Москва, ул. Миклухо-Маклая, д. 6 Лаборатория информационных технологий,Объединённый институт ядерных исследований,ул. Жолио-Кюри 6, Дубна, Московская область, Россия, 141980

Данная работа носит практических характер. Долгое время реализации генераторов последо-вательностей псевдослучайных чисел в стандартных библиотеках языков программирования иматематических пакетов были плохо проработаны. Ситуация начала улучшатся сравнительнонедавно. До сих пор большое количество библиотек и слабо поддерживаемых математическихпакетов используют в своем составе старые алгоритмы генерации псевдослучайных чисел. Мыописываем четыре актуальных набора статистических тестов, которые можно применить дляпроверки генератора, который используется в той или иной программной системе. В рабо-те предлагается использовать для исследования утилиты командной строки, что позволяетизбежать низкоуровневого программирования на языках типа С или С++. Кроме того, рас-сматриваются только свободные системы с открытым программным кодом.

Ключевыеслова: генерация псевдослучайных чисел, TestU01, PractRand, DieHarder, gjrand ∗ [email protected] † [email protected] ‡ [email protected] § [email protected] I. ВВЕДЕНИЕ

При моделировании технических систем с управлением возникает необходимость исследования их характе-ристик. Также необходимо исследование влияния параметров систем на эти характеристики. В системах суправлением возникает такое паразитное явление, как автоколебательный режим. Нами проводились исследо-вания по определению области возникновения автоколебаний. Однако параметры этих автоколебаний нами неисследовались. В данной статье мы предлагаем использовать метод гармонической линеаризации для даннойзадачи. Этот метод применяется в теории управления, однако данный раздел математики достаточно редкоиспользуется в классическом математическом моделировании. Авторы предлагаю методическую статью, при-званную познакомить неспециалистов с применением этого метода.

II. ВВЕДЕНИЕ

В информатике и вычислительной технике случайные числа находят широкое применение: для проведениястатистических испытаний, в криптографии, в имитационном моделировании. Однако получение истинно слу-чайных чисел является крайне трудоёмким процессом. Эта вызвано в первую очередь сложностью и дорого-визной генераторов истинно случайных чисел. Также следует учесть, что генерация истинно случайных чиселможет занимать много времени, и вполне вероятен вариант, когда при исчерпании источника истинно случай-ных чисел программа переходит в режим ожидания (время которого тоже случайно). Таким образом чаще речьидёт не об истинно случайных числах, а о псевдослучайных числах. В данной работе мы будем рассматриватьпрограммные реализации только генераторов псевдослучайных чисел.Генераторы псевдослучайных чисел должны удовлетворять ряду критериев [1]:• для предотвращения зацикливания последовательности псевдослучайных чисел необходимо иметь доста-точно длинный период;• алгоритм должен быть эффективным по скорости работы алгоритма и затрате вычислительных ресурсов;• алгоритм должен удовлетворять критерию воспроизводимости, то есть должна быть возможность воспро-извести ранее сгенерированную последовательность псевдослучайных чисел любое количество раз;• алгоритм должен быть переносим по отношению к архитектурам оборудования и операционным окруже-ниям;• алгоритм должен быть быстрым и ресурсосберегающим.Тестирование генераторов псевдослучайных чисел фактически заключается в проверке последовательностипсевдослучайных величин на независимость, одинаковую распределённость, равномерность на единичном ин-тервале. Работы по данной тематике активно велись и отечественными учёными [2–5]. Также доказательствостатистической выявляемости псевдослучайных чисел можно найти, например, в работах Ю. Н. Тюрина [6].В основе каждой программной реализации генератора псевдослучайных чисел лежит свой алгоритм. Вомногих теоретических исследованиях обсуждаются и сравниваются сами алгоритмы, но мы оставим данныйаспект без обсуждения. В работе мы сосредоточимся на вопросах практической проверки качества генерируе-мых программным кодом последовательностей псевдослучайных чисел. Исследование проводится с помощьюпрограммных инструментов.В рамках работы мы затрагиваем следующие проблемы. Во-первых, в свободных системах компьютернойалгебры зачастую используют генераторы псевдослучайных чисел, не прошедшие всех возможных тестов. Во-вторых, желательно использовать наилучший (хотя бы по некоторым критериям) генератор псевдослучайныхчисел. Для решения данных проблем нами описывается практическая структура стенда для исследования про-граммных реализаций генераторов псевдослучайных чисел. Это позволяет оценить текущую реализацию ге-нератора, встроенную в ту или иную систему компьютерной алгебры. Кроме того, можно подобрать другуюпрограммную реализацию для встраивания её в свободную систему компьютерной алгебры.В качестве иллюстрации методика исследования продемонстрирована на некоторых системах компьютернойалгебры.

III. ГЕНЕРАТОРЫ В СОВРЕМЕННЫХ СИСТЕМАХ КОМПЬЮТЕРНОЙ АЛГЕБРЫ

Изобретение первого алгоритма генерации последовательности псевдослучайных чисел приписывают Дж.фон Нейману в 1946 году. После этого в 1949 году Д. Г. Лехмер (D. H. Lehmer) предложил свой алгоритм,который впоследствии был обобщен и стал известен как линейный конгруэнтный генератор (linear congruentialgenerator — LCG) [7]. Именно этот генератор в различных его модификациях стал основным алгоритмом, реа-лизованным в библиотеках на языках Fortran, ADA, C.В 1995 году Джордж Марсальея (George Marsaglia) выпустил набор статистических тестов, который позволялпроверить насколько случайную и равномерно-распределенную последовательность чисел дает тот или инойалгоритм. Применение данного набора тестов показало, что подавляющее большинство генераторов случайныхчисел на практике дают некачественную последовательность и проваливают большинство тестов.Данный набор тестов пробрел широкую известность и побудил специалистов начать поиски более качествен-ных алгоритмов генерации случайных чисел и их программных реализаций.К настоящему моменту в современных версиях стандартных библиотек активно поддерживаемых языков про-граммирования и систем компьютерной алгебры, таких как Maple [8], Mathematica [9], SymPy [10] применяетсяреализация алгоритма вихрь Мерсенна (MT — M ersenne T wister). Именно этот алгоритм стали повсеместновнедрять как качественную замену LCG, так как он был одним из первых алгоритмов, программная реализациякоторый проходил все имеющиеся на тот момент тесты. Он был разработан в 1997 году [11] и получил своеназвание из-за использования простого числа Мерсенна − . В зависимости от реализации он обеспечиваетпериод вплоть до − .Основным недостатком алгоритма является относительная громоздкость и, как следствие, сравнительно мед-ленная работа программного кода. Заметим также, что в настоящее время разработаны намного более эф-фективные и простые алгоритмы [12–14]. В остальном же данный генератор обеспечивает псевдослучайнуюпоследовательность хорошего качества и вполне применим для большинства задач.Перейдём, однако, к основной цели данной работы. Если в исследовательской работе используется генерацияпсевдослучайных чисел, то какими средствами можно проверить качество последовательностей этих чисел? Этоможет быть актуально при использование нестандартной системы компьютерной алгебры или системы старыхверсий. Даже если используется одно из современных средств, то остается вопрос выбора удачного начальногозначения.Очевидным ответом на данный вопрос будет использование какого-либо программного пакета, реализующегонабор статистических тестов. Однако все известные авторам пакеты реализованы на языках C или C++ идля применения их функций непосредственно необходимо низкоуровневое программирование. Так как языкпрограммирования для систем компьютерной алгебры является высокоуровневым проблемно-ориентированнымязыком программирования, то внедрение функций на языках типа C/C++ может быть невозможно или весьматрудоёмко.На наш взгляд это затруднение можно обойти, использовав утилиты командной строки. Использование по-добных утилит избавит от необходимости внедрять код на языках типа C/C++ в программу и позволит огра-ничиться созданием скрипта, передающего последовательность анализируемых чисел на вход тестирующейутилите. IV. СТАТИСТИЧЕСКИЕ ТЕСТЫ

Тестирование генераторов псевдослучайных чисел фактически является классической задачей проверки ста-тистических гипотез. Но если в математической статистике обычно ставится задача опровержения нулевойгипотезы, то при тестировании генераторов псевдослучайных чисел, наоборот, нулевую гипотезу пытаютсяподтвердить. Иными словами тесты направлены на обоснование того, что сгенерированные последовательно-сти псевдослучайных чисел являются случайными независимо распределёнными случайными величинами ине связаны никаким детерминированным законом. Генератор успешно проходит тест, если не удалось найтистатистически значимых отклонений от ноль гипотезы.Так как генератор псевдослучайных чисел является детерминированным алгоритмом, то всегда найдётся тотили иной статистический тест, который данный генератор не пройдет. Качество генератора определяется тем,насколько много тестов он может пройти. Поэтому при создании программной реализации тестов их объединяютв наборы тестов и применяют совместно.Перечислим несколько тестов, которые входят в набор статистических тестов

DieHard .• Тест на игру в кости. Генерируется 200000 последовательностей равномерно распределённых псевдослу-чайных чисел. Каждая из полученных последовательностей используется для симуляции игры в кости.Далее проверяется, насколько теоретические значения математического ожидания и дисперсии согласует-ся с эмпирическими. Для статистической проверки используется критерий χ -квадрат.• Парадокс дней рождения. Генерируется последовательность на большом интервале. Расстояния междучислами должны быть асимптотически распределены по Пуассону. Таблица I. Сводная характеристика пакетов программ для тестирования

Пакет Язык CMD Unix Windows Версия Год Сайт

TestU01

ANSI C - ++ ± PractRand

C99, C++11 + + + 0.94 04.08.2018 [18] gjrand

C99 + + ± DieHarder

C99 + ++ ± • Пересекающиеся перестановки. Генерируется большое число выборок из пяти последовательных случай-ных чисел. Вероятности появления каждой из всех возможных перестановок должны быть статистическиэквивалентны.Всего в DieHard входило 12 тестов. В современных программных пакетах, реализующих наборы статистиче-ских тестов, количество тестов существенно увеличено и превышает несколько десятков. V. ПРОГРАММНЫЕ ПАКЕТЫ НАБОРОВ СТАТИСТИЧЕСКИХ ТЕСТОВ

Как уже отмечалось, исторически первым программным пакетом, реализующим набор статистических те-стов (или просто набор тестов) для тестирования генераторов псевдослучайных чисел, был пакет программDieHard [15], созданный в 1995 году Джорджем Марсальей (George Marsaglia). Он распространялся на CD-дискеи в настоящее время официальная страница доступна только в виде архива. Пакет DieHard на данный моментне актуален, но тесты, входившие в его состав, сейчас включены в другие программные пакеты статистическихтестов.Из актуальных в настоящее время программных пакетов, реализующих наборы тестов, можно выделитьследующие четыре.•

TestU01 [16, 17] за авторством Пьера Л’Экуйе и Ричарда Симарда (Pierre L’Ecuyer, Richard Simard). Напи-сана на

ANSI C . На сегодняшний день является самым известным наборов тестов. Тестирует генераторы,генерирующие числа из интервала [0 , . Последняя версия от 18 августа 2009 года.• PractRand [18] за авторством Криса Доти-Хамфри (Chris Doty-Humphrey). Написана на

С++11 с элемен-тами C99. Принимает на вход поток байт, может тестировать 32 и 64 битные генераторы. Способен справ-ляться с очень большими объемами данных. Последняя версия от 04 августа 2018 года.• gjrand [19]. Контактов автора на официальном сайте найти не удалось. Написан на

С99 . Принимаетна вход поток байт. Поставляется с набором различных генераторов, способных генерировать не толькоравномерно-распределенные последовательности псевдослучайных чисел, но и последовательности, под-чиняющиеся нормальному, пуассоновскому и некоторым другим распределениям. Последняя версия от 28 ноября 2014 года.•

DieHarder [20] за авторством Роберта Брауна. Позиционируется как наследник тестов DieHard. Напи-сан на языке C . Требует для своей работы библиотеки GSL [21] и может тестировать любой генератор, синтерфейсом в стиле интерфейсов генераторов из GSL. Последняя версия от 19 июня 2017 года.В таблице I дана сводка основных характеристик обозреваемых пакетов программ. В колонке

Unix указа-на возможность установки под *nix системами. В колонке

Windows поставлен плюс только если программувозможно собрать без установки

CygWin или

MinGW . При должном старании любую библиотеку можно скомпи-лировать и под Windows тоже.Все перечисленные программные пакеты наборов тестов имеют открытый исходный код. TestU01 и DieHarderдоступны для установки через официальные репозитории многих дистрибутивов, в частности Ubuntu 18.10. Двадругих набора тестов необходимо устанавливать путем сборки из исходных кодов. Каждый из перечисленныхпрограммных пакетов позволяет проводить тесты как путем подключения библиотек к С/С++ программепользователя, так и предоставляет утилиту командной строки. Наиболее функциональная утилита команднойстроки у программного пакета DieHarder.

A. Установка пакетов тестов под ОС типа Unix

Опишем процесс установки в домашний каталог пользователя без прав администратора. Все действия выпол-нялись в операционной системе GNU/Linux Ubuntu 18.10. Использовался набор компиляторов gcc mkdir -p ~/usr/bin ~/usr/lib ~/usr/share ~/usr/include • Каталог ~/usr/bin для исполняемых файлов.• Каталог ~/usr/lib для файлов разделяемых и статических библиотек ( .so и .a ).• Каталог ~/usr/include для заголовочных файлов.• Каталог ~/usr/share для примеров и документации.Также в файл ~/.bashrc добавляем следующие переменные окружения: export PATH="$HOME/usr/bin/:$PATH"export LD_LIBRARY_PATH="$HOME/usr/lib:$LD_LIBRARY_PATH"export LIBRARY_PATH="$HOME/usr/lib:$LIBRARY_PATH"export C_INCLUDE_PATH="$HOME/usr/include:$C_INCLUDE_PATH" это даст возможность командному интерпретатору искать исполняемые файлы в том числе и в каталоге ~/usr/bin , а компилятору автоматически подключать библиотеки и заголовочные файлы.

1. Установка TestU01

Для установки TestU01 скачаем zip архив с официального сайта [16], распакуем его и перейдем в корневуюдиректорию. wget \protect\vrule ← ֓ width0pt\protect\href{http://simul.iro.umontreal.ca/testu01/TestU01.zip}{http://simul.iro.umontreal.ca/testu01/TestU01.zip}uz TestU01.zipcd TestU01-1.2.3/ TestU01 поставляется с набором скриптов

Autoconf и Automake , поэтому процесс компиляции и установкисводится к выполнению следующих трех команд: ./configure --prefix=$HOME/usrmakemake install

Указание опции --prefix=$HOME/usr позволяет установить программу локально в ~/usr .После установки в директории ~/usr/include будет создано большое количество заголовочных файлов. Длятого, чтобы организовать их аккуратней, создадим директорию ~/usr/include/testu01 и перенесем туда все .h файлы TestU01. То же самое можно сделать и директории ~/usr/lib для файлов библиотек.

2. Установка gjrand

Скачиваем архив с исходными кодами с официального сайта [19]: wget https://datapacket.dl.sourceforge.net/project/gjrand/gjrand/ gjrand.4.2.1/gjrand.4.2.1.tar.bz2tar -xvjf gjrand.4.2.1.tar.bz2cd gjrand.4.2.1

Для компиляции gjrand используются bash-скрипты compile . Исходные файлы для получения библиотекирасположены в директории src . cd src && ./compile на выходе получаем скомпилированный файлы динамической gjrand.so и статической gjrand.a библиотек,которые вручную переместим в директорию ~/usr/lib : cp -t ~/usr/lib/gjrand gjrand.a gjrand.socp -t ~/usr/include/gjrand gjrand.h Вернемся в корневую директорию командой cd .. и пройдёмся таким же образом по всем поддиректориям. Вкаждой из них присутствует скрипт compile , который необходимо выполнить. cd testother/src && ./compile

На выходе получаем исполняемые файлы в testother/bin . Вновь вернемся на уровень выше cd .. и перейдемв следующую директорию: cd testmisc/ && ./compile получаем исполняемый файл kat , который можно использовать для проверки корректности работы библиотеки.При запуске он выполнит ряд тестов, чтобы проверить работает ли программа так, как рассчитывал автор.Перейдя в следующий каталог cd testunif/src && ./compile на выходе получим исполняемые файлы для каждого статистического теста. Они будут располагаться в дирек-тории testunif/bin . Непосредственно в testunif появятся два файла mcp и pmcp — командные утилиты длятестирования генерируемых битовых последовательностей. Эти утилиты используют исполняемые файлы издиректории testunif/bin поэтому переместить их в другую директорию нельзя.Следующий каталог cd testfunif/src && ./compile полностью аналогичен предыдущему, только тесты предназначены для чисел типа double из интервала [0 , .В каталоге testother cd testother/src && ./compile находятся тесты для неравномерных распределений. Исполняемые файлы также будут помещены в testother/bin .В результате установки gjrand мы получили два файла библиотеки в каталоге ~/usr/lib/gjrand и заголо-вочный файл в каталоге ~/usr/include/gjrand . Остальные утилиты останутся в соответствующих каталогах.Следует отметить, что процесс установки прошел без ошибок. Автор пакета указал опцию -Wall для компи-лятора и в процессе сборки были видны лишь незначительные предупреждения об использовании условия if без ограничивающих скобок.

3. Установка DieHarder

Для сборки программы необходимо наличие библиотеки GSL (GNU Scientiﬁc Library) [21]. В Ubuntu ее можноустановить выполнив команду apt install libgsl-dev

Скачиваем архив с исходным кодом с официального сайта [20] и распаковываем: wget https://webhome.phy.duke.edu/~rgb/General/ dieharder/dieharder-3.31.1.tgzuz dieharder-3.31.1.tgzcd dieharder-3.31.1

Для установки DieHarder, как и в случае ./configure --prefix=$HOME/usrmakemake install

Дополнительно при выполнении ./configure можно указать опцию --disable-shared , что приведет к стати-ческой компиляции утилиты командной строки и динамическая библиотека не будет создана. Это удобно, есливам нужна только утилита командной строки.В процессе компиляции в нашем случае произошла ошибка: компилятор не смог найти определение типа intptr_t . Это можно исправить, включив в файл ./include/dieharder/libdieharder.h заголовочный файл stdint.h . Дальнейшая установка прошла без ошибок. В процессе выполнения make install в директорию ~/usr/lib были перенесены библиотечные файлы, в ~/usr/include была автоматически создана поддиректория dieharder и в нее помещены все заголовочные файлы. В ~/usr/bin оказалась утилита командной строки dieharder .

4. Установка PractRand

Скачиваем архив с исходным кодом с официального сайта [18] и распаковываем: wget https://netcologne.dl.sourceforge.net/project/ pracrand/PractRand_0.94.zipuz PractRand_0.94.zip

В архиве поставляются уже собранные файлы динамической и статической библиотек для ОС Windows.Также присутствует файл проекта для Visual Studio. Для установки же под Unix перейдем в директорию unix ,а сборку запустим командой make cd PractRand_0.94/unixmake

Для сборки необходим компилятор для языка

C++ , поддерживающий стандарт

C++11 .При сборке возникали ошибки, связанные с регистром букв в названии заголовочных файлов. Так, напримерфайл

Coup16.h был указан в test.cpp в нижнем регистре, хотя само имя файла начинается с заглавной буквы.Еще несколько таких ошибок были вызваны той же ошибкой в именах файлов

NearSeq.h , birthday.h и каталога Tests . Такая путаница вызвана скорее всего тем, что автор разрабатывал программу под ОС Windows, гдерегистр значения не имеет.После устранения ошибок получаем 4 исполняемых файла и статическую библиотеку libpracrand.a .Описав установку всех четырёх рассматриваемых программных пакетов, перейдём к описанию утилит длятестирования последовательностей псевдослучайных чисел, которые эти пакеты предоставляют.

B. Утилиты командной строки

Пакеты

PractRand , DieHarder и gjrand имеют в своем составе утилиты командной строки. Данные утилитыпозволяют проводить статистические тесты над последовательностью случайных чисел считывая ее из стан-дартного потока ввода или из файла. Наиболее функциональная утилита входит в состав DieHarder .

1. Утилиты PractRand

После компиляции и сборки PractRand мы получим в свое распоряжение четыре исполняемых файла.•

RNG_output — позволяет запустить один из генераторов, входящих в состав пакета.•

RNG_test — утилита, предназначенная для проведения статистических тестов над встроенными генерато-рами или над потоком данных, подаваемом на стандартный ввод.•

RNG_benchmark — измерения скорости работы встроенных или добавленных пользователем генераторов.•

Test_calibration — для тестирования и настройки наборов тестов.Для целей тестирования используется

RNG_test . Для примера, запуск теста для встроенного генератора jsf32 достаточно выполнить команду

RNG_test jsf32

Для теста внешней программы нужно подать генерируемый бинарный поток беззнаковых целых чисел на стан-дартный вход

RNG_test , например так: ./random -G 4 -N 1000000000000000 | RNG_test stdin32

Опция stdin32 заставляет

RNG_test интерпретировать поток бинарных данных как набор -битных чисел.Если указана опция stdin64 , то числа будут восприниматься как битные, а при указании stdin программасама решит какую разрядность использовать. Для тестирования данных из файла можно поступит, напримертак: cat file.data | RNG_test stdin64 Данные в файле также должны быть в бинарном формате. Данные в текстовом формате не поддерживаются.Тесты проводятся довольно быстро и на выход распечатывается отчет, где перечисляются проваленные иподозрительные тесты. Указав опцию -p 1 можно заставить программу печатать все результаты, а не толькопроваленные.

Таблица II. Опции mcp пакета gjrand

Опция Описание –tiny (10 MB) –small (100 MB) –standard (1 GB) (default) –big (10 GB) –huge (100 GB) –tera (1 TB) –ten-tera (10 TB) number количество байтов –no-rewind не повторяться

2. Утилиты gjrand

После сборки появляется множество разных исполняемых файлов. Для тестирования внешних генераторовпредназначены два из них. Один расположен в директории testfunif и называется mcp (master computerprogram), а второй в директории testother и называется fmcp . Утилита mcp предназначена для тестирова-ния последовательностей целых беззнаковых псевдослучайных чисел, а fmcp для чисел из интервала [0 , . Востальном между ними отличий нет.Программа mcp принимает на стандартный вход только бинарный поток. Делается это с помощью конвейера ./random -G 4 -N 1000000000000000 | mcp --big К переданному потоку применяется весть набор тестов. Результаты распечатываются по мере появления. Дру-гих опций, кроме настройки объёма проверяемых данных (см. таблицу II) у программы нет.

3. Утилита dieharder

После установки пакета

DieHarder в консоли станет доступна команда dieharder . Данная команда позволяетзапускать и тестировать встроенные в

DieHarder и библиотеку

GSL генераторы псевдослучайных чисел. Крометого, она может тестировать поток случайных чисел из файлов или из стандартного ввода. Подробную справкупо всем возможным опциям можно получить указав опцию -h . Перечислим кратко наиболее важные из них:• -l — показать список доступных тестов.• -d n — выбрать для применения тест n .• -a — применить все тесты.• -g -1 — показать список доступных генераторов псевдослучайных чисел. Выбранный генератор можнопротестировать указав опцию -g n , где n — номер генератора из списка.В списке генераторов присутствуют, в частности, следующие опции.• Опция позволяет считывать поток бинарных данных, подаваемый на стандартный ввод stdin .• Опции и включают считывания данных из файла в бинарном формате и текстовом форматесоответственно.• Опции и позволяют считывать поток байтов с псевдоустройств /dev/random и /dev/urandom .Для тестирования внешних генераторов наиболее предпочтительным способом будет передача непрерывногопотока двоичных данных через конвейер ( | ). Сделать это можно, например, следующим способом: ./random -G 4 -N 1000000000000000 | dieharder -g 200 -a Программа random будет использовать генератор под номером 4 для генерации чисел. Заметим, что это неприведет к зависанию процесса, так как dieharder сам закроет канал и завершит процесс, когда проведет всетесты.В случае считывания псевдослучайных числе из текстового файла, данный файл должен иметь определеннуюструктуру. Каждое псевдослучайное число должно располагаться на новой строке, а в первых строках файланеобходимо указать следующие данные: тип чисел ( d — целые числа двойной точности), количество чисел вфайле и разрядность чисел ( или бита). Приведем пример начала такого файла: type: dcount: 5numbit: 641343742658553450546163299420274983667023111285719358198731296616083714213600417179712607770735227 Когда такой файл создан можно передать его dieharder для проведения тестирования dieharder -a -g 202 -f file.in > file.out флаг -f задаёт входной файл с числами для анализа. Результаты тестирования будут сохранены в file.out .Для полноценного теста более чисел, поэтому размер файла может превысить десятки гигабайт. В случаеменьшего количества чисел dieharder может начать считывать файл сначала, что приведет к ухудшениюрезультатов теста. VI. ПРИМЕР ТЕСТИРОВАНИЯ ГЕНЕРАТОРОВ SYMPY И MAXIMA

Рассмотрим как можно организовать тестирование любого генератора псевдослучайных чисел на примереCAS Maxima и библиотеки SymPy для Python. Будем использовать CAS Maxima версии 5.42.1 и Python 3.6.8,дистрибутив Miniconda. Нашей целью не является сравнение реализаций генераторов, а лишь предоставлениепримера, который может быть использован для тестирования генераторов в других библиотеках, математиче-ских пакетах и системах компьютерной алгебры.Так как SymPy является python-модулем, то для генерации псевдослучайных чисел используется стандарт-ный модуль random . Следующий фрагмент кода позволяет организовать вывод беззнаковых 64-битных целыхчисел в стандартный поток вывода. import randomimport syswhile True:r = random.randint(0,2**64-1)r = r.to_bytes(8, byteorder="little", signed=False)sys.stdout.buffer.write(r)

В данном фрагменте генерируется целое число из полуинтервала [0 , ) , затем оно преобразуется в двоич-ный вид и выводится в стандартный поток вывода. Записав данный скрипт в файл rand_test.py мы можеморганизовать тестирование выполнив в консоли следующую команду: python rand_test.py | dieharder -g 200 -a или python rand_test.py | RNG_test stdin64 В случае Maxima сгенерировать последовательность случайных целых чисел можно с помощью следующегофрагмента кода: for i: 1 thru 10 step 1 do print(random(2^64-1))$

Однако на выходе получим столбик целых чисел в текстовом виде, поэтому для тестирования данного ге-нератора воспользуемся возможностью утилиты dieharder считывать данные в текстовом виде из заранееподготовленных файлов (см. раздел V B 3).Так как обе системы компьютерной алгебры используют алгоритм вихрь Мерсенна, то получен-ные результаты тестирования практически одинаковы. Утилита

RNG_test для SymPy выдает результат no anomalies in 133 test result(s) , а утилита отображает в отчете 5 слабо пройденных тестов из болеечем тридцати. Результаты dieharder для Maxima практически идентичны.0

VII. ЗАКЛЮЧЕНИЕ

Авторы постарались дать исчерпывающее описание практики построения стенда для исследования программ-ной реализации генераторов псевдослучайных чисел. Описанное программное обеспечение дает исследователювыбор из трёх утилит командной строки. Эти утилиты снимают необходимость низкоуровневого программиро-вания. Кроме того, их удобно использовать вместе с каким либо промежуточным связующим языком програм-мирования.Для использования этих утилит необходимо обеспечить вывод в стандартных поток результатов генерациичисел в бинарном виде. Так как большинство систем компьютерной алгебры ориентированны на высокоуров-невое программирование, то обеспечить бинарный вывод может быть затруднительно. В этом случае можноиспользовать только утилиту dieharder , так как она может обрабатывать данные в текстовом виде.В конце работы демонстрируются приёмы исследования программных реализаций генераторов случайныхчисел на примере двух свободных систем компьютерной алгебры.

БЛАГОДАРНОСТИ

Публикация подготовлена при поддержке Программы РУДН «5-100». [1] Дроздова И. И., Жилин В. В. Генераторы случайных и псевдослучайных чисел // Технические науки в Россиии за рубежом: материалы VII Междунар. науч. конф. — М. : Буки-Веди, 2017. — 11. — С. 13–15. — Режим доступа: https://moluch.ru/conf/tech/archive/286/13233 .[2] Колчин В. Ф., Севастьянов Б. А., Чистяков В. П. Случайные размещения. — М. : Наука, 1976. — 224 с.[3] Тихомирова М. И., Чистяков В. П. Статистические критерии, предназначенные для выявления некоторых видовзависимостей между случайными последовательностями // Тр. по дискр. матем. — 2006. — Т. 3. — С. 357–376. —Режим доступа: http://mi.mathnet.ru/tdm153 .[4] Тихомирова М. И., Чистяков В. П. Статистический критерий типа критерия пустых ящиков //Математические вопросы криптографии. — 2010. — Т. 1, № 1. — С. 101–108. — Режим доступа: http://mi.mathnet.ru/mvk6 .[5] Кириченко Л. O., Цехмистро Р. И., Круг О. Я., Стороженко А. В. Сравнительный анализ генерации псевдослучайныхчисел в современных технологиях беспроводной передачи данных // Системи обробки iнформацiї. — 2009. — Т. 4(78). — С. 70–74. — Режим доступа: .[6] Тюрин Ю. Н., Макаров А. А. Статистический анализ данных на компьютере / Под ред. В. Э. Фигурнова. — М. :ИНФРА, 1998. — 528 с. — ISBN: 5-86225-662-8.[7] Кнут Д. Э. Искусство программирования. — 3 изд. — М. : Вильямс, 2001. — Т. 2. — 832 с. — ISBN: 5-8459-0081-6.[8] Maple home site. — 2020. — Access mode: .[9] Mathematica home site. — 2020. — Access mode: .[10] SymPy home site. — 2020. — Access mode: .[11] Matsumoto M., Nishimura T. Mersenne Twister: A 623-dimensionally Equidistributed Uniform Pseudo-randomNumber Generator // ACM Trans. Model. Comput. Simul. — 1998. — Vol. 8, no. 1. — P. 3–30. — Access mode: .[12] Panneton F., L’Ecuyer P. On the Xorshift Random Number Generators // ACM Trans. Model. Comput. Simul. — 2005. —Vol. 15, no. 4. — P. 346–361.[13] PCG: A Family of Simple Fast Space-Eﬃcient Statistically Good Algorithms for Random Number Generation : Rep. :HMC-CS-2014-0905 / Harvey Mudd College ; Executor: M. E. O’Neill. — Claremont, CA : 2014. — Access mode: .[14] Boldi P., Vigna S. On the Lattice of Antichains of Finite Intervals // Order. — 2018. — Vol. 35, no. 1. — P. 57–81.[15] Marsaglia G. The Marsaglia Random Number CDROM including the Diehard Battery of Tests of Randomness. — 1995. —Access mode: https://web.archive.org/web/20160125103112/http://stat.fsu.edu/pub/diehard/ .[16] L’Ecuyer P., Simard R. TestU01 — Empirical Testing of Random Number Generators. — 2009. — Access mode: http://simul.iro.umontreal.ca/testu01/tu01.html .[17] L’Ecuyer P., Simard R. TestU01: A C library for empirical testing of random number generators //ACM Transactions on Mathematical Software (TOMS). — 2007. — Vol. 33, no. 4. — P. 22. — Access mode: .[18] Doty-Humphrey C. PractRand oﬃcial site. — 2018. — Access mode: http://pracrand.sourceforge.net/ .[19] Gjrand random numbers oﬃcial site. — 2014. — Access mode: http://gjrand.sourceforge.net/ .[20] Brown R. G., Eddelbuettel D., Bauer D. Dieharder: A Random Number Test Suite. — 2017. — Access mode: . [21] Galassi M., Gough B., Jungman G., Theiler J., Davies J., Booth M., Rossi F. GSL — GNU Scientiﬁc Library. — 2019. —Access mode: