htswanalysis/MACS/lib/gsl/gsl-1.11/doc/gsl-design.texi

   1 \input texinfo @c -*-texinfo-*-
   2 @c %**start of header
   3 @setfilename gsl-design.info
   4 @settitle GNU Scientific Library
   5 @finalout
   6 @c -@setchapternewpage odd
   7 @c %**end of header
   8
   9 @dircategory Scientific software
  10 @direntry
  11 * GSL-design: (GSL-design).             GNU Scientific Library -- Design
  12 @end direntry
  13
  14 @comment @include version-design.texi
  15 @set GSL @i{GNU Scientific Library}
  16
  17 @ifinfo
  18 This file documents the @value{GSL}.
  19
  20 Copyright (C) 1996, 1997, 1998, 1999, 2000, 2001, 2004 The GSL Project.
  21
  22 Permission is granted to copy, distribute and/or modify this document
  23 under the terms of the GNU Free Documentation License, Version 1.2 or
  24 any later version published by the Free Software Foundation; with no
  25 Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts.  A
  26 copy of the license is included in the section entitled ``GNU Free
  27 Documentation License''.
  28 @end ifinfo
  29
  30 @titlepage
  31 @title GNU Scientific Library -- Design document
  32 @comment @subtitle Edition @value{EDITION}, for gsl Version @value{VERSION}
  33 @comment @subtitle @value{UPDATED}
  34 @author Mark Galassi
  35 Los Alamos National Laboratory
  36
  37 @author James Theiler
  38 Astrophysics and Radiation Measurements Group, Los Alamos National Laboratory
  39
  40 @author Brian Gough
  41 Network Theory Limited
  42
  43 @page
  44 @vskip 0pt plus 1filll
  45 Copyright @copyright{} 1996,1997,1998,1999,2000,2001,2004 The GSL Project.
  46
  47 Permission is granted to make and distribute verbatim copies of
  48 this manual provided the copyright notice and this permission notice
  49 are preserved on all copies.
  50
  51 Permission is granted to copy and distribute modified versions of this
  52 manual under the conditions for verbatim copying, provided that the entire
  53 resulting derived work is distributed under the terms of a permission
  54 notice identical to this one.
  55
  56 Permission is granted to copy and distribute translations of this manual
  57 into another language, under the above conditions for modified versions,
  58 except that this permission notice may be stated in a translation approved
  59 by the Foundation.
  60 @end titlepage
  61
  62 @contents
  63
  64 @node Top, Motivation, (dir), (dir)
  65 @top About GSL
  66
  67 @ifinfo
  68 This file documents the design of @value{GSL}, a collection of numerical
  69 routines for scientific computing.
  70
  71 More information about GSL can be found at the project homepage,
  72 @uref{http://www.gnu.org/software/gsl/}.
  73 @end ifinfo
  74
  75 The @value{GSL} is a library of scientific subroutines.  It aims to
  76 provide a convenient interface to routines that do standard (and not so
  77 standard) tasks that arise in scientific research.  More than that, it
  78 also provides the source code.  Users are welcome to alter, adjust,
  79 modify, and improve the interfaces and/or implementations of whichever
  80 routines might be needed for a particular purpose.
  81
  82 GSL is intended to provide a free equivalent to existing proprietary
  83 numerical libraries written in C or Fortran, such as NAG, IMSL's CNL,
  84 IBM's ESSL, and SGI's SCSL.
  85
  86 The target platform is a low-end desktop workstation. The goal is to
  87 provide something which is generally useful, and the library is aimed at
  88 general users rather than specialists.
  89
  90 @menu
  91 * Motivation::
  92 * Contributing::
  93 * Design::
  94 * Bibliography::
  95 * Copying::
  96 * GNU Free Documentation License::
  97 @end menu
  98
  99 @node Motivation, Contributing, Top, Top
 100 @chapter Motivation
 101 @cindex numerical analysis
 102 @cindex free software
 103
 104 There is a need for scientists and engineers to have a numerical library
 105 that:
 106 @itemize @bullet
 107 @item
 108 is free (in the sense of freedom, not in the sense of gratis; see the
 109 GNU General Public License), so that people can use that library,
 110 redistribute it, modify it @dots{}
 111 @item
 112 is written in C using modern coding conventions, calling conventions,
 113 scoping @dots{}
 114 @item
 115 is clearly and pedagogically documented; preferably with TeXinfo, so as
 116 to allow online info, WWW and TeX output.
 117 @item
 118 uses top quality state-of-the-art algorithms.
 119 @item
 120 is portable and configurable using @emph{autoconf} and @emph{automake}.
 121 @item
 122 basically, is GNUlitically correct.
 123 @end itemize
 124
 125 There are strengths and weaknesses with existing libraries:
 126
 127 @emph{Netlib} (http://www.netlib.org/) is probably the most advanced set
 128 of numerical algorithms available on the net, maintained by AT&T.
 129 Unfortunately most of the software is written in Fortran, with strange
 130 calling conventions in many places.  It is also not very well collected,
 131 so it is a lot of work to get started with netlib.
 132
 133 @emph{GAMS} (http://gams.nist.gov/) is an extremely well organized set
 134 of pointers to scientific software, but like netlib, the individual
 135 routines vary in their quality and their level of documentation.
 136
 137 @emph{Numerical Recipes} (http://www.nr.com,
 138 http://cfata2.harvard.edu/nr/) is an excellent book: it explains the
 139 algorithms in a very clear way.  Unfortunately the authors released the
 140 source code under a license which allows you to use it, but prevents you
 141 from re-distributing it.  Thus Numerical Recipes is not @emph{free} in
 142 the sense of @emph{freedom}.  On top of that, the implementation suffers
 143 from @emph{fortranitis} and other
 144 limitations. [http://www.lysator.liu.se/c/num-recipes-in-c.html]
 145
 146 @emph{SLATEC} is a large public domain collection of numerical routines
 147 in Fortran written under a Department of Energy program in the
 148 1970's. The routines are well tested and have a reasonable overall
 149 design (given the limitations of that era).  GSL should aim to be a
 150 modern version of SLATEC.
 151
 152 @emph{NSWC} is the Naval Surface Warfare Center numerical library.  It
 153 is a large public-domain Fortran library, containing a lot of
 154 high-quality code.  Documentation for the library is hard to find, only
 155 a few photocopies of the printed manual are still in circulation.
 156
 157 @emph{NAG} and @emph{IMSL} both sell high-quality libraries which are
 158 proprietary.  The NAG library is more advanced and has wider scope than
 159 IMSL. The IMSL library leans more towards ease-of-use and makes
 160 extensive use of variable length argument lists to emulate "default
 161 arguments".
 162
 163 @emph{ESSL} and @emph{SCSL} are proprietary libraries from IBM and SGI.
 164
 165 @emph{Forth Scientific Library} [see the URL
 166 http://www.taygeta.com/fsl/sciforth.html].  Mainly of interest to Forth
 167 users.
 168
 169 @emph{Numerical Algorithms with C} G. Engeln-Mullges, F. Uhlig. A nice
 170 numerical library written in ANSI C with an accompanying
 171 textbook. Source code is available but the library is not free software.
 172
 173 @emph{NUMAL} A C version of the NUMAL library has been written by
 174 H.T. Lau and is published as a book and disk with the title "A Numerical
 175 Library in C for Scientists and Engineers". Source code is available but
 176 the library is not free software.
 177
 178 @emph{C Mathematical Function Handbook} by Louis Baker. A library of
 179 function approximations and methods corresponding to those in the
 180 "Handbook of Mathematical Functions" by Abramowitz and Stegun.  Source
 181 code is available but the library is not free software.
 182
 183 @emph{CCMATH} by Daniel A. Atkinson. A C numerical library covering
 184 similar areas to GSL. The code is quite terse.  Earlier versions were
 185 under the GPL but unfortunately it has changed to the LGPL in recent
 186 versions.
 187
 188 @emph{CEPHES} A useful collection of high-quality special functions
 189 written in C. Not GPL'ed.
 190
 191 @emph{WNLIB} A small collection of numerical routines written in C by
 192 Will Naylor. Public domain.
 193
 194 @emph{MESHACH} A comprehensive matrix-vector linear algebra library
 195 written in C. Freely available but not GPL'ed (non-commercial license).
 196
 197 @emph{CERNLIB} is a large high-quality Fortran library developed at CERN
 198 over many years.  It was originally non-free software but has recently
 199 been released under the GPL.
 200
 201 @emph{COLT} is a free numerical library in Java developed at CERN by
 202 Wolfgang Hoschek.  It is under a BSD-style license.
 203
 204 The long-term goal will be to provide a framework to which the real
 205 numerical experts (or their graduate students) will contribute.
 206
 207 @node  Contributing, Design, Motivation, Top
 208 @chapter Contributing
 209
 210 This design document was originally written in 1996.  As of 2004, GSL
 211 itself is essentially feature complete, the developers are not actively
 212 working on any major new functionality.
 213
 214 The main emphasis is now on ensuring the stability of the existing
 215 functions, improving consistency, tidying up a few problem areas and
 216 fixing any bugs that are reported.  Potential contributors are
 217 encouraged to gain familiarity with the library by investigating and
 218 fixing known problems listed in the @file{BUGS} file in the CVS
 219 repository.
 220
 221 Adding large amounts of new code is difficult because it leads to
 222 differences in the maturity of different parts of the library.  To
 223 maintain stability, any new functionality is encouraged as
 224 @dfn{packages}, built on top of GSL and maintained independently by the
 225 author, as in other free software projects (such as the Perl CPAN
 226 archive and TeX CTAN archive, etc).
 227
 228 @menu
 229 * Packages::
 230 @end menu
 231
 232 @node Packages,  , Contributing, Contributing
 233 @section Packages
 234
 235 The design of GSL permits extensions to be used alongside the existing
 236 library easily by simple linking.  For example, additional random number
 237 generators can be provided in a separate library:
 238
 239 @example
 240 $ tar xvfz rngextra-0.1.tar.gz
 241 $ cd rngextra-0.1
 242 $ ./configure; make; make check; make install
 243 $ ...
 244 $ gcc -Wall main.c -lrngextra -lgsl -lgslcblas -lm
 245 @end example
 246
 247 The points below summarise the package design guidelines.  These are
 248 intended to ensure that packages are consistent with GSL itself, to make
 249 life easier for the end-user and make it possible to distribute popular
 250 well-tested packages as part of the core GSL in future.
 251
 252 @itemize @bullet
 253 @item Follow the GSL and GNU coding standards described in this document
 254
 255 This means using the standard GNU packaging tools, such as Automake,
 256 providing documentation in Texinfo format, and a test suite.  The test
 257 suite should run using @samp{make check}, and use the test functions
 258 provided in GSL to produce the output with @code{PASS:}/@code{FAIL:}
 259 lines.  It is not essential to use libtool since packages are likely to
 260 be small, a static library is sufficient and simpler to build.
 261
 262 @item Use a new unique prefix for the package (do not use @samp{gsl_} -- this is reserved for internal use).
 263
 264 For example, a package of additional random number generators might use
 265 the prefix @code{rngextra}.
 266
 267 @example
 268 #include <rngextra.h>
 269
 270 gsl_rng * r = gsl_rng_alloc (rngextra_lsfr32);
 271 @end example
 272
 273 @item Use a meaningful version number which reflects the state of development
 274
 275 Generally, @code{0.x} are alpha versions, which provide no guarantees.
 276 Following that, @code{0.9.x} are beta versions, which should be essentially
 277 complete, subject only to minor changes and bug fixes.  The first major
 278 release is @code{1.0}.  Any version number of @code{1.0} or higher
 279 should be suitable for production use with a well-defined API.
 280
 281 The API must not change in a major release and should be
 282 backwards-compatible in its behavior (excluding actual bug-fixes), so
 283 that existing code do not have to be modified.  Note that the API
 284 includes all exported definitions, including data-structures defined
 285 with @code{struct}.  If you need to change the API in a package, it
 286 requires a new major release (e.g. @code{2.0}).
 287
 288 @item Use the GNU General Public License (GPL)
 289
 290 Follow the normal procedures of obtaining a copyright disclaimer if you
 291 would like to have the package considered for inclusion in GSL itself in
 292 the future (@pxref{Legal issues}).
 293 @end itemize
 294
 295 Post announcements of your package releases to
 296 @email{gsl-discuss@@sourceware.org} so that information about them
 297 can be added to the GSL webpages.
 298
 299 For security, sign your package with GPG (@code{gpg --detach-sign
 300 @var{file}}).
 301
 302 An example package @samp{rngextra} containing two additional random
 303 number generators can be found at
 304 @url{http://www.network-theory.co.uk/download/rngextra/}.
 305
 306 @node Design, Bibliography, Contributing, Top
 307 @chapter Design
 308
 309 @menu
 310 * Language for implementation::
 311 * Interface to other languages::
 312 * What routines are implemented::
 313 * What routines are not implemented::
 314 * Design of  Numerical Libraries::
 315 * Code Reuse::
 316 * Standards and conventions::
 317 * Background and Preparation::
 318 * Choice of Algorithms::
 319 * Documentation::
 320 * Namespace::
 321 * Header files::
 322 * Target system::
 323 * Function Names::
 324 * Object-orientation::
 325 * Comments::
 326 * Minimal structs::
 327 * Algorithm decomposition::
 328 * Memory allocation and ownership::
 329 * Memory layout::
 330 * Linear Algebra Levels::
 331 * Error estimates::
 332 * Exceptions and Error handling::
 333 * Persistence::
 334 * Using Return Values::
 335 * Variable Names::
 336 * Datatype widths::
 337 * size_t::
 338 * Arrays vs Pointers::
 339 * Pointers::
 340 * Constness::
 341 * Pseudo-templates::
 342 * Arbitrary Constants::
 343 * Test suites::
 344 * Compilation::
 345 * Thread-safety::
 346 * Legal issues::
 347 * Non-UNIX portability::
 348 * Compatibility with other libraries::
 349 * Parallelism::
 350 * Precision::
 351 * Miscellaneous::
 352 @end menu
 353
 354 @node Language for implementation, Interface to other languages, Design, Design
 355 @section Language for implementation
 356
 357 @strong{One language only (C)}
 358
 359 Advantages: simpler, compiler available and quite universal.
 360
 361 @node Interface to other languages, What routines are implemented, Language for implementation, Design
 362 @section Interface to other languages
 363
 364 Wrapper packages are supplied as "extra" packages; not as part of the
 365 "core". They are maintained separately by independent contributors.
 366
 367 Use standard tools to make wrappers: swig, g-wrap
 368
 369 @node What routines are implemented, What routines are not implemented, Interface to other languages, Design
 370 @section What routines are implemented
 371
 372 Anything which is in any of the existing libraries.  Obviously it makes
 373 sense to prioritize and write code for the most important areas first.
 374
 375 @c @itemize @bullet
 376 @c @item Random number generators
 377
 378 @c Includes both random number generators and routines to give various
 379 @c interesting distributions.
 380
 381 @c @item Statistics
 382
 383 @c @item Special Functions
 384
 385 @c What I (jt) envision for this section is a collection of routines for
 386 @c reliable and accurate (but not necessarily fast or efficient) estimation
 387 @c of values for special functions, explicitly using Taylor series, asymptotic
 388 @c expansions, continued fraction expansions, etc.  As well as these routines,
 389 @c fast approximations will also be provided, primarily based on Chebyschev
 390 @c polynomials and ratios of polynomials.  In this vision, the approximations
 391 @c will be the "standard" routines for the users, and the exact (so-called)
 392 @c routines will be used for verification of the approximations.  It may also
 393 @c be useful to provide various identity-checking routines as part of the
 394 @c verification suite.
 395
 396 @c @item Curve fitting
 397
 398 @c polynomial, special functions, spline
 399
 400 @c @item Ordinary differential equations
 401
 402 @c @item Partial differential equations
 403
 404 @c @item Fourier Analysis
 405
 406 @c @item Wavelets
 407
 408 @c @item Matrix operations: linear equations
 409
 410 @c @item Matrix operations: eigenvalues and spectral analysis
 411
 412 @c @item Matrix operations: any others?
 413
 414 @c @item Direct integration
 415
 416 @c @item Monte carlo methods
 417
 418 @c @item Simulated annealing
 419
 420 @c @item Genetic algorithms
 421
 422 @c We need to think about what kinds of algorithms are basic generally
 423 @c useful numerical algorithms, and which ones are special purpose
 424 @c research projects.  We should concentrate on supplying the former.
 425
 426 @c @item Cellular automata
 427
 428 @c @end itemize
 429
 430 @node What routines are not implemented, Design of  Numerical Libraries, What routines are implemented, Design
 431 @section What routines are not implemented
 432
 433 @itemize @bullet
 434 @item anything which already exists as a high-quality GPL'ed package.
 435
 436 @item anything which is too big
 437  -- i.e. an application in its own right rather than a subroutine
 438
 439 For example, partial differential equation solvers are often huge and
 440 very specialized applications (since there are so many types of PDEs,
 441 types of solution, types of grid, etc).  This sort of thing should
 442 remain separate.  It is better to point people to the good applications
 443 which exist.
 444
 445 @item anything which is independent and useful separately.
 446
 447 Arguably functions for manipulating date and time, or financial
 448 functions might be included in a "scientific" library.  However, these
 449 sorts of modules could equally well be used independently in other
 450 programs, so it makes sense for them to be separate libraries.
 451 @end itemize
 452
 453 @node  Design of  Numerical Libraries, Code Reuse, What routines are not implemented, Design
 454 @section  Design of  Numerical Libraries
 455
 456 In writing a numerical library there is a unavoidable conflict between
 457 completeness and simplicity.  Completeness refers to the ability to
 458 perform operations on different objects so that the group is
 459 "closed". In mathematics objects can be combined and operated on in an
 460 infinite number of ways.  For example, I can take the derivative of a
 461 scalar field with respect to a vector and the derivative of a vector
 462 field wrt a scalar (along a path).
 463
 464 There is a definite tendency to unconsciously try to reproduce all these
 465 possibilities in a numerical library, by adding new features one by
 466 one. After all, it is always easy enough to support just one more
 467 feature.... so why not?
 468
 469 Looking at the big picture, no-one would start out by saying "I want to
 470 be able to represent every possible mathematical object and operation
 471 using C structs" -- this is a strategy which is doomed to fail.  There
 472 is a limited amount of complexity which can be represented in a
 473 programming language like C.  Attempts to reproduce the complexity of
 474 mathematics within such a language would just lead to a morass of
 475 unmaintainable code.  However, it's easy to go down that road if you
 476 don't think about it ahead of time.
 477
 478 It is better to choose simplicity over completeness.  In designing new
 479 parts of the library keep modules independent where possible. If
 480 interdependencies between modules are introduced be sure about where you
 481 are going to draw the line.
 482
 483 @node Code Reuse, Standards and conventions, Design of  Numerical Libraries, Design
 484 @section Code Reuse
 485
 486 It is useful if people can grab a single source file and include it in
 487 their own programs without needing the whole library.  Try to allow
 488 standalone files like this whenever it is reasonable.  Obviously the
 489 user might need to define a few macros, such as GSL_ERROR, to compile
 490 the file but that is ok.  Examples where this can be done: grabbing a
 491 single random number generator.
 492
 493 @node Standards and conventions, Background and Preparation, Code Reuse, Design
 494 @section Standards and conventions
 495
 496 The people who kick off this project should set the coding standards and
 497 conventions.  In order of precedence the  standards that we follow are,
 498
 499 @itemize @bullet
 500 @item We follow the GNU Coding Standards.
 501 @item We follow the conventions of the ANSI Standard C Library.
 502 @item We follow the conventions of the GNU C Library.
 503 @item We follow the conventions of the glib GTK support Library.
 504 @end itemize
 505
 506 The references for these standards are the @cite{GNU Coding Standards}
 507 document, Harbison and Steele @cite{C: A Reference Manual}, the
 508 @cite{GNU C Library Manual} (version 2), and the Glib source code.
 509
 510 For mathematical formulas, always follow the conventions in Abramowitz &
 511 Stegun, the @cite{Handbook of Mathematical Functions}, since it is the
 512 definitive reference and also in the public domain.
 513
 514 If the project has a philosophy it is to "Think in C".  Since we are
 515 working in C we should only do what is natural in C, rather than trying
 516 to simulate features of other languages.  If there is something which is
 517 unnatural in C and has to be simulated then we avoid using it. If this
 518 means leaving something out of the library, or only offering a limited
 519 version then so be it.  It is not worthwhile making the library
 520 over-complicated.  There are numerical libraries in other languages, and
 521 if people need the features of those languages it would be sensible for
 522 them to use the corresponding libraries, rather than coercing a C
 523 library into doing that job.
 524
 525 It should be borne in mind at all time that C is a macro-assembler.  If
 526 you are in doubt about something being too complicated ask yourself the
 527 question "Would I try to write this in macro-assembler?" If the answer
 528 is obviously "No" then do not try to include it in GSL. [BJG]
 529
 530 It will be useful to read the following papers,
 531
 532 @itemize @asis
 533 @item
 534 Kiem-Phong Vo, ``The Discipline and Method Architecture for Reusable
 535 Libraries'', Software - Practice & Experience, v.30, pp.107-128, 2000.
 536 @end itemize
 537
 538 @noindent
 539 It is available from
 540 @uref{http://www.research.att.com/sw/tools/sfio/dm-spe.ps} or the earlier
 541 technical report Kiem-Phong Vo, "An Architecture for Reusable Libraries"
 542 @uref{http://citeseer.nj.nec.com/48973.html}.
 543
 544 There are associated papers on Vmalloc, SFIO, and CDT which are also
 545 relevant to the design of portable C libraries.
 546
 547 @itemize @asis
 548 @item
 549 Kiem-Phong Vo, ``Vmalloc: A General and Efficient Memory
 550 Allocator''. Software Practice & Experience, 26:1--18, 1996.
 551
 552 @uref{http://www.research.att.com/sw/tools/vmalloc/vmalloc.ps}
 553 @item
 554 Kiem-Phong Vo. ``Cdt: A Container Data Type Library''. Soft. Prac. &
 555 Exp., 27:1177--1197, 1997
 556
 557 @uref{http://www.research.att.com/sw/tools/cdt/cdt.ps}
 558 @item
 559 David G. Korn and Kiem-Phong Vo, ``Sfio: Safe/Fast String/File IO'',
 560 Proceedings of the Summer '91 Usenix Conference, pp.  235-256, 1991.
 561
 562 @uref{http://citeseer.nj.nec.com/korn91sfio.html}
 563 @end itemize
 564
 565 Source code should be indented according to the GNU Coding Standards,
 566 with spaces not tabs. For example, by using the @code{indent} command:
 567
 568 @example
 569 indent -gnu -nut *.c *.h
 570 @end example
 571
 572 @noindent
 573 The @code{-nut} option converts tabs into spaces.
 574
 575 @node Background and Preparation, Choice of Algorithms, Standards and conventions, Design
 576 @section Background and Preparation
 577
 578 Before implementing something be sure to research the subject
 579 thoroughly!  This will save a lot of time in the long-run.  The two most
 580 important steps are,
 581
 582 @enumerate
 583 @item
 584 to determine whether there is already a free library (GPL or
 585 GPL-compatible) which does the job.  If so, there is no need to
 586 reimplement it.  Carry out a search on Netlib, GAMs, na-net,
 587 sci.math.num-analysis and the web in general. This should also provide
 588 you with a list of existing proprietary libraries which are relevant,
 589 keep a note of these for future reference in step 2.
 590
 591 @item
 592 make a comparative survey of existing implementations in the
 593 commercial/free libraries. Examine the typical APIs, methods of
 594 communication between program and subroutine, and classify them so that
 595 you are familiar with the key concepts or features that an
 596 implementation may or may not have, depending on the relevant tradeoffs
 597 chosen.  Be sure to review the documentation of existing libraries for
 598 useful references.
 599
 600 @item
 601 read up on the subject and determine the state-of-the-art.  Find the
 602 latest review papers.  A search of the following journals should be
 603 undertaken.
 604
 605 @itemize @asis
 606 @item ACM Transactions on Mathematical Software
 607 @item Numerische Mathematik
 608 @item Journal of Computation and Applied Mathematics
 609 @item Computer Physics Communications
 610 @item SIAM Journal of Numerical Analysis
 611 @item SIAM Journal of Scientific Computing
 612 @end itemize
 613 @end enumerate
 614
 615 @noindent
 616 Keep in mind that GSL is not a research project.  Making a good
 617 implementation is difficult enough, without also needing to invent new
 618 algorithms.  We want to implement existing algorithms whenever
 619 possible. Making minor improvements is ok, but don't let it be a
 620 time-sink.
 621
 622 @node Choice of Algorithms, Documentation, Background and Preparation, Design
 623 @section Choice of Algorithms
 624
 625 Whenever possible choose algorithms which scale well and always remember
 626 to handle asymptotic cases.  This is particularly relevant for functions
 627 with integer arguments.  It is tempting to implement these using the
 628 simple @math{O(n)} algorithms used to define the functions, such as the
 629 many recurrence relations found in Abramowitz and Stegun.  While such
 630 methods might be acceptable for @math{n=O(10-100)} they will not be
 631 satisfactory for a user who needs to compute the same function for
 632 @math{n=1000000}.
 633
 634 Similarly, do not make the implicit assumption that multivariate data
 635 has been scaled to have components of the same size or O(1).  Algorithms
 636 should take care of any necessary scaling or balancing internally, and
 637 use appropriate norms (e.g. |Dx| where D is a diagonal scaling matrix,
 638 rather than |x|).
 639
 640 @node Documentation, Namespace, Choice of Algorithms, Design
 641 @section Documentation
 642 Documentation: the project leaders should give examples of how things
 643 are to be documented.  High quality documentation is absolutely
 644 mandatory, so documentation should introduce the topic, and give careful
 645 reference for the provided functions. The priority is to provide
 646 reference documentation for each function. It is not necessary to
 647 provide tutorial documentation.
 648
 649 Use free software, such as GNU Plotutils, to produce the graphs in the
 650 manual.
 651
 652 Some of the graphs have been made with gnuplot which is not truly free
 653 (or GNU) software, and some have been made with proprietary
 654 programs. These should be replaced with output from GNU plotutils.
 655
 656 When citing references be sure to use the standard, definitive and
 657 best reference books in the field, rather than lesser known text-books
 658 or introductory books which happen to be available (e.g. from
 659 undergraduate studies).  For example, references concerning algorithms
 660 should be to Knuth, references concerning statistics should be to
 661 Kendall & Stuart, references concerning special functions should be to
 662 Abramowitz & Stegun (Handbook of Mathematical Functions AMS-55), etc.
 663 Whereever possible refer to Abramowitz & Stegun rather than other
 664 reference books because it is a public domain work, so it is
 665 inexpensive and freely redistributable.
 666
 667 The standard references have a better chance of being available in an
 668 accessible library for the user.  If they are not available and the user
 669 decides to buy a copy in order to look up the reference then this also
 670 gives them the best quality book which should also cover the largest
 671 number of other references in the GSL Manual.  If many different books
 672 were to be referenced this would be an expensive and inefficient use of
 673 resources for a user who needs to look up the details of the algorithms.
 674 Reference books also stay in print much longer than text books, which
 675 are often out-of-print after a few years.
 676
 677 Similarly, cite original papers wherever possible.  Be sure to keep
 678 copies of these for your own reference (e.g. when dealing with bug
 679 reports) or to pass on to future maintainers.
 680
 681 If you need help in tracking down references, ask on the
 682 @code{gsl-discuss} mailing list.  There is a group of volunteers with
 683 access to good libraries who have offered to help GSL developers get
 684 copies of papers.
 685
 686 @c [JT section: written by James Theiler
 687
 688 @c And we furthermore promise to try as hard as possible to document
 689 @c the software: this will ideally involve discussion of why you might want
 690 @c to use it, what precisely it does, how precisely to invoke it,
 691 @c how more-or-less it works, and where we learned about the algorithm,
 692 @c and (unless we wrote it from scratch) where we got the code.
 693 @c We do not plan to write this entire package from scratch, but to cannibalize
 694 @c existing mathematical freeware, just as we expect our own software to
 695 @c be cannibalized.]
 696
 697 To write mathematics in the texinfo file you can use the @code{@@math}
 698 command with @emph{simple} TeX commands. These are automatically
 699 surrounded by @code{$...$} for math mode. For example,
 700
 701 @example
 702 to calculate the coefficient @@math@{\alpha@} use the function...
 703 @end example
 704
 705 @noindent
 706 will be correctly formatted in both online and TeX versions of the
 707 documentation.
 708
 709 Note that you cannot use the special characters @{ and @}
 710 inside the @code{@@math} command because these conflict between TeX
 711 and Texinfo.  This is a problem if you want to write something like
 712 @code{\sqrt@{x+y@}}.
 713
 714 To work around it you can preceed the math command with a special
 715 macro @code{@@c} which contains the explicit TeX commands you want to
 716 use (no restrictions), and put an ASCII approximation into the
 717 @code{@@math} command (you can write @code{@@@{} and
 718 @code{@@@}} there for the left and right braces).  The explicit TeX
 719 commands are used in the TeX ouput and the argument of @code{@@math}
 720 in the plain info output.
 721
 722 Note that the @code{@@c@{@}} macro must go at the end of the
 723 preceeding line, because everything else after it is ignored---as far
 724 as texinfo is concerned it's actually a 'comment'. The comment
 725 command @@c has been modified to capture a TeX expression which is
 726 output by the next @@math command. For ordinary comments use the @@comment
 727 command.
 728
 729 For example,
 730
 731 @example
 732 this is a test @@c@{$\sqrt@{x+y@}$@}
 733 @@math@{\sqrt@@@{x+y@@@}@}
 734 @end example
 735
 736 @noindent
 737 is equivalent to @code{this is a test $\sqrt@{x+y@}$} in plain TeX
 738 and @code{this is a test @@math@{\sqrt@@@{x+y@@@}@}} in Info.
 739
 740 It looks nicer if some of the more cryptic TeX commands are given
 741 a C-style ascii version, e.g.
 742
 743 @example
 744 for @@c@{$x \ge y$@}
 745 @@math@{x >= y@}
 746 @end example
 747
 748 @noindent
 749 will be appropriately displayed in both TeX and Info.
 750
 751
 752 @node Namespace, Header files, Documentation, Design
 753 @section Namespace
 754
 755 Use @code{gsl_} as a prefix for all exported functions and variables.
 756
 757 Use @code{GSL_} as a prefix for all exported macros.
 758
 759 All exported header files should have a filename with the prefix @code{gsl_}.
 760
 761 All installed libraries should have a name like libgslhistogram.a
 762
 763 Any installed executables (utility programs etc) should have the prefix
 764 @code{gsl-} (with a hyphen, not an underscore).
 765
 766 All function names, variables, etc should be in lower case.  Macros and
 767 preprocessor variables should be in upper case.
 768
 769 Some common conventions in variable and function names:
 770
 771 @table @code
 772 @item p1
 773 plus 1, e.g. function @code{log1p(x)} or a variable like @code{kp1}, @math{=k+1}.
 774
 775 @item m1
 776 minus 1, e.g. function @code{expm1(x)} or a variable like @code{km1}, @math{=k-1}.
 777 @end table
 778
 779 @node Header files, Target system, Namespace, Design
 780 @section Header files
 781
 782 Installed header files should be idempotent, i.e. surround them by the
 783 preprocessor conditionals like the following,
 784
 785 @example
 786 #ifndef __GSL_HISTOGRAM_H__
 787 #define __GSL_HISTOGRAM_H__
 788 ...
 789 #endif /* __GSL_HISTOGRAM_H__ */
 790 @end example
 791
 792 @node Target system, Function Names, Header files, Design
 793 @section Target system
 794
 795 The target system is ANSI C, with a full Standard C Library, and IEEE
 796 arithmetic.
 797
 798 @node Function Names, Object-orientation, Target system, Design
 799 @section Function Names
 800
 801 Each module has a name, which prefixes any function names in that
 802 module, e.g. the module gsl_fft has function names like
 803 gsl_fft_init. The modules correspond to subdirectories of the library
 804 source tree.
 805
 806 @node Object-orientation, Comments, Function Names, Design
 807 @section Object-orientation
 808
 809 The algorithms should be object oriented, but only to the extent that is
 810 easy in portable ANSI C.  The use of casting or other tricks to simulate
 811 inheritance is not desirable, and the user should not have to be aware
 812 of anything like that.  This means many types of patterns are ruled
 813 out.  However, this is not considered a problem -- they are too
 814 complicated for the library.
 815
 816 Note: it is possible to define an abstract base class easily in C, using
 817 function pointers.  See the rng directory for an example.
 818
 819 When reimplementing public domain fortran code, please try to introduce
 820 the appropriate object concepts as structs, rather than translating the
 821 code literally in terms of arrays.  The structs can be useful just
 822 within the file, you don't need to export them to the user.
 823
 824 For example, if a fortran program repeatedly uses a subroutine like,
 825
 826 @example
 827 SUBROUTINE  RESIZE (X, K, ND, K1)
 828 @end example
 829
 830 @noindent
 831 where X(K,D) represents a grid to be resized to X(K1,D) you can make
 832 this more readable by introducing a struct,
 833
 834 @smallexample
 835 struct grid @{
 836     int nd;  /* number of dimensions */
 837     int k;   /* number of bins */
 838     double * x;   /* partition of axes, array of size x[k][nd] */
 839 @}
 840
 841 void
 842 resize_grid (struct grid * g, int k_new)
 843 @{
 844 ...
 845 @}
 846 @end smallexample
 847
 848 @noindent
 849 Similarly, if you have a frequently recurring code fragment within a
 850 single file you can define a static or static inline function for it.
 851 This is typesafe and saves writing out everything in full.
 852
 853
 854 @node Comments, Minimal structs, Object-orientation, Design
 855 @section Comments
 856
 857 Follow the GNU Coding Standards.  A relevant quote is,
 858
 859 ``Please write complete sentences and capitalize the first word.  If a
 860 lower-case identifier comes at the beginning of a sentence, don't
 861 capitalize it!  Changing the spelling makes it a different identifier.
 862 If you don't like starting a sentence with a lower case letter, write
 863 the sentence differently (e.g., "The identifier lower-case is ...").''
 864
 865 @node Minimal structs, Algorithm decomposition, Comments, Design
 866 @section Minimal structs
 867
 868 We prefer to make structs which are @dfn{minimal}.  For example, if a
 869 certain type of problem can be solved by several classes of algorithm
 870 (e.g. with and without derivative information) it is better to make
 871 separate types of struct to handle those cases.  i.e. run time type
 872 identification is not desirable.
 873
 874 @node Algorithm decomposition, Memory allocation and ownership, Minimal structs, Design
 875 @section Algorithm decomposition
 876
 877 Iterative algorithms should be decomposed into an INITIALIZE, ITERATE,
 878 TEST form, so that the user can control the progress of the iteration
 879 and print out intermediate results.  This is better than using
 880 call-backs or using flags to control whether the function prints out
 881 intermediate results.  In fact, call-backs should not be used -- if they
 882 seem necessary then it's a sign that the algorithm should be broken down
 883 further into individual components so that the user has complete control
 884 over them.
 885
 886 For example, when solving a differential equation the user may need to
 887 be able to advance the solution by individual steps, while tracking a
 888 realtime process.  This is only possible if the algorithm is broken down
 889 into step-level components.  Higher level decompositions would not give
 890 sufficient flexibility.
 891
 892 @node Memory allocation and ownership, Memory layout, Algorithm decomposition, Design
 893 @section Memory allocation and ownership
 894
 895 Functions which allocate memory on the heap should end in _alloc
 896 (e.g. gsl_foo_alloc) and be deallocated by a corresponding _free function
 897 (gsl_foo_free).
 898
 899 Be sure to free any memory allocated by your function if you have to
 900 return an error in a partially initialized object.
 901
 902 Don't allocate memory 'temporarily' inside a function and then free it
 903 before the function returns.  This prevents the user from controlling
 904 memory allocation.  All memory should be allocated and freed through
 905 separate functions and passed around as a "workspace" argument.  This
 906 allows memory allocation to be factored out of tight loops.
 907
 908 @node Memory layout, Linear Algebra Levels, Memory allocation and ownership, Design
 909 @section Memory layout
 910
 911 We use flat blocks of memory to store matrices and vectors, not C-style
 912 pointer-to-pointer arrays.  The matrices are stored in row-major order
 913 -- i.e. the column index (second index) moves continuously through memory.
 914
 915 @node Linear Algebra Levels, Error estimates, Memory layout, Design
 916 @section Linear Algebra Levels
 917
 918 Functions using linear algebra are divided into two levels:
 919
 920 For purely "1d" functions we use the C-style arguments (double *,
 921 stride, size) so that it is simpler to use the functions in a normal C
 922 program, without needing to invoke all the gsl_vector machinery.
 923
 924 The philosophy here is to minimize the learning curve. If someone only
 925 needs to use one function, like an fft, they can do so without having
 926 to learn about gsl_vector.
 927
 928 This leads to the question of why we don't do the same for matrices.
 929 In that case the argument list gets too long and confusing, with
 930 (size1, size2, tda) for each matrix and potential ambiguities over row
 931 vs column ordering. In this case, it makes sense to use gsl_vector and
 932 gsl_matrix, which take care of this for the user.
 933
 934 So really the library has two levels -- a lower level based on C types
 935 for 1d operations, and a higher level based on gsl_matrix and
 936 gsl_vector for general linear algebra.
 937
 938 Of course, it would be possible to define a vector version of the
 939 lower level functions too. So far we have not done that because it was
 940 not essential -- it could be done but it is easy enough to get by
 941 using the C arguments, by typing v->data, v->stride, v->size instead.
 942 A gsl_vector version of low-level functions would mainly be a
 943 convenience.
 944
 945 Please use BLAS routines internally within the library whenever possible
 946 for efficiency.
 947
 948 @node Error estimates, Exceptions and Error handling, Linear Algebra Levels, Design
 949 @section Error estimates
 950
 951 In the special functions error bounds are given as twice the expected
 952 ``gaussian'' error.  i.e. 2-sigma, so the result is inside the error
 953 98% of the time.  People expect the true value to be within +/- the
 954 quoted error (this wouldn't be the case 32% of the time for 1 sigma).
 955 Obviously the errors are not gaussian but a factor of two works well
 956 in practice.
 957
 958 @node Exceptions and Error handling, Persistence, Error estimates, Design
 959 @section Exceptions and Error handling
 960
 961 The basic error handling procedure is the return code (see gsl_errno.h
 962 for a list of allowed values).  Use the GSL_ERROR macro to mark an
 963 error.  The current definition of this macro is not ideal but it can be
 964 changed at compile time.
 965
 966 You should always use the GSL_ERROR macro to indicate an error, rather
 967 than just returning an error code. The macro allows the user to trap
 968 errors using the debugger (by setting a breakpoint on the function
 969 gsl_error).
 970
 971 The only circumstances where GSL_ERROR should not be used are where the
 972 return value is "indicative" rather than an error -- for example, the
 973 iterative routines use the return code to indicate the success or
 974 failure of an iteration. By the nature of an iterative algorithm
 975 "failure" (a return code of GSL_CONTINUE) is a normal occurrence and
 976 there is no need to use GSL_ERROR there.
 977
 978 Be sure to free any memory allocated by your function if you return an
 979 error (in particular for errors in partially initialized objects).
 980
 981 @node Persistence, Using Return Values, Exceptions and Error handling, Design
 982 @section Persistence
 983
 984 If you make an object foo which uses blocks of memory (e.g. vector,
 985 matrix, histogram) you can provide functions for reading and writing
 986 those blocks,
 987
 988 @smallexample
 989 int gsl_foo_fread (FILE * stream, gsl_foo * v);
 990 int gsl_foo_fwrite (FILE * stream, const gsl_foo * v);
 991 int gsl_foo_fscanf (FILE * stream, gsl_foo * v);
 992 int gsl_foo_fprintf (FILE * stream, const gsl_foo * v, const char *format);
 993 @end smallexample
 994
 995 @noindent
 996 Only dump out the blocks of memory, not any associated parameters such
 997 as lengths.  The idea is for the user to build higher level input/output
 998 facilities using the functions the library provides.  The fprintf/fscanf
 999 versions should be portable between architectures, while the binary
1000 versions should be the "raw" version of the data. Use the functions
1001
1002 @smallexample
1003 int gsl_block_fread (FILE * stream, gsl_block * b);
1004 int gsl_block_fwrite (FILE * stream, const gsl_block * b);
1005 int gsl_block_fscanf (FILE * stream, gsl_block * b);
1006 int gsl_block_fprintf (FILE * stream, const gsl_block * b, const char *format);
1007 @end smallexample
1008
1009 @noindent
1010 or
1011
1012 @smallexample
1013 int gsl_block_raw_fread (FILE * stream, double * b, size_t n, size_t stride);
1014 int gsl_block_raw_fwrite (FILE * stream, const double * b, size_t n, size_t stri
1015 de);
1016 int gsl_block_raw_fscanf (FILE * stream, double * b, size_t n, size_t stride);
1017 int gsl_block_raw_fprintf (FILE * stream, const double * b, size_t n, size_t str
1018 ide, const char *format);
1019 @end smallexample
1020
1021 @noindent
1022 to do the actual reading and writing.
1023
1024 @node Using Return Values, Variable Names, Persistence, Design
1025 @section Using Return Values
1026
1027 Always assign a return value to a variable before using it.  This allows
1028 easier debugging of the function, and inspection and modification of the
1029 return value.  If the variable is only needed temporarily then enclose
1030 it in a suitable scope.
1031
1032 For example, instead of writing,
1033
1034 @example
1035 a = f(g(h(x,y)))
1036 @end example
1037
1038 @noindent
1039 use temporary variables to store the intermediate values,
1040 @example
1041 @{
1042   double u = h(x,y);
1043   double v = g(u);
1044   a = f(v);
1045 @}
1046 @end example
1047
1048 @noindent
1049 These can then be inspected more easily in the debugger, and breakpoints
1050 can be placed more precisely.  The compiler will eliminate the temporary
1051 variables automatically when the program is compiled with optimization.
1052
1053 @node Variable Names, Datatype widths, Using Return Values, Design
1054 @section Variable Names
1055
1056 Try to follow existing conventions for variable names,
1057
1058 @table @code
1059 @item dim
1060 number of dimensions
1061 @item w
1062 pointer to workspace
1063 @item state
1064 pointer to state variable (use @code{s} if you need to save characters)
1065 @item result
1066 pointer to result (output variable)
1067 @item abserr
1068 absolute error
1069 @item relerr
1070 relative error
1071 @item epsabs
1072 absolute tolerance
1073 @item epsrel
1074 relative tolerance
1075 @item size
1076 the size of an array or vector e.g. double array[size]
1077 @item stride
1078 the stride of a vector
1079 @item size1
1080 the number of rows in a matrix
1081 @item size2
1082 the number of columns in a matrix
1083 @item n
1084 general integer number, e.g. number of elements of array, in fft, etc
1085 @item r
1086 random number generator (gsl_rng)
1087 @end table
1088
1089 @node Datatype widths, size_t, Variable Names, Design
1090 @section Datatype widths
1091
1092 Be aware that in ANSI C the type @code{int} is only guaranteed to
1093 provide 16-bits. It may provide more, but is not guaranteed to.
1094 Therefore if you require 32 bits you must use @code{long int}, which
1095 will have 32 bits or more.  Of course, on many platforms the type
1096 @code{int} does have 32 bits instead of 16 bits but we have to code to
1097 the ANSI standard rather than a specific platform.
1098
1099 @node size_t, Arrays vs Pointers, Datatype widths, Design
1100 @section size_t
1101
1102 All objects (blocks of memory, etc) should be measured in terms of a
1103 @code{size_t} type.  Therefore any iterations (e.g. @code{for(i=0; i<N;
1104 i++)}) should also use an index of type @code{size_t}.
1105
1106 Don't mix @code{int} and @code{size_t}.  They are @emph{not}
1107 interchangeable.
1108
1109 If you need to write a descending loop you have to be careful because
1110 @code{size_t} is unsigned, so instead of
1111
1112 @example
1113 for (i = N - 1; i >= 0; i--) @{ ... @} /* DOESN'T WORK */
1114 @end example
1115
1116 @noindent
1117 use something like
1118
1119 @example
1120 for (i = N; i > 0 && i--;) @{ ... @}
1121 @end example
1122
1123 @noindent
1124 to avoid problems with wrap-around at @code{i=0}.
1125
1126 If you really want to avoid confusion use a separate variable to invert
1127 the loop order,
1128 @example
1129 for (i = 0; i < N; i++) @{ j = N - i; ... @}
1130 @end example
1131
1132 @node Arrays vs Pointers, Pointers, size_t, Design
1133 @section Arrays vs Pointers
1134
1135 A function can be declared with either pointer arguments or array
1136 arguments.  The C standard considers these to be equivalent. However, it
1137 is useful to distinguish between the case of a pointer, representing a
1138 single object which is being modified, and an array which represents a
1139 set of objects with unit stride (that are modified or not depending on
1140 the presence of @code{const}).  For vectors, where the stride is not
1141 required to be unity, the pointer form is preferred.
1142
1143 @smallexample
1144 /* real value, set on output */
1145 int foo (double * x);
1146
1147 /* real vector, modified */
1148 int foo (double * x, size_t stride, size_t n);
1149
1150 /* constant real vector */
1151 int foo (const double * x, size_t stride, size_t n);
1152
1153 /* real array, modified */
1154 int bar (double x[], size_t n);
1155
1156 /* real array, not modified */
1157 int baz (const double x[], size_t n);
1158 @end smallexample
1159
1160 @node  Pointers, Constness, Arrays vs Pointers, Design
1161 @section Pointers
1162
1163 Avoid dereferencing pointers on the right-hand side of an expression where
1164 possible.  It's better to introduce a temporary variable.  This is
1165 easier for the compiler to optimise and also more readable since it
1166 avoids confusion between the use of @code{*} for multiplication and
1167 dereferencing.
1168
1169 @example
1170 while (fabs (f) < 0.5)
1171 @{
1172   *e = *e - 1;
1173   f  *= 2;
1174 @}
1175 @end example
1176
1177 @noindent
1178 is better written as,
1179
1180 @example
1181 @{
1182   int p = *e;
1183
1184   while (fabs(f) < 0.5)
1185     @{
1186      p--;
1187      f *= 2;
1188     @}
1189
1190   *e = p;
1191 @}
1192 @end example
1193
1194 @node Constness, Pseudo-templates, Pointers, Design
1195 @section Constness
1196
1197 Use @code{const} in function prototypes wherever an object pointed to by
1198 a pointer is constant (obviously).  For variables which are meaningfully
1199 constant within a function/scope use @code{const} also.  This prevents
1200 you from accidentally modifying a variable which should be constant
1201 (e.g. length of an array, etc).  It can also help the compiler do
1202 optimization.  These comments also apply to arguments passed by value
1203 which should be made @code{const} when that is meaningful.
1204
1205 @node Pseudo-templates, Arbitrary Constants, Constness, Design
1206 @section Pseudo-templates
1207
1208 There are some pseudo-template macros available in @file{templates_on.h}
1209 and @file{templates_off.h}.  See a directory link @file{block} for
1210 details on how to use them.  Use sparingly, they are a bit of a
1211 nightmare, but unavoidable in places.
1212
1213 In particular, the convention is: templates are used for operations on
1214 "data" only (vectors, matrices, statistics, sorting).  This is intended
1215 to cover the case where the program must interface with an external
1216 data-source which produces a fixed type. e.g. a big array of char's
1217 produced by an 8-bit counter.
1218
1219 All other functions can use double, for floating point, or the
1220 appropriate integer type for integers (e.g. unsigned long int for random
1221 numbers).  It is not the intention to provide a fully templated version
1222 of the library.
1223
1224 That would be "putting a quart into a pint pot". To summarize, almost
1225 everything should be in a "natural type" which is appropriate for
1226 typical usage, and templates are there to handle a few cases where it is
1227 unavoidable that other data-types will be encountered.
1228
1229 For floating point work "double" is considered a "natural type".  This
1230 sort of idea is a part of the C language.
1231
1232 @node  Arbitrary Constants, Test suites, Pseudo-templates, Design
1233 @section Arbitrary Constants
1234
1235 Avoid arbitrary constants.
1236
1237 For example, don't hard code "small" values like '1e-30', '1e-100' or
1238 @code{10*GSL_DBL_EPSILON} into the routines.  This is not appropriate
1239 for a general purpose library.
1240
1241 Compute values accurately using IEEE arithmetic.  If errors are
1242 potentially significant then error terms should be estimated reliably
1243 and returned to the user, by analytically deriving an error propagation
1244 formula, not using guesswork.
1245
1246 A careful consideration of the algorithm usually shows that arbitrary
1247 constants are unnecessary, and represent an important parameter which
1248 should be accessible to the user.
1249
1250 For example, consider the following code:
1251
1252 @example
1253 if (residual < 1e-30) @{
1254    return 0.0;  /* residual is zero within round-off error */
1255 @}
1256 @end example
1257
1258 @noindent
1259 This should be rewritten as,
1260
1261 @example
1262    return residual;
1263 @end example
1264
1265 @noindent
1266 in order to allow the user to determine whether the residual is
1267 significant or not.
1268
1269 The only place where it is acceptable to use constants like
1270 @code{GSL_DBL_EPSILON} is in function approximations, (e.g. taylor
1271 series, asymptotic expansions, etc).  In these cases it is not an
1272 arbitrary constant, but an inherent part of the algorithm.
1273
1274 @node Test suites, Compilation, Arbitrary Constants, Design
1275 @section Test suites
1276
1277 The implementor of each module should provide a reasonable test suite
1278 for the routines.
1279
1280 The test suite should be a program that uses the library and checks the
1281 result against known results, or invokes the library several times and
1282 does a statistical analysis on the results (for example in the case of
1283 random number generators).
1284
1285 Ideally the one test program per directory should aim for 100% path
1286 coverage of the code.  Obviously it would be a lot of work to really
1287 achieve this, so prioritize testing on the critical parts and use
1288 inspection for the rest.  Test all the error conditions by explicitly
1289 provoking them, because we consider it a serious defect if the function
1290 does not return an error for an invalid parameter. N.B. Don't bother to
1291 test for null pointers -- it's sufficient for the library to segfault if
1292 the user provides an invalid pointer.
1293
1294 The tests should be deterministic.  Use the @code{gsl_test} functions
1295 provided to perform separate tests for each feature with a separate
1296 output PASS/FAIL line, so that any failure can be uniquely identified.
1297
1298 Use realistic test cases with 'high entropy'.  Tests on simple values
1299 such as 1 or 0 may not reveal bugs.  For example, a test using a value
1300 of @math{x=1} will not pick up a missing factor of @math{x} in the code.
1301 Similarly, a test using a value of @math{x=0} will not pick any missing
1302 terms involving @math{x} in the code.  Use values like @math{2.385} to
1303 avoid silent failures.
1304
1305 If your test uses multiple values make sure there are no simple
1306 relations between them that could allow bugs to be missed through silent
1307 cancellations.
1308
1309 If you need some random floats to put in the test programs use @code{od -f
1310 /dev/random} as a source of inspiration.
1311
1312 Don't use @code{sprintf} to create output strings in the tests.  It can
1313 cause hard to find bugs in the test programs themselves.  The functions
1314 @code{gsl_test_...}  support format string arguments so use these
1315 instead.
1316
1317 @node Compilation, Thread-safety, Test suites, Design
1318 @section Compilation
1319
1320 Make sure everything compiles cleanly.  Use the strict compilation
1321 options for extra checking.
1322
1323 @smallexample
1324 make CFLAGS="-ansi -pedantic -Werror -W -Wall -Wtraditional -Wconversion
1325   -Wshadow -Wpointer-arith -Wcast-qual -Wcast-align -Wwrite-strings
1326   -Wstrict-prototypes -fshort-enums -fno-common -Wmissing-prototypes
1327   -Wnested-externs -Dinline= -g -O4"
1328 @end smallexample
1329
1330 @noindent
1331 Also use @code{checkergcc} to check for memory problems on the stack and
1332 the heap.  It's the best memory checking tool.  If checkergcc isn't
1333 available then Electric Fence will check the heap, which is better than
1334 no checking.
1335
1336 There is a new tool @code{valgrind} for checking memory access.  Test
1337 the code with this as well.
1338
1339 Make sure that the library will also compile with C++ compilers
1340 (g++).  This should not be too much of a problem if you have been writing
1341 in ANSI C.
1342
1343 @node Thread-safety, Legal issues, Compilation, Design
1344 @section Thread-safety
1345
1346 The library should be usable in thread-safe programs.  All the functions
1347 should be thread-safe, in the sense that they shouldn't use static
1348 variables.
1349
1350 We don't require everything to be completely thread safe, but anything
1351 that isn't should be obvious.  For example, some global variables are
1352 used to control the overall behavior of the library (range-checking
1353 on/off, function to call on fatal error, etc).  Since these are accessed
1354 directly by the user it is obvious to the multi-threaded programmer that
1355 they shouldn't be modified by different threads.
1356
1357 There is no need to provide any explicit support for threads
1358 (e.g. locking mechanisms etc), just to avoid anything which would make
1359 it impossible for someone to call a GSL routine from a multithreaded
1360 program.
1361
1362
1363 @node Legal issues, Non-UNIX portability, Thread-safety, Design
1364 @section Legal issues
1365
1366 @itemize @bullet
1367 @item
1368 Each contributor must make sure her code is under the GNU General Public
1369 License (GPL).   This means getting a disclaimer from your employer.
1370 @item
1371 We must clearly understand ownership of existing code and algorithms.
1372 @item
1373 Each contributor can retain ownership of their code, or sign it over to
1374 FSF as they prefer.
1375
1376 There is a standard disclaimer in the GPL (take a look at it).  The more
1377 specific you make your disclaimer the more likely it is that it will be
1378 accepted by an employer.  For example,
1379
1380 @smallexample
1381 Yoyodyne, Inc., hereby disclaims all copyright interest in the software
1382 `GNU Scientific Library - Legendre Functions' (routines for computing
1383 legendre functions numerically in C) written by James Hacker.
1384
1385 <signature of Ty Coon>, 1 April 1989
1386 Ty Coon, President of Vice
1387 @end smallexample
1388
1389 @item
1390 Obviously: don't use or translate non-free code.
1391
1392 In particular don't copy or translate code from @cite{Numerical Recipes}
1393 or @cite{ACM TOMS}.
1394
1395 Numerical Recipes is under a strict license and is not free software.
1396 The publishers Cambridge University Press claim copyright on all aspects
1397 of the book and the code, including function names, variable names and
1398 ordering of mathematical subexpressions.  Routines in GSL should not
1399 refer to Numerical Recipes or be based on it in any way.
1400
1401 The ACM algorithms published in TOMS (Transactions on Mathematical
1402 Software) are not public domain, even though they are distributed on the
1403 internet -- the ACM uses a special non-commercial license which is not
1404 compatible with the GPL. The details of this license can be found on the
1405 cover page of ACM Transactions on Mathematical Software or on the ACM
1406 Website.
1407
1408 Only use code which is explicitly under a free license: GPL or Public
1409 Domain.  If there is no license on the code then this does not mean it
1410 is public domain -- an explicit statement is required. If in doubt check
1411 with the author.
1412
1413 @item
1414 I @strong{think} one can reference algorithms from classic books on
1415 numerical analysis (BJG: yes, provided the code is an independent
1416 implementation and not copied from any existing software.  For
1417 example, it would be ok to read the papers in ACM TOMS and make an
1418 independent implementation from their description).
1419 @end itemize
1420
1421 @node Non-UNIX portability, Compatibility with other libraries, Legal issues, Design
1422 @section Non-UNIX portability
1423
1424 There is good reason to make this library work on non-UNIX systems.  It
1425 is probably safe to ignore DOS and only worry about windows95/windowsNT
1426 portability (so filenames can be long, I think).
1427
1428 On the other hand, nobody should be forced to use non-UNIX systems for
1429 development.
1430
1431 The best solution is probably to issue guidelines for portability, like
1432 saying "don't use XYZ unless you absolutely have to".  Then the Windows
1433 people will be able to do their porting.
1434
1435 @node Compatibility with other libraries, Parallelism, Non-UNIX portability, Design
1436 @section Compatibility with other libraries
1437
1438 We do not regard compatibility with other numerical libraries as a
1439 priority.
1440
1441 However, other libraries, such as Numerical Recipes, are widely used.
1442 If somebody writes the code to allow drop-in replacement of these
1443 libraries it would be useful to people.  If it is done, it would be as a
1444 separate wrapper that can be maintained and shipped separately.
1445
1446 There is a separate issue of system libraries, such as BSD math library
1447 and functions like @code{expm1}, @code{log1p}, @code{hypot}.  The
1448 functions in this library are available on nearly every platform (but
1449 not all).
1450
1451 In this case, it is best to write code in terms of these native
1452 functions to take advantage of the vendor-supplied system library (for
1453 example log1p is a machine instruction on the Intel x86). The library
1454 also provides portable implementations e.g. @code{gsl_hypot} which are
1455 used as an automatic fall back via autoconf when necessary. See the
1456 usage of @code{hypot} in @file{gsl/complex/math.c}, the implementation
1457 of @code{gsl_hypot} and the corresponding parts of files
1458 @file{configure.in} and @file{config.h.in} as an example.
1459
1460 @node Parallelism, Precision, Compatibility with other libraries, Design
1461 @section Parallelism
1462
1463 We don't intend to provide support for parallelism within the library
1464 itself. A parallel library would require a completely different design
1465 and would carry overhead that other applications do not need.
1466
1467 @node Precision, Miscellaneous, Parallelism, Design
1468 @section Precision
1469
1470 For algorithms which use cutoffs or other precision-related terms please
1471 express these in terms of @code{GSL_DBL_EPSILON} and @code{GSL_DBL_MIN}, or powers or
1472 combinations of these.  This makes it easier to port the routines to
1473 different precisions.
1474
1475 @node Miscellaneous,  , Precision, Design
1476 @section Miscellaneous
1477
1478 Don't use the letter @code{l} as a variable name --- it is difficult to
1479 distinguish from the number @code{1}. (This seems to be a favorite in
1480 old Fortran programs).
1481
1482 Final tip: one perfect routine is better than any number of routines
1483 containing errors.
1484
1485 @node Bibliography, Copying, Design, Top
1486 @chapter Bibliography
1487
1488 @section General numerics
1489
1490 @itemize
1491
1492 @item
1493 @cite{Numerical Computation} (2 Volumes) by C.W. Ueberhuber,
1494 Springer 1997,  ISBN 3540620583  (Vol 1) and  ISBN 3540620575  (Vol 2).
1495
1496 @item
1497 @cite{Accuracy and Stability of Numerical Algorithms} by  N.J. Higham,
1498 SIAM,  ISBN 0898715210.
1499
1500 @item
1501 @cite{Sources and Development of Mathematical Software} edited by W.R. Cowell,
1502 Prentice Hall, ISBN 0138235015.
1503
1504 @item
1505 @cite{A Survey of Numerical Mathematics (2 vols)} by D.M. Young and R.T. Gregory,
1506  ISBN 0486656918, ISBN 0486656926.
1507
1508 @item
1509 @cite{Methods and Programs for Mathematical Functions} by Stephen L. Moshier,
1510 Hard to find (ISBN 13578980X or 0135789982, possibly others).
1511
1512 @item
1513 @cite{Numerical Methods That Work} by Forman S. Acton,
1514  ISBN 0883854503.
1515
1516 @item
1517 @cite{Real Computing Made Real: Preventing Errors in Scientific and Engineering Calculations} by Forman S. Acton,
1518  ISBN 0486442217.
1519 @end itemize
1520
1521 @section Reference
1522
1523 @itemize
1524 @item
1525 @cite{Handbook of Mathematical Functions} edited by Abramowitz & Stegun,
1526 Dover,  ISBN 0486612724.
1527
1528 @item
1529 @cite{The Art of Computer Programming} (3rd Edition, 3 Volumes) by D. Knuth,
1530 Addison Wesley,  ISBN 0201485419.
1531 @end itemize
1532
1533 @section Subject specific
1534
1535 @itemize
1536 @item
1537 @cite{Matrix Computations} (3rd Ed) by G.H. Golub, C.F. Van Loan,
1538 Johns Hopkins University Press 1996,  ISBN 0801854148.
1539
1540 @item
1541 @cite{LAPACK Users' Guide} (3rd Edition),
1542 SIAM 1999,  ISBN 0898714478.
1543
1544 @item
1545 @cite{Treatise on the Theory of Bessel Functions 2ND Edition} by G N Watson,
1546  ISBN 0521483913.
1547
1548 @item
1549 @cite{Higher Transcendental Functions satisfying nonhomogenous linear differential equations} by A W Babister,
1550  ISBN 1114401773.
1551
1552 @end itemize
1553
1554 @node Copying, GNU Free Documentation License, Bibliography, Top
1555 @unnumbered Copying
1556
1557    The subroutines and source code in the @value{GSL} package are "free";
1558 this means that everyone is free to use them and free to redistribute
1559 them on a free basis.  The @value{GSL}-related programs are not in the
1560 public domain; they are copyrighted and there are restrictions on their
1561 distribution, but these restrictions are designed to permit everything
1562 that a good cooperating citizen would want to do.  What is not allowed
1563 is to try to prevent others from further sharing any version of these
1564 programs that they might get from you.
1565
1566    Specifically, we want to make sure that you have the right to give
1567 away copies of the programs that relate to @value{GSL}, that you receive
1568 source code or else can get it if you want it, that you can change these
1569 programs or use pieces of them in new free programs, and that you know
1570 you can do these things.
1571
1572    To make sure that everyone has such rights, we have to forbid you to
1573 deprive anyone else of these rights.  For example, if you distribute
1574 copies of the @value{GSL}-related code, you must give the recipients all
1575 the rights that you have.  You must make sure that they, too, receive or
1576 can get the source code.  And you must tell them their rights.
1577
1578    Also, for our own protection, we must make certain that everyone
1579 finds out that there is no warranty for the programs that relate to
1580 @value{GSL}.  If these programs are modified by someone else and passed
1581 on, we want their recipients to know that what they have is not what we
1582 distributed, so that any problems introduced by others will not reflect
1583 on our reputation.
1584
1585    The precise conditions of the licenses for the programs currently
1586 being distributed that relate to @value{GSL} are found in the General
1587 Public Licenses that accompany them.
1588
1589 @node GNU Free Documentation License,  , Copying, Top
1590 @unnumbered GNU Free Documentation License
1591 @include fdl.texi
1592
1593 @c @printindex cp
1594
1595 @c @node Function Index
1596 @c @unnumbered Function Index
1597
1598 @c @printindex fn
1599
1600 @c @node Variable Index
1601 @c @unnumbered Variable Index
1602
1603 @c @printindex vr
1604
1605 @c @node Type Index
1606 @c @unnumbered Type Index
1607
1608 @c @printindex tp
1609
1610 @bye