Gamma Archive


Archive

Gamma Test References to January 2009

 

The Gamma Test is a non-linear modelling analysis tool that allows us to quantify the extent to which a numerical input/output data set can be expressed as a smooth relationship. In essence, it allows us to efficiently calculate that part of the variance of the output that cannot be accounted for by the existence of any smooth model based on the inputs, even though this model is unknown. A key aspect of this tool is its speed: the Gamma Test has time complexity O(Mlog M), where M is the number of data-points. For data sets consisting of a few thousand points and a reasonable number of attributes, a single run of the Gamma Test typically takes a few seconds. A simple introduction to the basic ideas is given in [Jones 2002] (below). The original commercial software winGammaTM  is now Freeware and available from this site.

The Gamma test computes the second moment of the noise distribution. Developments in 2002  (see Evans and Jones 2008 below) include new algorithms to compute as many higher moments of the noise distribution as is justified by the amount of available data. This enables an approximate reconstruction of the noise distribution - this may have interesting applications in communication theory and economic modelling for example.

In 2003 the Gamma test was used as an anomaly detector against a background of noise. Work of Peter Boyce (available below) has shown how a moving window Gamma test can be used to greatly increase our ability to detect galaxies in the IR map of the universe. Unfortunately, Peter has moved on and it is not clear how this work can be continued.

This archive lists all the published material that we are aware of  - up to the last update - which uses or discusses the Gamma Test.

Gamma Software

Brochure (pdf 631 Kb))

winGammaTM is an easy-to-use non-linear data-analysis and modelling package produced by Cardiff University as a Windows application. Its primary use is as a research and teaching tool for academics and students available as Freeware under license from Cardiff University, for data modellers requiring state-of-the-art analysis and modelling techniques.  This software is particularly useful for time series prediction and dynamic system control applications, but also has a wide range of other applications.  If you download this package you will also need the winGammaTM  manual 2001

Command line Gamma test  controlled by a script file. This is useful for running multiple gamma tests, full embeddings etc. It comes in a compiled DOS form and is also available in the UNIX source code.

Gamma test Support files in MathematicaTM. These files include test data generation, analysis and graphics generation. This collection also includes a fast C-code Windows compiled version of the Gamma test which can called from a MathematicaTM  file,  as well as the original  MathematicaTM code used to specify the O(MlogM) Gamma test based on kd-trees.

There is an R-library <here>

<or available from Cran or from a mirror site:

http://lib.stat.cmu.edu/R/CRAN/src/contrib/Archive/GammaTest/

http://www2.uaem.mx/r-mirror/src/contrib/Archive/GammaTest/

http://cran-r.c3sl.ufpr.br/src/contrib/Archive/GammaTest/

http://medipe.psu.ac.th/cran-r/contrib/main/Archive/GammaTest/

>

for a very fast Gamma test using (optionally) exact or approximate near neighbours, together with supporting tools to facilitate input variable selection (feature selection). Using approximate near neighbours can speed up a full embedding over 20 inputs from something like two days to twenty minutes without significantly affecting the effectiveness of the feature selection process. This was written by Samuel Kemp (University of Glamorgan).

WISHFUL THINKING? - NO! IT'S HERE NOW....

Windows and Linux compiled fast C-code routine for (optional) approximate fast near neighbours for MATLABTM Gamma test. Available <here>.

COMING EVENTUALLY - Gamma test support for MAC

Papers/Theses

[Adalbjörn Stefánsson 1997] Adalbjörn Stefánsson, N. Koncar and Antonia J. Jones. A note on the Gamma test, Neural Computing & Applications 5(3):131-133, 1997. [31Kb]

[Koncar 1997] N. Koncar. Optimisation methodologies for direct inverse neurocontrol. PhD Thesis, Imperial College of Science Technology and Medicine, University of London, 1997.

[Otani 1997] Guiding Chaotic Orbits. Masyuki Otani and Antonia J. Jones. Technical Report. [1.5 Mb]

[Oliveira 1998] Ana G. Oliveira and Antonia J. Jones. Synchronization of chaotic maps by feedback control and application to secure communications using neural networks. International Journal of Bifurcation and Chaos 8(11):2225-2237, November 1998. URL: http://www.worldscinet.com/ijbc/0811/olive.html

[Connellan 1998] Connellan, O. P. & James, H. Forecasting Commercial Property Values in the Short Term. RICS Cutting Edge Conference, Leicester, 1998. RICS London. Available electronically from rics-foundation.org.

[Chuzhanova 1998] Nadia A. Chuzhanova, Antonia J. Jones and S. Margetts. Feature selection for genetic sequence classification. Bioinformatics 14(2):139-143, 1998. [46 Kb]

[Oliveira 1999] Ana R.S. Guedes de Oliveira. Synchronization of Chaos and Applications to Secure Communications. PhD thesis, Department of Computing, Imperial College of Science, Technology and Medicine, University of London, 1999.

[Haythorn 1999] W. Haythorn, S. Margetts, P. Durant and Antonia J. Jones. Non-parametric smooth non-linear model identification and construction.  International Seminar on Forecasting, Washington June 1999. (Proceedings not published).  [168 Kb]

[Tsui 1999a] Alban P.M. Tsui. Smooth Data Modelling and Stimulus-Response via Stabilisation of Neural Chaos. PhD thesis, Department of Computing, Imperial College of Science, Technology and Medicine, University of London, 1999.  [9.1 Mb]

[Tsui 1999b] Alban P.M. Tsui and Antonia J. Jones. Periodic response to external stimulation of a chaotic neural network with delayed feedback. International Journal of Bifurcation and Chaos, 9(4):713-722, 1999.

[Connellan 1999] O. Connellan and H. James. Forecasting values of real property using recently developed techniques. Final Report to the Company of Chartered Surveyors. 1999. URL: http://www.technicalforecasts.com/files/Forecasts.pdf

[Otani 2000] M. Otani and Antonia J Jones. Automated Embedding and Creep Phenomenon in Chaotic Time Series [EB/OL], Originally 1997. First made available on the web at http://users.cs.cf.ac.uk/Antonia.J.Jones/UnpublishedPapers/Creep.pdf in 2000.

[James 2000] H. James and O. P. Connellan. Forecasts of a Small feature in a Property Index. Proc. RICS Cutting Edge Conference, London, 2000. RICS London. Available electronically from rics-foundation.org.

[Connellan 2000] O. P. Connellan and H. James. Time Series Forecasting of Property Market Indices. Cutting Edge Conference, London, 2000. RICS London. URL: http://www.rics-foundation.org/publish/documents.jsp?pageNumber=1

[Jones 2000] Non-linear modelling and chaotic neural networks. Antonia J. Jones, Steve Margetts, Peter Durrant, and Alban P. M. Tsui. Invited paper in Proceedings IV Simpósio Brasileiro de Redes Neurais (IV Brazilian Symposium on Neural Networks) (SBRN2000), I pp7-14 ISBN 0-7695-0856. Rio de Janeiro, Brazil, November 2000. HTML Presentation. [7.7 MB] PDF file [692 Kb].

[Goodridge 2001] C.L. Goodridge, L.M. Pecora, T.L. Carroll, and F.J. Rachford. Detecting Functional Relationships between Simultaneous Time Series. Phys. Rev. E 64, 026221, 2001.

[Durrant 2001]. P. J. Durrant winGamma™: A non-linear data analysis and modelling tool with applications to flood prediction. Ph.D. thesis, Department of Computer Science, University of Wales, Cardiff, Wales U.K.  [14 MB]

[Goodridge 2001] C. L. Goodridge, F. J. Rachford, L. M. Pecora, and T. L. Carroll. Functional dependence and quasiperiodicity in the spatiotemporal dynamics of yttrium iron garnet films. PHYSICAL REVIEW E, 64, 016210, June 2001:

[Evans 2002a] The Gamma Test: Data derived estimates of noise for unknown smooth models using near neighbour asymptotics. D. Evans, Department of Computer Science, Cardiff University 2002. [1.5 Mb].

[Evans 2002b] D. Evans and Antonia J. Jones. A proof of the Gamma test.  Proc. Roy. Soc. Lond. A  458:2759-2799. Preprint PDF [643 Kb]

[Evans 2002c] D. Evans, Antonia J. Jones, W. M. Schmidt. Asymptotic moments of near neighbour distance distributions. Proc. Roy. Soc. Lond. A, 458:2839-2849. Draft PDF [248 Kb]

[Jones 2002] Antonia J. Jones, Dafydd Evans, Steve Margetts, Peter M. Durrant. The Gamma Test. Chapter IX in Heuristic and Optimization for Knowledge Discovery. Edited by by Ruhul Sarker, Hussein Abbass and Charles Newton. Idea Group Publishing, Hershey, PA. 2002. [464 Kb]

[Jones 2002] Neural models of arbitrary chaotic systems: construction and the role of time delayed feedback in control and synchronization. Antonia J. Jones., A.P.M Tsui, and Ana G. Oliveira. With html or pdf electronic supplement. Complexity International, Volume 09, 2002. ISSN 1320-0682. Paper ID: tsui01, URL: [217 Kb/119 KB/568 Kb] http://www.csu.edu.au/ci/vol09/tsui01/

[Tsui 2002] Alban P.M. Tsui, Antonia J. Jones, and Ana G. Oliveira. The construction of smooth models using irregular embeddings determined by a Gamma test analysis. Draft [897 Kb]. Neural Computing & Applications, 10(4), 318-329, April 2002.

[Durrant 2003] P. J. Durrant and Antonia J. Jones. Non-linear modelling of river levels using the Gamma test. Technical Report. Draft [2.5 MB]

[Corcoran 2003] J. Corcoran, I.D. Wilson, J.A. Ware.  Predicting the Geo-Temporal Variation of Crime and Disorder.  To appear in International Journal of Forecasting, Special Issue on Crime Forecasting.

[Boyce 2003] P. Boyce, GammaFinder: a Java application to find galaxies in astronomical spectral line datacubes. M.Sc. Dissertation, School of Computer Science, Cardiff University 2003. [8 Mb]

[Reyhani  2003] Nima Reyhani and Maral Jamshidi. A gamma test oriented approach to the problem of inferring gene regulatory models over time series microarray experiences. MASSEE 2003, Borovets, Bulgaria.

[Wilson 2004] I. D. Wilson, Antonia J. Jones, D. H. Jenkins, and J. A. Ware. Predicting Housing Value: Attribute Selection and Dependence Modelling Utilising the Gamma Test. Draft [418Kb]. Advances in Econometrics 19, 243-275, 2004. Elsevier Ltd. ISSN 0731-9053/doi:10,1016/D0731-9053(04)19010-5.

[Jones 2004]. Antonia J. Jones. New Tools in Non-linear Modelling and Prediction.  Computational Management Science, 1(2):109-149, 2004.

[Evans 2005] D. Evans. Estimating the variance of multiplicative noise.  Proceedings of the 18th International Conference on Noise and Fluctuations, 99-102, 2005. 

[Reyhani 2005] M. N. Reyhani, J. Hao, Y. Ji, A. Lendasse.  Mutual Information and Gamma Test for Input Selection, 0ESANN 2005, European Symposium on Artificial Neural Networks, Bruges (Belgium), 27-29 April 2005, pp. 503-504.

[Lendasse 2005] Amaury Lendasse, Yongnan Ji, Nima Reyhani, and Michel Verleysen. LS-SVM Hyperparameter Selection with a Nonparametric Noise Estimator. ICANN'05, International Conference on Artificial Neural Networks, Artificial Neural   Networks: Formal Models and Their Applications, W. Duch, J. Kacprzyk, E. Oja, S. Zadroznyeds, Springer, Lecture Notes in Computer Science 3697, 11-15 September 2005, Warsaw (Poland), pp. 625-630.

[Liu 2005] Daizhi Liu, Xihai Li and Bin Zhang. Feature Selection and Identification of Underground Nuclear Explosion and Natural Earthquake Based on Gamma Test and BP Neural Network. Advances in Neural Networks - ISNN 2005, Second International Symposium on Neural Networks, Chongqing, China, May 30 - June 1, 2005, Proceedings, Part II. Lecture Notes in Computer Science 3497. Springer 2005, ISBN 3-540-25913-9, pp. 393-398.

[Kashefipour 2005] S. M. Kashefipour, B. Lin,  R. A. Falconer. Neural networks for predicting seawater bacterial levels. Water Management. 158(3), pp. 111-118.

[Iturrarán 2005] Ursula Iturrarán-Viveros and James H. Spurlin. The Gamma test applied to select seismic attributes to estimate effective porosity. SEG Technical Program Expanded Abstracts. 2005,  pp. 1739-1742.

[Rangel 2005]. José Luis Rangel , Ursula Iturrarán-Viveros, A. Gustavo Ayala, Francisco Cervantes.  Tunnel stability analysis during construction using a neuro-fuzzy system. International Journal for Numerical Methods in Geomechanics, 29(15), pp 1433-1456.

[Hong-guang  2006] Ma Hong-guang  and Han Chong-zhao. Selection of Embedding Dimension and Delay Time in Phase Space Reconstruction. Translated into English from Journal of Xi’an Jiaotong University, 2004, 38(4): 335–338 (in Chinese). Also Front. Electr. Electron. Eng. China (2006) 1: 111–114. Chinese version.

[Jones 2006] Antonia J. Jones and S. E. Kemp. Heuristic confidence intervals for the Gamma test. The 2006 International Conference on Artificial Intelligence (ICAI'06): June 26-29, 2006, Las Vegas, USA. Preprint PDF [165 Kb]

[Kemp 2006] Gamma test analysis tools for non-linear time series. Samuel E. Kemp. Ph.D. Thesis. Department of Computing & Mathematical Sciences, Faculty of Advanced Technology, University of Glamorgan, Wales UK, 2006 [4 MB].

[Jones 2007] A Note on the Gamma test Analysis of  Noisy Input/Output data and Noisy Time Series. Antonia J. Jones, D. Evans and S. E. Kemp. Physica D: Nonlinear Phenomena, Volume 229(1): 1-8, 2007. doi:10.1016/j.physd.2006.12.013. Preprint PDF [365 Kb]

[Evans 2008] Dafydd Evans and Antonia J. Jones Non-parametric estimation of residual moments and covariance.  Proc. Roy. Soc. Lond. Series A, 464(2099): 2831-2846, 2008. doi:10.1098/rspa.2007.0195. [Preprint PDF]

[Remesan 2008] Model data selection using gamma test for daily solar radiation estimation. R. Remesan, M. A. Shamim and D. Han. Hydrol. Process. 22, 4301–4309 (2008).

Test Data Sets (this source will be extended in due course).

A useful calibration test set.

Sin500.asc  [21KB] A sine curve with artificially added uniformly distributed noise having a variance of 0.075. This is the classic example used in many papers and theses. Gamma = 0.0733545595048562, Gradient = 0.711221348889957, Standard Error = 0.00376506542836251, V-Ratio = 0.127617177846676, Near Neighbours = 10, Start Vector = 1, Unique Points = 500, Evaluated Output = 1, Zeroth Nearest Neighbours = 0, Lower 95% Confidence = - ,Upper 95%  Confidence = -, Mask = 1.  Where there are Zeroth nearest neighbours (repeated input point, which may or may not have identical output values)  in the test set, the pointwise variance is computed and returned as Lower 95% Confidence and Upper 95% Confidence. In this case there are no Zeroth nearest neighbours in the data set.

The Henon Map (Zero noise time series)

Hen100.asc [2KB]

Hen1000.asc [20KB]

Hen50000.asc [961KB]

Gamma test Implementation and interpretation notes

[1] Near neighbour routines can be sometimes be misled by unusual distributions of the input vectors, depending on how the routine is implemented. For example, if  there are many identical input points, or the input vectors form a regular lattice. If there are many identical input points (possibly with the same or different output values), or the inputs form a regular lattice, we have to ask ourselves [a] Can we detect these unusual distributions of data, and [b] if we can, what are we going to do about them? One approach is to simply ignore the possibility of these data distributions and perhaps even use approximate near neighbour routines - this has the advantage of speeding up the code. Another possibility is to code the near neighbour routines so as to detect anomalous distributions of data. This slows up the analysis, but is safer, and was the general approach taken in winGammaTM, which maintains lists of equidistant near neighbours. Thus the list of near neighbours becomes a list of lists.

[2] As the number of well distributed data points increases, since (by hypothesis) the data is confined to a closed bounded set, the mean distance between input data points decreases. The Gamma test relies on taking differences of the input vectors and works out the distances between pairs of points. If the input data is of low precision then as the number of data points becomes large the errors in the input data vectors cause the distances between pairs of points to become less and less significant. Thus with low precision data there comes a stage where increasing the number of data points is counter productive - the Gamma statistic then starts to perform a random oscillation about the true value. In particular using double-precision arithmetic in a Gamma test implementation, perhaps designed to deal with very large data sets, is only useful if the data is high precision.