Return to Genesis of Eden?


Back to Fractal and Chaotic Dynamics in the Brain/1

2 : The Modelling of Neural Systems

(a) Computational Processes and Causal Closure

Generally, a computational process is designed to be determined by the input conditions and the algorithms within the program so as to arrive at a logically precise outcome. A computer program which fails to conform to such criteria may crash, or give unreliable results. The inclusion of random variation within such a system is normally associated with noise or component failure and is avoided.

These criteria are softened in certain artificial intelligence approaches in which a computational system is required to handle input from an open system, where the external conditions being responded to may not yield to a single deterministic algorithmic strategy. A heuristic approach is then followed in which differing strategies are adopted with differing probabilities which may be a function of previous successes.

A second limitation on deterministic algorithmic processes is caused by problems which are intrinsically difficult, because deterministic processes require exponentiating computation times as the number n of cases increases. An example is the travelling salesman problem, which involves finding the minimum cyclic path joining n points, fig 9. A simple estimate of the number of cases involved is n!/2n since the n! cases are divided by n equivalent starting points and two directions of travel. This involves 12 possibilities for 5 points, 181,400 for 10 points, and 3.09 x 1023 for 25 points, taking 109 years at a million cases a second. Problems that scale like nk as the number n increases are tractable, while ones scaling like en are intrinsically difficult (Cowan et. al. 1988, Bern et. al. 1989).


Fig 8 : Travelling salesman problem for 9 vertices. Maximal and minimal paths.

A final limit on computability exists in the consequences of Gödel's theorem determining that some logical propositions may be formally undecidable from within the axioms of the system in which they are constructed. A Turing machine is a formal computer with potentially infinite memory operating on binary data. Turing has proved that it is formally undecidable whether a Turing machine will in fact complete a calculation determined by a given input code. This raises serious questions as to the logical completeness of computability in general. Roger Penrose has pointed out that the entire program of strong artificial intelligence in which conceptual models of the brain are based on formal computing models and axioms may thus be doomed to failure (Penrose 1989, Searle 1987, 1990, Churchland & Churchland 1990).

Spin glasses (Cowan 1988, Stein 1989) are materials with chaotically oriented atomic spins which can reach neither a ferromagnetic equilibrium (spins aligned ) nor a paramagnetic one ( spins cancelling in pairs ), because of long-range spin interactions between magnetic trace atoms (Fe) and the conduction electrons of the host material (Cu). Because these effects reverse repeatedly with distance, no simple state fully resolves the dynamics, and spin glasses thus adopt a large variety of disordered states. Modelling the transition to a spin glass has close parallels in neural nets, particularly the Hopfield nets consisting of symmetrically unstable circuits. Optimization of a task is then modelled in terms of constrained minimization of a potential function. However the problem of determining the global minimum among all the local minima in a system with a large number of degrees of freedom is intrinsically difficult. Spin glasses are also chaotic and display sensitive dependence (Bray & Moore 1987). Similar dynamics occurs in simulations continuous fields of neurons (Cowan & Sharp 1988).

Annealing is a thermodynamic simulation of a spin glass in which the temperature of random fluctuations is slowly lowered, allowing individual dynamic trajectories to have a good probability of finding quasi-optimal states. Suppose we start out at an arbitrary initial state of a system and follow the topography into the nearest valley, reaching a local minimum. If a random fluctuation now provides sufficient energy to carry the state past an adjacent saddle, the trajectory can explore further potential minima. Modelling such processes requires the inclusion of a controlled level of randomness in local dynamical states, something which in classical computing would be regarded as a leaky, entropic process. The open environment is notorious as a source of such intrinsically difficult problems, which may have encouraged the use of chaotic systems in the evolutionary development of the vertebrate brain.

One of the most important aspects of the design of flexible algorithmic process is a modular architecture in which new routines can be added to the existing repertoire in the event that existing strategies cannot provide an effective solution, even if this is at the level of a primitive. Such a system must have the capacity to generate variation at any level of processing.

A classical deterministic system following the principles of the Laplacian universe can be described by specifying its equations of evolution and the initial conditions. Often the equations of evolution take the form of differential equations. The classical Hamiltonian and Lagrangian dynamical equations, for example, are both expressed as differential equations defining generalized coordinates as a function of increasing time, where and similarly giving the total ( T kinetic & V potential ) energy :

 [2.1]

 [2.2]

We will call such a system causally closed because it obeys strict causality in time, and is completely specified by its initial or other boundary conditions. By contrast, a system which is not closed may display non-deterministic behavior, resulting from information which enters the system during its time evolution. We will call such a system causally open. A partially random process, such as a Markov process is causally open. At a deeper level, the stochastic-causal processes of quantum mechanics are also causally open because Heisenberg uncertainty [1.36] prevents a complete causal description of quantum dynamics. The probability interpretation of the wave function [1.35] determines the open boundary to be the hidden level of sub-quantum description. The theory can predict future states only as probabilities, leaving open the possibility that a hidden variable theory may solve the problem of how the universe makes a unique choice in each instance of reduction of the wave packet.\

(b) Mathematical Models of Neural Nets At its simplest, the formal neuron is described as a discrete additive logic unit, (Cowan 1988) :

[2.3]

in which there is a discrete 0 or 1 output representing excitation or quiescence, depending on the weighted sum of inputs adjusted by a threshold. Mathematical modelling of neural nets is usually done in terms of such McCulloch-Pitts neurons. Other models include the PLMI or piecewise linear map of the interval which performs an iteration of the interval based on three linear steps in a similar manner to the logistic iteration (Labos 1986). This can mimic chaotic pacemaker activity or silent stable cells depending on whether the slope of the central section exceeds y=x.


Fig 9 : (a) The x XOR y configuration for perceptrons. (b) Design of the Hopfield net. (c) The format of the back-propagation net.

The perceptron consists of a layer of McCulloch-Pitts neurons (m) synapsed by both inputs (x,y) and thresholds (t), fig 9(a). Hidden intermediate cells (h) may also modify the response. A system is trained to perform logic operations by altering the synaptic weightings, adding a positive increment when the output is 0 and should be 1 and conversely. Repeated perturbation of the weightings will eventually train such a net to perform logical discriminations.

Other algorithms can be used. The adaline works on the following scheme :

Let e(s) = actual(s) - expected(s).

Then[2.4]

Formal neurons can also be used to model time-varying continuous changes such as integration of velocity information into distance (Cowan & Sharp 1988). However the universal operation XOR cannot be so trained using either of these weighting schemes, fig 9(a). While McCulloch-Pitts neurons cannot model chaotic dynamics, being restricted to discrete states, it is possible to design systems of formal neurons which have aperiodic output for a given finite collection of inputs, e.g. period 256 in an 8-bit channel (Labos 1986). This forms a discrete analogue of an aperiodic orbit in real dynamics.\

Continuous time-dependent models can be developed using a sigmoidal function of the input : [2.5]

A sigmoidal response is used in the Hopfield net (Hopfield & Tank 1986, Tank & Hopfield 1987). Neurons which are linear amplifiers (op. amps) are coupled through symmetrical inhibitory or excitatory links from their output back into the input of the other cells, fig 9(b). The requirement for symmetry ensures specifically that no limit-cycle attractors or chaotic dynamics can result in the system, which will have a dynamical energy surface consisting entirely of sources, saddles and sinks. If a mix of excitatory and inhibitory weights exists, the system will adopt a form analogous to a spin glass. Such a system can be trained to solve constrained optimization problems such as the travelling salesman problem, by suitably defining an energy surface through determining the synaptic weightings. For example the synaptic weightings can consist essentially of the distances between all pairs of cities, and a suitably wired net will move exponentially towards a solution close to the minimum distance connecting them. Although the total number of such quasi-optimal local minima is restricted by spin glass theory to approximately the number of cells, the system runs into the classical problem of multiplying local minima as the number of cases grows large. The solution also results from a careful choice of synaptic weightings to define the problem as a minimization rather than a learning process.

To avoid having to test all the configurations of the state space, a strategy like annealing in spin glasses can be used. A trajectory from a given initial state is followed to a local minimum. The state of a given unit is then randomly reversed and the new state is retained if it is optimal over the unreversed configuration. Such a net is then termed a Boltzman machine. Multi-layered perceptron and Hopfield nets can be trained to perform several sensory pattern-recognition tasks such as character recognition.

More efficient adaptation and a solution to the XOR problem arises from back-propagation as illustrated in fig 9(c). The neurons are amplifiers with a linear gain G, and modification occurs in 2 stages elaborating the adaline rules : (i)

(ii) [2.6]

Chaos does not exist in these dynamical models because of the symmetry in synaptic weightings, which ensures the models are designed to stably seek local minima, however the systems are dynamically capable of chaos if symmetry is removed. It should be noted that chaotic pacemaker and silent cells of the PLMI type can be used to model periodic neural nets, making it possible for chaotic neurons to give rise to periodic circuits (Labos 1986).

(c) The Adaptive Resonance Architecture. One of the most interesting neural net models is the ART or adaptive resonance theory model of Grossberg (1983,1987,1988). The system consists of a set of two coupled neural nets, which are capable of developing self-directed pattern recognition without the problems of entrapment in local minima, memory saturation, requirements for external guidance and resetting of other nets. The general configuration of the ART system is shown in fig 10(b). This consists of two fields of formal neurons. The field F1 receives the input pattern of 0's and 1's, which is passed via synaptic connections to the upper field F2. F2 has mutual (dipole) lateral inhibition, fig 10(a), which causes contrast enhancement and in the ideal case determines a winner-take-all strategy in which only the maximally-excited node is activated. Such activation will then result in the top-down synapses back to F1 activating the learned pattern represented by this node.

Gain controls enable the lower field to respond to the full bottom-up input pattern if F2 is inactive, but only to the intersection of the bottom-up and top-down features if F2 is activated, permitting both generalization of a set of features and mismatch masking. The two thus form an attentional subsystem which through such masking can detect match or mismatch between the input and top-down model. In the event of match, the intersected model is reinforced, but in the event of mismatch, a second feedback system, the orienting subsystem takes over and resets the given node in F2 so that it is below the other thresholds. In this manner the system beginning with the input, will sequentially seek a node which either represents the input within tolerance, or is free to be used as a template for the pattern presented. Only then does pattern discrimination learning take place. Other refined modulation varies the synaptic adjustments inversely with pattern complexity so that complex patterns with only minor (random) variation are clumped while simple patterns with significant variation are discriminated.

The system also captures characteristics of short-term (STM) and long-term memory (LTM). The short-term memory resides in the dynamic activity of the each layer. Long-term memory results from changes in the synapses once a stable representation of the input has been formed. A more advanced version, the ART2 (Carpenter & Grossberg 1988) has three F1 fields to hold input and masked patterns separately and is capable of pattern discrimination learning of continuous analog inputs once they are converted into digital form.

The strengths of the model are that it has been developed by a mutual investigation of biological feature detection systems and theory of parallel distributed networks, and thus forms a relatively good model of biological systems, that it is capable of self-organization and is less subject to the difficulties associated with constrained optimization. The model has been used to successfully predict a variety of aspects of central nervous behavior from word-length effects in word recognition to a hippocampal generator for the P300 event-related potential, consistent with a role of the hippocampus in cortical memory formation. One weakness is that the system depends on digital processing of features which are pre-defined. Although the visual system has prominent detector types, the same generalizations have not been made so clearly for example in olfaction. Moreover there is some evidence that the structure of visual processing is at least partially dependent on the dynamics of input (Kalil 1989).

The versions of ART form one of the best model systems yet devised of active pattern recognition, which has, at least in germinal form many of the observed functional aspects of biological nervous function. It is thus a valid foil with which to compare chaotic models. Although the model is digital and is thus not formally capable of chaotic dynamics, its functional characteristics could lead to chaotic dynamics in continuous versions. In particular, the dipole mutual inhibition and selection of optimal nodes should lead to structurally unstable dynamics. The intervening phases of sequential search with orienting reset and stable attentional learning should result in dynamics with sensitive dependence typified by the loss of phase coherence during unfamiliar stimuli followed by coherence once familiarization has occurred as discussed later. Thus although the ART model is in a sense a competitor to chaotic models such as those of Freeman, to be discussed later, the two may in actuality be complementary perspectives on a common biological mechanism, both of which are simplifications of real systems.


Fig 10 (a) A Gated Dipole involves mutual inhibition between two competing excitatory pathways. Such an arrangement is characteristic of contrast-enhancement and can lead to dynamical instability. (b) The Adaptive Resonance Network involves two layered nets each with gain controls forming an attentional subsystem. Repeated search and mismatch detection governed by a second, orienting subsystem enables self-organized pattern recognition learning. The system also has short term (dynamic) and long term (synaptic) memory. (c) Siphon - Gill connections in Aplysia showing excitatory and inhibitory interneurons.

(d) Comparison with Biological Nerve Nets. Systems of biological neurons both in simpler animals such as the sea snail Aplysia, and in models of higher nervous systems such as Marr's model of the cerebellum fig 13(b) display a similar architecture to such formal nets in important ways. The architecture of biological neural systems comprises a layered parallel circuitry in which only a small number of synapses occurs between input & output. The parallel architecture is modulated by several interneuron cell types, each having distinct neurotransmitters, resulting in excitatory and particularly inhibitory feedback, and threshold modulation. In Marr's model of the cerebellum fig 13(b) (Marr 1969, Cowan 1988) several inhibitory interneurons such as basket and stellate cells act to modulate the connection between the incoming mossy fibres and Purkinje cell output via excitatory synapses between the granule and Purkinje cells, which are thus entrained by climbing fibre input. The Golgi cells then modulate the level of activity of the granule cells to prevent overload and the other interneurons modulate the Purkinje cells to ensure recall of stored data rather than random excitations. The entrainment is believed to weaken the synapses which are entrained, so as to remove Purkinje inhibition from the cerebellar nucleus. The model has not been definitively tested for validity. It does not include the active masking and feedback of the ART configuration, and considers only long-term changes rather than dynamical features. Nevertheless the architecture of such inhibitory interneurons is capable of generating structural instability through modulation of transverse inhibitory connections.

Modular feature detection units such as the line-orientation, and ocular-dominance columns in the visual cortex (Hubel & Wiesel 1979) appear to have a similar, although more elaborate structure resulting from the 5 - 6 layers of the 20 or so neuron types in each unit. The form of the chaotic model of Freeman fig 16(b) hypothesizes a series of such parallel units linking the olfactory bulb (OB), pyriform cortex (PC) and entorrhinal cortex (EC). Each of these units has a parallel architecture with a combination of positive and negative feedback leading to dynamical behavior similar to the experimental EEG fig 16(d).

In the abdominal ganglion of the Aplysia (Kandel 1979) in which each neuron has a distinct genetically programmed function, a strongly parallel architecture connects sensory input in the siphon to motor action in the gill with a set of three interneurons, two excitatory and one inhibitory, fig 10(c). Notably in Aplysia identifiable master cells such as the cardiac regulator have intermittent firing patterns consistent with sensitive dependence and chaos rather than the regular beating or bursting patterns characteristic of lower level neurons. Facilitation is also mediated by synapto-synaptic junctions as in fig 13(a).

Chaos may be characteristic of periodic feedback loops which require in addition sensitive plasticity to external modulations. Chaotic feedback loops may thus be a very general phenomenon common for example to the cortico-thalamic links in the alpha rhythm, the limbic system and the orienting reaction, as well as pattern recognition in sense perception. The combination of such feedback with the mutual (dipole) inhibition characteristic of contrast enhancement leads to structural instability. Mutual global-local coupling between the dynamics of neurosystems and that of single neurons is notable in its potentiality to couple cellular and neurosystems instabilities (Alkon 1989).

On to Fractal and Chaotic Dynamics in the Brain/3