Under construction

Participate in building a complete model of being human because science is a community effort. The uncharted territory to explore is the holographic mind. The resources that I am using to create a coherent model are the recently developed mathematics of fuzzy systems, fractals, self organizing systems, cellular automata, neural nets. "Play" is the first experience of being self organizing and creative.

The brain is a biological neural network in which modern scientific research has discovered structural properties that are similar to a hologram.

So how did the human species evolve culture, and all the power of group learning passed down for generations outside of biological constraints. By the development of language as meta maps that allow actual brain sharing even though each individual is still constrained by self organization to form their own implementations of these identical maps.

This is a linking of the inhibitory computational properties of the neocortex to form meta structures that have fractal properties especially of a cantor dust that can interface with "white light" holograms to separate the frequencies as a defraction grating. This was pointed to by the Taoists pictorial representation of the I Ching and can be constructed from an information connectivity with the real local cosmos!

It is the purpose of this chapter to explain these statements and later chapters to show what are the consequences and implications for religion and science. There are many pieces to this jigsaw puzzle or cards to this deck. I will not presume to teach the detailed nature of these immense subjects, but to explain how they fit together and to use the material introduced here in later chapters on religion and my model of reality. I will provide what I hope is a guide thru this forest of new research into the new view as given by science of reality as complex self organizing and self regulating systems in which chaotic processes are the stuff from which emerges life.

WWW Links [under construction] Holography: explaination
Holographic Mind: biology
Fractals: explainations
Cellular Automata
Self Organizing Systems
Neural Nets: Intro


This self or person who is conscious is made of the same holograms, but has the fiction or illusion of viewing something separate: consciousness "extracts" itself from itself! This means that all the procedures used by all life forms throughout evolutionary time are also in a holographic form and are all present in our consciousness: all time is now! The development of the mathematics of neural networks opens up our understanding to how networks developed at different evolutionary times that exist "side by side" in holographic mind conflict and interact. Humans are not the end product of evolution such as the space shuttle is the end of development in transportation, but the sum and content of all of evolution as if the chariot, horse and buggy, clipper ship and steam engine were all integrated into the operation of the space shuttle. [At least then it could land on water!] This is also the paradigm of Astrology: humans receive instructions from all our mammal, reptilian, fish and insect ancestors! So what is it to be human that is separate from those messages? It is our sharing and opening and trusting and inter-depending thru human communication. This is our self realization and spiritual resource.

I have briefly introduced neural networks in the opening of this chapter. I have used this theoretical work as a guide in understanding and shaping my study of the technical biology of the holographic brain, and subsequently, of human behavior. This follows a well know principle of the scientific method that will be discussed in subsequent chapters: completability. Present knowledge cannot contradict known facts in other fields unless it replaces it, but must be completable given the entire context of knowledge. This applies to neural networks and biology: it is known that supervision and learning from behavior modification or stimulus response automatic mechanisms are biologically impossible. We know from human behavior that competition and beauty and creativity exist. So I am looking for neural network structures that fit these facts: that are completable within these contexts. And sure enough, there are whole classes of competitive all-or-nothing neural networks, and another class that offers adaptive resonant. Both of these types are self organizing. So I will start with the details about them and the characteristics of the computational spaces that emerge from these networks.

Here is some material that first caught my interest in the literature of neural networks. I am briefly reporting the findings. [The complete text hopefully can be found on the WWW or in your library.]

Comments on excerpts from "Naturally Intelligent Systems"

by Maureen Caudill, and Charles Buttler

"… AI [artificial intelligence] attempts to capture intelligent behavior without regard to the underlying mechanisms producing the behavior. This approach involves describing behaviors, usually with rules and symbols. In contrast, neural networks do not describe behaviors; they imitate them."

[This is a metaprocess and again shows the difference between "thinking" and the underlying primitives of language which are the neural networks that resonate as the meaning of the words.

Hypercubes are equivalent to The I Ching and fractal Cantor dusts as well as defraction gratings. They can be fuzzy by having different dimensional hypercubes or I Ching patterns embedded within each other. It all ties together!

[That human minds have the following qualities in far greater development is clear. Understand that living intelligent systems are not limited by the need to do numeric calculations or store separate memories in holographic and fractal spaces. But it is obvious that we use these kinds of processes as competition, resonance, as in worship and recognition of beauty, filtering and learning rules at many levels of resolution. In the next chapter I will make the case that life can make these things available as self organizing only when we connect to the cosmos as sources for the components of these systems!]

"An autoassociative memory is one where each data item is associated with itself. In this case, a data pattern is recalled by providing part of a data item or a garbled version of the whole item. A heteroassociative memory is one in which two different data items, say A and B, are associated with each other. A can be used to recall B, and B can be used to recall A. ... Consider the case of a system used to "clean up" stereotyped pictures. We might, for example, want to give the network garbled or incomplete versions of the letters of the alphabet and receive back complete, legible versions. To do this, we would store "clean" examples of the letters A, B, and so on. Afterward, when we input a noisy or garbled F, for instance; we would get back the original, legible F pattern. As long as there is not so much garbling of the input patterns that the wrong associations are made, an autoassociative memory will clean up input patterns presented to it and return correct versions."

[ Obviously, (I hope) culture uses this a lot to keep the integrity of its meanings. But if the neural networks as holoprocesses are tied to coherent sources in the cosmos, there will be "drift" where the meaning is not recognizable because the cosmos "moves". Thus trying to establish "absolute unmoving final" instructions for living or knowing "truth" does not work.]

Energy Surface Representation

[This model is used in my model to represent how personality fits into a social structure. Although we think we are unique, the more we suppress our feelings and other areas of our "animal" intelligence, the more we participate in a cultural energy surface common to or shared by most of those in a language group. The more we are "consistent" and obey the rules of culture, the more we are minima or solutions on a surface that may be like a lake with many holes in the bottom producing whirlpools in which we are caught. This kind of behavior is more pronounced during school ages and is called peer pressure, style etc.]

"Probably the most useful construct in the study of crossbar networks is the energy surface."

[As a dimensional construct that can be in 3 dimensions or slices of higher dimensions it is embedded in higher dimensions, or as a mobius within a twisted dimension of broken symmetry.]

"The concept provides a useful physical analogy to the way a crossbar network stores information. Imagine that we have a supply of some soft plastic substance, say modeling clay, that we can dent by pressing it firmly with a finger. Let's suppose we want to associate the size and weight of several spheres made from different materials like lead, cotton, and wood. We spread the modeling clay in a rectangle on a large table so that it is an even 3 inches thick all over. We then label two adjacent sides of the rectangle with the ranges of the numbers we want to use. Along the side nearest us, we tape a scale with numbers ranging from zero to the maximum diameter we expect to encounter; along the side to our left, we tape another scale with numbers ranging from zero to some maximum expected weight.

For each sphere, we press our finger into the clay at the spot corresponding to the measured values of the diameter and weight, leaving a conical dent. As a reminder, we can place a slip of paper containing the name of the material at the bottom of each dent. ...

This is a slightly simplified model of the way an associative memory stores information. When we have made dents for all samples, we find that spheres that are similar in size and weight are associated. In neural network terms, memories of similar spheres are stored near each other. Given a new sphere of unknown makeup, we can easily find which of our example materials it most closely resembles in these two combined properties. If we place a marble on the surface of the clay at the spot corresponding to the size and weight of the unknown sphere, it will roll to the bottom of the nearest dent. Readers familiar with introductory physics recognize that the ball minimizes its potential energy by moving to the lowest accessible position on the surface; it seeks the nearest potential energy minimum.

If we look a little more closely at the characteristics of the clay surface, we see that there are two kinds of spots on it. If we release a marble at a place where the surface is level, it does not move. If we release the marble on a slope, it moves into the nearest dent that is downhill from it, as if the dent attracted the ball. By placing the marble at many different places around a dent, we can map out the region of influence of that dent, its "basin of attraction."

Crossbar networks have their own "energy surfaces" that are comparable to the clay surface of our example. Mathematically the crossbar's energy is analogous to the potential energy of the ball on the clay surface, but in the network, each energy value corresponds to a state of the network, that is, to a unique set of synapse weights and neurode activations. Researchers working with crossbar networks often talk about "sculpting the energy surface." They mean that they store patterns in the weight matrix so that appropriate "dents" are created in the energy surface. Just as with our clay surface, these dents have basins of attraction. Any input pattern that causes the state of the system to fall within the basin of attraction of a dent will also cause the system to recall the memory associated with that dent."

"Another complicating factor is the

problem of storing patterns that are very similar to each other.

Crossbar associative memories produce the fewest recall problems when the patterns stored are orthogonal. In essence, two patterns are orthogonal if they do not overlap, that is, if they are completely distinct. Orthogonal three dimensional vectors, for instance, are all perpendicular to each other in space; they do not overlap. Of course, real-world problems are rarely orthogonal. Most of the time, we cannot be sure that the data patterns we must store are sufficiently different from each other to enable a crossbar to record them reliably. ..."

[This problem of orthogonal separation fits very well into my model of connection with the cosmos. In this case it is the functions that derive answers that are separable or too indistinguishable. These functions exist on different scales or levels of resolution that are correlated with the planets and their orbits.]

"There is yet another problem associated with storage and recall by crossbar associative memories. We can best discuss this problem using the energy surface analogy. When we "sculpt the energy surface" by placing the energy dents or wells where we want to create memories, we invariably and

unavoidably end up adding extra energy wells we don't want. It is as though we have to walk across our clay surface to get to the place where we want to make a dent, and the process of walking across the clay leaves dents where we don't want them or smoothes out dents where we have stored data.

These extra energy wells, called "spurious minima" by researchers, cause crossbars in general, and BAMs in particular, sometimes to generate output patterns that have nothing whatever to do with any of the input patterns stored. When this happens, the imaginary marble placed on the surface has rolled into one of those extra energy wells rather than into one of. the wells we deliberately produced. In the terminology of the computer world, it's "good stuff in, garbage out." …"

[Again this correlates with cultural behavior as "sin", error and negativity. Thus as we adjust the "distance" between our functions, which are other neural networks on other fractal scales, we undo these false solutions or reveal true solutions that have been "filled in", but create other errors at other places. When we wake up to being self correcting and to self regulation, and stop trying to fix something that is what it is and is not really broken, but is just a cultural characteristic. This built-in production of error functions is more obvious in times like now in the inner cities and culture in general that is making many "jobs", lifestyles, and family situations "errors". This will be discussed in later chapters.]

Adaptive Filter Associative Memories

"... The idea behind an adaptive signal filter is simply to make a system that can adjust the way it filters noise from a signal. The filter ideally will adapt to the types of noise presented to it and learn to filter the signal, removing the noise and thus enhancing the signal. ... The adaline is of great importance in our study of neural networks because it is the first network we will look at that

learns through an iterative procedure.

Such learning procedures are more typical of neural networks than the kind of single-pass algorithm we discovered in the crossbar networks. The adaline's iterative learning procedure is more similar to some types of animal learning than the crossbar because new patterns are not instantly stored; instead they must be presented a number of times before learning is complete."

[Iteration and recursion bring fractals into the picture.]

"The adaline also introduces a learning law that is one of the most important in the field of neural networks, the delta rule."

[Does this "rule" work with self organization? Is the training pattern done by other holograms or CA?]

Introducing the Adaline

" One of the oldest neural networks, the adaline has been around for more than a quarter-century. In its simplest form, this network consists of a single neurode along with its associated input interconnects and synapses. That single neurode can learn to sort complex input patterns into two classes. ...

The adaline ... forms a weighted sum of all inputs, applies a threshold, and in this case outputs a +1 or -l signal as appropriate. It has one input and one modifiable synapse for every element in the expected input pattern. In addition, it has an extra input. We use this extra "mentor" input in the training process to tell the neurode what it is supposed to output for the current input pattern. We leave the weight of the mentor input at a constant value of 1.0. It does not contribute to the summed input unless the adaline is being taught, but when it is in use, we want the mentor signal to overwhelm the combined effect of all other inputs."

[There were many steps in the development of neural nets, but I will skip the history and only include material pertinent to my model and real brain / mind problems. I find this book very well written even for non-mathematicians like me.]

Competitive Filter Associative Memories

"In the previous chapters we have seen two kinds of associative memories, each of which can learn the correct associations only when provided with the right response during training. There are times when such a training technique is adequate; however, there are also times when we need a system that can learn by itself. A neural network with such a capability is called a self-organizing system because during training the network changes the weights on its interconnects to learn appropriate associations, even though no right answers are provided. One of the simplest networks with this characteristic is the competitive filter associative memory, so called because the neurodes in the network compete for the privilege of learning.

Why is self-organization so important?

In the early 1960s, researchers had naive notions about the prerequisites for constructing intelligent systems. Some expected that they could just randomly interconnect huge numbers of neurodes and then turn them on and feed them data to create automatically an intelligent mechanical brain. This is nonsense. As we now know, our brains, even the brains of lizards and snails, are highly organized and structured devices. Mere random interconnection of elements will not work. And yet one observation underlying those naive notions is certainly valid: our brains can learn without being given the correct answer to a situation.

Certainly we sometimes need a tutor during learning; this is one of the reasons we go to school. And learning from a teacher or a book is often a more efficient means of mastering cognitive tasks than simple discovery. But how did you learn to move your arm or hand? How did you learn to walk? How did you learn to focus your eyes and interpret visual stimuli to gain an understanding of the physical reality around you? This kind of learning clearly occurs in all of us, and yet there is no teacher to tell us how to do it. Such learning is not taught in any traditional sense. How does it happen?

Some of the most exciting research in neural networks addresses this question: how is it possible for a neural network (such as the brain) to

learn spontaneously, without benefit of a tutor?

In early days, many people postulated a little man living inside the brain, called a homunculus. The idea was that this little man acted as the decision maker/tutor/pilot for learning. The reason for this invention was simply that no one could envision a mechanism for learning that did not require some kind of tutor to be available. Of course, this explanation is not very helpful in the long run, because that means we still have to explain how the little man knows what to do. (Does the homunculus have a mini-homunculus resident in its brain for example? If not, how does the homunculus learn what to do?)

In any event, we clearly need a learning system that does not rely on predigested lessons with answers.

Self-organization and self organizing systems

have thus taken on an important role in the search for biologically reasonable systems. Research into self organization has generally been concentrated on two specific kinds of networks, one relatively simple and one highly complex. In this chapter we will address the simpler kind, which has been intensively developed and investigated by Teuvo Kohonen. This Kohonen network, as it is often called, is the competitive filter associative memory, and we use these terms, as well as the descriptive phrase Kohonen feature map, interchangeably in this book.

A Self-Organizing Architecture

The competitive filter network is exquisitely simple in concept yet has some remarkable properties. In its usual form, this network consists of three layers. The input layer consists only of fan-out neurodes, which distribute the input pattern to each neurode in the middle, competitive layer. The output layer similarly receives the complete pattern of activity generated in the middle-layer neurodes and processes it in some manner appropriate to each particular application. Both of these layers are garden-variety neurode layers, with little to distinguish them from other networks.

The interesting layer is the middle, competitive layer,

and we will concentrate on its operation. Neurodes in this layer have connections to the input and output layers and also strong connections to other neurodes within the layer. We have not seen such intralayer connections before in this book. Since they are central to competitive learning, it is important that we understand their function before discussing how the network learns.

Lateral Inhibition

We have previously considered one way of introducing competition among the neurodes of a neural network. The crossbar associative network, when implemented in hardware, uses feedback competition to ensure that the correct neurodes become active. In that system, the output pattern is fed back to the input during network operation.

The Kohonen network uses a different sort of competition, called "lateral inhibition" or "lateral competition." In this scheme, the neurodes in the competitive layer have many connections to each other, as well as the usual connections from the input layer and to the output layer. The strengths of these intralayer connections are fixed rather than modifiable, and are generally arranged so that a given neurode is linked to nearby neurodes by excitatory connections and to neurodes farther away by inhibitory connections. In other words, when any given neurode fires, the excitatory connections to its immediate neighbors tend to help them fire as well, and the inhibitory connections to neurodes farther away try to keep those neurodes from firing. All neurodes in the layer receive a complex mixture of excitatory and inhibitory signals from input-layer neurodes and from other competitive-layer neurodes. If properly designed, however, the layer's activity will quickly stabilize so that only a single neurode has a strong output; all others are suppressed. This kind of connection scheme is also sometimes called an oncenter, off-surround architecture, a term used for biological structures that operate in the same way.

In lateral inhibition, an input is presented to all the neurodes in the competitive layer. Some of these are sufficiently excited that they try to generate output signals. These output signals are sent to the other neurodes in the layer through the intralayer connections, where they try to squash the receiver's output (an inhibitory connection) or try to assist it in firing (an excitatory connection). The result is that some of these receiving neurodes that were on the verge of firing have their activity suppressed. This strengthens the remaining neurode's outputs since the suppressed neurodes are no longer inhibiting their neighbors. Eventually one neurode's output will prove to be the strongest of all; that one neurode transmits a signal to the output layer for further processing. All other neurodes have their output suppressed in this winner-take-all scheme. A very real competition has occurred, with the strongest neurode winning the competition and thus winning the right to output to the next layer.

Several variations on this are possible. The number of neurodes that are excited and suppressed by the intralayer connections can vary, as can the values of the fixed excitatory and inhibitory weights. It is not necessary, for example, for all of these fixed weights to have the same value. Lateral inhibition has a number of subtleties of this sort that can make it reasonably complex to implement, but that are unimportant here. The point is that by using this scheme, we can enforce a system whereby the neurode with the strongest response to the input pattern is the single winner. Furthermore, we have a mechanism that makes this scheme work without having to call upon some outside mediator to decide upon a winner arbitrarily. The need for the homunculus has disappeared."

[Within the holographic mind the distribution of centers of information where each area contains the whole yet from a different viewpoint or location in phase space. This can be pictured as if the content of mind is within a room whose walls are holograms. We look at the contents from one direction which gives us a selective view: we can best see the opposite wall and space, but can't see what is directly on or near the wall where we are looking from. Using this metaphor, "animal" mind would have their viewpoint set by their species, and our animal mind or midbrain viewpoint would be set at birth. This implies that humans as a group can have multiple viewpoints which is the paradigm implicit in Astrology. Further the human brain or neocortex can take many positions including a view from the ceiling which sees the entire view.]

The network literally organizes itself based only on the input patterns presented to it, so that it models the distribution of input patterns.

"If we use a training set that is too small, we will not get a valid model of the input distribution, just as too small a population sample will give invalid survey results. Similarly we also need to choose carefully the input patterns we use for training to ensure that they are representative of the actual input pattern distribution."

[In humans, being exposed to a confined, reduced environment is abuse of deprivation: the neural nets do not form or self organize correctly. Yet when they are exposed unusual results can take place as illustrated by the story of the Buddha. He was isolated from suffering until as an adult he came to his own conclusions. There is evidence of retarded brain development in language areas when there is constriction of movements like crawling of infants. Science itself uses a method of control of input restriction which tests for relevance and only allows input that has a causative relationship. This method has great successes in material sciences, but in applying this method to humans and animals before self organization was even discovered, incorrect conclusions have been made. Since neural nets create their own models there is no exterior cause to be found and humans are not the effect of anything except insufficient or excess input - output.

The problem is of random distribution of levels of resolution in presentation of the number and type of patterns to human minds. This ensures variety and successful matching because every possibility is tried. The connection with the cosmos gives this random distribution of patterns and levels of resolution. Thus every individual is useful as it is and standardization thru human engineering defeats the structure of mind.

The connection with the cosmos directly applies to the scope - width of the inner model vectors which have different levels of resolution. Thus a data set can have different internal representational sizes as well as different sizes in the input, inner representation and output. For instance, the input vector may have two very broad vectors and the representational map have 12 or 90 vectors or visa-versa. There may also be fuzzy or merged vectors. Emotions like fear and love fall in this category. New levels of resolution may be established as happened during the 16th, 19th, and 20th centuries. When applied to work context, it implies a great increase in specialties and with the personality an increase in fragmentation. It seems that these networks in life forms other than human connect an entire ecosystem but that humans in the neocortex established networks of networks or meta structures, embodied in language, which is what I understand Astrology and other "spiritual" systems modeled for thousands of years. Now it is as if the fragments of human consciousness have taken the size and intelligence of insects.]

"Because of the need to use many input patterns and because we keep the nudges small so the weight vectors stay normalized, we can expect the training of this network to be fairly slow. In fact, this network needs a great deal of careful thought when designing it for an application. We need to be concerned with exactly how to normalize the input and weight vectors. The simple normalization procedure discussed earlier may not be the most appropriate method for a given problem. We need also to consider how we should initialize the weight vectors before training. The simplest, and most obvious, solution is to place them randomly; however, this may not be the best solution."

[ Astrologically, this is random fractal, and is the solution that is used by life forms. Life uses solutions that work and by natural selection hopefully evolves or develops into "best solutions". But the major way things get stuck in a "working but not best" solution is that the structure of the networks are based on resonance, which makes change difficult in the operation of life. It is almost as if resonance must be defeated before progress can be made. Hence the revolution in 500 BC of focusing on the world as suffering and labeling its cause as forms of resonance called attachment and desire! This was further amplified in the Roman Christian and Moslem ideologies, and brought to a peak in the Protestant and Puritan anti-enjoyment attitude. Why should we not see that withdrawal of nurturing can lead to change. This direction also "suffers" from stuckness in resonance as evidenced by the first half of the 20th century and the competition of Fascism and communism to be harshest in their removal of competition and the resonance provided by religion! In a world where the modern, the new, and change is resonated, there is little room for tradition, especially in love and marriage. Thus the benefits of stability and being able to find ground states is replaced by agitation and dissatisfaction: suffering.]

"We have to assure ourselves, for example, that every neurode in the competitive layer is initialized so that it will have an opportunity to be the winner or at least to be the neighbor of a winner. Otherwise some of the neurodes may never participate in the training.

[So much or maybe all of what society does that we label good or bad has its origin in brain structure and is only elaborated in a projected form.]

These are called "dead vectors" when they occur, for obvious reasons. Appropriate initialization procedures can be quite tricky to implement. Although the network designer does not need to have a detailed mathematical form for the distribution of input vectors, he or she must have an understanding of the characteristics of the input data. Only with such an understanding can proper normalization and weight randomization be defined. And of course the training set must be carefully selected so that it accurately portrays the statistical characteristics of the overall data set. This in itself is nontrivial in many cases."

[Normalization is one of the major features of connection with the cosmos where the intersections of the planes of the orbits provide "built-in" normalization. But there are 2 symmetric nodes and they drift! It will be my contention that this drift is also reflected in cultural drift which is mislabeled as history and progress.]

"There is still another important point about these networks: they are particularly useful in modeling the statistical characteristics of the input data, but the statistical models they create are only as accurate as the network size permits. A competitive layer of 100 neurodes produces a statistical model that is 10 times as detailed as one produced by a layer with only 10 neurodes but only a tenth as detailed as one from a layer with 1000 neurodes will be. The network will do its best to model the input data correctly, but the more neurodes it has available, the less area each weight vector must cover, and the more accurate the final trained network. For a perfect data set model, there would be one weight vector, or neurode, available for each possible input vector. A moment's thought reveals that this arrangement is not feasible. It is equivalent to saying that the ideal model of the input data set is the input data set itself. If we want a network capable of telling us something nontrivial about the data, we must use networks having fewer neurodes than the number of possible input vectors. For this reason, a Kohonen network will never be perfectly accurate."

[Here is the source of the problems of politics and the democratic solution versus the single ruler with divine resonance. The structure of divine resonance is that only one person is allowed to be fully connected with their nonverbal self and whole: therefore holy! The nonverbal self as a neural network still follows the structure of "Data in - rules out", but is not inhibited by resonance with the current rules, which to our inner self is anything within hundreds (or thousands) of years, which is the time necessary for cultural rules to be embodied in the preverbal "divine resonance"!]

"On the other hand, there are some interesting possibilities for this kind of network. For example, suppose we train a network with a collection of input patterns and after training find that the weight vectors are clustered. We can then replace these clusters of weight vectors with single supervectors that serve to represent that cluster. As long as we keep the correct ratios of weight vectors in each cluster, we can use as few replacement supervectors as we like. For example, an initial set of weight vectors might have 100 vectors in one cluster and 200 in another. We can replace these with a single weight vector pointing to the average location of the first cluster and two vectors pointing to the average location of the second cluster. Since each weight vector corresponds to a single neurode in the competitive layer, this represents a dramatic reduction in the size of the network needed for this application.

When we have done this clustering replacement of the network, we have a smaller, more efficient network that effectively performs a data compression on the original patterns. Furthermore, we are guaranteed that this data compression scheme is statistically meaningful relative to the input data patterns. These clusters correspond to feature vectors of the input data set, and the scheme that produces them is sometimes called vector quantization."

[This discussion relates to levels of resolution and language as supervectors. There must be a level of resolution fine enough to allow the emergence of supervectors, which I believe happens in the neocortex where there is a 20,000 to 1 ramification of connections between the midbrain and the neocortex. So how many words are needed to represent a single feeling state of the midbrain?]

As new input patterns are represented, we can relate the new inputs to the old by specifying how far away the new ones are from the nearest feature vector. If we have stored these original feature vectors somewhere, we can store the new inputs by simply saving the differences between the input and the stored feature vectors. This may not sound difficult, but for vectors with many elements, such as might be found in digital images, transmitting only the differences between the current image and some standard feature image can result in enormous efficiency improvements.

[A excellent description of social language processes and culturally integrated rules into the preverbal intelligence. But these "enormous efficiency improvements" do not include "drift", and thus become "enormous efficiency inhibitors"! Thus the scientific method and the understanding of the shift of paradigms is coming to grips with this built in problem. But they have it operating on a highly impersonal - non-biological plane of trying to isolate "Truth".]

The Topology-Preserving Map

The one application of these networks that best illustrates their usefulness is the topology-preserving map, studied extensively by Teuvo Kohonen. The easiest way to understand what topology preserving map means is to consider an example of how to create one. Imagine a sheet of paper and a robot arm, with a pencil in the robot's hand. Let's assume that we can move the hand to any location on the sheet of paper and have the robot make a dot. Suppose we connect sensors to the arm and hand that report back the position of the robot's arm as we make dots on the paper. When we are done making dots on the paper (and we must make many, many dots to make a valid statistical set), we have a pattern on the paper giving the distribution of locations where we placed the pencil point. Places that are very dark were visited many times and thus had a high probability of occurrence. Places that are still blank or contain few dots had a very low probability of being visited. The coordinates of each dot define the input vector for that dot. We use these vectors as input data to a Kohonen network. The competitive layer of the network is laid out as a two-dimensional grid, with connections between neighbors in rows and columns. Imagine a grid like that found on ordinary graph paper to understand the connections between the competitive layer neurodes. Suppose we make 2000 dots on the surface of the paper, feeding each dot's coordinates into the Kohonen network as training data. Let's stop every 100 dots and make a note of the network's weight vectors, for a snapshot of the state of the network at these times. Now we take the snapshots and make a series of plots of the weight vectors in the network. As we plot the positions, we draw a line between the weight vectors of nearest-neighbor neurodes, defining neighbor as a neurode that is only one column or one row away in the layers grid. The plot we are making connects the weight vectors of neurodes that are physically positioned next to each other in the grid. It should be clear that there is no particular reason that neighboring neurodes in the grid should have weight vectors that point anywhere near each other. Remember that initialization of the network deliberately scrambled the weight vectors before we began, so we would expect the chart we make to be a jumbled tangle of connecting lines. Figure 7.3a shows that initially we have just such a tangled mess of lines. (In the figure, we initially forced all the weight vectors to be randomly located within the upper right quadrant.)

What will the snapshots of the network look like over time? The other sections of figure 7.3 show the weight vector chart after 100, 200, and 2000 data points have been passed through the network. In this case, about half of the input patterns came from points in the upper right quadrant of the circle, and the remaining input patterns were about evenly divided between the upper left and lower right quadrants. Notice that as the number of input patterns increases, there are fewer and fewer lines crossing the center of the circle and that the edges of the plot become closer and closer to an actual circle. This indicates that the physical ordering of the weight vectors over time becomes organized according to the characteristics of the input data. In other words, if a neurode's neighbor has a weight vector pointing in a particular direction, the neurode itself very likely has its weight vector point in a similar direction. The jumbled mass of lines is gone, replaced with an orderly mesh. It is as though the weight vectors form a stretchy fishnet that begins as a crumpled, tangled ball and tries to conform itself to the shape of the input pattern distribution, with more mesh intersections where input patterns are more likely and fewer mesh intersections where they are less likely. It turns out that no matter what the input pattern distribution is,

the network will organize itself so that the weight vector fishnet stretches and twists so that it makes a reasonably good mapping of the input pattern distribution.

[This is another reason why those life forms that successfully modeled and tracked the cosmos for use in initialization, and normalization as well as resonance became successful. The use of cycles of the cosmos self organizes into models of the cosmos that track the cosmos!]

Furthermore, we can experiment and connect the neurodes in the competitive layer in a simple linear array instead of a grid, with each one connected only to the neurodes before and after it in the line. If we do this, the weight vectors behave as if they were a ball of twine, and their distribution after training becomes like a string twisting along the input vector pattern distribution. These plots are topology-preserving maps because the topology, or shape, of the distribution of the input patterns in coordinate space is preserved in the physical organization of the weight vectors. Topology-preserving maps do not necessarily have to map physical locations. They can map frequencies, for example. A common name for a topology-preserving map when the input data corresponds to sound frequencies is a tonotopic map. In this case, the map represents an ascending or descending set of frequencies, and the neurodes are sensitive to a graduated scale of frequencies. In other words, the weight vectors of neighboring neurodes point to neighboring frequency inputs.

Because the robot arm example is truly a plot of spatial distributions, we call it a geotopic map. This may sound like a somewhat wild-eyed, and perhaps even useless, trait, but in fact topology preserving maps exist in animals. It is known, for instance, that certain structures in the brain that form part of the auditory system are physically organized by the acoustic pitch or frequencies they respond to. Quite literally, tonotopic maps exist in the brain for sound inputs. In addition, there appear to be other such abstract maps existing in the brain for such purposes as geographic-location mapping, such as retinotopic maps in vision and somatotopic maps in the sense of touch. For example, rats that have been trained in a maze have certain spatially ordered brain cells that fire when they are in a particular location in the maze. Such spatial ordering cannot possibly have existed in the animal before training unless we argue that it is that rat's destiny to learn that particular maze. Some mechanism must exist that allows the physical structure of the brain to modify during learning so that the neurons order themselves according to the layout of the maze. While competitive learning may not be exactly correct as a mechanism for this process, it certainly offers an elegant, simple model of how this might occur.

Why does the competitive filter network preserve input data topology?

The fishnet analogy is quite apropos. As Kohonen has described, there are two forces working on the weight vectors. First, the vectors are trying to model the probability distribution function of the input data. Second, their interconnections are also trying to form into a continuous surface because of the synaptic links between each neurode and its neighbors. These different forces establish the model of the input data that we have seen. In other words, when each winning neurode adjusts its weight vector in the direction of an input vector, it pulls its neighbor's weight vectors along with it. Therefore, after training, the weight vectors have formed a more or less continuous surface. Finally, the continuity of the maps means that the trained network has the ability to generalize from its specific experiences to process novel data patterns.

... Let's now examine more complex learning systems, ones that are even more directly modeled after biological systems. In the next part we explore some of these biologically based learning systems.

Application: The Voice Typewriter

... accuracy is only marginally adequate when working in the large vocabulary, speaker-independent mode.

[This book was written in 1990 and much has happened in this field since!]

[Learning is of grave concern to our present culture. We have implemented systems in schools based on classical conditioning and behavior modification research. This research started with birds and animals without an understanding of even what the possible differences are between humans with a shared mind that can be supervised, and animals with only self organization. In effect science imposed an unexamined bias onto animal intelligence. The structure of this bias emerged from European child rearing practices which is big on reward and punishment that doesn't exist in the animal kingdom. Since the learning of rational knowledge is a subset or lower dimension that holographic mind and holoprocesses, humans have projected this "lower intelligence" onto animals, children and women. So I will use much material from the learning chapters of this book and hope to correlate this with social and personal problems around learning and mental health. The correlation for the cosmos of all this material is still hypothetical but fruitful in opening to the possible ways life has connected itself to the cosmos. By studying this one will not predict any Fate or Future, but open to the way cultural practices has set humans up to believe in fate and predestination by mere association of supposed laws of the influence of the "Stars" as in "It is written in one's stars". Our neural networks like structures and rules and can set up shared expectations as if they were laws, when in fact they are self fulfilling prophesies. Self fulfilling and self reference expectations are like computer virus that eat away our natural defense of testing and trying new alternatives and being self organizing.]

pg. 113

Part III

A learned blockhead is a greater blockhead than an ignorant one. Ben Franklin

Learning without thought is useless; thought without Learning is dangerous. Confucius


Neural networks are trained, not programmed; they learn. We have already seen two distinctly different types of learning in the adaline and the Kohonen feature map. The subject of learning and memory in artificial systems is so important, however, that we need to consider it in a more structured manner.

We will start this introduction to learning in artificial systems by looking at the ways animals and people learn and the kinds of memory that have been identified in humans by psychologists. We will then be in a position to relate these to the ways neural networks learn and remember. Finally, we will adopt a more operational view and discuss the major methods for training artificial systems.

[This material is covered by many sources of from traditional academic experimental psychology. I do not agree with the epistemology or ontology of this direction. I conceptualize the results of these areas of human endeavor as projections that have more application and support for the politics of fascism and other forms of repression which prove that intelligence is instrumental and mechanical. This neocortex inhibition projection asserts that intelligence is not part of a mechanical universe, but something to be manipulated. We are seeing the consequences of this manipulation in the violence of our present school systems. Every person that I have met and questioned who is attending school has reported and confirmed the awareness of these major abuses to their true nature!

I will skip to the material on instar and outstar which can begin to model holographic processes. The material input can be transformed into light and dark patterns which essentially store information as interference patterns or convolution spaces.]

Types of Learning

Learning has taken place-in an animal or in a neural network when there is some lasting behavioral change or increase in knowledge resulting from an experience. For our purposes, we can break learning in animals and humans into three broad classes: instrumental conditioning, classical conditioning, and observational learning.


pg. 119

[I have presented much material that I do not accept as accurate model of how the holoprocesses operate, but do accept as "rational cultural" processes that can be changed and reframed. This includes assumptions about the need to change synaptic weights in order for learning to take place as well as classical conditioning and behavior modification. With state of the art theories I can agree that science doesn't "know any better", but I am using a different model of shared holomind, fractals etc. which sees the use of neural net connectivity applied to holoprocesses.]

Learning models in neural networks are rules or procedures that tell a neurode how to modify its synaptic weights in response to stimuli.


Training a Neural Network


pg. 123

Those seeking a new neural network design often adopt the ideas of biologists or psychologists.


The neohebbian model accounts for the fact that biological systems not only learn but also forget.


Differential Hebbian Learning

In our discussion of simple hebbian learning, we had to introduce two features found in biological systems that are necessary for proper operation of a neural network: the possibility of both decreasing and increasing weights during learning and the presence of inhibitory as well as excitatory synapses. ... In mathematical terms, the expression "rate of change" refers to the derivative of a neurode's output with respect to time.


pg. 127


[I am not concerned with models of Classical conditioning because I surmise that it is a political use of science to justify what we now know to be dysfunctional social practices supported by Christian traditions and models of the universe based on Heaven and Hell. But as is often done in science, new discoveries may be developed that are misapplied. This does not change the mathematical correctness of the model. I see these models of animal behavior working in a "schizoid" context of a laboratory setting or in a human environment which is invested in splitting off our "animal" intelligence and thus our connection with holoprocess in favor of cultural values producing rational behavior devoid of natural fractal complexity and self organization. In fact, in the myth of the Garden of Eden, self organization is represented as rebellion against God inspired by the reptilian brain!]

pg. 130

The Instar and Outstar

Let's begin our discussion of the outstar by looking at the neurode from a new, geometric perspective. We describe here only those minimal characteristics needed to understand outstar learning. In Grossberg's work the concepts of instar and outstar imply much more than the simple physical structure we outline. We know that each neurode in a neural network receives input from hundreds or thousands of other neurodes. Thus, each neurode is at the focus of a vast array of other neurodes feeding signals to it. In three dimensions, this construct resembles a many-pointed star with all its radii directed inward. Stephen Grossberg terms this an "instar." From another, equally valid point of view, each neurode is a hub from which signals fan out to a vast array of other neurodes, since each neurode sends its output to hundreds or thousands of others. Grossberg, reasonably enough, calls this an "outstar."

Every neurode in any neural network is, at the same time, both the focus of an instar and the hub of an outstar. Thus, a neural network can be viewed as a highly complex, interwoven mesh of these structures, with the inwardly feeding inputs of each instar arising from the outwardly directed signals of other outstars. In a properly designed network, this complicated arrangement does not result in chaos. In fact, it is precisely this complex mesh in which the ever-changing activity takes place that generates the behavior characteristic of neural networks.

So far, we have not mentioned the synapses that we know lie at the end of every interconnect in both the instar and outstar. In an instar, the synapses form a tight cluster about the input end of the focus neurode. In an outstar, there is a synapse where each interconnect terminates at one of the outer, or "border," neurodes. If we could in some way make the weights on these synapses visible during learning, we would see a beehive of activity, with some weights tending upward, others tending downward, and yet others staying nearly constant.

We can use this instar-outstar concept to understand how a neural network can learn complex patterns. First let's consider how a network of instars and outstars might learn a static spatial pattern, one that does not change in time.

Outstar Learning

Let's build a small network that consists of a single neurode, acting as an outstar, connected to an array of neurodes that act as instars. For this network, we need to use only two inputs on each instar neurode: one from the outstar neurode that has an adjustable weight and a training input that has a fixed synapse weight of 1.0.

Imagine that we cluster the instar neurodes together into a two-dimensional grid, similar to the pixel grid that makes up the image on a computer monitor. (A pixel is the smallest element of light or dark a monitor can display.) So we can more easily visualize the operation of the outstar, we assume that we have arranged a way to make the output of each neurode visible. We see a tiny spot of light proportional to the output of each grid neurode. Thus whenever a grid neurode has an output near 1, we get a very bright spot on the grid at that neurode's position, and whenever a neurode has an output near zero, little light is emitted. In between these limits, the light output varies with the neurode's output. If we use enough neurodes in the instar grid, it will be able to produce an image much like that appearing on a computer screen. Finally, we will place a threshold on each of the incoming signals from the outstar neurode so that only stimuli that are at least as strong as the threshold value will be perceived by the grid neurodes; any smaller stimuli will be ignored. This threshold will suppress random noise firings in the network.

[Instead of arriving at a picture that duplicates the input in space, an interference pattern is constructed that is independent of space, otherwise we are starting with elements that are dependent on exact size, distance and other spacial characteristics that are not relevant]


pg. 135


[Drive reinforcement is one of the best examples of total dysfunction that creates a belief and grasping of cause and effect as unalterable real processes. Yet it is stated that the process I am investigating is how living systems generate various networks that can produce subsets of holoprocesses. We produce hierarchies of processing the results of other processes until we have lost sight of our original holoprocess and shared brain (One Mind). This is a theme over and over again in religions and spiritual teachers of which I am a modern example. My spiritual insight emerges from and is informed directly by the holoprocess and millions of years of animal intelligence which has changed little over the last mere 5 thousand years of recorded spiritual leaders. The difference is that I have the resources of science, but the task of reframing both modern science and religious traditions is the same as other "reformers". Now reform is built into the system and expected and is reframed as progress. Isn't this one of the spiritual blessings of the age we live in?]


Drive-Reinforcement Theory


Even the drive-reinforcement network, however, still looks at the world only one step at a time. If we truly want naturally intelligent systems, we must do more than process a series of single moments; we must deal with continuity of actions and events. We must be able to handle patterns that change in time rather than just patterns that are static; in the next chapter we look at some ways neural networks can handle such sequences of patterns.


[The following excerpt is concerned with learning sequences and in its simple goal, to learning alphabets. I see this as working with fractals and dimensions of fractals by making dependencies of needing the presence of predefined "moves" before continuing the calculation. Thus I see this as a process of computational iterated procedures rather than applied to learning a data. It is like a combination lock: it doesn't open until all conditions are met. When applying this to the connection with the cosmos, I hypothesize a topology of time: the next state of a computation may be completed in the next moment or in a month or year. And it may be completed without the "question" having been asked or alerting the "self" [beyond awareness] that anything happened so that the answer seems to appear suddenly in a totally unrelated context, like a dream or preparing food. Many such occurrences are recorded in Buddhist literature, and in fact the Zen training such as Koan develops just such awareness.]

pg. 141-149 There is nothing permanent except change. --Heraclitus

Learning Sequences of Patterns


The Music Box Associative Memory

... The "music box" name for this technique derives from its similarity to the frozen sequence of notes that a music box plays. The sequence can be repeated indefinitely, and there is no variation or alteration; each replay is exactly the same as the one before. This style of associative memory operation is also called the "tape recorder" mode, for obvious reasons.

Single-Neurode Systems

We can make a system that recalls sequences of patterns with a much simpler layout, however. In fact, Stephen Grossberg has shown that a single outstar neurode can initiate and sustain a lengthy spatiotemporal sequence. ...

The Outstar Avalanche

We can use the outstar learning model to build a network that can learn pattern sequences by making one basic change to the model: we must use neurodes that are slow to lose their activity once they become activated.

... For the network, we use an input, middle, and output layer. The input layer provides fan-out. It transmits every element of the initiating or "trigger" pattern to every middle-layer neurode and starts the replay of the sequence of letters. Thus, each input neurode is the hub of an outstar.

The output layer comprises the grid that displays the appropriate letters. Its purpose is thus the same as the corresponding layer of the static outstar network. Its structure is also the same: each grid neurode is the focus of an instar beginning on neurodes in the previous layer. In the static network, there was only a single neurode in the previous layer; in this case, however, there are many. Also as in the previous network, these output neurodes each have an additional external input used during training to impress the pattern for each letter on the output grid.

The middle layer is the most complex. These neurodes receive inputs from the input layer and from some number of other neurodes within the middle layer itself. We will be a bit vague about how many other neurodes each middle layer unit connects to, but it should be some small number, say between 1 and 10. These intralayer connections are essential for the correct operation of the network.

Now we need only two more things before describing the way a trained system works. First, although the operation of the avalanche network is continuous, it will be helpful in our description if we break time into short intervals as with the crossbar network. Second, we need to indicate the size of the activation decay constant of the input- and middle-layer neurodes. ...

Operation of the trained avalanche is exquisitely simple in concept. The neurodes in the middle, avalanche layer are trained to fire only if they receive a stimulus from the currently active neurode and if the previously firing neurodes in the sequence are still at least partially active. Each succeeding neurode can thus be triggered only if the correct combination of stimuli is received at the correct time. For example, if the correct sequence of neurode firings in the middle layer is 1, then 5, then 3, then 6, the network is set so that neurode 3 will not fire unless it sees stimulating activity from neurode 5 and at least partial activity from neurode 1. Neurode 6 will not fire unless it sees stimulating activity from neurode 3 and partial activity from neurode 5, and so on. The intralayer connections enforce the temporal relationships between the avalanche layer neurodes.

There are a number of points to note about the operation of this network. For instance, if the middle-layer neurodes are excited in the wrong order or accidentally stimulated with noise, any resulting spurious activity in the layer soon dies out, and the process continues as if nothing happened. Also, for complex patterns, we can require more than one neurode to be active for the pattern to continue. We can also arrange for operator interaction. At any point in the process, for instance, we can require a reinforcing command from the input layer in order for the recollection to continue; thus, for instance, a single input prod will not necessarily cause the network to run through the entire alphabet. Finally, we can store many sequences with a relatively small number of neurodes since it is their temporal relationship during stimulation that determines whether or not they activate.

[Time dimension fractals: The length or scale of the line is the duration and the interruptions and direction are the actions? Here are I Ching ground states and transitions?]


These are fundamental properties of voluntary behavior; we can stop and start such behaviors at will.


Recognizing Sequences of Patterns


pg. 149

[So on to self organization or as it is called "Buddha Mind". I "cannot" resist, actually I don't want to, including an excerpt from a very well known Buddhist teacher of hundreds of years ago: Naropa. "Tilopa sang this song of his oral instructions in which the meaning of supreme goal-realization is condensed:

Naropa, you are a worthy vessel: In the lamasery of Pullahari In the spacious sphere of radiant light, ineffable, The little bird of mind as transference has risen high by its wings of coincidence. Dismiss the craving of belief in an ego.

In the lamasery of non-dual transcending awareness, In the offering pit of the apparitional body By the fire of awareness deriving from the bliss and heat of mystic warmth The fuel of evil tendencies of normal forms of body, speech, and mind has been consumed; The fuel of dream tendencies has been burnt up. Dismiss the craving for the duality of this and that.

In the lamasery of the ineffable, The sharp knife of intuitive understanding Of Great Bliss, of Mahamudra, has cut the rope of jealousy in the intermediate state. Dismiss the craving that causes all attachment.

Walk the hidden path of the Wish-Fulfilling Gem Leading to the realm of the heavenly tree, the changeless. Untie the tongues of mutes. Stop the stream of Samsara, of belief in an ego. Recognize your very nature as a mother knows her child.

This is transcendent awareness cognizant in itself, Beyond the path of speech, the object of no thought. I, Tilopa, have nothing at which to point. Know this as pointing in itself to itself.

Do not imagine, think, deliberate, Meditate, act, but be at rest.' With an object do not be concerned. Spirituality, self-existing, radiant, in which there is no memory to upset you cannot be called a thing.

Naropa then said that action which is free from all bias had been fully understood.

Naropa had imbibed all the qualities that were to be found in the treasure-house of Tilopa's mind. He had realized the twelfth spiritual level and he expressed his intuitive understanding in the words:

One need not ask when one has seen the actuality, The mind beyond all thought, ineffable, unveiled; This yoga, immaculate and self-risen, in itself is free. Through the Guru's grace highest realization has been won, One's own and others' interests fulfilled. Thus it is."

I will leave it to the reader to mull this over until later chapters where I can fully explain what is being discussed. Samsara is the word used to point to our so called consciousness of reality conditioned by our cultural and personal bias. It is not an "evil" state, but is awareness and programs of daily life. It is what you are experiencing at this very moment as visually reading, the visual and other sensual surroundings as well as your internal voices. But since it is derived from or a subset of a holoprocess it is also nirvana.]

pg. 151-153

Who learns by Finding Out has sevenfold The Skill of him who learns by Being Told. --Guiterman

12 Autonomous Learning

We have discussed several important learning models in this part. Let's now step back and take one last, broader look at autonomous learning systems. We will distinguish autonomous learning from the more general unsupervised learning of, say, the competitive filter associative memory by the following characteristic: an ordinary unsupervised learning system learns every input pattern, whether or not it is important; the only way to prevent an input pattern from being learned is temporarily to disable-turn off-learning. An autonomous system, on the other hand, can learn selectively; it learns only "important" input patterns. As a result, learning can be enabled-left on-at all times.


Characteristics of Autonomous Learning Systems

The competitive filter associative memory is capable of ordinary unsupervised learning; for example, it can learn the statistical Properties of its input data set without a tutor. But we must provide the network with a carefully controlled schooling experience for it to learn correctly. For instance, we must arrange for the learning data set to be a balanced and rich representation of the real-world dare we expect the network to experience in operation. build into a system What are the characteristics we would like to the following, capable of truly autonomous learning? We suggest based on a list originally set forth by Gail Carpenter and Stephen Grossberg.

1. The system functions as an autonomous associative memory; it organizes its knowledge into associated categories with no help from us and reliably retrieves related information from partial or even garbled input cues.

2. It responds rapidly when asked to recall information or to recognize some input pattern it has already learned. This means that the system utilizes parallel architecture and parallel search techniques.

3. Since it must function in the real world, the system learns and recalls arbitrarily complex input patterns. Further, it places no mathematical restrictions on the form of those patterns. A mathematician would say that the input does not need to be orthogonal or linearly separable, for instance.

4. The system learns constantly but learns only significant information, and it does not have to be told what information is significant.

5. New knowledge does not cover or destroy information that the system has already learned.

6. It automatically learns more detail in a particular associative category if feedback information indicates that this is necessary. The autonomous system may suddenly begin treating as significant some input that it had previously been ignoring, or vice versa.

7. It reorganizes its associative categories if new knowledge indicates that its present ones are inefficient or inadequate.

8. It can generalize from specific examples to general categories.

9. Its storage capacity is essentially unlimited.

These are ambitious requirements, but some of the neural network designs we will describe display almost all of them. Let's explore each requirement in more detail to see what it might mean in the operation of an autonomous system.

Neural net theories are continued here!