A Dissertation Submitted to the Faculty of the Graduate School Bowie State University in Partial Fulfillment of the Requirements for the Degree of DOCTOR OF SCIENCE Department of Computer Science by Kenneth MBALE
September 28th, 2017
BOWIE STATE UNIVERSITY THE GRADUATE SCHOOL DEPARTMENT OF COMPUTER SCIENCE
DISSERTATION COMMITTEE: ___________________________________________ Dr. Darsana Josyula, Ph.D., Chair ___________________________________________ Dr. Seonho Choi, Ph.D. ___________________________________________ Dr. Joseph Gomes, Ph.D. ___________________________________________ Dr. Joan Langdon, Ph.D. ___________________________________________ Dr. Donald Perlis, Ph.D. – UMD College Park, External Examiner
Candidate: Kenneth MBALE Date of Defense: September 28th, 2017
ABSTRACT
Title of Dissertation:
Behavior Oriented Intelligence
Kenneth MBALE
Dissertation Chaired by:
Dr. Darsana P. Josyula, PhD Department of Computer Science Bowie State University
Intelligence is the ability to acquire behavior through observation of the environment, including other individuals, and to select the correct behavior in response to stimuli emanating from the environment. In this dissertation, we describe the Behavior Oriented Intelligence framework, with a focus on the abstract data type that supports the knowledge base of a behavior oriented artificial intelligence.
i
ACKNOWLEDGEMENTS
I would like to express the deepest appreciation to my committee chair Dr. Darsana P. Josyula, who has the attitude and the substance I aspire to exemplify. Her passion and energy for the field of Artificial Intelligence powered my own drive to study, research and write. Her patience gave me the opportunity to learn and understand this important field at my own pace. Without her guidance, this dissertation would not have been possible. I also would like to express gratitude to my fellow students in the BSU Autonomous Systems Lab, for their support, patience and assistance and developing and testing the results incorporated within this dissertation:
Derrick Addo Tagrid Alshalali Christion Banks Revanth Baskar Jesuye David Joseph Emelike Daryl Freeman
Anthony Herron Benyam Mengesha Francis Onodueze Archanaben Patel Paul Sabbagh Christopher Stone
ii
TABLE OF CONTENTS ABSTRACT ......................................................................................................................... i ACKNOWLEDGEMENTS ................................................................................................ ii List of Tables ...................................................................................................................... v List of Figures .................................................................................................................... vi CHAPTER 1: INTRODUCTION ....................................................................................... 1 1.1 Background ............................................................................................................... 2 1.2 Statement of the Problem .......................................................................................... 5 1.3 Purpose of the Study ................................................................................................. 7 1.4 Significance of the Study .......................................................................................... 8 CHAPTER 2: LITERATURE REVIEW .......................................................................... 11 2.1 Biological Learning of Behaviors ........................................................................... 11 2.2 Techniques for Emulating Behavior ....................................................................... 15 2.3 Frameworks for Emulating Behavior ...................................................................... 18 2.4 Using Metacognition to Improve Behaviors ........................................................... 24 2.5 Time-Series Analysis and Prediction ...................................................................... 29 2.6 Inter-Agent Communication Interfaces ................................................................... 33 CHAPTER 3: METHODOLOGY .................................................................................... 37 3.1 Behavior Oriented Framework ................................................................................ 37 3.1.1 Stimulus-Response Behavior Model ................................................................ 38 3.1.2 Perception Processing and Organization .......................................................... 43 3.1.3 Behavior Formation and Learning .................................................................... 45 3.1.4 Cognition Cycle ................................................................................................ 47 3.2 The Kasai................................................................................................................. 55 3.2.1 Structure of Data Series .................................................................................... 56 3.2.2 Dynamic Analysis of Data Series ..................................................................... 60 3.2.3 Design of The Kasai ......................................................................................... 66 3.2.4 Kasai Network .................................................................................................. 81 iii
The original and fundamental focus of Artificial Intelligence is to create an artificial system that emulates the reasoning ability of human beings. Since natural intelligence is the result of biological evolutionary processes, one avenue of research to solve this problem is to examine the biology of intelligent animals. Nature creates Intelligence as the output of a biochemical process that produces an autodidactic organism. Situated within its environment, the intelligent organism observes, explores, and experiments with its environment, detects patterns of actions and circumstances, and internalizes some of these patterns. The organism applies the internalized patterns in response to stimuli in the environment, generally, to achieve the maximum reward based on its own definition of value. A behavior is an internalized pattern with a determinable reward. Learning is the process which allows the organism to know which behavior will result in the most reward in response to specific stimuli.
Observation Sensor to sample the environment Actuator to modify the environment
Pattern Non-controllable characteristics
Behavior Internalized pattern
Controllable characteristics
Utility
An ordering mechanism
Learning
Determinism
Figure 1.1 - Observation Pattern Behavior 1
This research describes a design for an artificial intelligence (AI) that acquires behavior from observations. The observations report the state of the environment, including the individual itself and other actors. The series of observations consists of characteristics controllable by the individual through its actuators and non-controllable characteristic that are a function of the environment itself or of other participants. The series of observations has an ordering mechanism; such as time or dimension or spatiality. A pattern is a series of observations. Some patterns are candidate behaviors. The reward of the pattern may not be known or it may change over time. Learning is associating the behavior with stimuli. Intelligence is choosing from several candidate behaviors the one that maximizes rewards. The dissertation focuses on the capture and organization of observations that enables an AI to discover intelligent behavior from the observations. The main contributions of this dissertation are:
1.1
•
A framework for acquiring behavior through observation;
•
The reference design of an behavior oriented agent;
•
The abstract data type for capturing and organizing time series.
B ACKGROUND Artificial intelligence research has taken two seemingly divergent paths; symbolic and
connectionist [1]. In the symbolic (or declarative) paradigm, intelligence is the result of the application of a set of rules. In the connectionist paradigm, intelligence is a byproduct of the interaction of simple non-intelligent components. Intelligence is said to emerge from their interaction. The symbolic paradigm relies on the specification of a corpus of rules that provide the AI construct the ability to deal with any stimuli that arises from the environment. An example of
2
this paradigm is the expert system. A human expert defines a series of if-then rules. The expert system can then emulate the decision-making of the human expert. The typical expert system consists of an inference engine and a knowledge base. The knowledge base is a set of facts about the environment. The inference engine employs techniques such as forward chaining or backward chaining, using the facts and the user input, to reach a conclusion. Connectionism hopes to explain intellectual abilities using neural networks. The connectionist paradigm relies on training the AI construct by correcting its response to stimuli until the response is correct. Learning can be supervised, unsupervised, or by reinforcement learning. An example of this paradigm is the artificial neural network (ANN). ANN are used for a variety of near-reasoning tasks such as handwriting and voice recognition. ANN emulate the structure of the biological neural network found in the animal brain by incorporating layers of neurons. These layers of neurons encode the function that models the complex mapping between inputs and outputs (perceptions and actions). Given that the encoded function models the mapping accurately, ANN can show intelligent behavior in specialized areas where a substantial amount of prior data is available to learn the mapping. Nonetheless, ANN works well and predictably in the appropriate field of use. Supervised learning requires that solutions are available to train the AI construct. The AI construct switches between training and operating modes. In training mode, the AI construct adjusts its internal processing until its outputs reflect the expected outputs from the given input. In situations where a training set is not available, switching to learning mode does not enable a useful response. In addition, the AI construct cannot learn from a single exposure to a case (a problem and its solution).
3
Unsupervised learning enables the AI construct to learn from observing the input data directly and applying certain techniques to discover inferences in the data. An example of such a technique is clustering. To cluster, the AI construct uses attributes of the data it can recognize and process to find patterns and groupings within the data. In reinforcement learning, the AI construct has a reward function that enables it to select the behavior that maximizes the reward. Based on behaviorist psychology, reinforcement learning is usually combined with supervised or unsupervised learning. These approaches embed the designer’s knowledge and understanding of the environment. In supervised learning, the designer’s knowledge is embedded in the training process. In unsupervised learning, the designer’s knowledge is embedded in the clustering technique and the attributes selected as input to the clustering function. In reinforcement learning, the reward is usually static although it is possible to define a reward hierarchy that provides the construct some adaptability. Nonetheless, the hierarchy itself reflects the designer’s knowledge and understanding of the environment. As soon as environmental conditions are outside the designer’s expectations, the AI construct fails. The failure can be caused by the limitations of the knowledge of the expert. The failure can also be caused by changes in the environment that are not detected or accounted for. An expert system cannot overcome the limitations of the experts that designed it. To do so, it would have to observe its own errors and correct its own knowledge base. An ANN cannot overcome the limitations of its training data set. To do so, it would have to observe its own errors, generate a new training data set, and re-train itself. Referring to [2], there is only a system 1. A system 2 is needed to replace the role of the designer. For an AI construct to also demonstrate the
4
agility of natural intelligence, it must also combine symbolic (system 2) and connectionist (system 1) paradigm into a new paradigm.
1.2
S TATEMENT
OF THE
P ROBLEM
This dissertation presents a model of the mind as a behavior processing machine, and, an abstract data type for representation of observations and behaviors; the Kasai [3]. This data type is the foundational data structure for an artificial intelligence that bridges the gap between connectionist (black box ANN) and symbolic approaches. The focus of the dissertation is on the Kasai object and its functioning. However, to provide a proper context, this dissertation also describes an AI construct that defines and applies a new AI paradigm that reconciles symbolic and connectionist approaches to emulate biological intelligence more fully. The description of the construct covers the representation of observations, the detection of patterns, and the construction of the knowledge base of behaviors. This AI construct is designed to reside within and support the functioning of a host. An example of a host is an autonomous vehicle, or, a robot. The AI construct autodidactically learns behaviors from the processing of the time-series of observations. To determine the value or usefulness of a behavior in a circumstance, the construct must predict the future state of the environment to evaluate the usefulness therein. Therefore, the inference engine equivalent constructs a virtual environment internally that it uses to predict rewards. After enacting a behavior, it can test the virtual environment against the real environment and learn from any differences it finds. Prediction requires the rules mechanism of the symbolic paradigm.
5
Once the differences between the virtual and real environments are minimized, the construct must eliminate inference processing because it is slow. Stimulus-response mechanism must acquire a reflex like characteristic. The connectionist paradigm provides better performance. Therefore, the new paradigm uses the symbolic component to create rules that reflect the environment. Once the rules are validated, they are used to train the connectionist component. The normal operating mode provides for cognition. Correcting the rules to retrain the connectionist component provides for meta-cognition. Metacognition enables the AI construct to adapt to changing environments or deal with surprises or failures. When an anomaly occurs, metacognition is the ability to detect the mismatch between the virtual and the real environment, and thus provide an opportunity to correct the knowledge base that the virtual environment is founded upon. There are two constraints to the design. The first constraint is that the AI construct must be implementable in hardware. The second constraint is that the AI construct must be embeddable within a host construct. The hardware constraint is to ensure the AI construct can process a substantial amount of data at high speed. The embedding constraint ensures that the AI construct can be integrated into a wide variety of applications. The AI construct needs a flexible interface that supports integration into other agents or hosts. The interface between the AI construct and the host needs to be flexible enough to support several different types of hosts. For example, the same AI construct should be deployable in a self-driving automobile or as big data analytics agent. The interface must support receiving a variety of input types, sound, video, stock prices, for example. It must also support specifying behaviors in a manner the host comprehends. Therefore, the interface mechanism must allow for
6
a description of the environment and of the host. The AI construct must learn how to communicate with the host using the language it learns from the observation series, unsupervised. The AI construct is general-purpose in that it enhances the performance of any system it is embedded within. Social animals also learn behaviors by observing other members of their society. Advanced social animals use language to teach behaviors virtually, without having to enact the behavior. In turn, the AI construct needs a mechanism to share its knowledge base with and acquire behaviors from other AI construct instances that are part of its society. A special protocol is needed to enable this society of AI constructs. This protocol defines the communication mechanism, the data interface, and the social rules between instances. Going forward, we refer to the AI construct by the name General-Purpose Metacognition Engine (GPME). The GPME uses the Kasai as its core data structure. The GPME is a reference implementation of the behavior oriented framework.
1.3
P URPOSE
OF THE
S TUDY
The contribution of this research is to identify the software processes, interfaces, protocols, algorithms necessary to implement a single GPME and its society. The emphasis is on the Kasai and its ability to support both memory and prediction functions the GPME design requires. The scope of this research is broad and will extend into post-doctoral studies. The focus of the dissertation is on the Kasai since it is the foundational component of the GPME. For this dissertation, the questions under examination are:
7
•
How can behavior be acquired through observation?
•
What is the complete theoretical design of an agent (GPME) that can acquire behavior through observation?
•
How are observations organized for processing (the Kasai)?
•
How does the GPME use the observations organized by Kasai?
•
What is the specification of the communication interface between the GPME and the host, and, between the GPME instances?
The results of the dissertation include: •
The specification of the behavior oriented intelligence framework;
•
The complete design of the GPME’s internal components - processes and knowledge base;
•
The specification of the interfaces for integrating the GPME with host agents and for communicating with other GPME instances;
•
The implementation of the Kasai for organizing observations and predicting the next state of the environment.
The implementation of the Kasai is described in the Results section. The remaining implementation of the GPME, including the hardware design, is future work.
1.4
S IGNIFICANCE
OF THE
S TUDY
The GPME reconciles the symbolic and connectionist paradigm to propose a design that more closely emulates biological intelligence. Embedded within other systems, the GPME will provide improved perturbation tolerance and adaptability without human intervention. In other words, independent of its existing cognitive ability, an existing system should perform better equipped with the GPME than without.
8
Autonomous systems include patient monitoring systems, robotic nurses, self-driving cars or space exploration robots. The autonomous characteristic means these agents must operate effectively in their respective environment with minimal input from a human operator. For example, space exploration robots are constrained by their distance. Health care systems must respond correctly and immediately to the needs of the patient. Self-driving cars must react immediately to changing circumstances on the road. These applications preclude interaction with a human operator. The autonomous system must therefore be able to handle ambiguous situations. Humans deal with ambiguity by relying on the context, on their experience or even on their intuition. Certain situations require that the decision-making process be nearly instantaneous, almost instinctive, while others allow more time for deliberation. The most difficult situations involve epistemic uncertainty. The available knowledge is partial or incomplete. For example, the full context of a situation is not known at the time the decision must be made. This level of ambiguity introduces the need for meta-cognition, thinking about thinking. Before the autonomous system must decide whether it can even decide, it needs to know that it has sufficient context to apply cognition. Each decision involves a metacognitive decision followed by a cognitive process. This decision-making process must be nearly instantaneous and autonomous. The significance of this study is the complete design of a system that provides the computing capabilities necessary to support this type of reasoning. The specific result of this dissertation is the mechanism that enables the recognition of pattern in systematic data series, at the necessary speed for practical use.
9
Within the knowledge base, the data series must be organized in a manner that supports very high-speed processing, in response to stimuli. When situated in a stable environment, the patterns of perception are also stable. For example, the motion of the sun across the sky is predictable. The correlation between the ambient temperature and the position of the sun is also predictable. We can think of observations as the language of the environment. The GPME needs a grammar for this language to organize and understand the information in the data series. Since the GPME is autodidactic, it must define a grammar that supports the description of a virtual future environment. The dissertation results demonstrate the Kasai for the automatic generation of a grammar that describes a systematic pattern. This compact grammar can then be used for behavior processing.
10
CHAPTER 2:
LITERATURE REVIEW
The literature review is divided into:
2.1
•
Biological learning of behaviors;
•
Techniques for emulating behavior;
•
Frameworks for emulating behaviors;
•
Using meta-cognition to improve behaviors;
•
Time-Series Analysis and Prediction;
•
Swarm Communication.
B IOLOGICAL L EARNING
OF
B EHAVIORS
Learning is the process of acquiring knowledge through experience, practice, observation, or instruction. Conditioning is a pattern of stimulus and response, or, behavior and consequence, which the learner internalizes after many exposures. Observing living beings, we can identify three broad classes of behavior; instinctive, acquired, and deliberate. Instinctive behavior requires the least cognitive deliberation. All living beings are born with instinctive behaviors that need not be learned [4], [5]. Deliberate behavior requires the highest degree of cognitive deliberation during its performance. While these behaviors can be the most complex ones a being exhibits, the performance of these behaviors tends to be slow, in comparison to instinctive behaviors, because of substantial participation of cognitive deliberation. Acquired behavior is deliberate behavior that, through repetition, requires substantially less cognitive deliberation than deliberate behavior. In effect, acquired behaviors are “soft” instinctive behaviors created from deliberate behaviors. An acquired behavior is faster than a deliberate behavior if only because of lesser reliance on active deliberation.
11
Behaviorism proposes that all learning occurs through conditioning, where the environment or the trainer is the source of the stimuli and the responses. Behaviorism suggests the notion of correlated patterns as a basis for learning. We refer to these correlated patterns as rhythms. Behavior in the learner is machine-like; that is, independent of internal mental states. Skinner is the best known proponent of behaviorism [6] although the concept originates with John Watson [7]. Behaviorism introduces two important concepts; classical conditioning and operant conditioning. In classical conditioning, a naturally occurring stimulus is associated with a response. Then, a neutral stimulus is associated with the previous naturally occurring stimulus. The effect in the learner is to associate the neutral stimulus with the natural response, even in the absence of the naturally occurring stimulus. The neutral stimulus is referred to as the conditioned stimulus and the response as the conditioned response. In operant conditioning, an association is created within the learner between a behavior and a consequence for that behavior, i.e., a reward or a punishment. Chomsky refutes behaviorism as a basis for learning [8] and exposes its inadequate support for natural language. Chomsky argues that the mechanisms for processing systems of knowledge, including language, are built into our brains. These mechanisms interact with each other in complex ways we generally characterize as intelligence. For example, humans are not born knowing how to speak any language, but they are born with the ability to acquire and process a natural language. Specifically, language acquisition theory [9], [10] proposes that: •
Children acquire language effortlessly.
•
Children acquire language quickly.
•
Children do not need a formal setting for learning language. 12
•
Children discover language using a very small sample.
•
Children acquire language without relying on imitation.
•
Children acquire language without conditioning (reinforcement learning).
•
Children learn language actively, that is, they say things they have not heard from adults. The theory implies that an innate mechanism within the learner processes the patterns it
perceives with a specific intent to learn language. The learning mechanism supports several strategies, such as conditioning (reinforcement learning), imitation, extrapolation, and experimentation. No single strategy is ideal for every learning situation. Cognitive ethology research explores the innate mechanism by teaching language to dolphins, pigeons and chimpanzees amongst many other animals [11]–[13]. Savage-Rumbaugh’s experiments with bonobos [14] begins with teaching basic sign language to a bonobo mother. The learner was taught to associate words with objects. In this case, the base pattern consists of the spoken or signed word and the object, and the behavior of the researcher. The children of the learner were also present during the training sessions although they were not the targets of the instructions. However, it turns out that the children subsequently outperformed their mother in the cognitive exercises. In fact, the children could be said to have both a bonobo and human models [15] and exhibited behaviors and the absence of behaviors consistent with development in a hybrid human-bonobo culture. In the bonobos experiment, the child imitated its mother. Imitation is an important survival mechanism for all animals [16]–[18] as successful behaviors are replicated without depending on hardwired genetic knowledge or time-consuming, error-prone experimentation and discovery. Kaye discusses the mirror-neuron [19] as the innate mechanism that enables imitation in infants. Rizzolatti discusses the mirror system in the context of social cognition [20]. 13
Observational learning includes imitation. Imitation implies that the learner acquires a behavior from observation. However, observational learning also includes the notion that the learner learns not to acquire the behavior [21]. In human society, we apply observational learning where children learn through listening-in and keen observation [22]. The process of observing and listening with intent, concentration and initiative is called intent participation. Intent participation is critical in early language acquisition and we already observed that it is an important aspect of the performance of the children bonobos. Indeed, the children bonobos were intently observing and listening, unbeknownst to the researchers. In presenting social cognitive theory, the four proposed stages of observational learning are [23]: 1. Attention: Develop cognitive process to pay attention to a model and to observe accurately enough to imitate; 2. Retention: Memorize aspects of the behavior to be able to imitate it later, using language or images; 3. Production: Recall memories and translate them into over behavior. Then, evaluate the accuracy of the imitation; 4. Motivation: Reinforcements speed the translation between observation and action, increase attention and improve retention. This is true even if the reinforcement is negative. The ability to filter irrelevant sensory information and to focus on relevant information is necessary for higher-order cognitive functions such as selective attention and working memory. Recent research suggests that this ability is based on spontaneous alpha oscillations. Anticipatory increase in the alpha rhythm in the primary sensory cortex of the brain before the arrival of a stimulus inhibits the processing of non-attended stimuli [24]. 14
Extensive research in biological rhythms establishes that there exist several naturally occurring rhythms in nature [25]. Many living organisms are predisposed to note and align themselves with the lunar cycle, circadian cycle, and other natural cycles. A rhythm is a patterned repetition. However, in this context, the concept of a rhythm includes that several cycles are synchronized with each other. Thus, a rhythm is a correlated pattern Research in the role of biological rhythms and behavior establishes that a disruption of the rhythm leads to several neurological problems such as sleep cycle disruption, or even social rhythms [26]. Studies have shown that infants as young as five months are able to recognize and differentiate between various rhythms in music [27]. In 1951, Gustav Kramer discovered the sun compass. He performed his experiments by placing European Starlings in orientation cages and then used mirrors to shift the apparent location of the sun. In response, the birds shifted their migratory restlessness to match the compass direction indicated by the apparent new position of the sun Further research confirmed that the pigeon’s sun compass is tied to its internal clock [28], its circadian rhythm. Keenen reports experiments where the birds’ internal clock was shifted by six hours by holding them in a light-tight room with timer-controlled lighting. When released at high noon, the shifted bird headed south for home while the control birds correctly headed north. Since their internal clock had been reset, they incorrectly determined the position of the sun.
2.2
T ECHNIQUES
FOR
E MULATING B EHAVIOR
Broadly, there are two approaches for emulating intelligent behavior. One approach focuses on algorithms that result in intelligent behaviors. Examples of this approach include statistical methods and rules-based methods. The other approach studies the biochemistry of
15
animal brains with the objective of creating an artificial system that results in intelligent behavior by closely mimicking the biological processes of the brain. The goal of AI was to study intelligence by implementing its essential features using man-made technology [29]. This goal has resulted in several practical applications people use every day. The field has produced significant advances in search engines, data mining, speech recognition, image processing, and expert systems, to name a few. The engineering of these practical solutions has taken AI in a direction that enables the rapid implementation of the essential features of intelligence these applications require. A search engine can be very efficient at finding relevant results but it does not comprehend what it is searching for. A data mining application can identify relevant features from noise in a dataset, but it does not comprehend the meaning or significance of what it finds. To fill the void created by the absence of comprehension, AI researchers rely on formalisms and, more recently, on statistical methods. Modern AI has abandoned the use of formalisms [30], in favor of probabilistic and statistical models [31], in its decision-making. This shift reflects the substantial increase in computing capacity to process Big Data. Statistical methods are effective at identifying salient features and at predicting the next event. However, they neither impart nor proceed from comprehension. Comprehension requires intelligence. These applications are tools in the hand of the intelligence that still resides within the user. A consequence of statistical methods is a loss of transparency. Often, the processing of these applications is difficult for a user to understand, even though the results are understandable.
16
We view this lack of transparency as a reflection of our lack of understanding of how intelligence works within our own biological hardware. Of course, there is no assumption that our brains use statistical methods to achieve cognition. The statistical methods achieve sufficient approximation of intelligence to be useful within the narrow application domain. Other approaches focus on emulating the biological architecture of the brain. These approaches are based on the hypothesis that the brain is a collection of simple systems that collaborate to produce intelligence, and it makes sense to emulate this architecture to produce the same result. This line of thinking has resulted in several thrusts such as neural networks, and more recently, Connectomics [32]. Connectomics aims to map the brain’s synapses to decipher how information flows through the brain’s neural circuits. From the standpoint of neuroscience, the goal is to understand how the brain generates thoughts and perceptions. While this area of research will undoubtedly yield positive results in our struggle against diseases such as dementia and schizophrenia, it is not clear how it can provide insight into how intelligence works. In [33], the author wonders; if experiences are coded into brain connections, could a wiring diagram simulate your mind? Even if such a simulation happens, our understanding of intelligence would not have significantly advanced. Cognitive scientists question what we expect to find at the end of this immense omic brainbow [34]. Brenner is largely credited with establishing brain mapping but he does not believe this path will yield results for our understanding of cognition. The feasibility of observing the brain in action is still in question [35]. The question is whether the functioning of the brain, observed at its core level, will make sense to the researcher. This can only happen if the reality of the systems of the brain is a subset of human reality.
17
Otherwise, the researcher will not have a frame of reference to understand intelligence even if she can replicate it. Can a researcher understand what drives an animal by taking it apart? Which chemical test does she use to determine whether a cat enjoys tickles? The frame of reference appears mismatched; advanced intelligence is not a direct function of the anatomical hardware it operates upon. Our current limited understanding of the brain shows that various regions of the brain are dedicated to performing certain functions, including enabling communications between regions. The evidence from neuroscience shows how specific centers in the brain are dedicated to different cognitive tasks [36]. But these centers do not merely do signal processing: each operates within the universe of its experience so that it is able to generalize individually. This generalization keeps up with new experience and is further related to other cognitive processes in the brain. It is in this manner that cognitive ability is holistic and irreducible to a mechanistic computing algorithm. Viewed differently, each agent is an apparatus that taps into the "universal field of consciousness." On the other hand, AI machines based on classical computing principles have a fixed universe of discourse so they are unable to adapt in a flexible manner to a changing universe. Therefore, they cannot match biological intelligence.
2.3
F RAMEWORKS
FOR
E MULATING B EHAVIOR
Several excellent literature surveys on AI architectures have been published. For instance, [37], [38] provide thorough and comprehensive analyses of the cognitive and agent architectures. The integrated cognition (INCOG) Framework [39] provides a frame of reference that unifies the major cognitive approaches developed to date.
18
Kahneman describes the mind as consists of two systems; 1 and 2 [2]. System 1 is fast, instinctive, and emotional. System 2 is slow, deliberate, and logical. The book describes several cognitive biases that result from the architecture and interaction of the two systems. In effect, it provides a useful set of use cases and test scenarios for an AI construct. In the behavior oriented framework, system 2 is further divided in two parts; one that focuses on external rewards, and the other that focuses on internal rewards. The Skill, Rule, Knowledge based classification approach is another important framework [40]. While the paper focuses on the identification of the types of errors likely to occur in different operational situations, the classification also sharpens the definition of a behavior, as it is used in the dissertation. The concepts of knowledge, rule, and skill correlate somewhat to acquired, derived and instinctive behaviors. Goertzel et al describe new theories on how the mammalian brain might represent complex knowledge [41]. The combinatory logic encodes abstract relationships without explicit variable bindings. This theory resonates with aspects of progressive reduction as described later in the behavior oriented intelligence framework, in section 3.1.3 . Rosenbloom et al describe the components of a standard model of the mind [42]. The standard model consists of three types of memory; declarative long-term, procedural long-term, and working. Instrumentation (perception and motor) acts through the working memory. The standard model suggests that the mind is not an undifferentiated pool of information and processing; instead it consists of distinct and interdependent modules. Together, these modules perform the cognitive cycle. Similarly, the behavior oriented framework presented in this
19
dissertation is also a model of the mind with distinct components performing its own cognitive cycle. The integrated cognition framework [39] organizes the essential capabilities of humanlevel cognition. It provides a common frame of reference for comparing cognition architectures. The framework unifies the major cognitive approaches developed to date.
Figure 2.1 - INCOG Framework for Ingredients of Integrated Cognition The framework depicts advanced capabilities further from the center. An advanced capability requires that the underlying capabilities are achieved. Moreover, a capability along one axis is also dependent on capabilities on other axes. For example, the multi-level mind axis
20
has cognitive capabilities; Instinct, Reactive and Deliberative, and metacognitive capabilities; Reflective, Self-Reflective, and Self-Conscious. These capabilities build upon each other. Elgot-Drapkin et al. propose active logics as the formal approach for addressing these problems in a system’s knowledge base [43]. Active logics extend fist order logic with the concept of Time and Retraction. The result is an episodic logic reasoner that is capable of planning with deadlines, reason with contractions and with changes in the language of discourse. Nuxoll and Laird [44], [45] discuss the application of episodic memory in the context of SOAR. Episodic memory is a history of events that can used to improve decision-making. Humans have and continually make use of their episodic memories. An episodic memory is: •
Automatic: The system creates an episodic memory automatically.
•
Auto noetic: A stored memory is distinct and distinguishable from current sensory activity.
•
Temporally indexed: The memory’s metadata includes temporal information that orders the memories in order of perception in time. The conclusion is that episodic memory is essential to sophisticated cognitive capabilities.
The following capabilities are relevant to this research: •
Action Modeling: An agent can use the episodic memory to predict future outcomes.
•
Decision Making: The history of success and failure informs future decision making.
•
Retroactive Learning: Learning after the fact by replaying or rehearsing events captured in the memories. Case-based reasoning is closely related to episodic memory. A case describes a problem
the system encountered and the solution to the problem [46]. The system needs to match a new problem to an existing case to arrive at a previously successful solution. [47] describes a case 21
selection algorithm using rough sets and fuzzy logic. [48] describes ontology-based case based reasoning. As described in the introduction, the GPME uses a variant of case-based reasoning. Traditionally, the cases result in well-known solutions. However, the GPME creates its own cases from its experiences and refines them over time. As a result, each case provides several overlapping solutions to the same problem. The GPME must apply a choice making strategy to choose from the possible solutions. Procedural Reasoning System [49] work on a library of plans called knowledge areas (KAs). Agents based on this architecture can construct and act upon partial plans, pursue goal directed tasks while being responsive to changes in the environment. The metacognitive ability is achieved by user defined meta-level KAs. Because of this user-reliance, the architecture is only as good as the meta-level KAs that the user specifies. Thus, there is no dynamic monitoring and correction in the true sense. If proper meta-level KAs are not specified potential problems could arise especially when multiple tasks are executed at the same time. The Belief-Desire-Intention (BDI) agents [50] make use of an opportunity analyzer to propose plan options in response to perceived changes in the environment. A component checks which options are compatible with the agent's existing plans and passes on the surviving options to the deliberation process. The opportunity analyzer and the compatibility filter together provide the metacognitive ability to the underlying agents. The inherent expectation generation functionality in the GPME provides functionality like these two components. Recent work using Soar [44] and [51] enhances the performance of agents by providing them with episodic memory to support cognitive capability. OpenCog Prime [52] is an
22
architecture that builds on Cognitive Synergy Theory that defines several types of memory to support the cognitive processes. The opportunistic control model [53] describes an architecture that combines reactive (run-time event driven actions) and planning (strategic goal driven actions) approaches in a manner consistent with the intended evolution of metacognition. Further research continues on the Metacognitive loop, in particular, with the advent of the metacognitive integrated dual-cycle architecture (MIDCA) [54]. MIDCA encompasses both the cognitive and metacognitive agent. The support of both cognition and metacognition is also present in the GPME. Earlier research on MCL suggested that the metacognition function could be separate and general purpose [55]. This approach requires an interface between the cognitive and metacognitive functions. In practice, the two functions use the knowledge base directly, effectively combining them [56][57]. Alexander et al. present an approach for meta-level control in a Markov Decision Process (MDP) agent [58], [59]. An MDP is a probabilistic model of a sequential decision problem, where the current state and actions determine a probability on future states. The agent uses metalevel control to determine when to derive new policies or to unroll earlier states towards the problem horizon. To achieve balance between the two activities, the agent incorporates heuristics. This research illustrates that heuristics can be part of the metacognitive processes without constraining them. Lim et al. study the knowledge representation of a robot [60]. Their approach for knowledge representation also combines a graph and an ontology. The key difference between their research and the GPME is that there is no architectural differentiation between the host and the metacognitive engine. The single unified system is designed for human interaction. This
23
fundamental difference results in a substantial difference in the content and use of the knowledge base. Moore and Atkeson introduce a method called, prioritized sweeping, that combines Markov prediction and reinforcement learning [61]. It is a memory-based method that explicitly remembers all real-world experiences of the agent. Heintz et al. describe a knowledge processing middleware that bridges an agent’s instruments and the reasoning mechanisms [62]. The GPME addresses the same concern, albeit using a different architecture for the middleware. Their approach is not designed to capture behaviors or to process time series; it relies on deriving qualitative spatial relations between known objects in the environment.
2.4
U SING M ETACOGNITION
TO I MPROVE
B EHAVIORS
Metacognition is the ability to observe one’s own behavior and adapt it as needed to improve one’s own performance. The roots of metacognition extend into psychology where research sought to understand its role in development [63] and learning disabilities. [64] describes metacognition in humans as a collaboration of several cognitive processes and structures interconnected by the view of self. This theory of mind emerges in childhood as the child separates itself from its environment and distinguishes between reality and the model of reality in its mind. In [65], the survey reviews research results of experiments conducted on adult human subjects to determine how they solve problems. The conclusions were that subjects placed in metacognitive conditions perform better because, while problem-focused deliberation leads to good local solutions, metacognition provides the flexibility necessary to discover more complex 24
and efficient solutions by leveraging a more global perspective. When people reflect on their own thought processes as they solve problems, they perform better. Russell and Norvig classify types of agents [66]. An artificial agent has a performance measure to determine the effect of a sequence of acts on the environment. Inherently, this performance measure is externally focused. Therefore, an agent is built with instinctive behavior, such as a vacuum cleaner robot’s drive to clean the floor, or, a space probe to explore. The performance measure quantifies the effect of the agent action on the environment, in terms of its own objectives. Metacognition enhances an artificial agent by enabling it to examine and improve its own reasoning. Unlike the classical artificial agent, a behavior oriented agent needs an additional performance measure for metacognition. This performance measure is internally focused, measuring the accuracy of the internal knowledge base as compared to the environment. The goal is to minimize the differences between the knowledge base and the environment and to achieve homeostasis; a state where there are no anomalies, or variances between the internal and external representations of the environment. The result is that the behavior oriented framework extends the Kahneman framework from two systems to three. System 1 and 2 are externally focused and are built into the function of the agent. “System 3” is internally focused on improving the knowledge base. A key contribution of this dissertation is the notion that a metacognitive system consists of at least two software components. One, the cognitive component thinks about goals and plans to achieve goals. The other, the metacognitive component thinks about thinking and how it arrives at plans and solutions. The metacognitive component provides for introspection and selfimprovement. In most research, this distinction is logical. At the implementation level, the two
25
components are blended into a single agent. However, the GPME is specifically designed to be a separate agent focused on metacognition only. A perpetual self-aware cognitive agent is one that fully integrates cognition (planning, understanding and learning) and metacognition (control and monitoring of cognition) [67]. In effect, this definition combines the GPME and the host as software components of a single agent. Meta-AQUA is an implementation of this approach using multi-strategy learning. Meta-AQUA uses a form of failure blame assignment called case-based introspection to generate explicit learning goals. A system that detects its knowledge to be flawed must create a specific learning goal to correct the problem. Anderson and Oates survey the field and review the emergence of metacognition across several fields and application domains [68]. Zheng and Horsch address a key issue; control of computation [69]. Control of computation is knowing when to stop a given process. In terms of the metacognitive agent, stopping should occur when the proven best solution to a problem is found. This problem needs to be constrained by addressing the problem of time bound reasoning [70], [71]. Reasoning with a deadline refines the meaning of best solution by requiring that the agent can execute the solution while conditions still permit it. Cognitive systems experience three general types of problems; slippage, knowledge representation mismatch and contradictions [72]. Slippage refers to ongoing changes to the truth of known facts over time. What is true now is not necessarily true later. Knowledge representation mismatch refers to the problems introduced by representing the same knowledge differently. The representation is not important; the meaning behind the representation is what needs to be conveyed. A contradiction occurs when the system simultaneously believes two
26
opposite beliefs. Metacognition needs to provide general mechanisms to enable the system to overcome these problems. One of the objectives of Artificial Intelligence is to impart upon systems the ability humans have for overcoming the “brittleness problem.” The “brittleness problem” is the characteristic of systems to fail when operating circumstances exceed the designer’s expectations. The Metacognitive Loop is a proposed solution for addressing this problem [65]. Anderson and Perlis define brittleness as a system’s inability to manage perturbations [72]. A perturbation is any change in the environment or within the system itself that affects its performance in an undesirable way. Perturbation tolerance is the ability to adapt to the conditions by re-establishing the desired performance. Achieving perturbation tolerance requires the system to detect the perturbation and to make targeted adjustments to its configuration. The system must be self-aware and self-guided as it copes with changing conditions. The strategy to achieve perturbation tolerance is called the Metacognitive Loop (MCL) [73]. The loop consists of continually noticing the perturbation, assessing the anomaly, and guiding a solution into place. MCL enables the system to monitor itself and influence its own performance in real time. It also directs the system to learn when it encounters something it did not know or when it needs to correct beliefs that are now wrong. Metacognition is cognition about cognition; reasoning about one’s own reasoning. Conceptually, the solution consists of one system, referred to as the host, which is integrated with another system referred to as MCL. The host supplies information to MCL about its actions and about the expectations of the results of these actions. MCL monitors the success of the host’s actions by comparing expectations and outcomes. When an outcome does not match an
27
expectation, MCL notes the anomaly. It assesses the anomaly using its internal knowledge such as significance, priority, similarity to other anomalies and possible responses. Finally, MCL guides the host by providing a suggestion to address the anomaly. MCL applies a basic algorithm composed of these steps; note, assess, guide, repeat [74]. Consider a robot trained to perform a certain function. Its initial training occurs on a dry surface. After some time in operation, it arrives at a wet surface. It needs to learn how to function efficiently on this new surface. A slightly wet surface might require minor adjustments such as tolerating wheel slippage. A very wet surface requires major adjustments that amount to relearning how to function. Learning is a time-consuming and expensive operation. After some time operating on the wet surface, the robot moves again onto a dry surface. Ideally, it is not necessary for the robot to invest the same level of effort for learning once again how to function on a dry surface. A better option is for the robot to note the change in the environment it is situated in, to assess which of its learned procedures have the best chance to work and to proceed efficiently. While this example used a robot, MCL and GPME are intended to integrate with cognitive robots or cognitive software agents. We use the terms host, agent and robot interchangeably to mean a cognitive host. Agents equipped with MCL can recover from unanticipated failures. Several applications of MCL exist that improve the performance of the underlying cognitive agent. The earliest MCL implementations utilized simple strategies to improve a host system that consisted of a QLearner as the baseline [75]. The baseline Q-Learner enhanced with MCL was capable of increasingly complex responses to expectation violations (anomalies).
28
MCL has been applied to other systems to improve their responses to anomalies. In the Air Traffic Controller (ATC) [78] and the Natural Language Processor (Alfred) [56] implementations, MCL is implemented as a component within the host agent. In the Mars Rover [77] implementation, MCL is an external component that controls the behavior of the host agent. A commercial anomaly detection framework and the key characteristics it should possess are [79]: •
Timeliness – how quickly a determination must be made on whether something is an anomaly or not
•
Scale – how many metrics must be processed, and what volume of data each metric has
•
Rate of change – how quickly a data pattern changes, if at all
•
Conciseness – whether all the metrics must be considered holistically when looking for anomalies, or if they can be analyzed and assessed individually
•
Definition of incidents – how well anomalous incidents can be defined in advance
The Kasai must also demonstrate similar characteristics to enable the GPME to respond to stimuli.
2.5
T IME -S ERIES A NALYSIS
AND
P REDICTION
Observational learning uses a time-series of observations as its input. Recurrent network have often been used for prediction of time series [80]. In [81], a special Elman Jordan recurrent neural network is used to predict the future 18 values given a time-series of 111 historical values. The number of nodes in the hidden layers was determined experimentally using test data. The best performing network was then used to predict the 18 values. In [82], an echo state network is used to predict stock prices of the S&P 500 index. A Kalman filter is used as a baseline for comparison of results. These references present a variety of methodologies that use neural 29
networks for time-series prediction. Unfortunately, since they all rely on supervised learning, they are not applicable to the GPME. The GPME cannot use supervised learning. Pecar et al. [83] propose a method for analyzing time series using case base analysis to overcome the limitation of rules based time-series analysis, in particular, the dependency on supervised learning. The resulting approach is free from underlying models and assumptions about the nature and behaviors of the time series. Similarly, the GPME design precludes supervised training. The inability to train the GPME in situations we ourselves have never encountered also led to the incorporation of CBR. Giles et al [84] propose an approach where the input time-series is converted into a sequence of symbols using a self-organizing map to support grammatical inference. The approach leverages the ability of recurrent neural network to learn unknown grammars. Normally, inference from a limited set of data is an ill-formed problem because there are an infinite number of solutions that fit the limited data but do not generalize to a larger data set. In the case of the GPME, the data set is limited to the inputs the host and the swarm provide. Prasad [85] uses a deep recurrent network with back-propagation through time and space training method to predict epileptic seizures from electroencephalography. Erik trains recurrent neural networks with an evolutionary algorithm (genetic algorithm) [86]. This algorithm affects the architecture of the neural network, in addition to the weights. The approach described in this paper is closer to the requirements of the GPME. It is possible to conceive of the behavior composition component as a genetic algorithm. In the GPME design, the component includes a genetic programming component to enable creativity in the composition of behaviors.
30
Kurupt et al. [87] describe an approach for predicting a future state as a function of the current state, and then comparing the prediction to the actual state as a trigger to correct the internal model. The implementation uses ACT-R’s blending and partial matching. While the problem in the paper is not a time-series, the approach used is conceptually like the one described in this dissertation. Tsumori and Ozawa showed that in cyclical environments, reinforcement learning performance could be enhanced with a long-term memory and a ‘‘change detector’’, which would recall stored policies when a given known environment reappeared [76]. The different “throw-out current policy and explore” MCLs discussed above act as change detectors but they do not have a memory to store policies associated with a given environment and to recall policies when a known environment reappears (seasonality) [77]. Several approaches have been researched to address seasonality. For example, Zhang and Qi describe the inability of artificial neural networks to handle seasonality [88]. Using simulated and real trend time series data, their research concludes that neural networks are not well suited to forecast without substantial prior data processing. Taskaya-Temizel and Casey conclude that neural networks model seasonality provided their architecture is properly configured [89]. To address seasonality, the input layer size should be equal to the longest cycle information. The A-distance can be applied in detecting anomalies in streams of symbolic predicates in the context of the MIDCA cognitive architecture [90]. The method requires the calculation of the probability of occurrence of a predicate at a point in the stream. In the paper, a low probability of occurrence is an anomaly that triggers an analysis of the input. Similarly, the
31
GPME can use the A-distance algorithm to detect the occurrence of an unexpected value in a perception time-series. Case-based reasoning is closely related to episodic memory. A case describes a problem the system encountered and the solution to the problem [46]. The system needs to match a new problem to an existing case to arrive at a previously successful solution. Finite state machines can be classified as deterministic and non-deterministic [91]. Deterministic machines produce one trace for any given input while non-deterministic machines can produce multiple traces. Probabilistic state machines have a probability assigned to each transition. The Kasai is like a deterministic state machine in the sense that it can reproduce one and only one input sequence. Like the probabilistic state machine, its transitions carry attributes that affect the selection of the transition based on the state of the input. Bergmann et al. [92] present an approach for incremental graph pattern matching using the RETE approach. The authors apply the language VIATRA2 and demonstrate their approach using Petri nets. In VTCL, graph transformation rules are specified using a precondition on the left-hand side and the postcondition on the right-hand side. Similarly, the Kasai defines the rules that describe the input sequence as a precondition implying a postcondition. Berstel et al [93] discuss the application of rules based programming using the RETE Algorithm. Rules are expressed in ILOG JRules. A person must specify the rules based on their knowledge of the event domain the application will process. A Kasai object can detect rules in the event stream and either present them to the person for approval or automatically specify the rules and update the rules engine. The approach for augmenting RETE described in [94] suggests mechanisms that can be used to combine the Kasai with RETE. The traditional RETE algorithm 32
does not support temporal operators. Several extensions have been proposed to enable complex event processing using RETE [95], [96]. The Kasai object natively supports a representation of time. This representation of time is relative to itself. Intrusion detection continues to be significant problem [97]. Detection approaches can be categorized as anomaly detection or misuse detection. Anomaly detection assumes that intrusive activity varies from a norm. Anomaly detection relies on establishing a statistical model and detecting large variances. Misuse detection focuses on behavior and detecting unusual patterns. The authors describe a misuse detection approach based on state transition analysis by using pattern matching to detect system attacks. The Kasai object encapsulate the signature layer and the matching engine into a single object.
2.6
I NTER -A GENT C OMMUNICATION I NTERFACES The GPME communicates with other systems to provide them suggested behaviors, or, to
learn new behaviors. The communication interface relies on previous research in agent communication languages (ACL). The work on ACL finds its basis in older communication efforts such as abstract syntax notation 1 (ASN.1) and the Knowledge Sharing Effort (KSE). ACL address a much more general problem of enabling any agent to communicate with any other. GPME and the host are autonomous processes that need to collaborate to accomplish the system’s goals. In this context, each one is an agent in a multi-agent system. Collaboration requires a semantic interface based on a shared ontology that properly represents objects, concepts, entities, and relationships of the environment within which the agents are operating. [98] describe an ACL as a collection of message types each with a reserved meaning, that separates the content of the message and their meaning. 33
An early ACL, the Knowledge Query & Manipulation Language (KQML: http://www.cs.umbc.edu/kqml/), uses a syntax based on Lisp s-expression - a reflection of its AI roots. Another similar effort is the Foundation for Intelligent Physical Agent (FIPA: http://fipa.org/index.html) ACL [99]–[101]. The FIPA ACL is structurally like KQML. FIPA ACL is an industry standard whereas KQML is a de-facto standard. The concept and architecture FIPA and KQML share form a sound basis for the GPME host interface. In addition, FIPA provides guidelines for the development on ontologies. The extensible markup language (XML) is the modern way to exchange semantic messages. The literature contains examples of enabling agent communication by embedding XML within ACL messages and by expressing the message completely in XML. The latter approach is pursued as a possible solution for the difficulty developers have with processing the Lisp-based message syntax outside of Lisp. In a practical application, [102] combine XML with KQML in the design of the Plant Automation Markup Language (PAML). PAML enables communication between software agents operating in a manufacturing plant. PAML is expressed in XML. A PAML document is the content of a KQML message. Plant automation environment is the basis on the ontology while PAML itself is the language of the message content. In this case, the message is in KQML format and the content is in PAML (XML). Moore [103] compares KQML with the Formal Language for Business Communications (FLBC: http://www-personal.umich.edu/~samoore/research/flbc/) which is XML based. While FLBC is less restrictive than KQML, this flexibility comes at the cost of more complex message processing. By expressing five of the KQML performatives in terms of FBLC illocutions, the
34
expressive equivalence of the two languages is clearly demonstrated. In this case, the message is completely in FBLC (XML) with the functionality of KQML demonstrably preserved. Vasedevan et al. [104] compare FIPA ACL and KQML and appropriately highlights a key distinction between the two ACL. FIPA ACL forbids one agent from manipulating the virtual knowledge base internal to the other agent. KQML specifies performatives for this purpose; insert, un-insert, delete, delete-all, delete-one and undelete. It is important to note that KQML requires one agent to request these updates from another agent using the advertise performative. To minimize coupling between the GPME and the host, neither agent will be able to manipulate the knowledge base of the other. Liu et al. [105] create a direct XML translation of FIPA ACL into XML. This approach substantially facilitates the use of FIPA ACL in modern software engineering. It also becomes straightforward to express a FIPA ACL message into XML, and vice versa. This research uses a practical case study in the travel industry to demonstrate the practicality of the approach. Luo et al. [106] describe a modeling and verification methodology for KQML. The methodology guarantees the validity and correctness of KQML using a model checker for multiagent systems (MCMAS). Because the problem of interfacing the GPME to the host is simpler than the general problem, the verification algorithms can be reflected into the specification of the interface. By using XML to specify the interface, a corresponding XML schema can validate the interface specification and insure the semantic integrity of the message in terms of the rules of the underlying ACL. Raja et al. [107] propose a service oriented architecture (SOA) compliant FIPA ACL ontology to integrate ACL within SOA. This research is significant because SOA concepts have 35
come to dominate inter-agent communications over the Internet through Web Services. The research demonstrates that it is possible to specify a communication framework that takes advantage of both ACL and SOA. This research supports the notion that SOA concepts can also be extended to the interface; specifically, in the swarm communication.
36
CHAPTER 3: 3.1
METHODOLOGY
B EHAVIOR O RIENTED F RAMEWORK Artificially intelligent agents are the holy grail of computer science ever since visionaries
imagined automatons as the ultimate human tool. Intelligence is the ability to bring all the knowledge a system has to bear in the solution of a problem [108]. Artificial Intelligence, therefore, is the application of algorithms that use knowledge to solve a problem. An AI system is a biologically inspired attempt to realize or emulate the natural intelligence found in living creatures. The behavior oriented mind model is based on the premise that intelligence is the ability to select an appropriate behavior in response to stimuli, and, to acquire new behaviors through observation of the environment. In this context, behaviors range from simple (crying, laughing, blinking, etc.) to the complex (creating, planning, forecasting, etc.). Intelligence is defined, not by the number or complexity of behaviors the individual masters, but simply by the ability to acquire and apply a behavior in a manner that maximizes success (however success is defined in the context of the individual). From observing nature, living creatures are born with certain innate abilities. They use these abilities to define and interact with their reality. The reality of a creature is the scope of the universe it can perceive. Nature’s engineering of intelligence relies on simple components collaborating to produce a result greater than their combined capacity. The components that contribute to intelligence do not appear aware of the result; they appear to only perform their basic biological function. Individual components of a brain, a neuron, or a dendrite for example, do not appear intelligent. Yet it is obvious that their collaboration results in intelligence and 37
intelligent behavior. This observation leads to a design that enables intelligent behavior to emerge from the interaction of the components. No single component is intelligent in and of itself. In this chapter, we define the model of the mind that is the basis for the behavior oriented framework. We begin with describing how behavior is acquired from the basic biological stimulus response cycle. The cognition cycle describes how a behavior oriented construct interacts with the environment to build and improve its knowledge base. The GPME is a theoretical implementation of the behavior oriented mind model.
3.1.1 S TIMULUS -R ESPONSE B EHAVIOR M ODEL Any organism exhibits intelligence when it acquires knowledge through observation of its environment or its peers, develops new ideas and concepts from the observations, and adapts to its environment to insure its own success. That is, an intelligent organism acts in its environment with a purpose and strategy to insure its own success. Of course, each organism defines the meaning of success. Intelligence requires purposeful action taken because of reasoning. The true test of reasoning is the presence of choices and the selection of a choice through some deliberation. Observing living beings, we can identify three broad classes of behavior; instinctive, acquired, and deliberate. Instinctive behavior requires the least cognitive deliberation. All living beings are born with instinctive behaviors that need not be learned [4], [5]. Deliberate behavior requires the highest degree of cognitive deliberation during its performance. While these behaviors can be the most complex ones a being exhibits, the performance of these behaviors tends to be slow, in comparison to instinctive behaviors, because of substantial participation of 38
cognitive deliberation. Acquired behavior is deliberate behavior that, through repetition, requires substantially less cognitive deliberation than deliberate behavior. In effect, acquired behaviors are “soft” instinctive behaviors created from deliberate behaviors. A living being performs an acquired behavior faster than a deliberate behavior because of lesser reliance on active deliberation. However, the performance of acquired behaviors is slower than that of instinctive behaviors. An example of a behavior is shooting a basketball. A player engages instinctive behaviors; such as grasping the ball and maintaining their balance. The novice player learns through practice; that is, by shooting the basketball several times. The player crafts the shooting technique by creating several deliberate behaviors. She actively thinks about how to hold the ball, the placement of her feet, the follow through of the shot, and so on. Once the player identifies the behavior that produces the best results, she actively attempts to repeat it. Eventually, the player doesn’t have to think about each step in the act of shooting. The deliberate behavior becomes an acquired behavior. Once shooting the basketball becomes an acquired behavior, the player can use her deliberate faculties on the game situation and not on the mechanics of the shot1.
1
A good strategy for the opponent is to interrupt the execution of the acquired behavior
by re-engaging the deliberate functions into the act of shooting. For example, the opposite coach might call a timeout before a foul shot to allow the pressure of the situation to interfere with the performance of the acquired behavior by forcing a deliberate behavior to occur.
39
When a stimulus occurs, the agent triggers an appropriate behavior to respond, which is the output of a Response Selection function. The function chooses the response behavior from a set of candidate behaviors.
Stimulus
I
I Response Selection
I
I
Figure 3.1 - Simple Stimulus Response Behavior Model Initially, as depicted in Figure 3.1, there are only instinctive behaviors to choose from. For example, a baby’s response to most negative stimuli is to cry. We claim, without substantiation, that crying is an instinctive behavior in babies because they appear to do it without being taught. Over time, deliberate behaviors appear. For example, a baby learns to hear and pronounce words from their maternal language. The same stimulus that caused the baby to cry earlier might now cause the utterance of words, along with crying. Eventually, deliberate behaviors become acquired behaviors. For example, when a toddler calls her mother by name (mom, mama, etc.), the behavior appears instinctive and not deliberate. However, since the baby did not call the mother by name at birth, the behavior is acquired and not instinctive. The response selection function is flexible enough to change its output from an instinctive response to an acquired response or a deliberate response for the same stimulus. Where the baby first responded to discomfort by crying only, the baby now responds by crying and calling for its mother, or, better yet, by just calling for its mother. The crying behavior is not lost; there is a new behavior available that produces better results. For the response selection 40
function to change its choice of the default (instinctive) response, it needs some information about the outcomes of the available behaviors as depicted on Figure 3.2.
Figure 3.2 - Expanded Stimulus Response Behavior Model In living beings, stimuli are embedded in the time-series of physical and mental perceptions that include actions performed and data points about the environment and the being’s internal state. The information about outcomes is a projection of the reward of the behavior in the given situation. Therefore, we can view the process of creating behaviors as the result of the analysis of a time-series of perceptions. This analysis results in the identification of many deliberate behaviors that mature into acquired behavior once their reward is demonstrated (See Figure 3.3.)
41
composes
D
becomes composes
composes
I
A composes composes
Figure 3.3 - Behavior Composition Model Deliberate behaviors that are performed infrequently remain available but always require a high degree of cognitive deliberation to perform them. Deliberate behaviors that are performed frequently become acquired behavior and require a substantially lesser degree of cognitive deliberation to perform them. This model requires another function, a behavior composition function, that creates the deliberate behaviors and compiles a useful deliberate behavior into an acquired behavior. In our model, an instinctive behavior is atomic. An acquired behavior can be composed from instinctive and acquired behaviors. A deliberate behavior can be composed from instinctive, acquired, and deliberate behaviors. The Behavior Composition function administers the body of behaviors in the manner we have described thus far as depicted on Figure 3.4. It creates, modifies, and deletes acquired and deliberate behaviors, as it processes the time-series of perceptions. Since the number of behaviors grows rapidly, behaviors with low utility are expunged. Utility is a measure of how valuable the behavior is (how often it is applied, how it was learned, etc.). Based on our initial
42
definition of behaviors, instinctive behaviors are static. The number of acquired behaviors steadies over time as adaptation to the environment occurs.
Perception Time-Series
D
I
I
Response Selection
A
A I
I D
A
Behavior Composition
I
Information about Outcomes
Figure 3.4 - Complete Stimulus Response Behavior Model It is now conceivable to separate the physical systems from the heuristics through a formal interface. Furthermore, it should be possible to design and implement this model in such a manner that it can provide the cognitive and metacognitive capabilities for any autonomous platform through an interface that describes the capabilities of the platform and provide it the time-series of perceptions.
3.1.2 P ERCEPTION P ROCESSING
AND
O RGANIZATION
The perception time-series consist of discrete elements produced by the sensors. Sensors provide inputs at varying frequencies. In addition, the inputs are drastically different from each other. In animals, for example, some inputs are electromagnetic while others are chemical in nature. Nonetheless, the Response Selection and Behavior Composition heuristics process the sensory inputs. We know that in the animal brain, the sensory information is represented as electrochemical signals. In emulating the biological approach, the design cannot rely on 43
discovering underlying mathematical formula to establish the relationship between perceptions. Therefore, it treats all perceptions as discrete symbols. The fundamental relationship that exists between perceptions is a correlation. The first correlation is coincidence. All perceptions perceived in the same time internal are coincident. Coincident perceptions are contained in a Frame. The second correlation is adjacency. A frame received at time tn is adjacent to frames received at time tn-1 and tn+1. Adjacency suggests a graph representation of the time series of frames, such as tn-1 tn tn+1 tn+2. A set of adjacent frames is called an Episode. Each perception within the frame inherits the edges of the frame. The edge between frames simply indicates correlation, not causality, attribution, order, composition, correctness, or any other characteristic. Imparting these other characteristics is the function of Behavior Composition. The heuristic examines episodes to overlay characteristics on edges. A set of episodes with characteristics assigned to its edges is called a Case. Like an episode, a case consists of frames processed to reduce specificity to make the case more abstract and general than an episode. A processed frame is called a Fragment. We can now define two general classes of stimuli. The first class comes from the environment and conforms with the classical notion of a stimulus. Any individual situated in the environment would perceive a stimulus of this class. We call this first class of stimulus an Event. The second class comes from the Behavior Composition in response to noticing a difference between observed and expected results of applying a behavior. Only individuals who have acquired the same behavior and the same expectations would perceive the same stimulus of this class. We call this second class of stimulus an Anomaly. An anomaly is, therefore, always relative to the individual’s expectations. The only response to an anomaly depends on the type. 44
The response to a cognitive anomaly is to adjust utility and the expected reward, and, to select a different behavior. The response to a metacognitive anomaly is learning (i.e., the composition of new behaviors).
3.1.3 B EHAVIOR F ORMATION
AND
L EARNING
Observational learning allows learning to take place using a model instead of merely relying on conditioning. Any model the learner can observe is suitable for teaching a response through observation. A model is another individual that is exhibiting the behavior under observation. Because there is no reinforcement in observational learning, it is crucial that the learner can filter the noise in the stream of perceptions to focus attention on the relevant information. In modern software engineering, software is implemented using conditioning or a behaviorist approach. As a result, even intelligent agents are designed and trained to respond to a narrow set of conditions. If circumstances take the agent even slightly out of this conditioning, the agent’s performance suffers dramatically [72]. Biological agents employ observational learning to mitigate brittleness and to adapt to changing conditions in their environment. As a result, the GPME emulates that approach. In the behavior oriented framework, there are two learning mechanisms. The first learning mechanism is bounded within the knowledge base. The knowledge base is a correlation graph of observations organized, on a scale from specific to abstract, by perception, frame & episode, and fragment & case. The Behavior Composition heuristic creates fragments from frames, cases from episodes, and classified edges from correlation edges. Edge classifications include temporal, causal, spatial, composition, ordering, etc. This translation from the specific to 45
the abstract is called Progressive Reduction. Progressive Reduction creates an internal representation of the environment the Response Selection heuristic can use to select the behavior that results in maximum reward, based on the predicted state of the environment after the behavior is enacted. The second learning mechanism involves acquiring behavior from another individual, called a Model. A model must have a shared understanding of the environment with the learner. Specifically, they must share a common understanding of the characteristics of observations. In that case, the learner can acquire behavior from the model directly, without having to perform progressive reduction. This learning mechanism is called Selective Imitation. Learning is unfettered. It occurs continually. There needs to be a mechanism for purging behaviors that do not contribute to success. Atrophy is Nature’s Ockham’s razor. The shortest path to success involves investing resources in what results in success. Behaviors that clearly result in or impede success both support parsimony. Behaviors that contribute neither towards nor against success are not useful. The behavior oriented framework employs atrophy to eliminate this clutter. Every behavior carries an attribute we call Utility. Utility is a positive or negative value that reflects how its application has historically impacted success. As a behavior atrophies, its utility tends towards zero. At or near zero, it is automatically purged from the knowledge base. The behavior oriented framework implements unsupervised learning from a time-series of observations. It superimposes reinforcement learning. Reinforcement learning hinges on a reward. The reward is relative to the specific state of the environment. For example, depending on circumstances, the same behavior results in a reward or a penalty (a negative reward), or of
46
varying magnitude. The potential application of a behavior results in an expected reward. A situation where the expected reward and the actual reward do not match, reflects an inconsistency between the understanding of the environment and the reality of the environment. In the behavior oriented framework, there are two rewards. The external reward matches the classical reinforcement learning definition and the goal is to maximize the reward. The external reward is found within the environment. The internal reward is a function of the inconsistencies between the understanding of the environment (virtual internal representation of the environment) and the reality of the environment. The goal is to minimize the number and impact of the inconsistencies. Managing the external reward is the role of the cognitive functions. Managing the internal reward is the role of the metacognitive functions. Figure 3.5 depicts the learning hierarchy.
Metacognitive RL Cognitive RL
Unsupervised Learning
Figure 3.5 - Learning Hierarchy
3.1.4 C OGNITION C YCLE When referring to the environment, we include the setting, the host itself and its internal state, and other hosts (peers) and state information they are sharing. The host is equipped with
47
instruments that perceive the environment and others that can act in the environment. The GPME is the component within the host that implements the behavior oriented intelligence concepts. A perception is a raw data sample from an instrument. An observation is a perception encapsulated with its metadata. The metadata is additional information the instrument provides about the perception. For example, if the instrument is a camera, the image it captures is the perception. Examples of metadata are the resolution of the image or the location of objects detected in the image. In general, we use perception and observation interchangeably with the understanding that a perception is contained within an observation.
Behavior
Idea
Reality
Imagination
Perception
Experience
Observation
Figure 3.6 - Cognition Cycle A contiguous set of observations form an experience of the environment. From experiences, the GPME creates an internal representation of the environment. When presented with a stimulus, it uses this internal representation to test candidate responses by simulating their
48
effect on the internal representation. We refer to a candidate response as an idea. It can then choose the idea that produces the desired results in the real environment. Just as the internal representation of the environment, the perceived environment, mirrors the real environment, an idea mirrors an experience. The expectation is that the applied idea matches the corresponding future experience. The society of mind is a scheme where a mind consists of small distinct processes called agents. An agent is not aware and not capable of thought. However, when several agents cooperate in societies, intelligence arises. The intelligence is an outcome of the interaction. Societies are hierarchical with higher agents controlling those below. In this scheme, actions and decisions emerge from conflicts and negotiations among societies of agents that constantly challenge one another [109]. Similarly, the GPME should be social. It should also be possible to learn from peers. The objective is to acquire the other’s experiences without having to go through the steps. The obvious choice is for peers to share ideas, since ideas are the culmination of processing experiences. For peers to share information, we again require an interface that describes the types of perceptions the peer acquires because this information allows each peer to determine if they share a context, if their internal perceived environment is sufficiently like learn from each other. 3.1.4.1 CANDIDATE BEHAVIOR IDENTIFICATION A case is an abstraction of several similar experiences. In other words, a case is the data representation of an idea constructed as the centroid of a set of experiences. Going forward, we use the terms case and idea interchangeably with the understanding that a case is the computerized representation of an idea. A case is created by clustering a set of experiences and 49
generating a centroid. The centroid is not an experience, as it did not actually occur in the real environment. However, it is equivalent to each experience in the cluster. The GPME records success (or failure) resulting from the application of the case. Over time, the GPME develops a utility score and an expected reward for each case. When a similar circumstance arises, the agent reacts (chooses its behavior) based on the case that most closely matches the circumstance. The GPME experiences the environment. It uses a Reasoning Mechanism to match a case from its database to the experience. The matched case includes behaviors that The GPME selects and performs. The behavior may change the environment or The GPME itself. The environment can also change independently of the GPME.
Figure 3.7 - Reasoning This approach fails routinely because of three broad causes:
50
•
Case not matched error: The GPME does not recognize the current circumstance even though there is a matching case present in its database.
•
No case exists error: The GPME has no experience (existing cases) to draw upon.
•
Unexpected result error: The GPME finds the matching case and produces the idea but the behavior does not produce the expected result. The first two causes result in complete failure. The GPME is unable to respond. When an
unexpected result occurs, The GPME had been able to respond. However, all three situations are failures because The GPME is now paralyzed. When subjected to the first two causes, it does nothing until something changes in the environment that it recognizes. When subjected to the third cause, it continually repeats the same incorrect behavior. When subjected to these causes, The GPME needs a breakthrough. Fortunately, these situations are easily recognizable. Therefore, they can be treated like a stimulus whose response is the application of a heuristic designed to create breakthrough behavior. When the unexpected result error occurs, it is necessary to examine the reasoning that led to the selection of the behavior to adjust either the selection criteria or the expected results. We call this process of reasoning about reasoning metacognition, to distinguish it from cognition. Therefore, an unexpected result stimulus triggers the metacognition heuristic. When the case not matched error occurs or the no case exists error occurs, the reasoning mechanisms do not produce any ideas. The GPME needs to guess at an idea. The guesses come from existing ideas; there are no spontaneous ideas because all ideas come from cases (past observations). Therefore, a “no idea found” stimulus triggers the guessing heuristic.
51
3.1.4.2 BEHAVIOR SELECTION Many circumstances have deadlines for the application of the behavior. The GPME attempts all reasoning mechanisms against all applicable cases simultaneously. When a behavior is selected or the deadline expires, The GPME abandons incomplete attempts. Because of the parallelism of the reasoning mechanism, The GPME often produces several candidate ideas and must make a choice. Therefore, The GPME needs a choice making heuristic. The choice making strategies are; Ignore, Pick, Blend and Combine. The ignore strategy results in the selection of none of the ideas and therefore in no behavior. The pick strategy results in the selection of one and only one of the available choices. The pick-first strategy always results in choosing the first idea that was produced. The pick-last strategy always results in choosing the last idea that was produced. The blend strategy results in the selection of more than one but less than all the ideas. The blend max strategy blends the first and the last ideas. The blend mix strategy blends any number of ideas. The combine strategy results in the selection of all the ideas.
52
ignore combine
pick first Choice Strategy
blend max
pick last blend mix
Figure 3.8 - Choice Making Strategy The GPME is social. Each member of the society of The GPME stewards a portion of the collective knowledge base. Collaboration requires sharing cases. However, if there isn’t sufficient variety in choice strategy, all the members will converge to the same behaviors and minimize the amount of learning. If each member has a slightly different choice making strategy, the diversity of the population is maintained and the amount of learning can be maximized. The complete GPME design therefore is as follows:
53
Figure 3.9 - Complete GPME Design 3.1.4.3 HOMEOSTASIS Circumstances have widely varying impacts on success. When the GPME chooses an idea, it describes a certain outcome in the environment that becomes an expectation once the behavior is enacted. When the observed outcome significantly varies from expectation, the variation must influence the reasoning process. This parameter into reasoning can be neutral when no significant variation is observed, positive when expectations are exceeded and negative when expectations are not met. We call this parameter Emotion. Emotion represents the difference between the expected success and the actual success. At zero, there are no differences between expectations and reality. Phrased another way, we detect no anomalies comparing the state of the environment our instruments report and the state of the environment in our ideas. Homeostasis in a system is a state where internal conditions remain stable and constant. For the GPME, success means achieving emotional homeostasis. Reasoning mechanisms produce ideas. Ideas that result in (the best) emotional homeostasis are preferred. The choice 54
strategy results in behavior intended to maximize emotional homeostasis. Maximization of emotional homeostasis means the minimization of anomalies, the minimization of differences between the perceived environment and the real environment.
3.2
T HE K ASAI In the introduction, we postulated that the ability to deriving intelligent behavior through
observation depends on a mechanism for detecting systematic patterns in the stream of observations. The mechanism is the Kasai [110]. To understand the Kasai, we consider the problems of pattern recognition, prediction, and anomaly detection in a series of observations. The inputs are symbolic. They are not, for example, numbers where we can find a mathematical relationship between them. We are considering the type of observation that help a lion determine where the gazelles will be at a certain time of day. Observations such as the time of day, the temperature, the scent on the wind, the direction of the wind, and the history of experiences, inform the decision about where to hunt. Assume that we can observe an individual. We can represent symbolically actions that she performs. For example, the hand motion primitives data set [111] provides accelerometer data reflecting behaviors such as brush_teeth, climb_stairs, comb_hair, descend_stairs, pour_water, drink_glass, eat_meat_eat_soup, getup_bed, liedown_bed, sitDown_chair, and standUp_chair. We denote each behavior using two letters for brevity. Our observation yields the following time series: [GB, BT, CH, DS, PW, DG, DC, ES, DG, UC, CS, LB]. Given several episodes of such a series, we can recognize patterns and we can detect anomalies. Note that there is no mathematical formula that can be used to describe this series. There are no relationships
55
between its tokens. It appears we need a declarative approach to describe rules such as one must DC (sitDown_chair) before one can UC (standup_chair).
3.2.1 S TRUCTURE
OF
D ATA S ERIES
Data points under study can be examined individually or in a group. A datum carries its significance individually within its own context. Data in a group carries significance both as the datum and in terms of the datum’s participation in the group. An ordered group where the position of the datum in the sequence is also significant is called a data series. Our objective is to detect patterns of elements in a data series. The concept of order allows us to distinguish between two types of series, random, and systematic. A random data series contains no patterns in the occurrence of datum within it. Consider two series of observations; P1=[k,b,a,z,q,p,m] and P2=[k,a,b,k,a,b]. We qualify P1 as random because it does not contain any pattern. Observational learning is not possible on a random series. We qualify P2 as systematic because it contains a pattern. Observational learning is possible as there are at least two discernible episodes [k,a,b]. P2 is simple and obvious. In Nature, observational learning must occur using much more complex intertwined series. There are several types of systematic patterns in the environment. For example, we can consider weather. In the temperate zone of Earth, there are four annual seasons; spring, summer, fall and winter. These seasons repeat in the same cycle. Within each season, there are also subseasons. For example, in North America, we experience an Indian Summer in the fall. Consider the climate over 50,000 years. From this perspective, we observe similar cycle of warming and cooling. For example, ice ages last several thousand years, followed by warming period of several thousand years. We can also refer to an ice age as a season that possesses sub-seasons. 56
An Epoch is the period in the pattern that it takes for it to cycle. For example, the epoch is the year when we consider the four seasons. When we consider the ice ages, the epoch is 50,000 years. The annual epoch is a sub-epoch of this longer epoch. The detection of a pattern in a data series is a function of the method used to detect it. For example, P1 does not have a pattern in terms of its components. However, there is a pattern when one considers the relationships between [k, b, a] in the alphabet and notices that the relationship is the same as [z, q, p]. This relationship lets us predict what should follow [m] if P1 is systematic. The pattern in P2 is obvious and doesn’t depend on the lexical relationships of the letters. Another way to make this observation is that detection methods cannot detect patterns they are not designed to detect. The elements of the data pattern are symbols. We do not assume there is an embedded relationship between them. Our use of letters, for example, is not meant to imply a lexical ordering. In this context, every element of the data series must be considered. If the element can be detected, it is not noise. The element in the data series is a symbol. The detection method processes the symbol and produces a token. The symbol and the token are the same object. A symbol that cannot be processed into a token is not presented to the Kasai. For example, if the methods of detection of P1 or P2 are presented with a number instead of a letter, they cannot create a token for a number. On the other hand, the Kasai ignores no tokens. Therefore, the Kasai detects patterns in systematic series of tokens. When we consider the types of systematic patterns that can be constructed from tokens, we identify two types of token series we call reflexive (Figure 3.10) and periodic (Figure 3.11).
57
A reflexive pattern uses the same token; P3=[a,a,a,a,…]. A periodic pattern repeats a series of tokens; P4=[a,b,c,a,b,c,…].
Figure 3.10. Reflexive Pattern
Figure 3.11. Periodic Pattern
In the figures, the edge with arrow indicates that a sequence repeats at some point. In the case of P3, it repeats immediately. In the case of P4, it repeats after the token [c], when n=3. The next type of systematic pattern is composed of tokens subsequences. We denote a subsequence with a capital letter. A subsequence is is a finite symbol series such as S=[a,b,c]. A branched pattern occurs when a symbol periodically occurs in the series. Consider the series P5=[a,b,c,a,b,c,a,b,k,a,b,c,a,b,c,a,b,k,…]. It contains subsequences S=[a,b,c] and R=[a,b,k]. A subsequence P5 could be shown as P5=[S,S,R,S,S,R,…] reducing it to a periodic pattern. Branched patterns are periodic patterns of symbols.
Figure 3.12 - Branched Pattern A branched pattern contains a cycle with at least one branch, with the repetition of one or more subsequences. For example, in series P5, subsequence S cycles once before subsequence R occurs. At the token level, there are two occurrences of token [b] before token [k] occurs. Edges 58
between nodes capture this behavior on Figure 3.12. In practice, edges are labeled with the cycle count as in Figure 3.13.
Figure 3.13 - Labeled Branched The graph on Figure 3.13 produces [a,b,c] on cycles 1 and 2, and [a,b,k] on cycle 3, ad infinitum, just like series P5. We can visualize each subsequence as a simple season, and a series of sequences as an epoch. The seasons are occurring in a certain order over an epoch. An epoch is the period over which all sequences appear at least once, before starting again. Within each season, there can be sub-season or sub-sub-season and so on. A complex season has sub-seasons while a simple season does not. This observation leads us to define a hybrid pattern (Figure 3.14) as one that contains any combination of reflexive, periodic and branched patterns (The letters on the edges denote the cycle count).
Figure 3.14. - Hybrid Pattern
59
The Kasai is a technique that processes a data series and derives the rules that produce the data series. A set of rules that mirrors a data series has several advantages. It acts as a memory because it captures the static and dynamic characteristics of the data series. It enables the prediction of future state of the data series based on the current state. It supports comparison of data series using set operations and graph analysis techniques, which can be more efficient and insightful than brute force comparisons.
3.2.2 D YNAMIC A NALYSIS
OF
D ATA S ERIES
The Kasai dynamically builds a set of rules that describe the sequence processed to date. A Rule takes the form subsequence token. The rule Sx tn denotes that subsequence Sx predicts token tn. The collection of rules the Kasai builds is called a Grammar. Within the Kasai, the grammar is represented as a directed graph. The nodes of the graph are the tokens. The edges are directed and form a unique path through the nodes. Thus, a Path is a sequence of edges that connects a set of nodes such that the node that an edge ends in becomes the node at which the next edge starts. The graph is fully connected and all nodes are reachable. Each rule of the form Sx tn causes a set of nodes that form the subsequence Sx and tn to form a path. The first node added to the graph is referred to as the Root. Since all rules are connected, any node could be designated as the root. By convention, the first node is the root. However, the best root is the most frequently occurring rule in the sequence. Unfortunately, the most frequently occurring rule may not be known at the outset, or, it can change over time. The Kasai refactors the grammar to reposition the root node. To circumvent the need for refactoring, the root node can be an independent mechanism, such as an internal clock tick, that occurs at a higher rate than the perception rate of the sensors. 60
The grammar is a static construct but the description of the sequence must include the dynamic aspects of the sequence as well. The grammar describes the static structure of the sequence using rules and paths. To capture the dynamic aspects, the Kasai introduces the concept of cycles overlaid on top of the grammar. A Cycle is a path that leads back to the root node. This graph that captures both static and dynamic aspects of the sequence is called a Sarufi. Figure 3.13 is an example of a Sarufi. A pattern is seasonal whenever its Sarufi has cycles greater than one (1). Each cycle in a Sarufi has a charge. As we traverse the Sarufi, the charge builds by one each time we pass through the root node. When the charge reaches the cycle value, the cycle is active. Once the cycle is traversed, its charge goes back to zero (0). We define an Ideal Sarufi as one when the root node is in cycle 1 and there is a path from each node to any other node. For example, a Sarufi of a genome will be ideal because there is a path from each node to any other node. However, the Sarufi of weather will not be ideal immediately because it starts somewhere in the middle of the weather pattern. Eventually, non-ideal Sarufi will become ideal because the algorithm refactors the Sarufi as it discovers new patterns in the data. Systematic patterns result in ideal Sarufi. Random patterns cannot. Theorem: The Kasai is complete. It produces only the rules reflecting the data series it has processed to date and the rules it produces fully reproduce the data series. To prove this theorem, we give an exhaustive list below of all the different types of seasonality that can occur in a data series and show that each of these cases is represented by one of the patterns (reflexive, periodic, branched or hybrid) in Kasai.
61
CASE 1: EPOCH WITH ONE SEASON
{[a],[a],[a],….} An infinite series of the same season where each epoch has only one season. Kasai represents this type of series as a reflexive pattern. CASE 2 – EPOCH WITH DISTINCT MULTIPLE SEASONS
Sub-Case 2.a
{[a b c],[a b c],…} An infinite series of multiple seasons of the same length. In this example, each epoch has 3 seasons [a], [b] and [c]. This case corresponds to a periodic pattern in Kasai.
Sub-Case 2.b
{[a a a b], [a a a b], …} An infinite series of multiple seasons of different lengths. This is obtained by combining a finite number of case 1 epochs with a simple season. In this example, each
62
epoch has 3 [a] seasons followed by a [b] season. This case corresponds to a reflexive and periodic pattern in Kasai.
Sub-Case 2.c
{[a a a b b b], [a a a b b b], …} An infinite series of multiple seasons of different lengths. This is obtained by combining a finite number of a case 1 epochs with a finite number of another case 1 epochs. In this example, each epoch has three [a] seasons followed by three [b] seasons. This case corresponds to a reflexive and periodic pattern in Kasai.
Sub-Case 2.d
{[a a a b c c c], [a a a b c c c], …} An infinite series of multiple seasons of different lengths. This is obtained by combining multiple finite case 1’s with a simple season. This case corresponds to a reflexive, periodic and branched pattern in Kasai.
63
CASE 3 – NON-OVERLAPPING COMPLEX SEASONS
Sub-Case 3.a
{[a b c a b c], [d e f d e f], [a b c a b c], [d e f d e f], …) An infinite series of complex seasons obtained by combining multiple finite numbers of case 2 epochs. Each epoch in this example contains multiple complex seasons {[a b c a b c], [d e f d e f]}. This case corresponds to a periodic and branched pattern in Kasai.
Sub-Case 3.b
{[a b c a b c], [k], [d e f d e f], …) An infinite series of complex seasons obtained by combining multiple finite numbers of case 2 epochs with simple seasons. Multiple complex seasons [a b c a b c], and [d e f d e f], and simple season [k]. This case corresponds to a periodic and branched pattern in Kasai.
64
Sub-Case 3.c
{[a a a b b b c], [a a a b b b c], [d e e e f],[d e e e f]}, {[a a a b b b c], [a a a b b b c],[d e e e f],[d e e e f]}, … An infinite series of complex seasons obtained by combining multiple finite numbers of case 2 epochs with simple seasons, with repeating seasons. This case corresponds to a reflexive, periodic and branched pattern in Kasai.
CASE 4 – OVERLAPPING COMPLEX SEASONS
Sub-Case 4.a
{[a b c],[a b c],[a b k], [a b c],[a b c],[a b k], …} An infinite series of overlapping complex seasons. In this example, the complex season [a b c] and and the complex season [a b k] have 2 seasons that overlap ( a and b). This
65
case corresponds to a periodic and branched pattern in Kasai.
Sub-Case 4.b
{{[a b c],[a b c],[a b k]}, {[a b c],[a b c],[a b k]}, {[a a a b b b c], [a a a b b b c], [d e e e f],[d e e e f]}}, {{[a b c],[a b c],[a b k]},… An infinite series of overlapping and/or non-overlapping complex seasons over multiple epochs. This is obtained by combining multiple finite numbers of case 4 epochs with simple seasons or case 1 or 2 or 3 epochs. This case corresponds to a reflexive, periodic and branched pattern in Kasai.
Since the final case is overlapping and recursive, no sequence can be formed at a level higher than complex seasons. From this point on, we find that more complex combinations of seasons are equivalent to the final case. We conclude that seasonality in a data series can only contain reflexive, periodic, branched and hybrid patterns. Since the Kasai creates rules for any reflexive, periodic, branched or hybrid patterns, it also creates a complete set of rules for seasonal patterns.
3.2.3 D ESIGN
OF
T HE K ASAI
The implementation of the Kasai begins with a perceptron. The idea is to build a more abstract structure on top of a perceptron that combines the symbolic and connectionist paradigms by adding state-machine characteristics.
66
At first glance, it appears that a recurrent network such as LSTM could be used as the underlying data structure for the Kasai. Schmidhuber discusses deep and shallow learning using recurrent neural networks [112]. In a recurrent neural network, the number of past observations is fixed and predetermined. Recurrent neural networks need to know the depth of the seasonality of a time series. Unfortunately, the characteristics of the time series are not known at the outset; the Kasai does not need to know when or where the seasonality will occur within the time series. 3.2.3.1 THE KASAI ABSTRACT DATA TYPE The context view of the Kasai (Figure 3.15) depicts that it accepts a data series as input and it produces either a prediction of the next token in the series, or, it reports an anomaly. A control, depicted at the bottom, activates the learning mode. When the learning mode is activated, the Kasai modifies its internal structures to reflect its observations of the data series. When it is not activated, no modification occurs.
Figure 3.15 - Kasai Context Diagram The core atomic internal element of the Kasai is called a Kasi. As shown in Figure 3.16, the Kasi element is modeled on a perceptron.
67
Figure 3.16 - Kasai Kasi Component It accepts one external input token denoted t and, an internal recurrent input called charge and denoted c. The activation function is denoted Θ (Theta). The arguments in brackets are set when the Kasi is instantiated. The argument τ (Tau) is the expected matching token and the argument κ (Kappa) is the target cycle count. Referring to Figure 3.18, for example, the rightmost Kasi is initialized with κ =3 and τ =[e]. The Θ function is as follows:
function Θ (t, c): If t = τ and c = κ c = 0; return Forward; else if t = τ return Back; else return NULL;
68
end if; end function Θ;
The charge variable c is special. There is one instance of c associated with each Kasi. Each time an input t is processed by the Kasai object, all c variables are incremented by 1. The connections Back and Forward point to next Kasi that will accept the next input t. If the inputs t and c match τ and κ, the Kasi returns the Forward link. If the input t matches τ but c does not match κ, the Kasi returns the Back link. Otherwise, the input t did not match τ and therefore, the Kasi returns a NULL link. Obviously, the Kasi participates in a network of Kasi. Referring to Figure 3.17, for example, each node on the graph corresponds to a Kasi. A Sarufi (Figure 3.17) is a graph that connects multiple Kasi. A Sarufi is the data arrangement in memory, a graph of the interconnected Kasi. The Kasai and the Kasi are instantiated objects. There is only one Kasai object that contains all Kasi objects and the Sarufi.
Figure 3.17 - Kasai Sarufi Data Structure
69
The mainline of the Kasai object is as follows:
Kasai Mainline: loop: t = next data series input; if activeKasi is the rootKasi for all kasi: //parallel for kasi->c = kasi->c + 1; predecessorKasi = activeKasi; activeKasi = activeKasi->K(t); if activeKasi is NULL signalAnomaly(t,predecessorKasi-> Τ ) //anomaly if learning mode is active activeKasi = adjustSarufi(); end if else signalPrediction (activeKasi-> Τ ) //prediction end if; end loop; end Kasai Mainline;
The Kasai begins by incrementing the c of all Kasi upon receiving the input t. It uses the last link it received to apply its Θ function using t (and c). If it receives a link to another Kasi
70
(Back or Forward), it signals its τ. In effect, it is predicting the next value of the data series. Signaling means that it informs the overall client application that is using the Kasai to accomplish its purpose. Signaling could be implemented as a message or a printout or file output, for example. When the Kasi returns a NULL, it indicates that an anomaly has occurred. The anomaly is the fact that the expression (t = τ) is false. If the Kasai is in learning mode, the Kasai modifies the Sarufi to deal with the anomaly. If the Kasai is not in learning mode, the Sarufi is not modified. In either case, the Kasai signals the state information about the anomaly (i.e., the input token t and the last correct token τ). Figure 3.17 depicts a Sarufi created by the algorithm. Note that where the Back and Forward links go to the same Kasi, the link is not labelled for the sake of clarity. As an example, let us apply the following τ to Figure 3.17 from left to right: τ = [a,b,c,d,e], where the Back link points to [t = b]. Let the charges be κ = [1, {2,3}, 3, 3, 3] where the κ for Back is 2 and for Forward is 3. Figure 3.18 shows the resulting Sarufi. Then, the Sarufi encodes the sequence: P=[a,b,c,a,b,c,a,b,d,e,c,a,b,c,a,b,c,…..]. This infinitely long series is fully represented using a very compact representation. It is recurrent and, unlike a neural network, its recurrence can be infinitely expanded. It allows for the comparison of structures of patterns in series.
71
Figure 3.18 - Kasai Sarufi Example The adjustSarufi() function modifies the Sarufi when an anomaly is detected. The anomaly did not occur at the current Kasi; it occurred at the predecessor Kasi that directed the Kasai incorrectly. The correction is simple. We need to create a new Kasi using the current highest c value as κ and the unexpected input token t as τ. The new Kasi needs to connect to the predecessor Kasi. Of course, the predecessor Kasi cannot be modified as the parameter κ cannot be altered after the creation of the Kasi. However, we need a new Kasi in the same position in the Sarufi as the predecessor Kasi. The new predecessor use the same τ as the other predecessor but it uses the new κ. Therefore, there is now a possibility that the Θ function could return more than one Kasi link. To eliminate this problem, we merge the predecessor Kasi into one, as shown in Figure 3.19, and we modify the Θ function accordingly into a more general form since we have replaced the concept of the Back and Forward links with unlimited links.
72
Figure 3.19 - General Kasi It is now possible for a Kasi to branch to more than two other Kasi thus becoming general. The new Θ function becomes parallel so that it evaluates all Θ simultaneously inside the Kasi:
function Θ (t, c): If t is equal to τ for all Κ: //parallel if cn >= κ n cn = 0; map (κ n, Link); end if; link = reduce by maximum κ n return link;
73
else return NULL end if; end function Θ;
The general Θ function uses map reduce. For each Θ n where cn >= κ n , the link and the κ n
are mapped, and the charge c is reset. The Kasi performs these actions in parallel with the FOR
ALL instruction. The reduce step selects the link with the largest κ n in the map and returns it. Note that this function always returns a link if the input t = τ, because the adjustSarufi() function always creates at least one Kasi with a function definition Θ [τ , 1]. Given the general Kasi function Θ definition, the specification of the adjustSarufi() function is as follows:
function adjustSarufi (): newKasi = new Kasi (t, global_c) If predecessorKasi is NULL //This is a new Sarufi rootKasi = newKasi; else predecessorKasi->new Θ (global_c, newKasi); end if; return newKasi; end function adjustSarufi:
74
When the input t is the unexpected input that cause the anomaly, it needs to be added to the Sarufi. It occurred at the current maximum charge within the Kasai that it stores in the global_c variable. We now finalize the Kasai mainline:
Kasai Mainline: predecessorKasi = NULL; rootKasi = NULL; global_c = 0; loop: t = next data series input; if rootKasi is NULL //create the firstSarufi activeKasi = adjustSarufi(); if activeKasi is the rootKasi global_c = global_c + 1; for all kasi: //parallel for kasi->c = kasi->c + 1; end if; if t is equal to activeKasi-> τ //no anomaly predecessorKasi = activeKasi; activeKasi = activeKasi-> Kasi(); signalPrediction (activeKasi-> τ) //prediction; else //anomaly signalAnomaly(t,predecessorKasi-> τ)
75
if learning mode is active activeKasi = adjustSarufi(t, predecessorKasi); else activeKasi = rootKasi; end if; end if; end loop; end Kasai Mainline;
We now finalize the Kasi function given that the Kasai mainline detects the anomaly and no longer invokes the Kasi when an anomaly occurs:
function Kasi (): //Variables c, κ, τ are part of Kasi object. If t is not τ, this function is not called. for all Θ: //parallel if cn >= κ n cn = 0; map (κ n, Link); end if; link = parallel reduce by maximum κ n return link; end function Kasi:
76
The Θ function is clearly designed to take advantage of multiple threads or processors. Implementing each Kasi object to run on, for example, an nVidia GPU warp makes sense. The following diagram represents a Sarufi constructed using the algorithm described above:
Figure 3.20 - Kasai General Sarufi Data Structure When a new Kasi is created, the value of κ is set to the global_c. The Kasai mainline increments the global_c variable each time it traverses the rootKasi. Therefore, the rootKasi must point to the most frequently visited Kasi. Each Kasi needs a variable to count visits denoted v. If the value of v for the rootKasi is the highest v in the Sarufi, the Sarufi is ideal. Otherwise, it is no longer ideal and must be refactored. The refactorization concludes with the most visited Kasi becoming the rootKasi. To the Kasai mainline, we add the following instruction prior to the end loop statement:
Refactoring occurs when the Sarufi is expanded with a new Kasi. Therefore, to the adjustSarufi() function, we add the following instruction within the else clause: 77
refactorSarufi();
The invocation of the refactorSarufi() function is asynchronous. Refactoring the Sarufi happens in parallel without interrupting normal processing. That is, refactoring the Sarufi does not interrupt or pause the Kasai object. If the refactorSarufi() function determines that the rootKasi no longer points to the most visited Kasi, it creates a new Sarufi, beginning with the Kasi with the highest visit count with κ=1. Then, it propagates forward in the old Sarufi, processing each τ by creating a new Kasi with the adjusted κ. When the new Sarufi is complete, the Kasai switches to the new Sarufi at the next anomaly. In summary, Figure 3.21 depicts the architecture of the Kasai:
Figure 3.21 - Kasai Architecture Diagram
78
3.2.3.2 HARDWARE IMPLEMENTATION The Kasai is meant to be implemented in hardware, on-chip. Each Kasi circuit consists of memory storage for the parameters (κ and τ) and the charge. It requires an addition circuit to increment the charge. Its output is the τ.
Figure 3.22 – Conceptual Hardware Kasi Element Once initialized with its parameters, the Kasi element increments the charge when it receives the COUNT CYCLE signal. When the Kasi receives the ACTIVATION signal, it delays resetting the charge adder so that the Kasi circuit can operate one last time. The activation enables the final (rightmost) output AND gate. On the left, the first logic gate compares the charge to κ. If the output is TRUE, it enables the AND gate that presents τ to the final gate. Thus, τ is the output upon activation. A Kasai chip can be built using several Kasi elements as shown in Figure 3.23.
79
Figure 3.23 - Conceptual Kasai Hardware Chip The Kasai chip consists of a bank of Kasi elements as depicted on Figure 3.22 connected to the firmware processor via three connections. The data bus, in the center, addresses each Kasi by its index. It sets the τ and κ when the Kasi is initialized, and receives the τ and κ when the Kasi fires. On the left, the activation bus addresses each Kasi individually to signal it to fire. Only one Kasi receives the activation signal at a time. On the right, the Count Cycle Signal Broadcast instructs each Kasi to increment its charge. 80
The chip accepts a setting for the learning mode and an input τ. It produces the prediction and the anomaly signals. Each chip operates a single Kasai. To create a Kasai network requires several Kasai chips to interconnect.
3.2.4 K ASAI N ETWORK A Kasai Network is an arrangement of Kasai such that the output of one Kasai is the input of another. A prediction Kasai produces a prediction of the next token in the sequence based on the sequence processed to date.
Figure 3.24 - Kasai Network Example The example Kasai Network depiction in Figure 3.24 shows the construction of a unified environment Kasai created from the combination of other Kasai. On the left, physical sensors produce sequences that are input into their own assigned Kasai. The outputs of these Kasai are combined to form virtual sensors. In the example, the combined visual and auditory Kasai output form a virtual energy sensor. The combined auditory and touch Kasai output form a virtual 81
physical sensor. The combined touch, taste and scent Kasai output form a virtual chemical sensor. In turn, the combined energy, physical and chemical Kasai output form a virtual environment sensor. This Kasai Network enables prediction of the state of the environment. The Kasai network captures the current patterns that apply in the environment, combining facts and rules that specify the pattern into a single structure.
3.3
G ENERAL -P URPOSE M ETACOGNITION E NGINE D ESIGN This section provides a detailed specification of the GPME. The GPME’s goal is to
organize an internal representation of the environment that perfectly matches the environment. That is; every stimulus has a response that maximizes rewards and there exist no unmet expectations.
3.3.1 C ONTEXTUAL V IEW Figure 3.25 depicts the global context of the system. The environment contains the host. The host operates within and interacts with the environment. The host contains the GPME. The GPME provides the host with metacognitive capabilities. We refer to the combine host and GPME as the system.
82
Figure 3.25 - Global Context From a logical perspective, Figure 3.26 depicts that the host is the GPME’s interface to the environment. From the GPME’s perspective, the host is the environment. More precisely, the GPME only knows of the environment what the host shares. We refer to the flow of information between the GPME and the host as the telemetry. Specifically, information flowing from the GPME to the host is a suggestion (to act). Information flowing from the host to the GPME is an observation (of the environment or of the host itself). We refer to a device and capability the host possesses to interact with the environment as an instrument. Specifically, a device or capability to collect information about the environment or about the host is a sensor. A device or capability to affect the environment or the host, in a manner detectable by a sensor, is an actuator.
Figure 3.26 - Global Context Logical View 83
The host is a sophisticated system capable of performing several functions autonomously. For example, the host is a robot capable of movement, equipped with a gripping arm, and auditory and visual sensors. The host can safely navigate a space from an original location to a target location. The GPME does not provide detailed systematic instruction to navigate from point A to point B. The GPME suggests the host to move from point A to point B. The host is sophisticated enough to act on this suggestion and report its status back to the GPME. We refer to it as a suggestion because the host may not be able to act or may not succeed in the act. The instruments provide raw data from sampling the environment and the host, as well as processed data from the sampling. For example, an advanced visual sensor in the host reports an image in PNG format (raw data) and it provides a list of objects or faces it detects in the image (processed data). Therefore, an observation consists of raw data and processed data. The GPME requires some built in knowledge such as how to use its sensors and actuators, but also certain heuristics to enable reasoning. It can be designed as a layered architecture consisting of: •
The physical host or body: o Physical components of the robot body and instruments; o Primitive heuristics to control the body, use the instruments and perform primitive behaviors such as reflexes;
•
Cognitive heuristics to organize the information the GPME collects about its environment, and to develop behaviors through reasoning;
•
Metacognitive heuristics that enable reasoning about reasoning.
84
The purpose of the GPME heuristics (primitive, cognitive and metacognitive) is to implement a complete Stimulus Response Behavior Model, enabling instinctive, acquired, and deliberate behaviors. We refer to the individual who is creating the system as the designer. The designer incorporates the GPME within her work and designs the host to accomplish her requirements. The designer composes an XML document called the Environmental Interface Specification (EIS). The EIS defines the environment to the GPME, including the capabilities of the host. The designer also composes another XML document called the Operational Interface Specification (OIS). The OIS specifies the specific communication mechanisms available between the host and the GPME. The EIS and OIS can be revised at any time during the operation of the system.
Figure 3.27 - GPME Communication Interface Documents Therefore, the GPME is a pure intelligence. It is not a human or animal intelligence, and as such, is not designed to interact directly with living things. Since living things are part of the environment, interaction with them is a function of the host. For example, if the system requires human interaction, the host must be equipped with instruments that support human interaction and a human model of world. 85
3.3.2 GPME-H OST D EPLOYMENT T OPOLOGIES There are three basic topologies for deploying the GPME and the host:
Figure 3.28 - GPME-Host Topologies In the integrated topology, the host shares its computing facilities with the GPME. For example, the designer loads the GPME within the robot’s onboard computer. In the networked topology, the host and the GPME operate on distinct computing facilities. For example, the host is a robot in the field while the GPME runs on a remote server. The networked topology supports two variants; dedicated and coordinated. In the dedicated variant, the remote GPME behaves as if it is integrated into each host it services. Therefore, the GPME maintains a separate context for each host. In the dedicated variant, there is no sharing of knowledge. In the coordinated variant, one instance of the GPME supports several hosts with a shared knowledge base. For example, a remote GPME provide metacognitive services to a group of robots that share their knowledge and experiences while each function autonomously. For example, assume robot A experiences a desert and robot B experiences an ice field. In the integrated mode and in the networked dedicated variant, both A and B have to learn how to deal with the terrain type when encountered. In the networked coordinated variant, robot A 86
benefits from robot B’s experiences with ice fields, and vice-versa, as soon as it encounters the new terrain type. In the dedicated variant, both robots must learn how to deal with the new environments on their own.
3.3.3 GPME S WARM Independent of the GPME-Host topology, the GPME can be deployed as a single instance, several independent instances or as part of a cooperating group. When the designer opts to deploy several cooperating GPME instances, we refer to the collective GPME instances as a swarm. Members of the swarm share parts of their respective knowledge bases with each other. The designer enables the swarm through the OIS. The swarm members compose and share an XML document called the Social Interface Specification (SIS). Figure 3.27 depicts the SIS between the two GPME instances. The result of operating in a swarm is an acceleration of the development of the GPME’s knowledge bases. As depicted on Figure 3.29, the swarm topology is employed in conjunction with the integrated and networked deployment topologies described earlier.
87
Figure 3.29 - GPME Swarm For example, assume the designer is preparing Mars for habitation. She sends a fleet of robots of various roles; builders, excavators, prospectors and explorers. They share a common platform but their instruments are different based on their roles. A single GPME could service the fleet but a swarm is a better option. The designer deploys a GPME for each type of robot so that the GPME specializes in providing metacognition for the specific type of robot. The GPME participate in a swarm of GPME’s, allowing them to share the relevant parts of their knowledge bases.
3.3.4 C OGNITION C IRCUIT At the core of the GPME, there occurs a continuous cycle called the Perpetual Cognition Circuit, depicted in Figure 3.30. The GPME receives telemetry from the environment, which includes the host. The instruments create an observation. The observation triggers the learning apparatus to process the new observation into the knowledge base. The assimilation of a new 88
observation changes the organization of the episodic memory. The projection apparatus uses the knowledge base to project the future wellbeing of the system. If the future wellbeing of the system is in jeopardy, it suggests actions that maximizes wellbeing and monitors success. The measure of wellbeing is called homeostasis. The purpose and only goal of the GPME is to maximize homeostasis. All other goals are effectively steps towards this overarching goal.
Figure 3.30 - Perpetual Cognition Circuit
89
The sensory apparatus provides a continuous time series of perceptions of the environment and of the host. The time series of perceptions contains noise that needs to be identified and separated from the observations. Reasoning is based on observations. We can think of the time series as a language and the GPME as a child trying to communicate. All the sounds the child hears are not meaningful, including some coming from the other people, such as a cough or a grunt. The cough or grunt is noise (ignoring the emotional messaging!) while the words others speak are observations. Similarly, the GPME dynamically builds a grammar of the time series so that it can distinguish noise from observations. As the grammar matures, the observations become more precise and the learning apparatus can perform to its maximal extent.
3.3.5 L EARNING The GPME accepts telemetry about the environment and the host. To filter noise and identify significant features in the telemetry, the GPME looks for rhythmic patterns. This objective requires identifying base and correlated patterns. Section of the telemetry that provide patterns are candidates for learning the relationship between the pattern and the effect on homeostasis. The GPME has two learning mechanisms with which to build the knowledge base. The direct mechanism processes the telemetry. It is called progressive reduction. Progressive reduction allows the agent to process high volume streams into manageable clusters where machine learning techniques can be applied to conduct observational learning. In this manner, the GPME is capable of learning and adapting from its environment in a manner very similar to biological agents.
90
The indirect mechanism obtains inference rules from other more mature GPME instances. It is called selective imitation. The GPME learns from other GPME instances without processing the telemetry that produced the knowledge.
3.3.6 S TREAMS A unit of telemetry is called a segment. An instrument produces one or more segments during a moment. The instrument creates a stream of segments. The combination of all segments captured during a single moment is called a frame. Each frame contains several segments from each instrument, including empty segments. Each frame also contains a segment from the GPME stream of suggestions. Each frame also contains a segment from the internal base patterns. The segments share the same moment, the same temporal index. Figure 3.31 shows the frame vertically with its constituent segments denoted by σ.
Figure 3.31 – Streams of Segments and Frames The telemetry consists of several observation streams (one for each instrument defined in the EIS, collectively denoted σo) and the single suggestion stream (denoted σS). The frame also contains the GPME’s homeostasis and emotional state calculated at that moment (denoted σH).
91
The moment indexes the stream of frames (Figure 3.31). Therefore, at the fundamental level, the GPME processes a stream of frames (Figure 3.32) from the perspective of the host. The stream of frames can also be viewed as a stream of segments from the perspective of an instrument. The GPME is now able to analyze the streams and detect patterns within and across them.
Figure 3.32 - Stream of Frames We define short-term memory as the stream of frames the GPME keeps in its working memory. Short-term memory has a depth which is the number of frames or the length of the stream in short-term memory. Refining our earlier definition of emotional state, the calculation uses all frames in short-term memory, including the one from the current moment, to measure the change in homeostasis and establish the current emotional state. Looking at Figure 3.31 horizontally, we see the instrument’s and internal cycle’s stream of segments. The first step in pattern detection is to calculate the probability of occurrence of a segment, based on previous segments in the stream, using the A-distance. A segment with a low probability of occurrence is called a significant segment. A significant segment focuses the attention of the learning apparatus. The learning apparatus begins with significant segments and reduces the stream to higher levels of abstractions where it can develop inference rules to project the future state of the streams.
92
3.3.7 E PISODES Tulving [113] coined the term episodic memory to refer to the ability to recall specific past events about what happened where and when. Episodic memory is specifically about actual events that occurred in the past. The stream of Figure 3.31 contains a great deal of information that needs to be broken into sections for analysis. We refer to a section of temporally contiguous frames from the stream as an episode. An episode includes all frames in short-term memory; it is therefore important for short-term memory not to be too large.
Figure 3.33 - An Episode consists of several Contiguous Frames The GPME creates episodes when an anomaly occurs. The episode ends when the anomaly no longer exists; either because expectations have caught up with the current state or because the anomalous conditions no longer exist. The GPME considers four types of anomaly; reflex, rational, context and emotional.
93
The rational method detects that a deadline for achieving a certain homeostasis value has passed, or, that the projected homeostasis value is achieved much sooner than projected. If the homeostasis value is not achieved when expected, the GPME detects the anomaly and creates an episode. A hardwired anomaly is a type of rational anomaly that is the result of a violation of a designer-specified expectation in the EIS. An objective anomaly is a type of rational anomaly that is the result of a violation of an expectation created in response to a host request. The other types of anomaly detection use the concept of a bandwidth. A bandwidth is the projected range of a certain value. The projection is based on historical value contained in the short-term memory. The value is projected to occur within an upper and lower bound. An anomaly occurs whenever the value falls outside the band. The anomaly is resolved when the value returns to its original projection. Since the short-term memory changes over time, the bandwidth also changes. Therefore, it is possible for the bandwidth to catch up to the projected value. When this situation occurs, the anomaly is aborted. The reflex method projects an arrival rate of frames for each instrument stream. This projection is called the instrument arrival rate bandwidth. For example, the GPME expects the camera to provide an image every five seconds. After six seconds, if an image has not arrived, the GPME detects a reflex anomaly. The same anomaly would also be detected if the image arrived three seconds after the previous one. Since instruments are unlikely to be as regular as the example indicates, the GPME uses a range based on its experiences. The context method detects an anomaly in two different ways. The first way relies on segment significance. The anomaly occurs when a segment that should be significant is not, or, when a segment that should not be significant is found to be. When the significance of a segment
94
does not match its projected significance, the GPME detects the anomaly and creates an episode. The second way projects the accuracy of the projection. This expectation is called the projection accuracy bandwidth. The emotional method relies on the homeostasis value. The GPME projects the homeostasis value to be within a certain range, called the homeostasis bandwidth (See Figure 3.34). If the homeostasis value is outside the band, the GPME detects the anomaly and creates an episode. The bandwidth is the range between the highest and lowest homeostasis value in shortterm memory, however, it is further adjusted by the emotional state.
Figure 3.34 - Homeostasis Bandwidth With this information about the detection of anomalies and the creation of episode, we can better articulate the calculation of homeostasis. Homeostasis is a formula that uses the number of active anomalies by type. The formula assigns the highest weight in order; reflex, hardwired (rational), objective (rational), context, emotional and rational (all others besides hardwired).
95
For example, the highest and lowest homeostasis in short-term memory is 100 and 150. The bandwidth is 50. The homeostasis of the previous moment is 105. Therefore the unadjusted range is (105 – 25 =) 85 to (105 + 25 =) 130. Since the GPME is happy, the bandwidth is adjusted by +70% to 85. The adjusted range is (105 – 43 =) 62 to (105 + 43 =) 148. If the homeostasis value of the current moment falls within the adjusted range, it is normal. A value outside the range is considered anomalous. Note that the anomaly can occur because of exceeding expectations (> 148) or failing to meet expectation (< 62). In other words, an anomaly occurs when the agent is happier than it expected or when it is sadder than it expected. An analogous example could be applied to the instrument arrival rate bandwidth. The emotional state is a multiplier on the bandwidth. Its value modifies the calculated the bandwidth in the next moment, by shrinking or widening the band. This mechanism emulates biological responses. It enables the GPME to gradually become desensitized and automatically adjust its state of normalcy. Therefore, the GPME has the ability to create episodes from the telemetry. As depicted on Figure 3.33, a newly created episode is referred to as a candidate episode. It is a candidate for inclusion in a cluster of episodes. Figure 3.35 depicts an example of several episodes.
96
Figure 3.35 - Example of Episodes The GPME processes the telemetry (suggestion and observation streams) into frames. In this example, the frames are vertical; s1, s2 and o1 occurred at the same moment. Two anomalies occurred at moment T1 resulting in the creation of episode 1 and episode 2. At moment T3, another anomaly resulted in the creation of episode 3. At moment T4, episode 2 completed. At moment T7, episode 3 completed. At moment T8, episode 1 completed. At completion, the episode becomes available for learning. Episode 2 consists of frames 1 and 2, which is the same as saying that episode 2 consists of segments (o1, o2, o3, s1, s2, s3). Episode 2 and episode 3 overlap in frame 2. Both episodes 2 and 3 are subsets of episode 1. Let’s assume that the resolution of the anomaly ended all episodes. Then, we can express the episode as an inference rule: anomaly + episode resolution where resolution asserts ¬anomaly. While the inference rule is obviously true, the episode contains a significant amount of noise. Noise refers to telemetry that had no actual bearing on the anomaly or its resolution. Therefore, the challenge is to find sufficient examples of the same anomaly to identify the noise 97
and to determine what part of the telemetry affected the anomaly so that we can derive a clean and efficient inference rule.
3.3.8 C ASES Over time, the GPME creates many candidate episodes many of which do not contain any valuable information. To identify the valuable information, the GPME clusters episodes together based on the type of anomaly that created them. Therefore, there are four clusters of episodes; reflex, emotional, hardwired and rational. These four clusters are further divided in terms of their significant segments. Finally, these clusters are further divided in terms of the similarity of the patterns they contain. In the context of this analysis, rhythmic patterns are of greater significance than the ones that are not. In other words, finding several episodes that have the same rhythmic patterns is significant because this discovery immediately identifies the noise contained within the episodes. We can talk about the distance between two episodes as the quantification of their difference in terms of the clustering method we described above. Episodes that originate from different anomalies are the most distant from each other. Within the anomaly cluster, the ones with the most different significant segments are most distant. Within the significant segment cluster, the ones with least similar patterns are most distant from each other. Within the similar pattern cluster, the one with the same rhythms are closest to each other. The GPME attempts to create small clusters that contain episodes that have very small distances between them. Given a sufficiently tight episode cluster, the GPME generates a centroid that is called a case. Structurally, a case is identical to an episode. However, it is not an episode because it did
98
not actually originate from the telemetry. It is not an actual experience and cannot be called an episode. Like an episode, the case consists of sequence of frames. The number of frames reflects the number of frames of the cluster episodes. The observation and suggestion segments in the case’s frames contain only the significant segments from the cluster episodes. The case derives the homeostasis segments based on the homeostasis values of the cluster episodes. It is now clear that rhythmic patterns are preferred because they result in a higher information gain in identifying which significant segments are actually significant! Since the case only contains significant segments, it is composed of complete, partial and empty frames. We refer to a case’s derived frame as a fragment. We defined short-term memory earlier. Now, we define working memory as the combined short-term memory and projected fragments. The case is an abstract or pseudo episode created from the significant information in the cluster episodes. The case tells the GPME that, given a certain set of preconditions (from the observation stream), the suggestions (from the suggestion stream) will have a certain effect on homeostasis. As depicted on Figure 3.36, we refer to this inference rule as the case predicate. The GPME uses the case predicate to project the future state of the stream. The future state is expressed in fragments. The telemetry is expressed in frames. A rational anomaly results in a future moment when the frame of that future moment cannot be matched to a fragment projected into that future moment. The length in moments of the case establishes a deadline for achieving the projected homeostasis value. As we discussed earlier, a rational anomaly results when the number of moments elapses and the projection is not achieved. The availability of predicates and episodic memory make active logics [43] highly suitable for managing the agent’s responses to expectation violations. 99
Figure 3.36 - Cases and Case Predicate It is important to note that the case can support several predicates, in particular when it is relatively new. The number of predicates arises from the possible combinations of significant segments that arise from their linkages. Further in the document, we will see that frames are connected using several types of links. Thus far, we have only considered the temporal link that is automatically created in the episodic memory. The GPME will need a reasoning mechanism not only to choose a case but also to choose the best predicate for the given situation. We refer to the process of choosing a case to project as the deliberation process. To reiterate an earlier point, the current emotional state of the GPME affects the deliberation process.
100
The GPME sets expectations of its well-being and responds to anomalies by implementing a plan of action to return to a normal or better state. The plan of action is derived from the significant suggestions in the episode cluster. The case predicate of Figure 3.36 creates an expectation. The expectation is that, given the preconditions in the telemetry, applying the plan of action results in a projected homeostasis and observation stream. If the expectation is not achieved, the expectation violation is a rational anomaly. This introduction to cases focused on what is called a primitive case. The GPME uses an episode cluster to build a primitive case. A hybrid case is built on a combination of primitive cases and episodes. A super case is built exclusively on other cases.
Figure 3.37 - Case Hierarchy Having elaborated on the concept of matching frames to fragments in each moment, we can measure the accuracy of the projection of a moment as a function of the number of fragments and the number of matched fragments. The projection accuracy bandwidth is determined in the same manner as the homeostasis bandwidth, however, using the highest and lowest moment
101
projection accuracy in short-term memory. The emotional state does not modify the projection accuracy bandwidth.
3.3.9 A PPLICATION
OF THE
K ASAI
IN THE
GPME
The GPME use the Kasai to build its knowledge base. 3.3.9.1 EPISODES Recalling the earlier discussion on behaviors, the observation time series arrives in the GPME in time order. The only relationship between one observation and the next is that they are correlated by being sequential and adjacent. The GPME creates a correlation graph using the Kasai. The correlation graph captures the state of the observation series in compact form. It can apply graph theory algorithm to compare correlation graphs and to create additional constructs for its knowledge base. The GPME creates a Sarufi to represent each episode. Several episodes are clustered to create cases. The case is the centroid of the cluster. To create the centroid, dissimilar Kasi lose their Θ or their Κ and become wildcards. The case is Sarufi with wildcard Kasi while episodes contain no wildcards. 3.3.9.2 PREDICTION WEB In the GPME, the knowledge base captures the internal representation of the environment used for reasoning. A construct called the Prediction Web, built on a Kasai network, is used to predict the future state of the observation time-series using the previous states. The nodes in a prediction web correspond to observed fragments, output links capture prediction and input links capture priming information. An observation primes all the links emanating from it, like how 102
each evidence triggers multiple nodes in a Bayes Net. However, unlike a Bayes net, a prediction web does not have any associated probability distributions. In a Bayes net, the strength of relationship between the nodes is quantified by the associated probability distributions. However, all priming links have equal strength in a prediction web. Also, the prediction web has an output link at each node that generates an output (with probability 1) from each state, if all the priming links are active. The prediction web can thus be considered as a dynamically generated deterministic finite state machine.
Figure 3.38 - Prediction Web Example The algorithm for constructing the prediction web is as follows: 103
Assume that the structure of Prediction Web rule is: LHS ==> RHS Start 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
Mainline: predictedToken = null read prevToken newLHS = prevToken loop: read token if predictedToken = NULL pwAdd (newLHS,token) else if predictedToken token pwAdd (newLHS,token) newLHS = null end if newLHS = newLHS + token prevToken = token predictedToken = pwPredict(newLHS) until no more tokens pwPredict (LHS): pToken = null index = pwGet (LHS) while (index null) and (length(LHS) > 0) LHS = LHS string without first token index = pwGet (LHS) end while if index null pToken = pwGetRHS (index) end if return pToken pwGet (LHS): Return index of the rule whose left-hand side is LHS or return NULL pwGetRHS (index): Return RHS of rule identified by index pwAdd (LHS, RHS): Add rule to prediction web: LHS ==> RHS
The following table is the trace of the algorithm, using the same input as the example in figure 1. Each line of the table represents an iteration of the loop on line 4 of the mainline. After line 12 of the loop, the prediction web no longer changes. Loop Iteration 1 2 3 4 5
token
prevToken
predictedToken newLHS
Prediction Web State
B
A
C
B
D
C
A
D
B
A
NULL NULL NULL NULL B
A B AB C ABC D ABCD A ABCDABCDABC K
A AB ABC ABCD ABCDA
104
Loop Iteration 6 7 8 9 10 11 12 13
token
prevToken
predictedToken newLHS
Prediction Web State
C
B
C
ABCDAB
KA
D
C
D
A
D
A
B
A
B
C
B
C
K
C
D
ABCDABC ABCDABCD ABCDABCDA ABCDABCDAB ABCDABCDABC
A
K
NULL
K
B
A
B
KA
Table 3.1 Algorithm Trace Consider the function createPWRule defined as: createPWRule (newLHS,token): pwAdd (newLHS,token) for all rules where RHS = newLHS pwAdd([LHS+RHS],token) end for If lines 7 and 9 are replaced with a call to this function, the algorithm creates all the possible rules. For example, when rule AB C is added to the prediction web, rule B C is also added. Then, when rule ABC D is added to the prediction web, rules C D and BC D are also added. Since the predict function always uses the longest left-hand side to match a rule, rules B C, C D, BC D decay and are eventually purged from the prediction web. Nonetheless, this function allows the algorithm to handle more complex seasonality such as: abcdabcdabckabcdabcdabckabcz… We refer to the original algorithm as a strict prediction web and the latter version as a greedy prediction web. Consider a sequence formed by repetitions of the same series S, i.e., (S-S-S-S-…). This sequence has a seasonality factor of zero and it represents the trivial case. A sequence has nontrivial seasonality when it contains at least two distinct series S1 and S2 that repeat and form a pattern in their repetition as in the sequence (S1- S1- S2- S1- S1- S2-…). The prediction web could 105
be represented as a Markov model (Figure 3.39) in which a state corresponds to 1 or more series that preserve their order of occurrence. In such a model, each state would predict the first token of the next series Sp as t1Sp, where the notation tiSj denotes the ith token of the jth series. The translated Markov model is depicted below. If Sn predicts Sp then the corresponding prediction web encodes the rule Sn t1Sp,
Figure 3.39 - Prediction Web as Markov Model The GPME is designed to identify behaviors from the time-series of observations. A behavior is encapsulated in a case. The GPME automatically creates a case as the centroid of a cluster of episodes with the same anomaly signature. The GPME design identifies four types of anomalies that lead to the creation of episodes. Each case is a rule for predicting the future state of the environment and homeostasis of the GPME. The GPME design calls this rule the Case Predicate. Since there is no direct way to determine a change in context (a fundamental change in the environment), the cases are continually tested. To perform the test, the case predicate is encoded in the Prediction Web. If the case predicate is valid in the current context, it correctly 106
predicts the future. Otherwise, either the case is not valid or a new case has just been identified. The GPME design always assumes the latter (a new case) because low utility cases (such as invalid cases) will automatically decay and will be purged from the knowledge base over time. Given a knowledge base of tested and valid cases, the GPME can provide appropriate suggestions to its host in response to stimuli from the environment. Consistent with the Behavior Oriented Intelligence framework, we can view the GPME design as two concurrent macro processes. The first process accepts the time-series of observations and maintains the knowledge base. The second process uses the knowledge base to advise the host.
Figure 3.40 - Contextual View 3.3.9.3 IMPLEMENTATION OF THE KNOWLEDGE BASE Two possible implementations for the case database were considered; a tree and a graph. The root of the tree is an arbitrary node representing the self. A case has a signature of attributes representing the observations. At the higher levels of the tree, cases have few attributes in their signature. This means that they are matched quickly and to a variety of circumstances. The 107
deeper cases have increasingly larger signatures in terms of attributes. These cases are more specific to circumstances the GPME encounters in its environment. Upper level cases result in simple behaviors such as reflexes and lower level nodes result in complex behavior. Unfortunately, a hierarchy is not sufficiently flexible. In fact, the cases should be organized into a graph instead of a tree. Using a graph, the attribute signatures can label the edges instead of the nodes. Therefore, the nodes only needed to contain the ideas. The graph structure provides the necessary flexibility to represent the relationships between the cases. Several types of edges can connect nodes. One set of edges describes the structure of the cases and stores attributes. Another type of edge connects ideas in terms of how they are experienced. For example, a temporal edge can connect one case to another denoting the order in which they were created. This type of edge helps organize experiences along the dimension of time and allows time-dependent reasoning. Another type of edge can indicate a causal relationship between two ideas. The causal edge tells us that a set of preceding materialized cases result in a specific case. A materialized case is an idea that the GPME observes in the environment. In other words, the GPME selects an idea and enacts a behavior. Then the environment reflects the idea without anomalies. This means the idea became materialized. We can imagine that the GPME is in a conversation with the environment. In that case, the causal chains in the case database form the grammar of the language where words are the observations of the environment. The GPME needs to learn the grammar and vocabulary of the environment to interact with it effectively. The GPME can construct the grammar using causal edges.
108
Each observation is parsed through the grammar to determine whether the observation is regular and normal, or whether it is new or abnormal. In natural languages or in computer languages, the latter case is an error to be corrected. For the GPME, however, the latter case represents an opportunity to improve the grammar. The functionality that organized the cases into a graph and builds edges is the fundamental data structure of the GPME. The focus of this dissertation is on the implementation of the GPME case graph, specifically on the creation of the causal node network that expresses the grammar the GPME can use to predict the state of environment and, therefore, to detect anomalies. The learning processes engage in response to the presence of an anomaly. When the GPME’s ideas consistently materialize, it indicates there is nothing for it to learn. It is only when there is a difference between the expectation and the observation, an anomaly, that there is an opportunity to learn. Learning takes the form of modifying the case graph by adding or updating cases and edges of various types.
3.3.10 S ELECTIVE I MITATION Imitation is an important learning mechanism in biological agents. It enables a learner to acquire behavior from a model without having to go through the deliberation process to arrive at the same conclusion. This benefit is also the source of the unfortunate problem with imitation; the learner does not learn the circumstances of the conclusion, the “why.” Without this information, it is difficult to improve the cases acquired through this method.
109
We distinguish between imitation at the host level and at the GPME level. Host level imitation is the classical imitation exhibited by living creatures. The hosts involved require shared capabilities and a shared frame of reference in understanding their environment. As a result, the designer must account for host level imitation within the design of the host itself. In such a case, social sensors and actuators designed to recognize and interact with other hosts will be present in the context section of the EIS. The GPME will treat their information like any other part of the telemetry. At the GPME level, imitation requires sharing the episodic memory. However, for imitation to occur, the GPME instances must also share their context. The sharing occurs through the SIS shown on Figure 3.27. Two GPME instances share their SIS and create a communication channel between them. The section of their respective EIS allows each of them to select the relevant portion of episodic memory to share. For example, assume that a GPME social network consists of two GPME instances. The EIS of GPME A defines only a spectrum analyzer and a camera, and, that of GPME B defines only a camera and a microphone. There is no need to share any part of the episodic memory where the spectrum analyzer and the microphone play a part; because any case predicate built on these sensors will never be used by the other. For imitation to occur, one GPME instance provides its episodic memory to another instance. The provider is referred to as the model and the recipient as the learner. Imitation relies on two measures. We define the maturity level as a function of the number of primitive, hybrid and super cases a GPME possesses. We also define the prestige level as a function of the number of learners watching the GPME. Each GPME advertises its own maturity and prestige levels. A
110
learner prefers high maturity and prestige models, and will not become a learner to a lower maturity GPME. We define the confidence level as a rating of the utility a learner places on a model. This rating is a function of the usefulness of the episodic memory the model has provided to the learner, in terms of impact on its own homeostasis. The shared episodic memory includes the model’s homeostasis impact. The confidence level modifies the baseline homeostasis impact by either raising or lowering it when assimilated into the learner’s knowledge base. Therefore, the learner can determine whether to emulate or avoid the plan of action what the model provides. We refer to this learning mode as selective imitation because the learner selects what to imitate. The learner chooses the model and the portion of episodic memory it will assimilate from the model. The relationship between two GPME instances is unidirectional; a learner watches a model. The learner assimilates the model’s episodic memory into its own knowledge base. Information does not necessarily flow in the opposite direction when the model has a higher maturity level than the learner or when the model is itself a learner of higher maturity models. When two GPME communicate, they leverage each other’s learning. Consider a deployment of a single GPME instance. The number of hosts is not relevant. The GPME must learn everything from scratch. As it builds its episodic memory, it will eventually learn enough to effectively guide the hosts, provided that no catastrophic failure occurs because of its immaturity. Consider a deployment of a mature GPME and a new GPME. The mature GPME can provide several useful cases that have a high probability of averting disaster during this initial period. The more mature GPME instances the learner communicates with the faster it matures and the sooner it can become a model for other instances.
111
CHAPTER 4:
EXPERIMENTAL RESULTS
This section reports experimental results from applying the Kasai algorithm to data series, and, it describes the design of the GPME.
4.1
K ASAI E XPERIMENTS In this section, we discuss experiments using the Kasai on data series of symbols. The
letters represent a unique input token; there is no lexical relationship assumed between the tokens. We conclude with a time analysis.
4.1.1 S IMPLE D ATA S ERIES The following figures show graphical representation of the Sarufi generated for the input data series shown in the caption. The graphs were generated using ATT GraphViz. The following table shows the Sarufi for various types of series.
112
Generated Graph
Graph Visualization
Series: [abcabcabc…]
digraph G { a -> b [label=1]; b -> c [label=1]; c -> a [label=1]; }
113
Generated Graph
Graph Visualization
Series: [abcabcabckabcabcabck….]
digraph G { a -> b [label=1]; b -> c [label=1]; c -> a [label=1]; c -> k [label=3]; k -> a [label=3]; }
114
Generated Graph
Graph Visualization
Series: [aaabaaab….]
digraph G { a -> a [label=1]; a -> b [label=3]; b -> a [label=3]; }
Generated Graph
Graph Visualization
Series: [aaabaabaaabaab…]
digraph G { a -> a [label=1]; a -> b [label=3]; b -> a [label=3];
115
Generated Graph
Graph Visualization
a -> b [label=5]; b -> a [label=5]; }
Generated Graph
Graph Visualization
Series: [abcabcabckabcabcabckabcabcabck…]
digraph G { a -> b [label=1]; b -> c [label=1]; c -> a [label=1]; c -> k [label=3]; k -> a [label=3]; }
digraph G { a -> b [label=1]; b -> c [label=1]; c -> a [label=1]; b -> k [label=3]; k -> a [label=3]; k -> d [label=6]; d -> a [label=6]; d -> r [label=12]; r -> a [label=12]; }
In this example, the processing of the series starts midstream, at an infrequently occurring token. This next series test the algorithm’s refactorization function. 118
Generated Graph
Graph Visualization
Series: […kabcabcabckabcabcabckabcabcabc…]
digraph G { k -> a [label=1]; a -> b [label=1]; b -> c [label=1]; c -> a [label=1]; c -> k [label=3]; }
digraph G { k -> d [label=1]; d -> a [label=1]; a -> b [label=1]; b -> c [label=1]; c -> a [label=1]; b -> k [label=3]; k -> a [label=3]; k -> d [label=6]; d -> r [label=6]; r -> a [label=6]; d -> a [label=12]; }
120
4.1.2 C HAOTIC
TIME
S ERIES
Is the Kasai limited only to identifying relationship rules? How does it handle chaotic time series? A chaotic time series, where no element occurs more than once, creates a Sarufi without cycles; a set of single use Kasi. For example, a Kasi abc d never fires because [abc] will not occur again in the series. In addition, the global charge remains forever at 1 because the root Kasi is never revisited since there are no cycle. Nonetheless, it is possible to use the Kasai with random time series by applying a preprocessing strategy that generates a symbol that represents the existing pattern in the time series. The preprocessing strategy leverages patterns known to exist in the time series. Consider the Mackey-Glass differential equation chaotic time series depicted in Figure 4.1, using 12000 samples, a delay factor of 17 and a time-step of 0.1 (Generated using Matlab). The resulting time series consists of 12000 nonrepeating numbers. However, the graph shows there is a clear pattern that results from the mathematical relationship created by the formula.
121
Figure 4.1 - Chaotic Time Series Because the numbers, x(t), do not repeat, the time series is chaotic and is not a candidate for use with the Kasai directly. However, we can take the time series and transform it into a time series of symbols. Our approach is to create a bucket labelled by a symbol. For example, assume a bucket size of 3 and the time series S=[1,2,3,4,5,6,7,8,9]. We can create three buckets a=[1,2,3], b=[4,5,6] and c=[7,8,9]. We can then express S=[a,b,c]. Examining Figure 4.1, we see that the time series is bound by 1.3 > x(t) > 0.2. We can create buckets that divide this range. Instead of listing every element of the bucket, we can define the bucket in terms of its unique symbol label, an upper bound and a lower bound value. We denote the bucket as {symbol, lower bound, upper bound}. The following table shows a fragment of the Mackey-Glass data depicted in Figure 4.1:
Table 4.1 Mackey-Glass Fragment We can subdivide the range [0.2, 1.3] into buckets denoted {symbol, lower bound, upper bound}: •
{a, 0.868810, 0.885138}
•
{b, 0.890401, 0.905640}
•
{c, 0.910534, 0.924647}
•
{d, 0.929160, 0.942122}
Given these buckets, the time series fragment in the table transforms into S1=[aaaabbbbccccdddd…]. We can now predict the next bucket that the next value will fall into. The accuracy is a function of the size of the bucket. For example, if we create a single bucket {a, 0.2, 1.3}, the time series is S1=[aaa….] like Figure 3.10. This is not very useful for prediction. At the other extreme, if we create 12000 buckets, one for each number in the time series, then S2=[abcdefg…] is random and contains no seasonal patterns we can use to create Kasi. In fact, the Kasai creates 12000 Kasi with κ=1 but that never fire because of two reasons. First, since there are no cycles, their c value never increments past 0 so c is never equal to κ. Second, τ never occurs again. The series S1 above, with a bucket size of 12000, creates a single Kasi that fires all the time. The series S2, with a bucket size of 1, produces 12000 Kasi that never fire. The ideal bucket size is between these two values. The bucket size is, in effect, the error tolerance of the prediction. 123
Our experiment compares chaotic times series prediction using the Kasai, with ANFIS and NAR implemented in Matlab. All three techniques in the experiment use the same MackeyGlass chaotic time series as input. The ANFIS implementation can be found at: https://www.mathworks.com/help/fuzzy/predict-chaotic-time-seriescode.html?searchHighlight=mackey%20glass%20anfis&s_tid=doc_srchtitle. The ANFIS results are shown in Figure 4.2.
Figure 4.2 - ANFIS Prediction Results
124
The NAR neural network implementation can be found at: http://lab.fs.unilj.si/lasin/wp/IMIT_files/neural/nn05_narnet/. The NARx results are shown in Figure 4.3.
Figure 4.3 - NARx Prediction Results
The preprocessor accepts the tolerance (bucket size) as input. It dynamically creates new buckets as it processes the Mackey-Glass time series and it creates a time series of symbols that is then input to the Kasai. The table below shows the results of experiment. Error Tolerance is the bucket size. For example, with tolerance of 0.1, the range 0 to 1.5 divides into 15 buckets (0.0-0.1, 0.1-0.2, 0.30.4, …, 1.4-1.5). Empty buckets are ignored as they are not represented in the input series to the Kasai. When the Kasai generates a prediction, the prediction is within the tolerance. Otherwise, the Kasai generates no prediction. The ideal tolerance (bucket size) is therefore a function of the problem being solved. Predictions Made is the number of predictions the Kasai makes.
125
Prediction Accuracy is the percentage of Prediction Made to the length of the data series (12,000). Error Tolerance 0.5 0.1 0.05 0.025 0.0125 0.00625
Predictions Made 8017 6529 5036 5297 5122 3319
Prediction Accuracy 67% 54% 42% 44% 43% 28%
Table 4.2 Kasai Prediction Results using Chaotic Series The ANFIS and NAR experiments result in errors at most points of the time series. The ANFIS error range is consistent across the series. The NARX error worsens as the values gets further way from the training data set. The Kasai does not use a training data set. Based on the results and comparing the Kasai’s performance to ANFIS and NARX, it produces a reasonable prediction of chaotic time series. However, it is important to understand the context in which the Kasai was developed. It is part of an AI system that combines the symbolic procedural AI paradigm with the connectionist paradigm. For example, the ANFIS implementation uses four prior values to project a future value during the training phase of the underlying neural network. The four values are x(t-18), x(t-12), x(t-6) and x(t) to predict x(t+6). A possible better approach is to have the Kasai determine the optimal training data set. This approach should result in an improved performance for ANFIS or NAR prediction. The Kasai is not meant to compete with connectionist approaches, it is meant to enhance them.
4.1.3 W EATHER P REDICTION The previous experiment uses a Mackey Glass generated chaotic time series. The data series is converted into fingerprints which are then directly fed into the Kasai algorithm. This 126
experiment more closely approximates the design of the GPME. In the GPME, the raw input is mapped to a fingerprint. The fingerprint is clustered with other fingerprints. The centroid of the cluster the fingerprint belongs to is the input into the Kasai algorithm. This experiment uses historical weather data. Specifically, it uses two years of weather data to predict the weather in the third year. The source of the data is Weather Underground. The years used are 1965 and 1966 to predict 1967, in Baltimore, MD. WU provides several data points in the daily summary. WU acts like a rich sensor that produces one frame each day. For the sake of simplicity, the frame of this experiment uses a 4-tuple for each day; minimum and maximum temperature, and, minimum and maximum dew point. Temperature measures the amount of heat in the atmosphere while the dew point measures the amount of moisture. In the GPME, there are two clusters; signature and distance. The signature cluster groups frames with the same attributes from the same instruments, independent of values. The distance cluster groups frames that are at the same distance from the Prime Fingerprint. The signature cluster is of little value since there is only one sensor and the frames all have the same attribute. Therefore, the experiment only uses the distance cluster. Because the frame uses four values, each cluster is a tesseract. The experiment varies the size of the tesseract. The Prime Fingerprint is the same. The Sarufi constructed from the first two years of data is frozen during the prediction of the third year. Its learning mode is turned off after the second year. Tesseract Size Match 5 65 10 128
Stay
Miss
261
39
172
65
127
Tesseract Size Match 15 181
Stay
Miss
99
85
20 228 25 249
54
83
35
81
30 283
11
71 Table 4.3 Weather Prediction Results
Prediction algorithms attempt to establish causal relationships between data elements of the time series. When there is not sufficient information to establish the causal relationship, a prediction error is introduced. Classical prediction algorithms provide formulas to determine confidence in the prediction. The Kasai algorithm is different from classical prediction algorithms in that it recognizes when there is not sufficient information to make a prediction. In the results table above, the occurrences of this circumstances are in the stay column. The stay column contains situations the algorithm could not provide a prediction. Therefore, the agent should use another method to predict the weather. This situation is equivalent to a person realizing that she does not know something and she needs more information. For example, in the case of weather prediction, wind direction and speed, as well as the current condition upwind, are a better predictor of tomorrow’s weather conditions than yesterday’s weather conditions. However, our agent lacks sensors to determine wind speed or weather conditions upwind. It can tell that it does not have enough information to make prediction. This situation could prompt the agent to include other sensory information in the time series to find a better combination of sensory inputs. The agent knows exactly what conditions to test new prediction models. The GPME performs this act of creativity using the AMRI language. The GPME applies genetic programming techniques using AMRI to find different prediction models. In our case, since the agent lacks the proper sensors, it will not perform better. 128
The match and miss columns are situations where the Kasai algorithm provides a prediction. Matches are correct predictions while misses are incorrect predictions. In terms of the GPME design, misses are anomalies; an expectation that is not met and requires correcting the Sarufi. In biological systems, certainty is preferred over accuracy, especially when multiple sensors are available for prediction. It is preferable that the Kasai provides certainty that the learning mechanisms must be engaged. It does so accurately when a stay or miss event occurs. The response to a miss is to correct the Sarufi. The response to the stay is to use a creative approach; to create new Sarufis that use different combinations of sensor input. For example, assume that the agent is part of a swarm. It receives current position, current weather conditions and, wind speed and direction, from the others. Even though it may initially only use local weather conditions to predict future weather, the stay situations will lead it to use other sensory data, from its peers in the swarm. Eventually, it will create a model that uses wind and conditions at other locations, in conjunction with local conditions, to predict weather. These predictions would be better. All WEKA classifier algorithms used in this experiment could only produce one output based on three inputs. For example, if we input maximum temperature, minimum temperature, and maximum dew point, the algorithm outputs maximum dew point. This approach was then employed for all other attributes to get a four-dimensioned point in space and form the tesseract. We then compared the actual values from the third year to the ones WEKA predicted for the third year. To describe the distances between actual and predicted values from the four attributes that make up each point in space, we calculated their Euclidean distances. This distance was then used to construct several class widths, which is the range of ‘error’ allowances or width of the tesseract. 129
The following table compares the performance of the Kasai against traditional machine learning algorithm using the same data set implemented using WEKA. The methods do not have a Stay has they Match or Miss for every sample. M5P Match 5 49
219 58 314 Miss 322 165 39 333 Miss 326 166 50 322
Random Match Forest 5 49
Miss
15+ 60
130
349
15+ 50
10 200 15 312
71
Miss
323 172 60 312
Regression Match By Discretization 5 38 10 163 15 294 15+ 78
Miss
RandC Match 5 46 10 167 15 281
334 209
15+ 91
78
Miss 326 205 91 281
294
RandSS Match Miss 324 5 48 10 200 15 320 15+ 52
172 52 320
The following chart maps the differences between the Kasai and the other methods:
Figure 4.4 - Comparison of the WEKA Classification Algorithms and the Kasai 131
The comparison data shows that the Kasai outperforms the other methods at the highest level of precision in terms of matches. More importantly, in the context of behavior oriented intelligence, the Kasai clearly identifies metacognitive anomalies (Misses) from cognitive anomalies (Stay), enabling the GPME to respond appropriately. This result confirms that the Kasai is better suited to support prediction within an implementation of the behavior oriented framework.
4.1.4 T IME C OMPLEXITY A NALYSIS The Kasai consists of four distinct functions; the Kasai Mainline, the Kasi function, the adjustSarufi function and the refactorSarufi function. We examine the time complexity of the refactorSarufi function separately from the first three. The Kasai mainline consists of a loop iterating over the data series, one token at a time. The mainline consists of a series of if statements. In the case where the rootKasi is active, a parallel operation increments all the Kasi charge variables. Since all the increments are done in parallel, it is O(1). In addition, there are two functions; signalAnomaly and signalPrediction that communicate with the client application. We assume these two functions operate independently without each one having to wait for the other instructions and therefore have a time complexity of O(1) as well. Thus, the time complexity of the Kasai mainline is O(1). The Kasi contains several instances of Θ functions. In the worst case, there are κ instances. The Θ functions execute in parallel. Each thread compares the current charge to its own κ. If the current charge is less than or equal κ, it maps its link. After all the map operations complete, the reduce operation examines the mapped links and select the one with the highest κ. We assume an unsorted dataset because sorting the map introduces an additional performance 132
cost. Therefore, the map operation creates an unsorted list. A parallel reduce operation on an unsorted dataset with κ elements has a time complexity of O(log κ). The worst case series consists of alternating tokens in the form [a, b, a, c, a, d, a,…]. If the sequence has n elements, the Kasi with τ=[a] has n/2 κ edges. Therefore, the time complexity of the Kasi function is in the order of O(log n/2) O(log n). The adjustSarufi function is a simple sequence of operations. The algorithm always knows exactly where to place the new Kasi in the Sarufi and does not need to perform a search. Therefore, its time complexity is O(1). The time complexity of the Kasai functions except refactorSarufi is O(log n). The refactorSarufi function performs the same processing except that it has a different starting point. The refactored Sarufi has the same number of Kasi but the number of edges might vary. The time complexity of the refactorSarufi function is also O(log n), and, it occurs independently of the other functions. Therefore, the time complexity of the Kasai is O(log n).
4.2
GPME D ESIGN
4.2.1 D ATA S TRUCTURES 4.2.1.1 DAMARU DATA TYPE Before we proceed with a description of the knowledge base, we introduce the Damaru data type. The GPME is a biologically inspired design. This data type emulates biological processes of growth, uncertainty, and inertia. Further on, when we describe an object or construct
133
as a Damaru object, it is understood that it inherits and exhibits the behavior described in this section. In Hindu mythology, the Damaru is a drum used to mark the passage of time. The Damaru data type introduces an attribute called the Damaru Metric. The metric is an integer that reflects the utility and usefulness of the object to the GPME. Its domain is the range (-∞, +∞) which is denoted ±∞. The actual value of ∞ is an implementation issue; a large integer. The symbol Δ denotes the midpoint of the range. The Damaru metric is the measure of utility of the object. The value of a frequently used object approaches ±∞. When the metric of a Damaru object reaches Δ, it is deleted. An internal process continually moves all Damaru metrics closer to Δ. Utilization moves the metric away from Δ. In addition to Δ, there are special constants denoted β, µ, ɸ.
±∞ ±µ
±β ±ɸ Δ
Figure 4.5 - Damaru Triggers Optimal values for the metric are between ±∞ and (±∞ * 0.5). Suboptimal values are between [(∞ * 0.5) and (Δ + 1)] and [(-∞ * 0.5) and (Δ - 1)]. The constant β represents the default initial value of the metric variable. It is equal to (±∞ * 0.75). The constant µ represents the midpoint between β and the suboptimal range (±∞ * 0.625). The constant ɸ represents the 134
midpoint of the suboptimal range (±∞ * 0.25). Special processing occurs whenever the metric of a Damaru object reaches these constants. Specifically, a value of ɸ or µ triggers a deeper clustering analysis of the object. Decay: Every Damaru object in existence must have its metric adjusted once. Earlier, we defined a moment as the smallest indivisible unit of time internal to the GPME. A more precise definition is that a moment is the time it takes the GPME to adjust the metric of every Damaru object in the knowledge base. The adjustment moves the metric one unit closer to Δ. This behavior is called Decay. Each moment, every Damaru object decays. Counteracting decay, other processes within the GPME adjust the metric away from Δ based on their usage of the object. 4.2.1.2 CONNECTIONS The knowledge base makes use of connections between objects. A connection is a Damaru object. Therefore, it carries a Damaru metric and is subject to decay. The connection designates the two objects and provides bi-directional access. We depict a direction of the connection to indicate the driver of the connection. Nonetheless, the underlying data structure allows traversal in either direction. Impact on Decay The number of active connections an object participates in affects its rate of decay. An object that supports connections is limited to 12±1 connections. We use the symbol 12±1 to denote the following behavior: Phase Growth
Number of Connections 0 to 10
Decay Rate (1 + f(Connections)) units per moment, where f(Connections)
135
Phase
Number of Connections
Stable Teetering Mature
11 12 13
Decay Rate decreases linearly, as the number of connections increases. (1 +(f(10))/10) (1 + (f(1) * 10)) Decay rate is 1.
Table 4.4 Decay Rate During the growth phase, connections are created with few restrictions. Decay proceeds at a variable rate. During the stable phase, adding or modifying connections becomes more restrictive. The rate of decay of the object reduces slightly. During the teetering phase, the rate of decay of the object is high. During the mature phase, the rate of decay of the object is normal. Therefore, a Damaru object decays every moment. The specific amount by which the Damaru metric reduces is a function of the number of its connections. A connection is a Damaru object that also decays at a rate that is a function of the objects it connects. Utilization of the object (or connection) raises the Damaru attribute by an amount commensurate with the nature of the use. Since a moment is the time required to execute a single cycle of the decay process on all Damaru objects, the length of a moment also varies in real time, as the size of the knowledge base grows. Therefore, GPME instances cannot use their internal clock as a synchronization mechanism with each other but it can be used in pattern making with an offset adjustment. The purpose of decay and connections is to keep the observational memory manageable. Objects that do not have a sufficiently high relative utility are automatically pruned from the memory. As in biology, memory objects do not have an infinite number of connections but there are nearly an infinite number of paths through the objects. Note that we are describing a behavior we are labeling as 12±1. In practice, the actual number can be any prime number and is 136
an implementation issue that impacts memory and CPU utilization. For example, instead of 13, the maximum number can be 977 (976±1). However, 13 is a good starting point as the evidence from biology is that the number is around 7 [114]. Connection Types and Strength Two objects can have more than one type of connection with other. The GPME reasoning mechanisms make use of certain types of connection as they traverse the case database to make projections. Cases are connected to each other through shared frames. The strength of a connection is the number of connections that exists between two objects. The reasoning mechanism may prefer stronger connections over weaker ones. There are five types of connection; temporal, causative, attributive, spatial, order and composition. The GPME links successive frames with the temporal, causative, attributive and spatial connections. In effect, the GPME assumes that a strong connection exists between consecutive frames that will need to be reinforced or disproven later on. Anomalies or decay will cause these connections to be deleted over time. The GPME creates order and composition links in response to certain anomalies. •
Temporal: Indicates the order of arrival of the frames into the GPME;
•
Causative: Indicates that the source frame caused a change of state in the target. The GPME creates this link between two successive frames whenever the source frame contains a suggestion;
•
Attributive: Indicates that the state of source frame requires the state of the target. The GPME creates this link between two successive frames whenever the target frame contains a suggestion and the target precedes the source in time.
•
Order: Indicates a non-causative order of precedence between source and target frame. This is a weak causative link that occurs when a causative link decays. 137
•
Spatial: Indicates that the telemetry in the target frame was obtained in the same location as the source frame;
•
Composition: The GPME uses the composition link to assemble hierarchies of cases. The GPME actively changes the links in response to anomalies.
4.2.1.3 CLUSTERS A cluster is a Damaru object composed of other objects that share similar characteristics and may be disjoint. A cluster is based on similarity. The GPME has two types of clusters; a frame cluster and an episode cluster. The significant characteristics of the objects in the cluster are assembled into an abstract object called a centroid. The centroid of a frame cluster has the structure of a frame while the centroid of an episode cluster has the structure of an episode. A cluster has a centroid component that points to the members of the cluster. The centroid consists of the most significant features of the members of the cluster. For example, the centroid of the signature cluster is a frame that only carries significant attributes from the frames and segments. As depicted on Figure 4.6, a cluster member participates in several clusters. The centroid connects to 12±1 cluster members. The cluster members are closest to the centroid in terms of the similarity of their features. A new candidate member that is closer than existing members replaces the furthest member, subject to the phase the centroid is in. We can envision that, as cluster membership changes during the growth phase, the centroid moves wildly as it points to cluster members that are far apart. In the growth phase, the cluster is less susceptible to be influenced by new farther objects. In the teetering phase, its connection decays faster causing the farthest members to exit the cluster. In the mature phase, the cluster members are close to each other and admission of new candidates becomes more difficult.
138
Figure 4.6 - GPME Cluster Map Example We refer to the centroid as abstract because it is an abstraction of the members of the cluster. The distance and signature clusters are organized around an abstract frame which is a fragment, just like a case frame. Like any frame, the fragment consists of segments but it does not include the raw data attribute and only some of the segment processed data. The fragment’s and its segments’ attributes are an aggregation of the cluster member attribute values, except for the Damaru attribute which is always initialized when a Damaru object is created.
Figure 4.7 – Frame Cluster centered on Fragment 139
GPME also creates clusters using episodes that exhibit similar patterns in their attributes. The centroid has the same structure as an episode but its frames are fragments. As discussed earlier, the centroid of an episode cluster is called a case.
Figure 4.8 - Cluster with Abstract Case 4.2.1.4 FRAME CLUSTERS When the GPME receives observations (raw and processed data), it adds three attributes to the segment depicted on Figure 4.9; the α attribute, the ψ attribute and the fingerprint attribute. The α attribute is the A-distance; the probability of occurrence of this segment and its values within its own observation stream. To create the fingerprint, the GPME samples 256 words (2 bytes) at fixed positions in the raw data. The fingerprint is an abbreviation of the raw data created from a fixed positional filter. To create the ψ attribute, the GPME measures the Hamming distance between the fingerprint and a constant fingerprint called the Prime Fingerprint. The Prime Fingerprint is randomly generated at initialization and it never changes throughout the life of the GPME.
140
Figure 4.9 - GPME Segment All segments that arrived during a moment are grouped into a frame as depicted on Figure 4.10. The GPME adds the unique moment identifier to the frame. It initializes the Damaru metric to β. It records the GPME’s current homeostasis and emotional state. The GPME determines the signature of the frame. The signature of the frame is a vector that identifies which instruments contributed segments to the frame and the number of segments each one contributed. The identification of the contributors is in the segment processed data.
Figure 4.10 - GPME Frame For example, assume that Segments A and B are from the video camera and Segment C is from the speed sensor. The signature would be something like: {video=(0,1);speed=(2);left_arm=();right_arm=();battery=()} The Schema of the segment is the set of non-null attributes in the processed data combined with the attributes the GPME adds to the segment. The GPME uses the segment ψ attributes and the signature to place the frame in frames clusters; the ψ cluster and the signature cluster. The ψ cluster aggregates frames that have the closest ψ attribute value in comparison to each other. Figure 4.11 shows a representation of the ψ 141
clusters with respect to the Prime Fingerprint. Each circle represents a large cluster containing frames at the same distance from the Prime Fingerprint. Eventually, the concentric bands will shrink to include only frames that are at the same distance. Similarly, the signature cluster aggregates frames that have the closest signature in comparison to each other. The ψ and signature clusters are based on a number and a string of tag-value pairs, respectively. The simplicity of these values allows the clustering algorithm to be simple and therefore very fast [115]. All frames that occur in the same cluster are considered equivalent. Clustering is the basic abstraction mechanism. The frame clusters support automatic projections based on experience but not on learning; that is, they enable instinctive reactions or reflexes.
Figure 4.11 - Example of ψ Clusters 4.2.1.5 EPISODE CLUSTERS An episode is the record of an actual experience; that is, it is a section of the stream of frames. The GPME creates an episode in response to an anomaly. An episode begins with the 142
content of short-term memory and includes all future frames, until the anomaly is no longer detected or decays. Episodes can overlap by sharing frames. There is a one-to-one relationship between the occurrence of an anomaly and an episode. An anomaly identifier also uniquely identifies its episode. When assembling an episode, the GPME adds the unique anomaly (episode) identifier, an initial Damaru metric and an attribute called the significance. The significance is a Boolean vector that identifies the frames in the episode that contain data relevant to the anomaly. Significant frames are the ones that contain significant segments.
Figure 4.12 - GPME Episode The stereotype of the episode is the combined schema of its composing segments merged with its GPME episode attributes. The GPME uses schemas and stereotypes during episode cluster creation. Once clusters are identified, deeper clustering considers the data content of the segments (processed and raw). Once the cluster is as small as possible, the GPME generates the case predicates. An episode completes in one of three states: •
Resolved: The anomaly is resolved because the homeostasis value was achieved.
•
Aborted: The anomaly is resolved because the homeostasis bandwidth adjusted.
•
Unresolved: The anomaly is resolved because it decayed.
143
The completion states Aborted and Unresolved cause a rational anomaly. Their Damaru metric is initialized at a value of µ since they are less likely to contain an actual resolution, i.e., to support a useful case predicate. The GPME clusters episodes that share similar features and, then further clusters those that share patterns in their features. The clustering method is fuzzy [116] but it does not suffer from poor performance since the feature set is fixed (from the EIS) and it is ordered in terms of information gain. Nonetheless, episode clustering is at least an order of magnitude slower than frame clustering. 4.2.1.6 CASE GRAPH Figure 4.13 summarizes the knowledge base structure. From the bottom up, the knowledge base becomes more abstract. A segment is part of one and only one frame. A frame contributes to one or many signature clusters, distance clusters or episodes or any combination. In turn, an episode, signature cluster or a distance cluster, in any combination, contribute to one or many cases. All cases contribute all their respective case predicates to the ontology. From the top down, we move from the abstract to the specific. Therefore, for cases derived from its own deliberation, the GPME can identify the specific set of frames that support a case predicate. The social case is a case obtained through the social network and assimilated into the ontology. Since the model only provides case predicates, the observation hierarchy is not available.
144
Figure 4.13 - Case Network At the lowest level, it is a directed graph built using connections between frames. At the next level, the GPME can navigate a directed graph of observations arranged temporally. Since the frames order the segments, the GPME can examine the observations from a single sensor or actuator over time. At the next level, frames are clustered into distance and signature clusters. These clusters ignore the temporal order and focus on very basic features of the observations. At the next level, in response to an anomaly, the frames are clustered into episodes. An episode is
145
temporally based but comes into being as a result of a higher level of reasoning. An episode captures sections of the stream of frames for analysis. At the next level, episodes reflecting similar patterns are clustered into cases. Again, the cases ignore the temporal order and focus on the attributes. Therefore, the observational knowledge base creates an index on the stream of frames. In order words, it organizes the experiences of the host. The case points to all significant episodes that begin with an anomaly and end in a resolution so that GPME identifies the specific observations that actually result in the resolution. Note that the size of short term memory may not allow the GPME to actually capture the true beginning of the anomaly. Over time the telemetry in the case becomes more precise, that is, more relevant to the outcome. GPME uses the case observations to create a derived expectation that determines the success or failure of the suggestion. It is possible that different episodes, created in response to the same anomaly, result in different suggestions. Referring back to Figure 3.35, s1 is the suggestion for Episode 1, s2 for Episode 2 and s3 for Episode 3. Assume all episodes are in the same cluster. Each episode only has one suggestion. However, since all three episodes are in the same cluster, the abstract case results in more than one suggestion. Therefore, GPME needs a reasoning mechanism to choose the suggestion. For example, one reasoning mechanism is probabilistic. The GMPE counts the number of episodes in the case database by suggestion and builds a case parameter square depicted in Figure 4.14:
146
Figure 4.14 - Example of a Case Parameters Square Suggestion s2 is the solution 73% of the time, suggestion s1 is the solution 24% of the time and suggestion s3 is the solution 3% of the time. Therefore, the GPME using probabilistic reasoning would suggest s2. The GPME uses several reasoning mechanisms once the description of the knowledge base is complete. The GPME uses a Bayesian ontology to choose the reasoning mechanism, or, to retrieve the best performing suggestion. 4.2.1.7 BAYESIAN ONTOLOGY To more effectively organize all of the case predicates, the GPME uses an ontology. Figure 4.15 depicts the generalized ontology that maps directly to the case predicate. The Indications ontology organizes the preconditions. The Failures ontology enables the selection between several case predicates. The Responses ontology provides the plan of action that eventually results in suggestions.
147
Figure 4.15 - GPME Bayesian Ontology When an anomaly occurs, the GPME uses the Indications ontology to classify the anomaly. The indication points to several possible failure causes in the Failures ontology. The Failures ontology identifies the highest probability cause of failure that resulted in the anomaly based on its classification in the Indications ontology. From the cause, the Responses ontology identifies the highest probability response the GPME should use. Effectively, the GPME uses the ontology to dynamically manage the case predicates. The designer can hardwire the ontology through the EIS. This mechanism allows her to provide standard responses to certain known conditions. For example, the designer can specify that in the case of certain catastrophic failure patterns, the suggestion should be to return to base. Figure 4.24 shows several canned suggestions a designer might hardwire. Hardwired predicates are directly loaded into the ontology and never fully decay. However, the GPME will manage them like any other predicate in the ontology in terms of tracking their utility over time. A hardwired suggestion can fail to produce the expected result, in which case the GPME still needs
148
to find alternative suggestions. Even though the violation of a hardwired expectation is a rational anomaly, the GPME weights hardwired anomalies heavier than other rational anomalies when calculating its homeostasis. However, the Responses ontology organizes both reasoning mechanisms and suggestions. Initially, the Responses ontology classifies the success rate of the various reasoning mechanisms in addressing a given expectation violation. The GPME then uses the chosen reasoning mechanism to query the episodic memory for a plan of action. Eventually, if a reasoning mechanism consistently achieves high enough success, the predicate is stored directly into the ontology as a new suggestion. From that point on, the suggestion is immediately available. Therefore, a mature ontology consists primarily of suggestions with high probabilities of success. Referring to Figure 4.16, note that every possible suggestion is connected to every response. However, the connection between the response type and the suggestion carries its success rate. For example, the designer has specified two responses classes, A and B, and three suggestions, S1, S2 and S3. Each suggestion maps to plan of action. For example, suggestion S1 maps to action plan (a1, a2, a3). Suggestion S1 has a 90% success rate when applied in response to Class A and a success rate of 34% when applied in response to Class B. When the navigation from Indications to Failures to Response leads us to Response Class A, the GPME will suggest S1.
149
Figure 4.16 - Ontology Example - Initial State The suggestion and the plan of action are directly related to a case that provides its preconditions to the Indications and Failures ontology and its plan of action to the Responses ontology. With a 90% probability of success, these cases are mature and there is little to improve at this time. The specification of the Responses ontology is ideal for the circumstances as there is a definite successful suggestion. Consider the same situation for cases that lead to Class B. Those cases are immature because there is no definite successful response. However, it appears that action a2 is significant. The GPME creates a new suggestion S4 with an action plan of (a2). From Class B, the probability for S4 is 33%. The probability for S1 and S3 becomes 23%, and 21% for S2. From Class A, the probability for S4 is 0%. As we described earlier, the GPME also needs to teach the new S4 action plan to the host and make sure it is accepted. It is possible, for example, that the host is not capable of performing action a2 without performing a1, a7 or a6 first. Figure 4.17 shows the improved response ontology. 150
Figure 4.17 - Ontology Example - Improved State If S4 is ineffective (because it does not work or because the host cannot do it), the GPME will adjust the probabilities as S4 fails. If S4 is effective, its probability will eclipse the others. Eventually, through reasoning and testing, the GPME develops a new suggestion the designer did not specify, that enables mature cases. In this way, the GPME breaks through the brittleness of the ontology. Ontology objects are GPME objects; they decay when unused. A suggestion with a probability of success of 0% across the Responses ontology will eventually be deleted. Over time, the Responses ontology contains high probability suggestions for every response class. Therefore, the GPME spends less time in deliberation to find possible responses. However, as circumstances change, the reasoning mechanism option and the hardwired predicates can re-emerge as viable responses, as they never fully decay.
4.2.2 C OMPUTING P ROCESSES In the previous section, the dissertation described the structure and assembly of the knowledge base. We can now proceed with the description of the way the GPME uses the
151
knowledge base. This section describes a few internal processes (Figure 4.22) that take their name from Hindu mythology. The GPME processes are multithreaded and concurrent. 4.2.2.1 THE VISHNU PROCESS In mythology, Vishnu is the supreme god who masters the past, the present and the future, creates and destroys all existence, and preserves the universe. In the GPME, the Vishnu process uses the knowledge base to project the future. The future consists of several moments. Projecting the future involves determining what fragments will occur in future moments. Vishnu begins by constructing the current frame and adding it to short-term memory and into the frame clusters. Vishnu uses the instrument arrival rate and triggers an anomaly when the rate is outside the band. It matches the current frame to the fragments. For each matched fragment, it selects the 12±1 highest probability of occurrence Ψ and signature clusters that contain similar fragments to the current frames in short-term memory, using the previously matched fragments.
Figure 4.18 - Vishnu in action 152
As the paper described earlier, the centroids of these clusters contain fragments that reflect the most significant information from the underlying frames and episodes. These fragments are arranged in temporal order. Therefore, Vishnu distributes those fragments over future moments in the same order. In Figure 4.18, the circles labeled R represent the current frame or experience from the telemetry. The circles labeled T represent the expectation set for that moment. For each fragment, Vishnu places in a future moment, it projects forward from that fragment. The process is called spreading activation. Therefore, we can envision that the projection for a web as depicted in Figure 4.19. The Vishnu web has a fixed spread which represents the maximum distance, in terms of case hops, that a projected fragment is from the current frame. In the example, the spread is 3. From the current frame, the projection goes no further than 3-deep.
Figure 4.19 - Spread 3 Vishnu Web
153
The number of frames and fragments in working memory is the sum of the short-term memory (frames) and the Vishnu web (fragments). For example, with a spread of 3, the web contains up to 40 fragments. The number may be less because some fragments overlap such as the 2/3 fragment in Figure 4.19. A more practical spread of 7 contains nearly a million fragments allowing for about forty thousand frames in short-term memory. An overlapping fragment such as the 2/3 fragment in Figure 4.19 is called a nexus fragment. The GPME treats a nexus fragment like a single fragment from a navigation perspective but it maintains its distinct nature otherwise. Therefore, a nexus fragment can have multiples of 12±1 connections and each component fragment of the nexus decays at its own rate.
Figure 4.20 - Nexus Example In Figure 4.20, projections A and B share a nexus fragment. Projections C and D do not have a nexus fragment. When the GPME uses the nexus fragment, all paths are available. Under certain conditions, Projection B can switch to A.
154
4.2.2.2 THE SHIVA PROCESS In mythology, Shiva is the god of transformation. In the GPME, the Shiva process examines the projected fragments expected to occur in the current moment in order to match one of the fragments to the current frame. Typically, the frame overlaps the matched fragment but the overlap does not need to be perfect. When a fragment is matched to a frame, the combined fragment and frame pair is referred to as an anchor. An anchor is placed in short-term memory; short-term memory exclusively contains anchors. There are two types of anchors; matched and forced. When a fragment is not matched, Shiva sets its Damaru metric to Δ. For example, T14 and T22 will be set to Δ. A matched anchor is referred to as partial or perfect. A forced anchor occurs when no match is found (R3). In that case, the experience becomes the matched expectation. Perfect matches do not represent an anomaly and strongly reinforce the course of action. Partial matches and forced matches represent anomalies. Shiva uses the following rules to trigger anomalies. Note that the rules examine the segments of the frames: •
R Segment is Null and T segment is Null: No anomaly
•
R segment is Null and T segment is not Null: Strong anomaly
•
R segment is complete and T segment is complete and they are the same: No anomaly
•
R segment is complete and T segment is complete and they are not the same: Weak anomaly
•
R segment is complete and T fragment is a subset: No anomaly
•
R segment is complete and T fragment is not a subset: Weak anomaly
155
When there is no anomaly, Shiva reverses the decay of the cases that caused the anchor frames to be projected into the moment. Since Shiva knows which frames are in which future moments, it calculates the homeostasis bandwidth for each future moment. Once Shiva identifies the anchors in the current moment, Vishnu can push the projection from each new anchor to the full depth. Referring to Figure 4.19, assume the middle 1 is the only new anchor. Then, a new level of projections labeled 4 must be appended at the bottom of the sub-tree. Vishnu leaves the other sub-trees as they are. 4.2.2.3 THE KALI PROCESS In mythology, Kali is the goddess of time and death. In the GPME, the Kali process carries out the decay process. Therefore, it decays every Damaru object and calculates the current homeostasis value and emotional state. If the current homeostasis is outside the bandwidth projected for the current moment, Kali triggers an anomaly. After decaying all Damaru objects, Kali calculates the current homeostasis. It publishes the new moment identifier, the current homeostasis and the current emotional state. Fully decayed objects are deleted from the knowledge base. Because a Kali cycle defines a moment, it is important that a single cycle occurs quickly. If the Kali cycle is too slow, frames will become too large to process effectively. As a result, Kali operates on the GPGPU on a data structure that only contains the Damaru metric of objects both on the GPGPU and in main memory. This approach allows Kali to use several hundred threads for decay.
156
4.2.2.4 THE GANESH PROCESS In mythology, Ganesh is the god of intellect and the remover of obstacles. In the GPME, the Ganesh process determines the course of action to respond to an anomaly. The response can be to do nothing, to create a timer, to create a new projection or to correct a case’s links. Anomaly Handling When the anomaly involves the frame arrival rate, Ganesh consults the ontology for the hardwired response the designer specified. When the anomaly involves the project segment significance, Ganesh measures the frequency of occurrence of the anomaly using a timer. If the frequency trends upwards, it indicates that conditions in the environment may be diverging from the experience built into the knowledge base. Ganesh accelerates the decay of cached case predicates in the ontology to favor deliberation. When the anomaly involves the homeostasis bandwidth, Ganesh creates a new episode and monitors homeostasis to determine when to conclude the episode. The GPME employs heuristics to modify the cases that originated the fragments responsible for the anomaly. For example, refer to Figure 4.21. The Vishnu process selected a case and projected fragments C1 through C6. However, the homeostasis target at C6 occurs early in C4. This causes an anomaly because the actual homeostasis exceeds the projection at C4. Since the projected homeostasis is achieved, no new episode is necessary. However, Ganesh adjusts the case by creating a new causal link between C4 and C6. The result is a new case predicate that omits C5. The original predicate still exists but its Damaru is lesser.
157
Figure 4.21 - Example of Ganesh Heuristic Ganesh heuristics create or accelerate the decay of all types of links except temporal. When the anomaly involves an anchoring problem, Ganesh accelerates the decay of the cases that caused the anomalies. The degree of acceleration is a function of whether the anomaly is weak or strong. Ganesh packages information about the projection error for future analysis. Case Selection Ganesh consults the ontology to identify a response to the anomaly. The ontology can provide a reasoning mechanism or an actual plan of action. Ganesh uses the reasoning mechanism to select the case predicate. Ganesh communicates its suggestions to the host. Once the host accepts the suggestions, Ganesh projects the case predicate fragments into the appropriate future moments. A fragment Ganesh places in the future is called a goal fragment. A
158
goal fragment is associated with the anomaly that created it. If a goal fragment reaches Δ, Ganesh packages projection error information and re-plans a response to the anomaly using a different approach than the previous time. Case Exception A case exception occurs when no case is found to project from. This situation occurs when existing cases have failed or when no case exists. Ganesh uses its internal programming capability, called Amri, to experiment with its instruments to arrive at a solution. Ganesh uses the best performing case predicates and modifies the parameters of the actions contained in the plan of action. It uses values from the domain defined in the EIS that have not yet been tried. The instruments used for experimentation are the ones present in the case predicate. If there is no case predicate, it uses a random combination of instruments. 4.2.2.5 THE BRAHMA PROCESS In mythology, Brahma is the god of creation. In the GPME, the Brahma process creates clusters. Brahma takes each new frame Shiva creates and manages the signature and distance clusters. Brahma takes each new episode Vishnu creates and manages the case clusters. Managing the clusters means looking for all clusters where the new frame or episode is a better fit than existing frames or clusters. If a change is made to the cluster, Brahma generates new predicates and populates the ontology. When the Damaru metric of a cluster reaches μ, Brahma re-evaluates the cluster taking the projection error information Ganesh previously packaged into account. This analysis rates the predicates of the cluster based on the projection error information from anomalies.
159
4.2.2.6 THE SARASWATI PROCESS In mythology, Saraswati is the goddess of knowledge, music, arts, and science. She endows humans with the power of speech, wisdom, and learning. In the GPME, the Saraswati process is responsible for social network communication. Lakshmi prepares the SIS and broadcast messages to other GPME instances. Periodically, Saraswati examines the knowledge base and issues the Request-Whenever message. Saraswati also accepts messages from other GPME instances and populates the social cases into the knowledge base. 4.2.2.7 THE LAKSHMI PROCESS In mythology, Lakshmi is the goddess of wealth and prosperity. She is also the mother of Kama (Desire). In the GPME, Lakshmi is responsible for handling requests from the host. A request is not telemetry; it is the dynamic formulation of an objective expressed in terms of observations. Lakshmi projects case predicates to achieve the objective. The fragments Lakshmi projects are called objective fragments. An objective fragment is a type of goal fragment. The Lakshmi process manages the projection accuracy bandwidth. It detects variations between the knowledge base and the environment and triggers the anomaly Ganesh can react to. 4.2.2.8 THE PARVATI PROCESS In mythology, Parvati is the divine mother. In the GPME, the Parvati process is responsible for processing the EIS, establishing communication with the host and initiating all the other processes. The Parvati process provides management instrumentation into the GPME instance.
160
When the designer composes the EIS, she creates instrument types and instrument instances. The type specifies the capabilities and attributes of the instrument. However, instrument types are optional in the EIS. If the EIS references an instrument type that is not defined locally, Parvati looks for a publicly registered sensor of the same type name on the web site. This approach allows designers to share instrument type definitions and facilitate the creation of swarms. 4.2.2.9 PROCESS DEPLOYMENT CONSTRAINTS The Vishnu, Shiva and Kali processes are heavily threaded. As a result, they are designed to run on a GPU. The data structures that enable the Ganesh web must account for the GPU’s constraints. The Ganesh and Brahma processes are also multithreaded but require substantially fewer threads. The Lakshmi and Parvati processes can operate single threaded. The latter four processes are CPU bound since they must access and manage the episodic memory. The Kali process must perform its function on both episodic memory and Ganesh web. As a result, it straddles both CPU and GPU.
161
Figure 4.22 - GPME Processes
4.3
C OMMUNICATION I NTERFACE S PECIFICATION There are two communication interfaces; GPME-Host and GPME-GPME. This section
gives a clear understanding of how the host communicates with the GPME and how GPME instances communicate with each other. The GPME-Host interface specification consists of two parts; Environmental and Operational. The Environmental interface is an XML document that describes the capabilities of the cognitive agent and the characteristics of the environment, to the metacognitive agent. In the process, the Environmental interface (EIS) defines the ontology the Operational interface uses. The Operational interface (OIS) supports the monitor and control flows during execution time. It 162
defines the structure of operational messages exchanged between the host and the GPME in real time. The Operational interface is a subset of FIPA ACL or an XML implementation of a subset of FIPA ACL. The designer specifies the Environmental Interface to the metacognitive agent before execution time. It is part of the initialization of the agents. Once the metacognitive agent validates and accepts the document, it is ready to collaborate with the cognitive agent. EIS defines the environment in terms of the sensors and actuators the host possesses. The sensors are defined in terms of the observations they are capable of making. A sensor can be focused on the environment or on the host itself. The actuators are defined in terms of their capabilities to affect observations. It also defines the expectations the host has. An expectation is defined in terms of predicates. A predicate consists of three parts; an operator (p), a feature (f) defined in the interface, and an expected value or state (v). A predicate is denoted (o, f, v) and it evaluates true or false when a specific observation (f, v) is applied. Typically, an expectation consists of several predicates such as E1 = {(p1, f1, v1), (p2, f2, v2),…,(pn, fn, vn)}. An expectation is conjunctive; all predicates must be true in order for the expectation to be deemed violated. OIS defines the message format that will be exchanged in real time. Since the GPME does not have direct access to the environment or the sensors, the host reports observations to the GPME. This telemetry is continuous and asynchronous. The host does not wait for a response; it simply reports its newest observations or all its observations as they occur. The telemetry therefore consists of a continuous stream for observation value pairs, (fn, vn), in a format
163
specified in the Environmental specification. For example, the observation value pairs could be fixed format, XML, FIPA, etc. There is no timing implied in the observation stream. At any point in time, the GPME can interrupt the host with a suggestion. This suggestion is a stream of actuator action parameter pairs, (an,pn), in a format defined in the EIS using a delivery mechanism also specified in the same interface specification. For example, the response could be fixed format, XML, FIPA, etc., delivered by callback RPC, web service call, file exchange, etc. There is timing implied in the response stream, in that all the actuator parameter pairs the GPME delivers at a given point in time are considered to form a single response. The GPME-GPME interface is also divided into two parts like the EIS and OIS. Therefore, the GPME and the host are fully decoupled. The Environmental and Operational interface specifications allow the integration of new hosts without requiring any reprogramming of the GPME. The GPME can use separate computing resources from the host. The GPME can be deployed in the cloud as a service. The host may require some programming if it was not designed to report observations or to receive suggestions.
4.3.1 I NFORMATION M ODEL The GPME information model depicted below is the foundation for the Environmental interface specification and for the internal data structure within the GPME. Shaded entities contain information specific to the application. Clear entities contain static information. Each entity has a primary key to uniquely identify its members (rows). The arrow indicates the direction of the parent-child relationship, where the child points to the parent. Children inherit the primary key of the parent entity row it is related to.
164
Figure 4.23 - GPME Host Information Model For example, the parents of the entity called SENSOR TYPE are the entities SENSOR SCOPE and SENSOR CLASS. As a result, the entity Sensor inherits the primary key of both its parents as part of its own primary key. The primary key of a row in SENSOR is the concatenation of the primary keys of the related rows in SENSOR TYPE, SENSOR CLASS and its own primary key. A SENSOR SCOPE classifies a sensor definition as either observing the agent (self) or observing the environment. A sensor class classifies the behavior of the sensor. Classified by sensor scope and sensor class, a sensor type defines a type of sensor the agent possesses. A sensor noise classifies the noise characteristics of a specific instance of a sensor. A sensor is an actual real sensor as an instance of a sensor type. For example, assume that “Camera” is a sensor type. The agent can define Camera-1 and Camera-2 as instances of Camera, each with a different (sensor) noise profile. 165
A feature data type classifies the domain of a feature. The Set type allows the definition of an ordered enumerated list of values. The Predicate type allows the definition of a logical predicate that can be asserted (made true), denied (made false) or contradicted (made not its current state). A feature is a discrete information item the host can report to the GPME. The feature data type classifies it. The Sensor Type Feature entity associates a feature with a type of sensor. For example, the Energy Level is a feature reported by the Energy Sensor. The same feature can be reported by different sensor types. An Actuator Type defines a class of actuator. An Action defines a possible action the cognitive agent can perform. The entity Actuator Type Action associates the action with the Actuator Type that can perform it. Similarly, the entity Actuator Type Features associates the Actuator Type with the Features in the environment that it can affect. The entity Actuator Type Feature Actions associates the actions of the Actuator Type that affect the feature. This information can be specified in the Environmental interface; however, GPME also derives it over time. Lastly, an Actuator is an instance of an actuator type. For example, Arm is a type of actuator and Right Arm and Left Arm are instances of Arm. An observation is an instance of a sensor reporting the value of a feature. While a sensor may report all feature values defined in the entity sensor type features, the GPME only considers observations in its deliberations.
166
An expectation observation is a projection of a feature value. If the projection is not achieved, an expectation violation occurs. The expectation type classifies an expectation based on when the GPME checks for a violation. The bold expectation types are checked continually and can trigger guidance asynchronously. The expectation operator assigns the comparison operator for comparing the projected value and the observed value. If the result of the comparison is false, an expectation violation has occurred. An expectation is a predicate composed of Observation Expectations. If any of the expectation observations is not met, an expectation violation has occurred (conjunctive). An expectation group contains expectations that are evaluated together. The Expectation Group Member entity associates an Expectation to a group. An expectation can be part of several groups. An expectation group is either conjunctive or disjunctive. In a conjunctive group, the logical operator AND is applied to all member expectations to determine the truth value. In a disjunctive group, the logical operator OR is applied instead.
4.3.2 E NVIRONMENT I NTERFACE S PECIFICATION The purpose of the EIS is to establish a context of operations, to define the environment or universe within which the system operates. The GPME-GPME EIS interface reuses parts of GPME-Host EIS. Therefore, we begin with describing the GPME-Host EIS. 4.3.2.1 GPME-HOST EIS Based on the model, the Interface Specification XML takes the following form:
Host Interface Specification (EIS, Ontology, and OIS)
167
Host Interface Specification (EIS, Ontology, and OIS) Enumerated list of values delimited by DELIMITER if enum is YES in sorted order if ordered is YES initial value expected value or parameter based on operator OR List of expectation names in strict order of precedence if ordered is YES descriptive text descriptive text descriptive text descriptive text descriptive text path or URI path or URI
A primitive action the agent can perform. If the action has no child actuatorTypeFeature tag defined, the action is excluded from case based reasoning. State: Control: Spatial: Temporal: Resource: Reward: Ambient: Property: Message: Counter: Unspecified: Specifies the initial cost of a link between ontology nodes
Class
Cost Goal
Key
Yes: This expectation group contains expectations that reflect a goal of the agent. No: This expectation group does not contain any expectations that support any goals of the agent. For example, they are expectations about the environment. String that uniquely identifies the agents to each other.
Creates a link between the ontology node specified in the parent NODE tag and the ontology node specified in the name attribute of the LinksToNode tag. LinksToExpectationGroup Creates a link between the ontology node specified in the parent NODE tag and the expectation group specified in the name attribute of the LinksToExpectationGroup tag. Must be a leaf node in the ontology. Creates a link between the ontology node specified in the parent LinksToFeature NODE tag and the sensor feature specified by the feature and sensor attributes of the LinksToFeature tag. Must be a leaf node in the ontology. Creates a link between the ontology node specified in the parent LinksToResponse NODE tag and the response. Must be a leaf node in the ontology. A node in the ontology. The nodes FAILURES, INDICATIONS, Node RESPONSES are the root nodes for which all nodes in the given ontology have a path. Perfect: Noise Uniform: Automatic: Unspecified: String that uniquely identifies the specification instance. The Version agent can revise the EIS during runtime. LinksToNode
169
Attribute
Value
Ontology
Defines the ontology GPME uses. If omitted, GPME will use the MCL2 Bayesian ontology. Go Up: Trend Go Down: Trend Net Zero: Value Any Change: Value Net Range: Value Take Value: Value Don’t Care: Value Be Legal: Value Real Time: Watchdog Tick Time: Watchdog Stay Under: Value Stay Over: Value Maintain Value: Value Within Normal: Value Parent class within the Response ontology.
Operator
Response Scope SetsTime
Type
Self: Applies to the cognitive agent itself Environment: Applies to the environment If YES, any observation reported from this sensor is a clock tick. This allows GPME and the agent to establish timing between them using an observation. For example, this can be a heartbeat the agent sends every second. Integer: Rational: Natural: >= 0 Binary: Bitfield: -1, 0 , 1 Symbol: Set: Range: Boolean: 0, 1 Predicate: String Table 4.6 Host Interface Dictionary
The following EIS specification is an example: N;E;W;S
170
10 Exp1;Exp2; someGPMEURI someAgentURI
4.3.2.2 HARDWIRING THE ONTOLOGY GPME uses a Bayesian ontology to link designer-specified expectations to responses. We will define expectations further in the paper. Designer specified expectations are called hardwired expectations. The ontology is based on the Bayesian MCL implementation described in [74]. The host must know how to handle the responses shown on Figure 4.24 as well as any other responses the designer specifies. The EIS enables the designer to specify the ontologies and their linkages through the abstract nodes by using the , , ,
171
and tags. For example, the designer can specify that a certain failure as reported by an internal sensor requires a response of Solicit Help. When the GPME detects the expectation violation, it will suggest Solicit Help and the host inherently possesses the ability to do so, for example, flashing a warning light or broadcasting a distress signal.
Figure 4.24 - GPME Bayesian Ontology The EIS specification of the ontology described in Figure 4.24 follows, using only one response. Note that linkages are only defined once since a Bayesian network is a directed acyclic graph and the GPME always proceeds from indications to responses through failures: Try it again, Sam!
172
leftmost abstract node middle abstract node rightmost abstract node middle abstract node rightmost abstract node try the action because of failure
The hardwired components defined in the EIS are always part of the knowledge base.
4.3.3 O PERATIONAL I NTERFACE S PECIFICATION The Operational Interface is based on FIPA and its concept of ontology-based communication. GPME does not manipulate the host’s knowledge base, and it does not allow the
173
host to manipulate its own knowledge base. The interface requirements align better with FIPA. It begins in the EIS between the tags : path or URI
The conversation attribute is set to YES to indicate that every message from the GPME to the host requires an acknowledgement response. When acknowledgment is required, the recipient sends a confirmation message using the sender’s invocation specification, for every message received. Another message cannot be sent until the confirmation is received. This attribute enables conversational mode at the session level. When the attribute is set to NO, the session is non-conversational. The emote attribute is set to YES to indicate that the GPME will inform the host of any changes to the GPME’s emotional state. Otherwise, the host will receive the GPME’s emotional state only when the GPME has a suggestion. The acknowledge attribute is set to YES to indicate that the host will inform the GPME of the completion status of each suggestion it receives. The invocation tag specifies the invocation method for GPME and for the host. The method WS indicates that the agent accepts a web service invocation. The method PATH indicates that the sender must write the message to the specified file path. The use of web services enables GPME to apply security best practices, thereby addressing the needs for secure 174
communication between GPME and the agent. Secure web services best practices are defined in National Institute of Standards and Technology (NIST) Special Publication (SP) 800-95 [117]. This format attribute defines the syntax of the exchanges between the agent and GPME; flat (delimited format), FIPA or FIPA as XML. When message format is FLAT, the agent sends fixed format delimited message where the delimiter separates observations. Each feature has a unique identifier specified by the concatenation of the sensor name attribute with the sensorTypeFeature name attribute in the EIS, immediately followed by the value and the delimiter. Using the EIS specification example above, gauge_mainF100; is the message that reports fuel level at 100 for the sensor gauge_main. 4.3.3.1 SYNTAX AND DICTIONARY The following table describes the message formats FIPA and FIPAXML: FIPAXML Syntax
FIPA Syntax (performative :sender KEY :receiver KEY :content X :language XML|FLAT :ontology V :reply-with W :in-reply-to P)
< performative sender=”KEY” receiver=”KEY” language=”XML|FLAT” ontology=”V” reply-with=”W” in-replyto=”P”> X: Content of message in language format
The dictionary is as follows:
Attribute
Value
KEY
A unique value defining the specific instance of the agent in the EIS.
X
The content of the message being sent from one agent to the other.
Language
FLAT: The content of the message is in flat delimited format. XML: The content of the message is in XML. Version ID of the EIS the agent is using. The concatenation of KEY and V uniquely identifies the ontology that gives semantic meaning to content X.
V
175
Attribute
Value
Episode ID when sent by GPME. Message sequence number when sent by host. Last value of W when used for agree, cancel, failure, not-understood and refuse, P ignored otherwise. Performative The communicative acts are: Inform: from host to GPME to report telemetry, or, from GPME to host to provide known suggestion Refuse: from host to GPME, or vice-versa, to refuse to perform an understood request Failure: from host to GPME that request failed Request: from host to GPME to ask for a plan to achieve the stated objectives Subscribe: from GPME to host to request observations Cancel: from GPME to host to cancel subscribe or request or from host to GPME to cease all learning activity Not-Understood: from host to GPME that it could not perform request Agree: response to subscribe, request and cancel Propose: from host to GPME to propose the EIS or from GPME to host to propose a new response type and action plan Accept-Proposal: from GPME to host when accepting the proposed EIS or from host to GPME when accepting a new response type Reject-Proposal: from GPME to host when rejecting the proposed EIS or from host to GPME when rejecting a new response type Confirm: from host to GPME that it is ready for the next message when conversation mode is enabled, or, from host to GPME that it has completed the actions for the specified suggestion. W
Table 4.7 Host Interface Dictionary 4.3.3.2 COMMUNICATION FLOW Figure 4.25 depicts the communicative acts that can flow between the two agents:
176
Figure 4.25 - GPME-Host Interface Acts Figure 4.26 depicts the start of communication between the agents. In the proposal diagram, the host submits the EIS to GPME. If GPME accepts the EIS, the subscribe diagram shows the start of their communications:
177
Figure 4.26 - Proposal Flow
Figure 4.27 - Subscribe Flow The inform flow is very simple because the transport mechanism between GPME and the host is reliable:
178
Figure 4.28 - Inform Flow The request flow communicates a response from GPME to the host, indicating that GPME requests the host to perform some action. The host always sends an agree message to indicate receipt and acceptance of the request message. In conversational mode, the host controls the arrival of responses by sending a confirm message when it is ready to receive the next request message.
Figure 4.29 - Request Flow
179
In the examples that follow, the FIPA message omits the labels (:sender, :receiver, etc.). However, these labels are present in practice. Returning to the example where the agent reports the fuel level at 100: Format Language DELIMITED N/A FIPA FLAT XML FIPAXML FLAT
XML
Agent sends Message gauge_mainF100 (inform A B (gauge_mainF100) FLAT 1 1 1) (inform A B (100) FLAT 1 1 1) gauge_mainF100 100
Table 4.8 FIPA Examples When the invocation method is WS, the host provides a method for each communicative act, as well as a utility method called Post. The Post method accepts one parameter which is the message itself. For example, assuming GPMEWS is the URI specified in the tag, the agent can call GPMEWS.Inform(“robot1”,”GPME”,”FLAT”,”1”,”1”,”1”,”gauge_mainF100”) or GPMEWS.Post(“(inform A B (gauge_mainF100) FLAT 1 1 1)”).
4.3.4 S OCIAL I NTERFACE S PECIFICATION Although the GPME-GPME Social Interface Specification is very similar to the EIS, they are used very differently. The EIS is loaded into the GPME at initialization. From that point on, the host and the GPME use the OIS to communicate. However, the GPME uses the SIS in real time to communicate with other GPME. A GPME always broadcasts its message to all other GPME instances and the message is always in SIS format. 180
Social Interface Specification Duplicate the context (all feature, sensorType, sensor and actuator tags) from the EIS. Content of segment in case Strict order if ordered is YES Strict order if ordered is YES SOA topic name or path or URI pad text
Table 4.9 Social Interface Specification Duplicating the feature, sensorType, sensor and actuator from its own EIS allows another GPME instance to determine if it should communicate. If there is insufficient overlap in these tags, the two GPME instance do not share enough context to have a useful communication. If the other GPME instance determines that communication will be useful, it uses the OIS to access the broadcast of information from the originating GPME.
181
The SIS provides an additional method for invocation; Topic. In this context, a topic is a durable subscription to a message stream [118]. This communication method is particularly appropriate for the GPME social network. The SIS introduces two new acts; Propagate and Request-Whenever. The originating GPME uses the Propagate act to broadcast a message to all other GPME. The Request-Whenever act registers the GPME to receive all messages that meet certain conditions that relates to the maturity level attribute and the contents of the tag. The Subscribe act subscribes the GPME to the broadcast of another GPME instance. The Cancel act ends a subscription.
Figure 4.30 – GPME-GPME Acts In Figure 4.30, the GPME that receives the episodic memory is labeled learner. The GPME that provides the episodic memory is labeled model. When a GPME starts operating, it uses Propagate to broadcast its first SIS message. It uses Request-Whenever to specify the maturity level and specific context characteristics of models it is interested in. It receives the most recent SIS message from each of the other GPME instances that match or exceed the maturity level and meet the context selection criteria. It uses Subscribe to receive all future messages from the other GPME. Subscribe also causes the model to provide all relevant episodic
182
memory to date. The two GPME instances exchange security information to encrypt their communication. From this point on, the other GPME is a model and the learner begins receiving all future messages from the model. Note that the Request-Whenever still filters arriving messages. The learner can change the filter by issuing a new Request-Whenever. If the model’s maturity level drops, the learner stops receiving broadcasts because of the broadcast. This behavior also applies if the learner raises the desired maturity level above that of the model. However, the subscription is still active. As soon as the model’s maturity level is high enough, message delivery resumes. To end a subscription and disconnect from another GPME completely, the learner must issue a Cancel. Figure 4.31 summarizes the GPME social network interactions from the learner’s perspective.
Figure 4.31 - GPME Learner’s Social Network Interactions
183
4.4
A MRI Q UERY L ANGUAGE The GPME develops, tests, stores, and executes case queries. A case query produces a
projection. The Ganesh process creates a thread that generate and test case queries. This thread is called a Viumbe.
4.4.1 V IUMBE R UN The Viumbe runtime uses two private storage areas; the stack and the parameter vector. The Stack holds 132 (169) or 133 (2167) entries consisting of a set identifier and a link. 132 or 133 links are the maximum depth and scope of any command. All links are the same data type. The Parameter vector holds (key, value) pairs that are inputs into the functions. The runtime initializes its own stack. The Viumbe initializes the parameter vector and passes it to the runtime. The runtime has a single Operator of control that executes the program it was invoked with. Since the program is stored in a Memory, the Viumbe passes the Memory link of the program it wants executed to the runtime. The Viumbe starts the runtime by calling the function ViumbeRun (ProgramLink, ParameterVector). This is a blocking procedure call provided by the Kiota. The Kiota initiates a thread running the ViumbeRun interpreter program with the supplied parameters. ViumbeRun returns either a link to the memory inside the Viumbe’s where the resulting pattern is stored or a null link if the resulting pattern is empty.
4.4.2 MAKESET MAKESET (MKS) is a function that builds a set of links by analyzing the features of Memories. It can analyze the features supplies from the sensor feed, the meta-features added at 184
creation including the distinct vector element snapshots attached to the memory at creation. All of the parameters defined in the following table are required: Parameter Mode
Values MRG DST
Source
STK MMR
Type
PTN PGM TPL SSR ANY SLF MNT ANY
Origin
Description MERGED: The result of the function will be merged into a single set on the stack. DISTINCT: The result of the function will be several sets on the stack that will not be merged into one. STACK: Use the stack as the source of the links on which to perform the function. MEMORY: Use the Viumbe’s Memory store as the source of the link on which to perform the function. Start with the Memory whose link is at the top of the stack. Specifies the type of Memory that links in the result set should point to. PTN: PATTERN PGM: PROGRAM TPL: THEORY SSR: SENSOR Specifies the provider of the Memory that links in the result set point to. SLF: Self only ANY: Any swarm member including self MNT: Model only
Table 4.10 MAKESET Command Syntax The following are examples of MAKESET commands: •
MKS MMT DST MMR ANY SLF o This command builds one set of memory links for each moment in the parameter vector. Any memory, created by the Viumbe itself in one of the specified moments, is included.
•
MKS NSM MRG STK SSR TEM o Using links already on the stack, this command builds several sets of sensory memory links where all the entropic features are in the optimal range and the provider of the memory was a clique member. All 1320 combinations of parameters are valid and results in a set (including a null set) of
links. 185
4.4.3 S ELECT The SELECT (SLT) command instructs the runtime to select a set of links from the stack. This command only uses the stack.
Parameter
Values
Description
Method
INT UNN DFR SML FZY EQV MSF LSF BST WST
Mode
MRG DST
INT: Leave only the intersection of all sets on the stack. UNN: Leave the union of all sets on the stack, removing all duplicate links. DFR (Difference): Leave only links that only appear in one set on the stack. SML (Similar): Leave links of memories that have the same Damaru features in the optimal range. FZY (Fuzzy): Leave links of memories that have the same set of features defined. EQV (Equivalent): Leave links of memories that have the same set of features defined and the same values. MSF (MostSignificant): Leave the 12±1 links of memories that have the highest Damaru metrics across any feature. LSF (LeastSignificant): Leave the 12±1 links that have the lowest Damaru metrics across any feature. BST (Best): Leave the link to the memory with the highest Damaru metric across any feature. WST (Worst): Leave the link to the memory with the lowest Damaru metric across any feature. MERGED: The result of the function will be merged into a single set on the stack. DISTINCT: The result of the function will be several sets on the stack that will not be merged into one. Table 4.11 SELECT Command Syntax
The following are examples of selection commands: o SLT INT MRG o Merge the links common to all sets into a single set. o SLT DFR DST o Remove all links that appear in another set from the set, leaving only unique set members on the stack, each its original set.
186
All 20 combinations of parameters are valid selection strategies.
4.4.4 M AKE P ATTERN The MAKEPATTERN (MKP) command instructs the runtime to create the pattern using the links currently on the stack. If the stack is empty, it returns a null link. If the stack is not empty, it builds the pattern, it calculates the Hormonal Potential vector and it requests the Indexing Operator to store it in a memory location. It then returns the link to the memory location to the Action Operator. MKP then clears the stack. MKP is normally the last command in a theory as it initiates the pattern. However, MKP can occur anywhere in the theory. Each program carries a Hormonal Potential vector as a feature of the memory containing it. MKP calculated Hormonal Potential vector is calculated as the memory is created. As MKP processes the final set of links on the stack to create the action list, it considers the snapshot of the Hormonal vector each memory carries. It computes the difference between the elements of the snapshot vectors as it processes each memory. As the end, the Hormonal Potential vector is a Hormonal Vector that carries the potential change in the Hormonal Vector should the program be applied. Typically, the program with the Hormonal Potential vector with the highest value in element 0 is the best choice.
187
CHAPTER 5:
SUMMARY AND CONCLUSIONS
The questions the study examines are: •
What is the behavior oriented intelligence framework?
•
What is the complete theoretical design of the GPME?
•
How are observations organized for processing (the Kasai)?
•
How does the GPME use the observations organized by Kasai to predict?
•
What is the specification of the communication interface between the GPME and the host, and, between the GPME instances? In this dissertation, I presented a complete theoretical design for the GPME. The design
identifies the data structures, data representation and processes necessary to create an artificial intelligence that acquires behaviors through observation. The observations are directly obtained from the environment through sensors that are integrated within the GPME. The design includes an open and flexible interface that extends the GPME’s data model to include new sensors and new actuators. In addition, the GPME can learn behaviors from other instances by reconciling its own data structure model with that of the other instance. The core enabling capability within the GPME is called the Kasai algorithm. The Kasai algorithm allows the detection and encoding of behaviors found in a series. The grammar the Kasai discovers is used to define the normal state of the environment. Differences from the normal state are anomalies that are the stimuli that trigger the behavior composition function of the GPME. When the GPME see something unexpected, it engages its learning mechanism. The dissertation focused on describing and demonstrating the Kasai algorithm. In the results section, I present various input series to the Kasai algorithm to demonstrate its ability to capture behaviors as grammatical rules. The time analysis of the Kasai algorithm is
188
O(log n). The performance of the Kasai algorithm in terms of generating a complete set of rules that represent the input data series was accurate 100% of the time. Therefore, the Kasai algorithm provides the ability to predict the future state of the input series. In addition, the Kasai algorithm recognizes anomalies 100% of the time. I discovered that without recognizing anomalies every time, the rules cannot be correct, since the Sarufi will not fully represent the data series. The question of anomaly recognition is the other side of the coin from accurate prediction. Future work will also focus on the validation of the activation algorithm using the data series cases through a formal proof of correctness. With the functionality of the Kasai algorithm established, post-doctoral work can proceed to build Kasai networks and the Vishnu Web. The Kasai network is the system that produces prediction and signals anomalies. The Vishnu Web is the propagated prediction state. The Kasai algorithm predicts one moment forward. The Vishnu Web is generated by applying the Kasai algorithm projecting for several moments forward into the future, to enable the detection of future anomalies. Future anomalies give the GPME the time to plan, to avoid problems, and to choose the course of action that provides the best rewards.
189
APPENDIX I: WEATHER EXPERIMENT DETAILS The following table gives a detailed description of the parameters of the appropriate WEKA algorithm that was used in this experiment. Each algorithm was run four times to generate a composite model that in turn generated a four-dimensional point in space. Observe how each attribute is calculated from its model's inputs and weights. WEKA Algorithm Parameters M5P: M5 pruned MinTemp = 0.2614 * MaxTemp + 0.5068 * MinDew + 0.1213 * model tree (using MaxDew+ 3.9795 smoothed linear MaxTemp = 0.6905 * MinTemp - 0.1134 * MinDew + 0.4507 * models) MaxDew + 15.6291 MinDew = 0.6582 * MinTemp - 0.0478 * MaxTemp + 0.4558 * MaxDew - 12.6865 MaxDew = 0.1503 * MinTemp + 0.2882 * MaxTemp + 0.5625 * MinDew + 3.3953 MULTILAYER MinTemp: PERCEPTRON: In Linear Node 0: Inputs | Weights order to predict all Threshold | -0.08205495661629685 four attributes, Node 1 | -1.1758383531184908 WEKA used a Node 2 | 1.0978109055429828 multilayer Sigmoid Node 1: perceptron whose Inputs | Weights hidden layer consists Threshold | -1.9328466264016315 of two sigmoid Attrib MaxTemp | -1.1454180658985536 function nodes and Attrib MinDew | -2.006464685228817 whose output layer Attrib MaxDew | 0.07117407205306661 consists of one linear function node. Sigmoid Node 2: Inputs | Weights Threshold | -1.9992476530126 Attrib MaxTemp | 0.8647434601865871 Attrib MinDew | 1.5157897665198308 Attrib MaxDew | 1.382562284799789 MaxTemp: Linear Node 0 Inputs | Weights Threshold | 0.09643594174420236 Node 1 | -1.0994821411882159 Node 2 | 1.0422307884350526 190
For the tree based methods (REPTREE, Random Forest, Regression by Discretization, RANDC), the following decision tree were used: MinTemp
192
MinTemp
193
MinTemp
194
MaxTemp
195
MaxTemp
196
MaxTemp
MinDew
197
MinDew
198
MinDew
199
REFERENCES [1]
R. Sun, “Artificial Intelligence: Connectionist and Symbolic Approaches,” 2000.
[2]
D. Kahneman, Thinking, fast and slow. 2011.
[3]
K. M. M’Balé, “System, method and algorithm to Analyze Data Series to Create a Set of Rules (Patent Pending 62/500770),” 62/500770 (Provisional), 2017.
[4]
M. Heisenberg, “The Beauty of the Network in the Brain and the Origin of the Mind in the Control of Behavior.,” J. Neurogenet., vol. 28, no. April, pp. 1–11, 2014.
[5]
N. S. Chouhan, R. Wolf, C. Helfrich-Förster, and M. Heisenberg, “Flies Remember the Time of Day,” Curr. Biol., vol. 25, no. 12, pp. 1619–1624, 2015.
[6]
B. F. Skinner, “Are theories of learning necessary?,” Psychol. Rev., vol. 57, pp. 193–216, 1950.
[7]
J. B. Watson, “Psychology as the Behaviorist Views It,” Psychol. Rev., vol. 101, no. 2, pp. 248–253, 1994.
[8]
N. Chomsky, “A Review of B. F. Skinner’s Verbal Behavior,” in Readings in the Psychology of Language, L. A. Jakobovits and M. S. Miron, Eds. Englewood Cliffs, NJ: Prentice-Hall, 1967, pp. 142–143.
[9]
N. Chomsky, Language and Thought. Wakefield, RI: Moyer Bell, 1994.
[10]
N. Chomsky, Language and Mind, 3rd ed. Cambridge University Press, 2006.
[11]
B. Hale, “The last distinction? Talking to the animals,” Harper’s Magazine, 2012.
[12]
E. Wasserman, W. James, and D. Brooks, “Baboons And Pigeons Are Capable Of HigherLevel Cognition , Behavioral Studies Show,” Science Daily, pp. 11–13, 2009.
[13]
T. R. Zentall, J. Sutton, and L. Sherburne, “True Imitiative Learning in Pigeons,” Psychol. Sci., vol. 7, no. 8, pp. 343–346, 1996.
[14]
K. Brakke and S. Savage-Rumbaugh, “The development of language skills in bonobo and chimpanzee,” Lang. Commun., vol. 15, no. 2, pp. 121–148, 1995.
[15]
S. Savage-Rumbaugh, W. Fields, Par Segerdahl, and D. Rumbaugh, “Culture Prefigures Cognition in Pan / Homo Bonobos,” Theoria, vol. 4548, pp. 311–328, 2005.
200
[16]
T. R. Zentall, “Imitation: definitions, evidence and mechanisms,” Anim. Cogn., vol. 9, no. 4, pp. 335–353, 2006.
[17] T. R. Zentall, “Imitation by Animals : How Do They Do It ?,” Curr. Dir. Psychol. Sci., vol. 12, no. 3, pp. 91–94, 2003. [18]
C. M. Heyes, “Imitation, Culture and Cognition,” Anim. Behav., vol. 46, pp. 999–1010, 1993.
[19] K. Kaye and J. Markus, “Infant Imitation: The Sensory-Motor Agenda,” Dev. Psychol., vol. 17, no. 3, pp. 258–265, 1981. [20]
G. Rizzolatti and M. Fabbri-Destro, “The mirror system and its role in social cognition.,” Curr. Opin. Neurobiol., vol. 18, no. 2, pp. 179–84, Apr. 2008.
[21]
C. Steifel, “What your child learns by imitating you,” Parents Magazine, 2012. [Online]. Available: http://www.parents.com/toddlerspreschoolers/development/behavioral/learning-by-imitating-you/.
[22]
B. Rogoff, R. Paradise, R. M. Arauz, M. Correa-Chavez, and C. Angelillo, “Firsthand learning through intent participation.,” Annu. Rev. Psychol., vol. 54, pp. 175–203, Jan. 2003.
[23]
A. Bandura, Psychological Modeling: Conflicting Theories. Transaction Publishers, 2006.
[24]
J. J. Foxe and A. C. Snyder, “The Role of Alpha-Band Brain Oscillations as a Sensory Suppression Mechanism during Selective Attention.,” Front. Psychol., vol. 2, no. July, p. 154, Jan. 2011.
[25]
J. S. Takahashi, “Molecular Neurobiology and Genetics of Circadian Rhythms in Mammals,” Annu. Rev. Neurosci., vol. 18, no. 1, pp. 531–553, 1995.
[26]
R. Benca, M. J. Duncan, E. Frank, C. McClung, R. J. Nelson, and A. Vicentic, “Biological rhythms, higher brain function, and behavior: Gaps, opportunities, and challenges,” Brain Res. Rev., vol. 62, no. 1, pp. 57–70, 2009.
[27]
J. Plantinga and L. J. Trainor, “Melody recognition by two-month-old infants.,” J. Acoust. Soc. Am., vol. 125, no. 2, p. EL58-62, Feb. 2009.
[28] W. T. Keeton, “Orientation by pigeons: is the sun necessary?,” Science, vol. 165, no. 3896, pp. 922–8, Aug. 1969. [29]
G. G. Ramsay, “Noam Chomsky on Where Artificial Intelligence Went Wrong,” The Atlantic, pp. 1–20, 2012.
[30]
T. Eiter and M. Simkus, “FDNC : Decidable Nonmonotonic Disjunctive Logic Programs 201
with Function Symbols,” vol. 9, no. 9, pp. 1–45, 2007. [31]
A. Darwiche, “Bayesian networks,” Commun. ACM, vol. 53, no. 12, p. 80, Dec. 2010.
[32]
K. Kelland, “Scientists find way to map brain’s complexity,” Reuters, 2011. [Online]. Available: http://www.reuters.com/article/2011/04/10/us-brain-modelidUSTRE7392KU20110410.
[33]
T. J. Sejnowski, “The product of our neurons,” New Scientist, vol. 334, no. 6056, p. 1, 04Nov-2012.
[34]
S. Brenner and T. J. Sejnowski, “Understanding the human brain.,” Science, vol. 334, no. 6056, p. 567, Nov. 2011.
[35]
E. Underwood, “BRAIN project meets physics,” Science (80-. )., vol. 344, no. 6187, pp. 954–955, 2014.
[36] S. Kak, C. Donald, and E. T. Delaune, “Machines and Consciousness,” Ubiquity, New York, NY, pp. 1–16, Nov-2005. [37]
D. Vernon, G. Metta, and G. Sandini, “A Survey of Artificial Cognitive Systems : Implications for the Autonomous Development of Mental Capabilities in Computational Agents,” IEEE Trans. Evol. Comput., vol. 11, no. 2, pp. 151–180, 2007.
[38]
S. Baveja, “A Survey of Cognitive and Agent Architectures,” 1992. [Online]. Available: http://ai.eecs.umich.edu/cogarch0/index.html.
[39]
R. M. Rolfe and B. A. Haugh, “Integrated Cognition – A Proposed Defi nition of Ingredients , A Survey of Systems , and Example Architecture,” Alexandria, VA, 2004.
[40]
D. Embrey and H. Lane, “Understanding Human Behaviour and Error The Skill , Rule and Knowledge Based Classification,” System, pp. 1–10, 1990.
[41] B. Goertzel, “How might the brain represent complex symbolic knowledge?,” Proc. Int. Jt. Conf. Neural Networks, pp. 2587–2591, 2014. [42]
M. J. E. Laird, C. Lebiere, and P. S. Rosenbloom, “A Standard Model of the Mind : Toward a Common Computational Framework across Artificial Intelligence, Cognitive Science, Neuroscience, and Robotics,” AI Res., 2017.
[43]
J. J. Elgot-Drapkin, D. Perlis, S. Kraus, M. Miller, and M. Nirkhe, “Active Logics: A Unified Formal Approach to Episodic Reasoning,” 1999.
[44]
A. M. Nuxoll and J. E. Laird, “Enhancing intelligent agents with episodic memory,” Cogn. Syst. Res., vol. 17–18, pp. 34–48, Jul. 2012.
202
[45]
A. M. Nuxoll, “Enhancing Intelligent Agents with Episodic Memory,” 2007.
[46] R. C. Schank, Dynamic Memory Revisited, 2nd ed. New York, NY: Cambridge Press, 1999. [47]
L. Geng and H. J. Hamilton, “ESRS : A Case Selection Algorithm Using Extended Similarity-based Rough Sets,” in 2002 IEEE International Conference on Data Mining, 2002. ICDM 2003., 2002, pp. 609–612.
[48]
Y.-J. Chen, Y.-M. Chen, and Y.-S. Su, “An Ontology-Based Distributed Case-Based Reasoning for Virtual Enterprises,” 2009 Int. Conf. Complex, Intell. Softw. Intensive Syst., pp. 128–135, Mar. 2009.
[49] F. F. Ingrand, A. S. Rao, and M. P. Georgeff, “An Architecture for Real-Time Reasoning and System Control 1 Introduction 2 Requirements for the Design of Situated Reasoning Systems,” IEEE Expert, vol. 7, no. 6, pp. 33–44, 1992. [50]
M. E. Bratman, D. J. Israel, and M. E. Pollack, “Plans and resource-bounded practical reasoning bratman.pdf,” Comput. Intell., vol. 4, no. 4, pp. 349–355, 1988.
[51]
J. E. Laird, “Online Determination of Value-Function Structure and Action-value Estimates for Reinforcement Learning in a Cognitive Architecture,” vol. 2, pp. 221–238, 2012.
[52]
B. Goertzel, “OpenCogPrime : A Cognitive Synergy Based Architecture for Artificial General Intelligence,” in IEEE International Conference on Cognitive Informatics, 2009, pp. 60–68.
[53]
B. Hayes-Roth, “Opportunistic control of action in intelligent agents,” IEEE Trans. Syst. Man. Cybern., vol. 23, no. 6, pp. 1575–1587, 1993.
[54]
M. T. Cox, T. Oates, and D. Perlis, “Toward an Integrated Metacognitive Architecture,” in Advances in Cognitive Systems: AAAI Fall 2011 Symposium, 2011, no. Papers from the 2011 AAAI Symposium (FS-11-01), pp. 74–81.
[55]
H. Haidarian, W. Dinalankara, S. Fults, S. Wilson, D. Perlis, M. D. Schmill, T. Oates, D. P. Josyula, and M. L. Anderson, “The Metacognitive Loop : An Architecture for Building Robust Intelligent Systems,” in AAAI Fall Symposium, no. ii, pp. 33–39.
[56]
D. P. Josyula, S. Fults, M. L. Anderson, S. Wilson, and D. Perlis, “Application of MCL in a Dialog Agent,” in Third Language and Technology Conference, 2007, no. Figure 1.
[57] D. P. Josyula, “A Unified Theory of Acting and Agency for A Universal Interfacing Agent,” 2005. [58]
G. Alexander, Anita Raja, and D. J. Musliner, “Controlling Deliberation in a Markov Decision Process-Based Agent,” in Proceedings of the 7th International Conference on 203
Autonomous Agents and Multiagent Systems (AAMAS 2008), 2008, pp. 461–468. [59]
G. Alexander, A. Raja, E. H. Durfee, and D. J. Musliner, “Design Paradigms for MetaControl in Multiagent Systems,” in Proceedings of AAMAS 2007 Workshop on Metareasoning in Agent-based Systems, 2007, pp. 92–103.
[60] G. H. Lim, I. H. Suh, and H. Suh, “Ontology-Based Unified Robot Knowledge for Service Robots in Indoor Environments,” IEEE Trans. Syst. Man. Cybern., vol. 41, no. 3, pp. 492– 509, 2011. [61]
A. W. Moore and C. G. Atkeson, “Prioritized Sweeping: Reinforcement Learning with Less Data and Less Real Time,” Mach. Learn., pp. 103–130, 1993.
[62]
F. Heintz, J. Kvarnstrom, and P. Doherty, “Bridging the sense-reasoning gap: DyKnow Stream-based middleware for knowledge processing,” Adv. Eng. Informatics, vol. 24, no. 1, pp. 14–26, 2010.
[63]
S. R. Yussen, The Growth of Reflection in Children. New York, NY: Academic Press, 1985.
[64] H. F. Wellman, The Child’s Theory of Mind. Cambridge, MA: MIT Press, 1990. [65]
M. T. Cox, “Metacognition in computation: A selected research review,” Artif. Intell., vol. 169, no. 2, pp. 104–141, Dec. 2005.
[66] S. J. Russell and P. Norvig, Artificial Intelligence: A Modern Approach, 3e. Pearson Education Limited, 2014. [67]
M. T. Cox, “Perpetual Self-Aware Cognitive Agents,” AI Magazine, no. 2002, pp. 32–51, 2007.
[68] M. L. Anderson and T. Oates, “A review of recent research in metareasoning and metalearning,” AI Magazine, vol. 17604, pp. 1–17, 2007. [69] J. Zheng and M. C. Horsch, “A Decision Theoretic Meta-Reasoner for Constraint Optimization,” in Proceedings of the 18th Canadian Society conference on Advances in Artificial Intelligence, 2005, pp. 53–65. [70]
M. Nirkhe, “Time-Situated Reasoning within Tight Deadlines and Realistic Space and Computation Bounds,” 1994.
[71]
D. P. Josyula and K. M. M’Balé, “Bounded Metacognition,” in COGNITIVE 2013, 2013, pp. 147–152.
[72]
M. L. Anderson and D. Perlis, “Logic, Self-awareness and Self-improvement : the Metacognitive Loop and the Problem of Brittleness,” J. Log. Comput., vol. 14, no. 4, pp. 1–20, 2004. 204
[73]
M. L. Anderson, T. Oates, W. Chong, and D. Perlis, “Enhancing reinforcement learning with metacognitive monitoring and control for improved perturbation tolerance.”
[74]
M. D. Schmill, M. L. Anderson, S. Fults, D. P. Josyula, T. Oates, D. Perlis, H. Shahri, S. Wilson, and D. Wright, “The Metacognitive Loop and Reasoning about Anomalies.” p. 17, 2011.
[75]
M. L. Anderson, T. Oates, W. Chong, and D. Perlis, “The metacognitive loop I: Enhancing reinforcement learning with metacognitive monitoring and control for improved perturbation tolerance,” J. Exp. Theor. Artif. Intell., vol. 18, no. 3, pp. 387–411, Sep. 2006.
[76]
K. Tsumori and S. Ozawa, “Incremental learning in dynamic environments using neural network with long-term memory,” Proc. Int. Jt. Conf. Neural Networks, 2003., vol. 4, pp. 2583–2588, 2003.
[77] D. Wright, “Finding a Temporal Comparison Function for the Metacognitive Loop,” 2011. [78]
D. P. Josyula, H. Vadali, B. J. Donahue, and F. C. Hughes, “Modeling metacognition for learning in artificial systems,” 2009 World Congr. Nat. Biol. Inspired Comput., pp. 1419– 1424, 2009.
[79]
Anodot, “Building a Large Scale , Machine Learning- Based Anomaly Detection System Part 2 : Learning the Normal Behavior of,” 2017.
[80]
R. Boné and H. Cardot, “Advanced Methods for Time Series Prediction Using Recurrent Neural Networks,” Recurr. Neural Networks Temporal Data …, no. 1, 2011.
[81]
M. Abou-nasr, “Time Series forecasting with Recurrent Neural Networks NN3 Competition,” in 2010 Time Series Forecasting Grand Competition for Computational Intelligence, 2010.
[82]
A. Bernal, S. Fok, and R. Pidaparthi, “Financial Market Time Series Prediction with Recurrent Neural Networks,” 2012.
[83]
B. Pecar, “Automating Time Series Analysis - A Case-based Reasoning Approach and Web Services,” in UK Academy for Information Systems Conference, 2003, pp. 1–12.
[84]
C. Giles, S. Lawrence, and A. Tsoi, “Noisy time series prediction using recurrent neural networks and grammatical inference,” Mach. Learn., pp. 1–31, 2001.
[85]
S. C. Prasad and P. Prasad, “Deep Recurrent Neural Networks for Time Series Prediction,” vol. 95070, pp. 1–54, 2014.
[86] H. Erik, “Improving Time Series Prediction using Recurrent Neural Networks and Evolutionary Algorithms,” Chalmers University of Technology, 2004.
205
[87]
U. Kurup, C. Lebiere, A. Stentz, and M. Hebert, “Using Expectations to Drive Cognitive Behavior,” in AAAI Conference on Artificial Intelligence, 2011, pp. 221–227.
[88]
G. P. Zhang and M. Qi, “Neural network forecasting for seasonal and trend time series,” Eur. J. Oper. Res., vol. 160, no. 2, pp. 501–514, 2005.
[89] T. Taskaya-Temizel and M. C. Casey, “A comparative study of autoregressive neural network hybrids.,” Neural Netw., vol. 18, no. 5–6, pp. 781–9, 2005. [90] M. T. Cox, T. Oates, M. Paisner, and D. Perlis, “Noting Anomalies in Streams of Symbolic Predicates Using A-Distance,” Adv. Cogn. Syst., vol. 2, pp. 167–184, 2012. [91]
M. Cryan, “Inf1A : Probabilistic Finite State Machines and Hidden Markov Models,” pp. 44–49, 2004.
[92]
G. Bergmann, A. Ökrös, I. Ráth, D. Varró, and G. Varró, “Incremental pattern matching in the viatra model transformation system,” Proc. third Int. Work. Graph Model Transform. - GRaMoT ’08, p. 25, 2008.
[93]
B. Berstel, “Extending the RETE algorithm for event management,” Proc. Int. Work. Temporal Represent. Reason., vol. 2002–Janua, pp. 49–51, 2002.
[94] M. Schor, T. Daly, H. Lee, and B. Tibbitts, “Advances in RETE pattern matching,” … Fifth Natl. Conf. …, pp. 226–232, 1986. [95]
K. Walzer, T. Breddin, and M. Groch, “Relative temporal constraints in the Rete algorithm for complex event detection,” Proc. Second …, pp. 147–155, 2008.
[96] D. Zhou, Y. Fu, S. Zhong, and R. Zhao, “The Rete algorithm improvement and implementation,” Proc. Int. Conf. Inf. Manag. Int. Conf. Inf. Manag. Innov. Manag. Ind. Eng. ICIII 2008, vol. 1, pp. 426–429, 2008. [97] S. Kumar and E. H. Spafford, “A Pattern Matching Model for Misuse Intrusion Detection,” Proc. 17th Natl. Comput. Secur. Conf., pp. 11–21, 1994. [98] Y. Labrou, T. Finin, and Y. Peng, “Agent communication languages: the current landscape,” Intell. Syst. their Appl., vol. 14, no. 2, pp. 45–52, 1999. [99] FIPA, “FIPA Ontology Service Specification,” 2001. [100] FIPA, FIPA ACL Message Structure Specification. 2002. [101] FIPA, FIPA Communicative Act Library Specification. 2002. [102] L. Guoqiang and F. Qibin, “XML-based agent communications for plant automation,” in IEEE International Workshop on Factory Communication Systems, 2004. Proceedings., 206
2004, pp. 301–304. [103] S. A. Moore, “KQML & FLBC : Contrasting Agent Communication Languages,” in Proceedings of the 32nd Annual Hawaii International Conference on on Systems Sciences, 1999, 1999, vol. Track 6, no. c, pp. 1–10. [104] V. Vasudevan, “Comparing Agent Communication Languages Speech Act Theory : In Brief,” 2002. [105] X. Liu, Y. Shi, H. Wu, and Z. Ye, “Research and application of Agent Communication Language Extended XML,” 2011 Seventh Int. Conf. Nat. Comput., pp. 1309–1313, Jul. 2011. [106] X. Luo, M. Zou, and L. Luo, “A modeling and verification method to multi-agent systems based on KQML,” in 2012 IEEE Symposium on Electrical & Electronics Engineering (EEESYM), 2012, pp. 690–693. [107] M. A. Nazir Raja, H. Farooq Ahmad, H. Suguri, P. Bloodsworth, and N. Khalid, “SOA compliant FIPA agent communication language,” 2008 First Int. Conf. Appl. Digit. Inf. Web Technol., pp. 470–477, Aug. 2008. [108] A. Newell, Unified Theories of Cognition. Cambridge, MA: Harvard University Press, 1990. [109] M. Minsky, Society of Mind. New York, NY: Simon & Shuster, 1986. [110] K. M. M’Balé and D. P. Josyula, “Encoding seasonal patterns using the Kasai algorithm,” Artif. Intell. Res., vol. 6, no. 2, p. 13, 2017. [111] R. Bruno, B., Mastrogiovanni, F., Sgorbissa, A., Vernazza, T., Zaccaria, “Analysis of human behavior recognition algorithms based on acceleration data,” in IEEE Int Conf on Robotics and Automation (ICRA), 2013, pp. 1602--1607. [112] J. Schmidhuber, “Deep Learning in neural networks: An overview,” Neural Networks, vol. 61, pp. 85–117, 2015. [113] E. Tulving, “Episodic memory and autonoesis: Uniquely human?,” in The Missing Link in Cognition, H. S. Terrace and J. Metcalfe, Eds. New York, NY: Oxford University Press, 2005, pp. 4–56. [114] E. Koechlin, C. Ody, and F. Kouneiher, “The architecture of cognitive control in the human prefrontal cortex.,” Science, vol. 302, no. 5648, pp. 1181–5, Nov. 2003. [115] K. Wagstaff, C. Cardie, S. Rogers, and S. Schroedl, “Constrained K-means Clustering with Background Knowledge,” in Proceedings of the Eighteenth International Conference on Machine Learning, 2001, 2001, pp. 577–584.
207
[116] R. L. Cannon, J. V Dave, and J. C. Bezdek, “Efficient Implementation of the Fuzzy cMeans Clustering Algorithms.,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 8, no. 2, pp. 248–55, Feb. 1986. [117] T. Winograd and K. Scarfone, “Guide to Secure Web Services Recommendations of the National Institute of Standards and Technology Anoop Singhal.” [118] K. M. M’Balé, D. P. Josyula, and S. Sharma, “Topic - Based Service Integration in Software Systems,” in 22nd International Conference on Software Engineering and Data Engineering, 2013, pp. 3–8.