Computer Science Artificial Intelligence - Researchain

Featured Researches

Ordinal Monte Carlo Tree Search

In many problem settings, most notably in game playing, an agent receives a possibly delayed reward for its actions. Often, those rewards are handcrafted and not naturally given. Even simple terminal-only rewards, like winning equals one and losing equals minus one, can not be seen as an unbiased statement, since these values are chosen arbitrarily, and the behavior of the learner may change with different encodings. It is hard to argue about good rewards and the performance of an agent often depends on the design of the reward signal. In particular, in domains where states by nature only have an ordinal ranking and where meaningful distance information between game state values is not available, a numerical reward signal is necessarily biased. In this paper we take a look at MCTS, a popular algorithm to solve MDPs, highlight a reoccurring problem concerning its use of rewards, and show that an ordinal treatment of the rewards overcomes this problem. Using the General Video Game Playing framework we show dominance of our newly proposed ordinal MCTS algorithm over other MCTS variants, based on a novel bandit algorithm that we also introduce and test versus UCB.

Artificial Intelligence

Ordinal relative belief entropy

Specially customised Entropies are widely applied in measuring the degree of uncertainties existing in the frame of discernment. However, all of these entropies regard the frame as a whole that has already been determined which dose not conform to actual situations. In real life, everything comes in an order, so how to measure uncertainties of the dynamic process of determining sequence of propositions contained in a frame of discernment is still an open issue and no related research has been proceeded. Therefore, a novel ordinal entropy to measure uncertainties of the frame of discernment considering the order of confirmation of propositions is proposed in this paper. Compared with traditional entropies, it manifests effects on degree of uncertainty brought by orders of propositions existing in a frame of discernment. Besides, some numerical examples are provided to verify the correctness and validity of the proposed entropy in this paper.

Artificial Intelligence

PYCSP3: Modeling Combinatorial Constrained Problems in Python

In this document, we introduce PYCSP 3 , a Python library that allows us to write models of combinatorial constrained problems in a simple and declarative way. Currently, with PyCSP 3 , you can write models of constraint satisfaction and optimization problems. More specifically, you can build CSP (Constraint Satisfaction Problem) and COP (Constraint Optimization Problem) models. Importantly, there is a complete separation between modeling and solving phases: you write a model, you compile it (while providing some data) in order to generate an XCSP3 instance (file), and you solve that problem instance by means of a constraint solver. In this document, you will find all that you need to know about PYCSP 3 , with more than 40 illustrative models.

Artificial Intelligence

Paraconsistent Foundations for Quantum Probability

It is argued that a fuzzy version of 4-truth-valued paraconsistent logic (with truth values corresponding to True, False, Both and Neither) can be approximately isomorphically mapped into the complex-number algebra of quantum probabilities. I.e., p-bits (paraconsistent bits) can be transformed into close approximations of qubits. The approximation error can be made arbitrarily small, at least in a formal sense, and can be related to the degree of irreducible "evidential error" assumed to plague an observer's observations. This logical correspondence manifests itself in program space via an approximate mapping between probabilistic and quantum types in programming languages.

Artificial Intelligence

Patterns of Cognition: Cognitive Algorithms as Galois Connections Fulfilled by Chronomorphisms On Probabilistically Typed Metagraphs

It is argued that a broad class of AGI-relevant algorithms can be expressed in a common formal framework, via specifying Galois connections linking search and optimization processes on directed metagraphs whose edge targets are labeled with probabilistic dependent types, and then showing these connections are fulfilled by processes involving metagraph chronomorphisms. Examples are drawn from the core cognitive algorithms used in the OpenCog AGI framework: Probabilistic logical inference, evolutionary program learning, pattern mining, agglomerative clustering, pattern mining and nonlinear-dynamical attention allocation. The analysis presented involves representing these cognitive algorithms as recursive discrete decision processes involving optimizing functions defined over metagraphs, in which the key decisions involve sampling from probability distributions over metagraphs and enacting sets of combinatory operations on selected sub-metagraphs. The mutual associativity of the combinatory operations involved in a cognitive process is shown to often play a key role in enabling the decomposition of the process into folding and unfolding operations; a conclusion that has some practical implications for the particulars of cognitive processes, e.g. militating toward use of reversible logic and reversible program execution. It is also observed that where this mutual associativity holds, there is an alignment between the hierarchy of subgoals used in recursive decision process execution and a hierarchy of subpatterns definable in terms of formal pattern theory.

Artificial Intelligence

Persistent And Scalable JADE: A Cloud based InMemory Multi-agent Framework

Multi-agent systems are often limited in terms of persistenceand scalability. This issue is more prevalent for applications inwhich agent states changes frequently. This makes the existingmethods less usable as they increase the agent's complexityand are less scalable. This research study has presented anovel in-memory agent persistence framework. Two prototypeshave been implemented, one using the proposed solution andthe other using an established agent persistency environment.Experimental results confirm that the proposed framework ismore scalable than existing approaches whilst providing asimilar level of persistency. These findings will help futurereal-time multiagent systems to become scalable and persistentin a dynamic cloud environment.

Artificial Intelligence

Persistent Rule-based Interactive Reinforcement Learning

Interactive reinforcement learning has allowed speeding up the learning process in autonomous agents by including a human trainer providing extra information to the agent in real-time. Current interactive reinforcement learning research has been limited to interactions that offer relevant advice to the current state only. Additionally, the information provided by each interaction is not retained and instead discarded by the agent after a single-use. In this work, we propose a persistent rule-based interactive reinforcement learning approach, i.e., a method for retaining and reusing provided knowledge, allowing trainers to give general advice relevant to more than just the current state. Our experimental results show persistent advice substantially improves the performance of the agent while reducing the number of interactions required for the trainer. Moreover, rule-based advice shows similar performance impact as state-based advice, but with a substantially reduced interaction count.

Artificial Intelligence

Perspective: Purposeful Failure in Artificial Life and Artificial Intelligence

Complex systems fail. I argue that failures can be a blueprint characterizing living organisms and biological intelligence, a control mechanism to increase complexity in evolutionary simulations, and an alternative to classical fitness optimization. Imitating biological successes in Artificial Life and Artificial Intelligence can be misleading; imitating failures offers a path towards understanding and emulating life it in artificial systems.

Artificial Intelligence

Physical Reasoning Using Dynamics-Aware Models

A common approach to solving physical-reasoning tasks is to train a value learner on example tasks. A limitation of such an approach is it requires learning about object dynamics solely from reward values assigned to the final state of a rollout of the environment. This study aims to address this limitation by augmenting the reward value with additional supervisory signals about object dynamics. Specifically,we define a distance measure between the trajectory of two target objects, and use this distance measure to characterize the similarity of two environment rollouts.We train the model to correctly rank rollouts according to this measure in addition to predicting the correct reward. Empirically, we find that this approach leads to substantial performance improvements on the PHYRE benchmark for physical reasoning: our approach obtains a new state-of-the-art on that benchmark.

Artificial Intelligence

Physically Embedded Planning Problems: New Challenges for Reinforcement Learning

Recent work in deep reinforcement learning (RL) has produced algorithms capable of mastering challenging games such as Go, chess, or shogi. In these works the RL agent directly observes the natural state of the game and controls that state directly with its actions. However, when humans play such games, they do not just reason about the moves but also interact with their physical environment. They understand the state of the game by looking at the physical board in front of them and modify it by manipulating pieces using touch and fine-grained motor control. Mastering complicated physical systems with abstract goals is a central challenge for artificial intelligence, but it remains out of reach for existing RL algorithms. To encourage progress towards this goal we introduce a set of physically embedded planning problems and make them publicly available. We embed challenging symbolic tasks (Sokoban, tic-tac-toe, and Go) in a physics engine to produce a set of tasks that require perception, reasoning, and motor control over long time horizons. Although existing RL algorithms can tackle the symbolic versions of these tasks, we find that they struggle to master even the simplest of their physically embedded counterparts. As a first step towards characterizing the space of solution to these tasks, we introduce a strong baseline that uses a pre-trained expert game player to provide hints in the abstract space to an RL agent's policy while training it on the full sensorimotor control task. The resulting agent solves many of the tasks, underlining the need for methods that bridge the gap between abstract planning and embodied control. See illustrating video at this https URL.

Ready to get started?

Join us today

Archive Your Research