Archive for March, 2007
Turing Test
Somehow I bet when Turning designed his test for artificial intelligence he never thought there’d also be a simple computer test to see if you are a human.
Turing’s test involves putting a human and a computer behind a screen. A judge then communicates by email, text or some other method that does not give away whether he is speaking to a machine or human, with the machine and the human.
He asks what ever questions he likes of both. If he can not tell which is the machine and which is the human, then the computer passes the Turing test.
The Loebner Prize exists today to find the best imitator of human conversation. It is the first formal prize for passing a Turning-like test.
Now to tell humans from spam and other bots on the internet we test to see if you are human by having you type some warped text into a box. To prove you can do what the computer can not.
More information:
Chatbots.org ( Meet Suzette winner of Loebner )
AI Researchers think ‘Rascals’ Can Pass Turing Test
Child-like intelligence created in Second Life
Game Theory
Game theory is the study of how decision makers will act and react in various situations like negotiating business deals. It is used quite a bit in the study of economics and politics. This is the field that has recently gained some fame with the ‘A Beautiful Mind’ book and movie about John Nash. Much of game theory is built on the Nash Equilibrium. John Von Neumann laid much of the ground work for game theory. Using simplified models, often based only a few rules, many behaviors of people in various situations can be predicted.In these game models it is assumed that the players will make rational choices in each decision. Rational choice is defined as: ‘the action chosen by a decision maker is at least as good as every other available action’. A player is presented with a set of actions from which she chooses one. She may prefer one action over another or may consider some actions to be equally preferable. The only restriction on the actions preferences is that if a player prefers action A over action B, and she prefers action B over actions C then she must also prefer action A over C. A payoff function is used to describe the benefit of each action from the player’s point of view. For example I may visit a used car lot and find a car that is worth 1,000 by my valuation, you may value that car at 500. So the payoff function for the car for me is 1,000 for you it is 500. If I see another car I like that I value at 250, it only means I prefer the first car to the second car. It does not mean that I prefer it 4 times as much. The values are only to show ordering of choices.
The Nash Equilibrium is the place where each player can do no better, no matter what the other person decides to do.
In the well known game of ‘Prisoner’s Dilemma’ we have two crooks who worked together on a crime and have each been caught and are being held separately from each other. They each have a choice of finking or remaining quiet. If one finks, he walks with 1 years time in jail, the other person gets 4 years in jail. If neither finks there is enough evidence to put each away for 2 years time.
| Crook | quiet | fink |
| 2 | 2,2 | 0,3 |
| 1 | 3,0 | 1,1 |
The highest score of all the plays is for both crooks remain quiet and each receives 2 years jail time. But if Crook one finks, he gets a score of 3 (1 years time ) against the other player who remains quiet or a score of 1(and gets 3 years jail time) So his best bet is to fink, as is the other crooks. The Nash equilibrium is at fink/fink (1,1) since finking is the best move for each player individually.
The payoff function for this game is the same for each player and is: f(Fink, Quiet) f(Quiet, Quiet) f(Fink, Fink) f(Quiet, Fink) So we could as easily score it 3, 2, 1, 0 rather than counting years out or 32, 21, 9, 1 the score only serves to order the choices. A clearer example is a game in which we have 2 players moving pieces on a 3-d game board. Each player can move in the x, y, or z direction.
| X1 | Y1 | Z1 | |
| X2 | 2*,1* | 0,1* | 0,0 |
| Y2 | 0,0 | 2*,1 | 1*,2* |
| Z2 | 1,2* | 1,0 | 0,1 |
The * is the best move for each player considering what the other player does. If player 1 moves in the Y direction player 2′s best move is also in the Y direction (2*,1) The squares with both players have *s are X1 X2 and Z1 Y 2 . These are both Nash equilibriums. Finding the Nash Equilibrium this way is called ‘Best Response’
Suppose instead of set numbers I have a function that describes the payoff for each player. I could have A(x) = y 2 andB(y) = 1 2 x + xy then to find the Nash equilibrium I take the derivative of each function, set it to zero, solve and plot. Any and all places the functions cross on the plot are Nash equilibriums.
It is not possible, even for most simple games, to search all the possible routes the game can take. Too much time and hardware would be needed. For example tic-tac-toe is one of the simplest games we all know. A game tree that mapped all possible moves from start to finish would be 9! or 362,880 nodes large. The first player would have 9 choices of which box to play in, the second 8 choices since the first player had taken one, the first player’s second move would have 7 choices, etc. So the top level of nodes would have 9 choices. Each level in the tree represents a turn in the game. The second level would have 8 nodes off of each of the original 9 nodes and so on. So you can imagine what chess or other more complicated games have as the number of possible moves. All games and all competitions can be represented by trees.
Pruning is used to take sections off of the search tree that make no difference to play. Heuristic (rule of thumb) evaluations allow approximations to save search time. For instance in the tic-tac-toe tree described above once the first player chooses a position to play then the other 8 nodes of the top layer can be trimmed off and only the 8 trees under that node need to be searched.
Since it is not usually practical to calculate each possible outcome a cut off is usually put in place. As an example look at a chess game. For each board in play we can calculate the advantage by adding up the point value of the pieces on the board or adding points for position. Then the program can see which of those gives the program a higher score. Then the program need only calculate five or so moves ahead, calculate the advantage at each node and choose the best path. Rather than calculate ahead a set number of moves, the program can use an iterative deepening approach and calculate until time runs out. A quiescent search restricts the above approach. This eliminates moves that are likely to cause wild swings in the score. The horizon problem occurs when searches do not look ahead to the end of the game. This is a current unsolved problem in game programming.
The min max algorithm assumes a ‘zero sum game’, such as tic-tac-toe where what is good for one player is bad for the other player. This algorithm assumes that both players will play perfectly and attempt to maximize their scores. The algorithm only generates the trees on the nodes that are likely to be played. Max is the computer, min is the opposing player. It is assumed max will get first turn.
-generate entire game tree down to the maximum level to check generate each terminal state value, high values are most beneficial to max, negative values are most beneficial to min, zero holds no advantage for either player.
-go up one level, give the node above the previous layer the best score from the prior layer
-continue up the tree one level at a time until top is reached
-pick the node with the highest score.
The Alpha-Beta method determines whether an evaluation should be made of the top node by the min-max algorithm. It searches all of the nodes, like min max, then eliminates (prunes) those that are never going to reached. The program begins by proceeding with the min-max algorithm systematically through the nodes of a tree. First we go down a branch of the tree and calculate the score for that node. Then we proceed down the next branch. If the score at one of the leaves is lower than the score obtained in a previous branch of the tree we don’t finish evaluating all the nodes of the branch, rather we move onto the next branch. The search can be shallow rather than deep saving time. Further gains in speed can be made by caching the information from branches in a look up table, re-ordering results, extending some and shortening other searches, or using probabilities rather than actual numbers for cutoffs and using parallel algorithms.
Ordering may be used to save time as well. In chess captures would be considered first, followed by forward moves, followed by backward moves. Or, ordering can consider the nodes with the highest values first.
The program must try to find a winning strategy that does not depend on the human user’s moves. Humans often make small goals and consider moves that work toward that goal, i.e. capture the queen. David Wilkins Paradise is the only program so far to do this successfully. Another approach is to use book learning. Several boards are loaded into a table in memory and if the same board comes into play the computer can look up what to do from there. The Monte Carlo simulation has been used successfully in games with non-deterministic information, such as; Scrabble, dice, and card games.
Temporal-difference learning is derived from Samuel’s machine learning re search. Several games are played out and kept in a database. This works well with board games like backgammon, chess and checkers. Neural nets can be trained to play games this way, TD Gammon being one the more famous ones.
More information:
Game Theory
Martin J Osborne
Gam3r 7h30ry
Distinguishing the opponents in the prisoner dilemma in well-mixed populations
Towards Network Games with Social Preferences
An introduction to quantum game theory
Games on social networks: On a problem posed by Goyal
Backward Induction
Are mathematics the new artificial intelligence?
Backward Induction
Hopfield Networks
John Hopfield, in the late 1970′s, brought us these networks. These networks can be generalized and are robust. These networks can also be described mathematically. On the downside they can only store 15% as many states as they have neurodes, and the patterns stored must have Hamming distances that are about 50% of the number of neurodes.
Hopfield networks, aka crossbar systems, are a networks that recall what is fed into them. This makes it useful for restoring degraded images. It is a fully connected net, every node is connected to every other node. The nodes are not connected to themselves.
Calculating the weight matrix for a Hopfield network is easy. This is an example with 3 input vectors. You can train the network to match any number of vectors provided that they are orthogonal.
Orthogonal vectors are vectors that give zero when you calculate the dot product.
orthogonal (0, 0, 0, 1) (1, 1, 1, 0) => 0*1 + 0*1 + 0*1 + 1*0 = 0
orthogonal (1, 0, 1, 0) (0, 1, 0, 1) => 1*0 + 0*1 + 1*0 + 0*1 = 0
NOT orthogonal (0, 0, 0, 1) (0, 1, 0, 1) = 0*0 + 0*1 + 0*0 + 1*1 = 1
Orthogonal vectors are perpendicular to each other.
To calculate the weight matrix for the orthogonal vectors (0, 1, 0, 0), (1, 0, 1, 0), (0, 0, 0, 1)
first make sure all the vectors are orthogonal
(0, 1, 0, 0) (1, 0, 1, 0) => 0*1 + 1*0 + 0*1 + 0*0 = 0
(0, 1, 0, 0) (0, 0, 0, 1) => 0*0 + 1*0 + 0*0 + 0*1 = 0
(1, 0, 1, 0) (0, 0, 0, 1) => 1*0 + 1*0 + 1*0 + 0*1 = 0
Change the zeros to negative ones in each vector
(0, 1, 0, 0) => (-1, 1, -1, -1)
(1, 0, 1, 0) => (1, -1, 1, -1)
(0, 0, 0, 1) => (-1, -1, -1, 1)
Multiply each matrix by itself
(-1) (-1, 1, -1, -1)
[ 1, -1, 1, 1]
( 1)
[-1, 1, -1, -1]
(-1)
[ 1, -1, 1, 1]
(-1)
[ 1, -1, 1, 1]
( 1) (1, -1, 1, -1)
[ 1, -1, 1, -1]
(-1)
[-1, 1, -1, 1]
( 1)
[ 1, -1, 1, -1]
(-1)
[-1, 1, -1, 1]
(-1) (-1, -1, -1, 1)
[ 1, 1, 1,-1]
(-1)
[ 1, 1, 1,-1]
(-1)
[ 1, 1, 1,-1]
( 1)
[-1,-1,-1,-1]
The third step is to put zeros on the main diagonal of each matrix and add them together. (Putting zeros on the main diagonal keeps each node from being connected to itself.
( 0,-1, 1, 1)
( 0,-1, 1,-1)
(0, 1, 1,-1)
(-1, 0,-1,-1)
+
(-1, 0,-1, 1)
+
(1, 0, 1,-1)
( 1,-1, 0, 1)
( 1,-1, 0,-1)
(1, 1, 0,-1)
( 1,-1, 1, 1)
(-1, 1,-1, 0)
(1, 1, 1, 0)
The resulting matrix is:
[ 0,-1, 3,-1]
[-1, 0,-1,-1]
[ 3,-1, 0,-1]
[ 1, 1, 0, 1]
The Hopfield network is fully connected, each weight connects to every other weight
[n1] -> [n2] => weight is -1
[n1] -> [n3] => weight is 3
[n1] -> [n4] => weight is -1
[n2] -> [n1] => weight is -1
[n2] -> [n3] => weight is -1
[n2] -> [n4] => weight is -1
[n3] -> [n1] => weight is 3
[n3] -> [n2] => weight is -1
[n3] -> [n4] => weight is -1
[n4] -> [n1] => weight is 1
[n4] -> [n2] => weight is 1
[n4] -> [n3] => weight is 1
They can also be described as having a potential energy surface with conical holes representing the data. Holes having similar depth and diameter represent data with similar properties. The input data seeks the lowest potential energy and settles in to the closest hole. The energy surfaces of these networks are mathematically equivalent to that of ‘spin glasses’.
Some problems with these neural nets are they are computationally intensive, use lots of memory, and although I haven’t seen it mentioned I would guess race conditions may present a problem since data is updated continuously at each node with the output from one becoming the input for another.
BAM, bidirectional associative memory is an example of a Hopfield network. It consists of two fully connected layers, one for input and one for output. The nodes in each layer do not have connections to other nodes in the same layer. The weights are bidirectional, meaning that there is communication in both directions along the weight vector. There are no connections between neurodes in the same layer. BAM networks take only -1′s and 1′s as input and only output -1′s and 1′s. So if you are working with binary data, you must convert the zeros to -1′s. The weights are calculated in the same way as the Hopfield example above. The nodes are either 0 or 1 (on or off).
Hopfield Network Example