In the interest of content for this growing website, I’m going to explore some real and hypothetical applications of AI here. The point of relevance will be in translating said applications into working code, which will be a continuing process.
An incredibly useful tool for designing neural networks, the JavaNNS application has saved us a lot of work. Here’s a bit about it:
Java Neural Network Simulator (JavaNNS) was developed at the Wilhelm-Schickard-Institute for Computer Science (WSI) in Tübingen, Germany. Based on the Stuttgart Neural Network Simulator (SNNS) 4.2 kernel which was written in C, it has since had a new GUI written in Java glued to the front end. Because little of the main computation program has been changed, the capabilities of JavaNNS are mostly equal to the capabilities of the SNNS, with the idea of a more intuitive and user friendly user interface. Not all of the features are present, however, such as the 3d display of neural networks, but the project is under continuing development.
The SNNS website can be found here.
Physical Symbol System Hypothesis
Computer scientists Alan Newell and Herb Simon released this model of computation in 1975. Their proposition was that an intelligent system is one that manipulates physical symbols as a fundamental aspect of its operation. By their definition a symbol is any ‘thing’ that exists in reality. This could be a rock, the charge on a transistor (see where we’re going with this?), a chemical reaction, etc. The possibilities are close to infinite.
Symbols can be grouped to form ‘symbol structures’ or ‘expressions’. Multiple expressions grouped together form a ‘physical symbol system’. It’s clear that languages and processor instructions fit this definition. By extension, certain intelligence models also fit this definition.
The TSP is one of the classics. A salesman has a number of cities to visit in random locations all around the country. His problem is to figure out the most efficient route to get round all the cities, without visiting the same location more than once.
Sounds simple enough, but it turns out to be quite computationally intensive to solve, with an asymptotic complexity order of O(C^n) – exponential- using dynamic programming and O(n!) – factorial(!)- using brute force search. Either prospect is in the “inefficient” category (Google Big ‘O’ if you don’t know what I’m talking about). This means a small enough state space can be searched reasonably by modern computers. After a certain threshold, though, you would need several big-bang to heat-death universal cycles to solve the problem.
When all possible states of a process can be represented simultaneously, that representation is called the state space. For example, in the TSP the start state, goal state and all positions in between can be drawn as a graph with the points as the towns and the edges as distances between them. The graphical representation of the state space is used extensively in optimisation problems.
Generic Search Algorithm
(We’ll talk about genetic algorithms a bit later on).
A generic solution for a tree-based Search algorithm:
- N begins as the start state of the space to be searched. It could represent a node in the tree – perhaps the root node. As the algorithm progresses and the goal node is still not discovered, N becomes the nodes that are yet to be checked by the program. While N is not the goal node:
- Look at the child node(s) Ni from N.
- If Ni is the goal node, return True and the program ends.
- If Ni is not the goal node, delete Ni from N and add Ni ’s children to N. Jump back to Step 2.
- If all nodes in the tree have been visited and the goal node has not been discovered, return False and again the program ends.
With Depth First Search (DFS), at Step 3, first check node Ni that was last (Stacked) added to N.
With Breadth First Search (BFS), at Step 3, first check the node Ni that was first (Queued) added to N.
A coded solution can be found on the code page.
The application will capture live or still images and extract recognisable objects from the image stream in real time. These objects will then be processed through an algorithm that retrieves relevant information specific to the image object.
The user interface will display the image stream along with formatted data passed as information adjacent to the object within the image.
Controls should be available to the user such as:
- start/stop visual analysis
- number of objects to process at any one time
- ability to select particular objects to process
- ability to load a still image for processing
- ability to record a stream sequence
- ability to save information obtained
- control over artificial intelligence parameters
The program should be as lightweight and cross-platform as possible. The program will perform in as close to real time as possible. The program should be adaptable to be used internationally. The program is open source.
Use Case Diagram
The diagram below illustrates a use case for the augmented reality application. The user asks for an object on screen to be identified. The internet and database searches will be primarily reverse image lookups, but the user will be able to issue keyword prompts to narrow down the search criteria.
Breaking down some components:
Cross platform development is important from the outset. The OpenCV library meets this requirement. There maybe some juggling to do on the networking side with the choice of Winsock 2.0/ Berkeley Sockets, I envisage the use of #ifdef ‘s in the preprocessor to organise this. For the GUI and tools wxWidgets is the choice, though the main IDE will be VC2010.
Along with a file access mechanism, we will need to capture images from an on board camera. The OpenCV wiki provides us with a framework to do just that, the code can be viewed here.
Once we have an image stream we can begin to process it, again using OpenCV. A full study will be needed of the edge detection algorithm and perhaps look into the mechanisms of the face recognition algorithm to adapt it for use on other shapes.
An unresolved component at this stage are the sources for data retrieval.
The OpenCV library supplies us with a substantial amount of computation that we do not have to create for ourselves. There are classes and functions for everything you might think of and more for things you know nothing about in regards to image processing.
Edge detection, an essential function for this project, is solidly implemented in OpenCV with Canny edge detection. It is almost trivial to combine this with streaming camera capture and have live object tracking straight out of the box.
The next stage after this is to isolate certain tracked objects on screen, extract them from the image and pass them into our data mining black box. The following diagram illustrates the main sequence.
The above is a high level view of the core of the application that will be built. The more fine grained detail will be explored and defined on the Projects page.
Electronic Tutorials, Electronic Project Kits, Robotics Guide for Students, Amateur and Professionals. Computer Architecture and Digital Circuits.