Why connectionism

Why connectionism?

What advantages do building connectionist models give you over building other types of models? What disadvantages exist?

Neural Plausibility

Clearly one of the most alluring features of connectionist models is the analog to how neurons in the brain might be wired.

Similarities:

Connectionist units have activation values that are intended to roughly correspond to the firing of a neuron.
Connections between units are modeled on connections between axons and dendrites in that these connections can be inhibitory or excitatory, and they are weighted so that not all connections are of equal strength.
The activation level of a unit, like a neuron, is a function of the net inputs from the units it is connected to.

Pushing the analogy

To try and push the neural analogy as far as possible, some interesting parallels have been proposed:

Many cognitive processes only take a few hundred milliseconds to execute. Since it takes a few milliseconds for a neuron to fire, connectionists argue that a given cognitive operation can involve no more than 100 or so steps. It is claimed that it is easier to meet this limit in a parallel network than in a serial system.

Where neural plausibility fails

The strength of activation of a neuron is reflected in the neuron’s firing rate, whereas the strength of activation of a connectionist unit is represented as some real value.
The way input activation is added up to determine how active a connectionist unit should be is very simple compared to how neurons do it.
Nobody knows in any detail how neurons "know" when to change the strength of their connections to each other. Given that it seems that feed-forward learning algorithms will not allow complex tasks to be learned, the neural plausibility of the best alternate algorithm, back propagation, is in question.

Soft constraints

Tradition rule-based models encode lots of hard constraints. For example, in a production system, whenever the conditions in the LHS of the production are satisfied, the actions in the RHS must be executed.

In a network, the fact that two units are connected can be seen as a constrain on the processing in that network, but it is not a hard constraint. Because units are receiving input from many other units, one connection by itself will not necessarily determine the behavior of a given unit.

Graceful degradation

For the most part, the brain tends to be a pretty reliable machine, particularly when compared with, say, a car or a computer.

Which is not to say that the brain always functions perfectly. It can be damaged, or there can be too many demands on its resources, or too much information can be available for it to process.

When this happens, the brain does not "crash" and ask you to reboot. It simply performs in a sub-optimal fashion. Capabilities don’t disappear, but rather become gradually more limited.

Rule-based systems generally cannot model this graceful degradation. Either a rule exists or it doesn’t, and the rules existence determines whether that particular process can execute or not.

Memory lookup

In a rule-based system, looking up items in memory based on those items’ properties generally involves one of two procedures. Either every item in memory is scanned until a match is found, or some sort of index is scanned to find a related subset of items, and then all of those items are scanned.

In a connectionist system, retrieving items from memory is straightforward and does not require an exhaustive search. In the Jets & Sharks model, it was the use of spreading activation that allowed for the retrieval of either information about specific people, or information about groups of people.

Learning from experience

Connectionist systems have learning, in the form of changing weights, built into them. Rule-based systems have to develop complex procedures for learning new information, and how well they learn is often dependent on the cleverness of the programmer.

However, connectionist systems cannot perform one-trial learning, where something is learned very quickly, rather than in the gradual fashion embodied in current learning algorithms.

Why not always use connectionism?

With all of these advantages, why would anyone ever use anything but connectionist networks? There are a number of reasons people put forth:

This issue of the networks simply being association mechanisms. You put in input and look at the output and train the network to better match the proper outputs, but what is going on in between could be just about anything that satisfies the constraints on the outputs. What does that really tell us about cognition?
Many high-level cognitive processes really seem to be serial and rule-based, e.g., problem solving, grammar, attention.
The neural plausibility argument is not really as strong as some connectionists would have you believe. Back propagation is a perfect example.
Because the neural plausibility argument is, at best, weak, it is by no means clear that even if we can keep building bigger and bigger connectionist networks to solve complex tasks, that these networks would be any more accurate of a reflection of what is actually going on in the brain than rule-based models.

Generally, connectionist models have succeeded in modeling tasks that clearly involve a high degree of parallelism, such as early visual perception, and that need to be able to generate similar outputs from many different inputs, such as object recognition.