"It does this by decomposing the complete behavior into sub-behaviors. These sub-behaviors are organized into a hierarchy of layers. Each layer implements a particular level of behavioral competence, and higher levels are able to subsume lower levels (= integrate/combine lower levels to a more comprehensive whole) in order to create viable behavior. For example, a robot's lowest layer could be "avoid an object". The second layer would be "wander around", which runs beneath the third layer "explore the world". Because a robot must have the ability to "avoid objects" in order to "wander around" effectively, the subsumption architecture creates a system in which the higher layers utilize the lower-level competencies. "
One can imagine how backpropagation style training on "explore the world" or other high-level scenarios would result in formation of "avoid objects" kernels at the lower levels similar to how image recognition deep nets produce Gabor like kernels at the lower levels. We do have some theorems on optimality of such deep networks for image recognition and i wouldn't be surprised if similar optimality was present for behavioral deep networks. Also, it seems that like in case of image deep nets, the deep architecture for behavior naturally allows for transfer learning by reusing the lower layers too - like one would reuse the low/mid layers of image deep net, one would naturally reuse the lower "avoid objects" layers (and may be some more complex aggregate behaviors from "mid-levels") for other tasks.
The architecture of the subsumption system is fairly well mapped: from basic optical edge detectors in the banks of the calcarine sulcus to high levels of sophistication as you move forward (object recognition, numeracy, vocabulary in the parietal, then motor function, then judgement and grammar in the frontal and prefrontal cortical areas). The additional dynamic of taking over or controlling traditionally limbic behavior (emotion) is an additional process which I would characterize as a blend of subsumption and adaption: the cortex aquires data and abstracts it, but also, as an adaptive response.
Looks like a kind of "deep" architecture:
"It does this by decomposing the complete behavior into sub-behaviors. These sub-behaviors are organized into a hierarchy of layers. Each layer implements a particular level of behavioral competence, and higher levels are able to subsume lower levels (= integrate/combine lower levels to a more comprehensive whole) in order to create viable behavior. For example, a robot's lowest layer could be "avoid an object". The second layer would be "wander around", which runs beneath the third layer "explore the world". Because a robot must have the ability to "avoid objects" in order to "wander around" effectively, the subsumption architecture creates a system in which the higher layers utilize the lower-level competencies. "
One can imagine how backpropagation style training on "explore the world" or other high-level scenarios would result in formation of "avoid objects" kernels at the lower levels similar to how image recognition deep nets produce Gabor like kernels at the lower levels. We do have some theorems on optimality of such deep networks for image recognition and i wouldn't be surprised if similar optimality was present for behavioral deep networks. Also, it seems that like in case of image deep nets, the deep architecture for behavior naturally allows for transfer learning by reusing the lower layers too - like one would reuse the low/mid layers of image deep net, one would naturally reuse the lower "avoid objects" layers (and may be some more complex aggregate behaviors from "mid-levels") for other tasks.