Less is More: Six Paradigms of Intelligence Design
Exploring defining characteristics of AI design vs the human mind.
In part 2 of this series on AGI I want to take a closer look at the contrast between how we build artificial intelligence (AI) systems and how conceptually different the biological minds develop. As you will see, this is not just a matter of semantics or different methodologies, it’s a reflection of two diametrically opposed approaches to understanding and creating intelligence. To break this complex topic down into more palatable concepts, I’ll discuss 6 key dimensions: design direction, organisational paradigm, boundary condition, learning and optimisation methods and lastly energy consumption. Each section first looks at AI systems and is followed by the description of the biological-mind-equivalent.
The juxtaposition in this way is inspired by a talk given by Joscha Bach at the University of Tokyo on 8 July 2024. This is really worth watching. It is one of the most interesting talks I have come across in a while, bringing together surprising ideas for an interesting proposition in consciousness research.
I want to preface that I don’t believe in ‘speciesism’ and don’t mean to suggest that human level intelligence could only be achieved in the wetware of a human mind. Intelligence built or arising in silicon is certainly conceivable. To accept this does not mean we cannot make a sober assessment of where we are with current AI systems, relative to some of the seemingly defining characteristics of what makes human minds uniquely capable.
I find these questions interesting on a theoretical level, but the practical implications are no less relevant, because discussing these differences should give some perspective on all the promises for ‘agents’ and ‘agentic systems’. It would seem we have some ways to go, and mind you, what I am describing here is - more or less - ‘state of the art’. Of course, I’m not a mind reader and could not possibly say what is cooked up in some lab, but for none of the aspects discussed I would confidently say that a solution is within reach in the next few years.
The question then turns on whether any or all these aspects are mission critical for AGI. I posit that creating effective and robust AI systems without addressing at least some of these 6 areas explored here would be exceptionally challenging, if not impossible. All of these dimensions seem fundamental to the very nature of general intelligence and play critical roles in determining how the system is conceived, developed and operates.
A. Design Direction
The engineering approach to discuss is the ‘direction’ in which we are currently building AI and how this stacks up against biological minds.
Outside-in AI Design
The current approach to designing AI systems is an outside-in endeavour, where the design and functions are imposed on the system from external sources, namely data scientists and software engineers. This is how engineering as broader discipline is approached: a mostly linear, top-down process where the desired outcome is clearly defined at the outset and the system is built to meet those specific requirements.
Engineering mostly approaches solutions in a top-down process: Decisions are made in a hierarchical process mostly at the top and implemented down the chain, which ideally streamlines decision-making and ensures that the outcomes are aligned with the initial vision.
The definition of specific, well-defined problem sits at the beginning, such as a set of tasks that the AI system should be able to perform. On this basis engineers then construct AI systems by determining and implementing the necessary algorithms and data structures that will enable it to perform the defined tasks, while each component is meticulously planned and its behaviour tightly controlled.
This method is firmly rooted in the somewhat technologically deterministic nature of traditional engineering as a way to practically implement humanity’s understanding of the universe for best outcomes.
This method typically then leads to a fixed architecture: once the system’s structure and function are determined, they remain relatively static.
This outside-in approach comes with significant upside. First and foremost, it creates predictable systems, which (mostly) perform in the expected manner affording their architect foresight on their impacts. Such systems are also typically quite reliable in determining how to perform the functions for which they were designed in the first place. This might also be seen as a fairly efficient design approach.
Inside-out Biological Mind
Biological minds are shaped by a fundamentally different ‘design philosophy’, best described as ‘inside-out’, because development and complexity of these systems emerges from within, rather than being imposed via an external blueprint.
Biological systems do not start from a fully formed structure or function. Instead, they evolve from within, guided by intrinsic processes such as genetic instructions, cellular interactions and environmental influences. Over time, these processes give rise to the intricate architectures and sophisticated functionalities.
Interactions within biological systems, such as between cells or neurons, are not rigidly controlled but are instead dynamic and responsive, allowing the system to adapt and refine its structure and behaviour through environmental interaction, such as learning and experiences. In the brain, an adaptive process known as neuroplasticity allows neurons to extend their axons to form new connections with other neurons or make experience-dependent long-lasting changes in the strength of neuronal connections.
The complex architecture of biological systems, such as the human brain, emerges from quite simple rules or processes operating at a small scale. For example, the differentiation of cells into various types during embryonic development follows basic biochemical and signals and physical rules that guide cells to take on specific roles.
Nota bene: You can take a look at Michael Levin’s fascinating work on how electrical gradients across cells give even individual cells a sense of ‘left and right’ and how that mechanism can be used to rewrite anatomical pattern memory of planaria, a kind of flat worm.
When repeated and compounded over time, these small-scale processes lead to the formation of highly complex and specialised, adaptable structures, not architected based on an externally imposed, predetermined blueprint.
This inside-out design allows biological minds to be remarkably flexible and resilient. Unlike current AI systems, biological systems can spontaneously generate new behaviours in response to changing environments.
B. Organisational Paradigm
The organisational principles that govern AI systems and biological minds further highlight the difference in design philosophies: Where AI is designed based on a deterministic substrate, the mind is self-organising.
Deterministic Substrate
Current AI systems are built on a deterministic substrate, which is silicon. In this context, ‘deterministic’ means that the behaviour of the system is precisely controlled and predictable. Each operation performed by the hardware follows exact rules, with no room for deviation.
Nota bene: This is why computer systems need to deal with the issue of bit-flips.
Silicon-based chips are designed with a specific architecture that dictates how data flows through the system, how instructions are processed and how operations are executed. The predictability of operations ensures that the same input will always produce the same output, which is crucial for tasks that require high reliability and repeatability.
Nota bene: At the same time, this is a largely the reason why the current chip architectures are not particularly suitable for the kinds of AI models we are running on them: Models make decisions based on probability distributions, which is a process of quantifying randomness. It turns out that this is quite hard to ‘simulate’ on these deterministic machines.
For instance, in the artificial neural network of a machine learning model, the organisation of neurons is fixed once the model architecture is defined and training is completed. The connections between neurons and the flow of data through the network are determined by the design and training choices made by engineers.
Self-Organisation
Science’s understanding of biological minds on the other hand is that they are characterised by self-organisation and emergence, where complex behaviours and structures arise from simpler interactions within the system.
Reliance on the idea of self-organisation is placed for explaining complex systems of many kinds: It is the process by which a system naturally forms structured and complex behaviours or patterns without external control or guidance. This occurs through the local interactions of simpler components, leading to emergent order that is adaptive, robust and governed by feedback loops. The system operates without centralised authority, relying instead on the collective dynamics of its parts to self-regulate, evolve and ultimately balance itself.
We witness the capacity for self-organisation across many domains other than biology:
In physics, self-organisation occurs in phenomena like the formation of snowflakes, where the structure emerges from the interactions of water molecules as they freeze, or in the patterning of sand dunes by wind.
In social systems of human societies, self-organisation is observed in the emergence of complex societies, markets, social norms and networks without centralised control.
Nota bene 1: an exciting extension to the general concept of self-organisation is Lee Cronin’s Assembly Theory, which aims to explain and quantify selection and evolution from physics without an inherent design blueprint.
Nota bene 2: For the sake of completeness, in the context of thermodynamics, the concept of entropy could be seen to be at least limiting the applicability of self-organisation. Where self-organising systems create local order, the second law of thermodynamics suggests that this process cannot be free, but it comes at the cost of increased disorder elsewhere in the system or its environment, which might limit the applicability of self-organisation to isolated or closed systems.
Assuming that one subscribes to the theory of Darwinian evolution, then in biology, self-organisation can be seen in the formation of cellular structures, the development of organisms, and the functioning of ecosystems. The development of the nervous system in a human embryo begins with a few cells that eventually differentiate and organise into the complex structures of the brain. This process is not guided by a detailed blueprint but rather by genetic and environmental factors that influence the ways in which these cells interact. The result is a system that is highly adaptable and capable of learning and evolving in ways that are not strictly predictable, but also highly dependent on its environment.
Another key feature of self-organisation is that it enables biological minds to maintain functionality even in the face of disruptions. For instance, if part of the brain or the sensory input system are damaged, other parts can often reorganise to compensate for the loss.
C. Boundary Conditions
The way AI systems and biological minds interact with their environment further distinguishes their design differences and highlight contrasting approaches to how the systems process information and respond to external factors.
Decoupled Static System
Static systems do not exhibit any change over time. The variables and parameters that define them remain constant, leading to a predictable and unchanging output or behaviour. They typically have well-defined, closed boundaries, meaning they are isolated from external influences. Such system’s behaviour is entirely determined by its internal structure, with no input or output exchange with the environment.
AI systems are trained on static data that is decoupled from the world, resulting in a static model that does not adjust later on. Prior to training and as of a specific point in time training data from the real world is collected and curated. It and nothing else determines the model’s fine-tuned parameter weights, which in turn dictate the way in which it will respond to future user inputs at the inference stage. For example, an ML model for image recognition might be trained on a large dataset of labelled images. This dataset is static. It does not change, whether during training or at inference. Once training is complete, the model is ‘ready’.
Nota bene: For sake of clarity, I am not suggesting that the current GenAI systems, like LLMs, aren’t able to ‘dynamically’ output responses. Part of the problem we experience with them is that no two outputs are exactly the same. This observation, however, I see quite distinctly separate from the fact that the model itself does not change outside of training.
Updates to the internal structure of the models take place only, as engineers re-train or further fine-tune the model – once again – base on static training data.
Of course, current AI products can typically access the internet, but this is quite different from a dynamic coupling with the environment that relies on feedback loops to update its software. The information accessed by an AI model in this way does not update the model and adapt it, for example, to user expectations. In an imperfect analogy, it’s more like walking into the library and collecting some relevant books based on title and author, rather than studying the books and creating a learning from this process, so to update the system for next time. When the same question comes up again in the future, a static AI system – structurally – will have to re-retrieve the same books.
Work on AIRIS, the ‘Autonomous Intelligent Reinforcement Inferred Symbolism’ has made interesting progress on continuously updating an algorithm with behavioural rules as it encounters unknown aspects of the environment via a graph-based approach, that is leaning towards exploration. This works quite effectively for the MNIST image classification of numbers, the CartPole balancing and a puzzle game, each representing quite constrained environments. Most recently the system has been expanded into Minecraft, a more continuous space.
Figure 2 from the AIRIS paper.
Environmental Coupling
In contrast, biological minds are continuously coupled to their environment. From the moment of conception, biological systems are in constant interaction with their surroundings, receiving input from sensory data. The nervous system processes these sensory data inputs through a coordinated network of neurons that transmit information from the body’s sensory receptors to the brain. When one of the 5 senses detects stimuli from the environment, these signals are transmitted to the central nervous system. In the brain, particularly within specialised regions (e.g. the visual or auditory cortex), the signals are interpreted and integrated to form coherent perceptions of the external world. Based on this processed information the brain to generate appropriate responses, whether they be motor actions, emotional reactions, or cognitive assessments, ensuring that the body can react swiftly and accurately to its surroundings. As such, environmental interactions continually shape the brain’s development and an agent’s ongoing function. Every interaction in the world – rather literally – changes the brain’s structure.
This continuous environmental coupling allows biological minds to be highly adaptive. Because they are always interacting with and responding to their environment, they can adjust their behaviour in real-time. This adaptability is crucial for survival and performance in dynamic and unpredictable environments.
D. Learning Algorithm
The processes of learning in AI and biological systems expose fundamental differences in acquiring and adapting knowledge.
ML Training Runs
Learning of AI systems is conducted through machine learning training, where the system is exposed to large datasets and learns to perform its tasks based on this data. This includes both supervised and unsupervised learning. The training process is discrete and occurs in defined phases, or training runs. Once the AI model is trained, it is deployed to perform tasks, and its learning does not continue unless it is retrained on new data. This happens, for example, when certain issues get patched along the way, before an entirely new model gets trained.
Training large models such as GPT4 is time and resource intense, taking months and costing more than $100 millions dollars.
The effect this has is certain limitations to generalisation, especially regarding ‘out-of-distribution’ inputs which the model has never encountered in its training data, and the issue of ‘catastrophic forgetting’, is still large unsolved, which is a phenomenon where the model loses previously acquired information as it is trained on new data or fine-tuned for specific tasks.
Continuous Development
In contrast, learning in biological brains is continuous and integrated with the system’s operation. From the moment a biological system begins to interact with its environment throughout its lifespan, it is constantly learning and adapting.
What is remarkable about this is that this learning feedback works already at a very early stage in embryos and infants. The other remarkable detail is that this learning process is highly ‘context dependent’. A toddler, for example, operates in the plane – something the embryo is believed to have learned from its position in the placenta (!) – and has no concept of the 3rd spatial dimension, until this becomes relevant.
This learning process is ongoing and does not require separate training phases. Humans and other animals learn continuously throughout their lives, adapting to new experiences and environments in real-time.
E. Optimisation Algorithm
Looking at the optimisation strategies employed in AI and biological systems reveal further differences. Where AI systems seek to optimise prediction, the research on the human brain shows that we continuously optimise for coherence of sensory input with our existing world model.
Here is the image contrasting ML training on the left with continuous learning on the right. It visually represents the difference between the structured, repetitive nature of machine learning training and the fluid, evolving nature of continuous learning.
Prediction Optimisation
AI systems are typically optimised for prediction accuracy. During training, machine learning models are tuned via iterative optimisation techniques to minimise a loss function – mostly gradient descent – which quantifies the difference between the model’s predictions and the desired outcomes.
This is worth emphasising: As discussed in section A. covering the design direction, the model (via its parameter and weight adjustments) is designed to ‘arrive’ at a pre-defined outcome. This process is complete, when model’s parameter weights are adjusted to allow for an sufficiently high ability to predict outcomes based on the training data.
From an engineering perspective, the focus on prediction accuracy is well-suited for tasks where the goal is to make precise, repeatable predictions, such as in image recognition or language translation. The reproducibility of outcomes has the downside that it may struggle to generalise to new or unforeseen situations.
Coherence Optimisation
In contrast, biological systems optimise for coherence rather than only prediction accuracy. This is not a hierarchical feed-forward model – like the layers in a transformer – but an inferential process by 2 classes of neurons working together: one that transmits predictions about sensory data and another that signals sensory information which deviates from those predictions.
Essentially the brain relies on a world model, a mental representation of how we perceive the world, the mechanisms which drive it and how we can influence it. This world model is constantly compared with our sensory inputs.
Attention passes over the various inputs. This can be in the form of a ‘spot-light’, when we focus attention on a specific object for example. Attention can also take the form of a ‘flood-light’, when we widen attention to take in any new cues that arise, such as in dark alley at night. Lastly, attention can also be goal-selected. This type of attention oversees and directs executive functions, which involve managing multiple skills in pursuit of goals and ensuring that we don’t get side-tracked by distractions.
Nota bene: There are other, interesting ways to describe the attention process as modulating between opacity and transparency, like here, by John Vervaeke.
To the extent that the sensory input highlighted by the attention mechanism deviates from our world model, ie leads to ‘prediction error’, the brain exploits these aberrations to trigger appropriate actions to impact the environment so to realign it with the internal world model. In this way,
perception is the process of identifying the perceptual hypothesis that best predicts sensory input and hence, minimizes prediction error,
- Walsh/McGovern/Clark/O'Connell – see above (2 classes of neurons)
for maximum coherence.
This is a highly dynamic process allowing biological systems to function effectively in complex, unpredictable environments. It involves balancing competing demands and integrating diverse sources of information to achieve a harmonious outcome that limits violations of the constraints imposed by the environment as perceived by its internal reference structure.
F. Energy balance
The energy balance between training and operating an AI system and a biological mind also differ significantly due to their architectural and processing differences
Giga-Watt-Hours
The energy requirements for training and operating an AI system are enormous. Training a large-scale AI model, such as a deep neural network, consumes an brain-melting amount of energy. The process can require thousands of megawatt-hours (MWh) of electricity, to power large clusters of GPUs or TPUs over extended periods. For example, training a state-of-the-art model like GPT-4o consumed an estimated 50GWh, which translates into ca 40 billion kcal. Once trained, the energy needed for inference -applying the model to make predictions – is also significantly: GPT-4’s energy consumption for inference is estimated to be 91GWh per year. ChatGPT is estimated to consume per month approximately the energy that would otherwise power 26,000 households.
Nota bene: Research is done on ‘1-bit models’, which should bring down energy consumption drastically from the current 16 or 32-bit models.
Kilo-Watt-Hours
In stark contrast, the human brain operates with remarkable energy efficiency. The brain consumes somewhere between 250 and 400 kcal per day, so approximately 0.3 to 0.5 kWh, which means the brain operates on 2-3 bananas per day. It’s a bit of a silly conversion, but indulge me: So, for the annual energy consumption of running GPT-4 we could power ca. 227m brains, which is roughly the population of Brazil.
G. Conclusion
This was a long piece. I suspect that – if you are still here reading this – you would have gotten a sense of the distance between current AI systems and the human brain. None of this means that all hope is lost, quite the opposite in fact. Technology has made possible AI systems that are nothing short of astonishing. Despite the progress so far, more work is needed on the path to AGI.
Very complex indeed! Thanks for the interesting and informative read!