Listeners:

Top listeners:

00:00 00:00

Go to album

Paul Verschure on consciousness and distributed adaptive control CSN Podcasts
Edvard Moser on grid cells and entorhinal cortex CSN Podcasts
Giacomo Rizzolatti on mirror neurons and action understanding CSN Podcasts
Robert Axelrod on game theory and prisoner's dilemma CSN Podcasts
Adrian Owen on disorders of consciousness and vegetative state CSN Podcasts
Jonathan Whitlock on markerless motion capture and posterior parietal cortex CSN Podcasts
Luis Puelles on neuroanatomy and prosomeric model CSN Podcasts
Zoltan Molnar on subplate neurons and cortical development CSN Podcasts

Daniel Polani on information theory and embodied cognition

Episode 3 15.03.2018

PLAY EPISODE

Season 2018

0 Followers 10 Episodes

Description

What if evolution discovered that information itself is the most reliable local gradient for finding good solutions? Computer scientist Daniel Polani explains how information theory provides a normative framework for understanding why sensors are optimized, why brains are expensive, and why cognition is fundamentally constrained by the physics of embodiment. Subscribe for more from the Convergent Science Network podcast series. Daniel Polani joins Paul Verschure and Tony Prescott at the BCBT summer school to present his information-theoretic approach to embodied cognition. Starting from the observation that biological sensors often operate near their physical limits, Polani argues that information serves as a local proxy that evolution uses to direct adaptation , organisms that capture more relevant information gain access to new ecological niches, creating a positive feedback loop between sensory refinement and behavioral complexity. The information bottleneck framework allows relevant information to be distinguished from noise, providing a principled way to think about what an organism needs to sense versus what it can afford to ignore. The discussion moves from sensor optimization to the metabolic cost of processing. Polani draws an analogy to the Carnot cycle, proposing that at every level of biological organization , from ATP management to cellular logistics to high-level cognition , there is information processing happening, with each hierarchical level consuming most of the available free energy for administration and leaving only a fraction for novel computation. He introduces the distinction between open-loop and closed-loop control to formalize how sensing adds power to an agent: the extra entropic influence of a closed-loop agent is bounded by how much information it takes in, establishing that cognitive performance has hard informational limits. The conversation addresses how embodiment constrains the information flow available to an agent, why memory is the natural next step beyond reactive sensing, and how the framework generates sub-goals naturally from the interaction between long-term goals and environmental structure. Polani argues that unlike abstract AI approaches that treat decision-making as unconstrained, this information-theoretic view reveals tangible physical limits on what any embodied agent can achieve. Key topics include the evolution of sensors, relevant information versus noise, the metabolic cost of cognition, open-loop versus closed-loop control, Landauer’s principle and its connection to biological information processing, and why parsimony in neural computation is an evolutionary necessity. Part of the Convergent Science Network podcast series from the BCBT Summer School.

Tagged as:
Cognition embodied cognition Evolution information theory Physical Limits Polani Information Relevant Information

About the author

CSN Podcasts

Both the triumphs of humanity and its most evil deeds have resulted from collaboration. In a time where humanity is required to aspire to the former and minimize the latter, the question arises of how collaboration arises and why it fails. Surprisingly, this phenomenon, so central to who we are, is not well understood. Hence, a collaborative effort is required to understand collaboration in its full biological, psychological, sociological, cultural, and economic complexity and to translate this understanding into operational impact. This series of podcasts is one step toward achieving these complementary goals. The Collaboration Podcast presents interviews with people who are central orchestrators of collaboration in various domains including business, government, science, art, health, sustainability, and the military. The discussions were conducted by Prof. Dr. Paul F.M.J. Verschure and members of the Program Advisory Committee of the Ernst Strungmann Forum on Collaboration (https://www.esforum.de/forums/ESF32_Collaboration.html) during 2021 and had the goal to sketch a map of opportunities, challenges, and obstacles in human collaboration. The forum took place in May 2022, and now we would like to share this series of interviews with a broader audience. The full report of the Forum will be published in 2023 by MIT Press. The podcast was produced by the Convergent Science Network (https://www.convergentsciencenetwork.org/). Context: The stability of social systems depends critically on realizing sustainable methods of “collaboration,” yet how and by which means collaboration is achieved is not clearly understood; neither are the conditions or processes that lead to its breakdown or failure. Collaboration can be understood as cooperation between agents toward mutually constructed goals. Part of the reason for our lack of understanding is that the phenomenon of collaboration is, by nature, a highly multidisciplinary problem, and effective research into its complexities has been difficult to achieve across the broad range of scientific and technical disciplines involved. The need for a fundamental understanding of collaboration, however, has become increasingly important. Not only does humankind demand answers as it attempts to address critical challenges at multiple scales (e.g., climate change, migration, enhanced automation, social and economic inequality), but ever-increasing technological and economic means of interconnecting people and societies are disrupting long-established, familiar patterns of how we interact. Radical technological changes that are ongoing have the potential to reshape collaboration in ways that are currently hard to predict or influence (e.g., by altering configurations in interaction, information creation, and modes of communication). On one hand, such changes could disrupt hitherto stable forms of collaboration by affecting critical communication channels and traditional roles, as can be observed in the rapidly changing patterns in governance, commerce, and social interaction. Conversely, technology could lead to the emergence of novel, successful forms of collaboration that deviate from traditional “hierarchical” architectures. Evidence of this can be seen in areas as diverse as highly automated manufacturing plants, the open science movement, collaborative software repositories, user-centered services, and the sharing of economy-based modes of organization. Without a fundamental understanding of the mechanisms, processes, and boundary conditions of collaboration, it is not possible to evaluate or predict which of these possible scenarios are sustainable or even plausible. The Forum “How Collaboration Arises and Why it Fails” (May 8–13, 2022, Location: Frankfurt am Main, Germany) Chairs: Andreas Roepstorff and Paul Verschure Program Advisory Committee: Jenna Bednar, Julia R. Lupp, Bhavani R. Rao , Andreas Roepstorff, Ferdinand von Siemens, and Paul Verschure

Timestamp

00:00:03 - This is the Convergent Science Network podcast. Leading researchers in the domain
00:00:10 - of neuroscience, brain theory and technology are interviewed by Paul Verschoor and Tony Prescott.
00:00:21 - This is Paul Verschoor with the Convergent Science Network podcast and I'm here with Daniel Polanyi.
00:00:30 - Daniel spoke about his work here at BCBT Summer School 2016.
00:00:36 - And Daniel, you very much emphasized the more, let's say, information-oriented
00:00:42 - perspective on cognition, and in particular embodied cognition.
00:00:47 - So, how did you end up taking that specific perspective and trying to understand
00:00:53 - this complex phenomenon? phenomenon.
00:00:56 - One key issue that bothered me for a long time is the question,
00:01:02 - how is evolution directed in a way that moves it forward sufficiently fast?
00:01:11 - So one question I was interested in many years ago was evolution of sensors.
00:01:15 - How does an organism or a species learn evolutionarily that a particular sensoric
00:01:23 - channel has information that's relevant for its survival, or could be relevant for its survival.
00:01:28 - Which means that because we can assume that evolution is very local in terms
00:01:34 - of exploring the solution space,
00:01:37 - that there are indicators which are of local nature that nevertheless,
00:01:44 - give a drift to the evolutionary process, which advance it towards exploration
00:01:49 - of further and deeper informational sources.
00:01:52 - And the way to quantify that, to characterize that, for that you wanted to be
00:01:58 - able to say how much or what this information is about.
00:02:03 - And for this we had to understand how to reinterpret information theory as basically handed down by Shannon.
00:02:12 - In a way that allows us to incorporate the concept of relevance.
00:02:16 - So relevant information.
00:02:19 - And a very important article came out in 1999.
00:02:23 - It was the information bottleneck article, which basically made the argument
00:02:27 - that you can actually color, you can actually tag information.
00:02:31 - In other words, you can make a distinction between information that's relevant
00:02:34 - and information that you may have,
00:02:36 - but you don't want but now and
00:02:40 - you you take a very specific view on
00:02:43 - on the evolution in that sense of both uh sensing and
00:02:46 - and cognition because you consider sensors as being highly optimized systems
00:02:52 - that that work close to let's say their physical limits um is that is that critical
00:02:59 - to this approach that you take that sensors are really that highly optimized or that's arbitrary,
00:03:04 - In principle, it's arbitrary, but there are indicators that this gives us a
00:03:10 - hint on the fact that information really is a major driver for evolution.
00:03:15 - So I'd rather would say that this is originally it was a motivator to say,
00:03:21 - yes, information is important for our nature.
00:03:23 - But as we started looking at the information perspective itself we suddenly
00:03:29 - saw that informational perspectives if you assume that evolution uses it as
00:03:34 - a proxy for directing, for genesis.
00:03:39 - Processing information processing or behaviors etc.
00:03:42 - That it may actually desire or not desire but drive, push towards a increasingly
00:03:51 - warm, high resolution sensorics.
00:03:55 - So in other words, it's basically a mutual.
00:03:58 - Feedback loop mutual positive feedback but now
00:04:03 - it's okay if we consider that that sensors are
00:04:06 - highly optimized they're highly optimized relative to what excellent
00:04:10 - question the original idea was that they are highly optimized relevant to some
00:04:16 - hypothesized goals that was what you originally looked at so the original relevant
00:04:21 - information was just about that some indicators came up that that indicates no, it's not enough,
00:04:28 - you get actually a high level of other possible goals that become accessible.
00:04:35 - And of course, if your niche changes, or your agent gets a bigger brain,
00:04:39 - slowly, or you have additional goals in your life, it turns out that suddenly,
00:04:46 - with the same senses, you can solve other problems.
00:04:48 - So in other words, you get the opportunity to drift from goal to goal.
00:04:53 - So several questions emerge, where do the goals come from?
00:04:56 - Second question is how does this drift possible at all?
00:05:00 - Because often, if you're highly specialized, you can't change anything without
00:05:04 - losing performance or else.
00:05:05 - The argument that we suppose now, or that information theoretical view has given
00:05:10 - us is sometimes you get more information than you bargained for.
00:05:16 - This extra information gives you what people would call some kind of.
00:05:21 - Openness for evolution or permits you more
00:05:24 - adaptiveness in evolution that then you
00:05:27 - would naively expect but now we
00:05:30 - can distinguish different sensor systems right so for
00:05:33 - instance we because the oldest sensory systems are probably mechanical systems
00:05:37 - um so they just detect forces then we would have sensor systems that deal with
00:05:42 - molecules molecular structure which is chemical sensing then you have your sensory
00:05:47 - systems with sound pressure waves which which might build again on mechanical
00:05:50 - sensing, that's your auditory system,
00:05:52 - or you might look at a lateral line of fish that would pick up turbulence in flow.
00:05:56 - And then lastly, you would have sensor systems that like photons,
00:05:59 - and there you have vision, right?
00:06:01 - So in that sense, these biological systems you look at, and that you try to
00:06:06 - understand at the first step,
00:06:09 - the sensors from which interface to these different environments in which they
00:06:12 - can exist, you could argue, well,
00:06:15 - but maybe these different environments have also different informational requirements
00:06:19 - and a sense of a single notion of optimization that will sort of fall flat rather
00:06:23 - quickly because there's diversity of subdomains in which sensors have to adapt.
00:06:28 - So do you believe we can have like a generic informational optimization criterion
00:06:34 - to look at these sensor systems or we should already link it to the specific
00:06:37 - subdomain in which they act if we go from mechanical sensing to vision?
00:06:42 - I should expect that, of course, the ecological niche plays a role.
00:06:47 - And you will basically select for various, there will be different attractives.
00:06:51 - If you look at how a sensor and its environment interact, you have different
00:06:55 - attractives, different solutions for the same problem.
00:06:58 - Classic example is of course, bats.
00:07:01 - Bats use essentially the auditory sense in a vision form, in a vision modality.
00:07:07 - This is a very interesting development because it's vision, but it's also active
00:07:12 - sensing, very, very active sensing without sun generation doesn't work.
00:07:16 - But on the other hand, you have, for example, owls, and they just improved their night vision.
00:07:21 - So it's not a unique solution, and maybe there is a historical accident that
00:07:27 - gets you in a particular direction.
00:07:28 - One interesting case is, of course, snakes can detect infrared,
00:07:33 - and they use actually skin sense,
00:07:35 - which has been basically anatomically formed in such a way that it operates
00:07:40 - as a camera obscura for infrared.
00:07:43 - Now, the funny thing about that is everything about this hardware that they
00:07:47 - use essentially is equivalent to what we have.
00:07:50 - So in other words, the main difference between the snake skin sense is,
00:07:56 - apart from the wiring the brain essentially the
00:08:00 - anatomy not the actual nerve uh
00:08:03 - quality or skin property or so
00:08:05 - it's really the anatomy most but now
00:08:09 - if you if you say that that sensors tend to
00:08:12 - operate at some physical limit right they're optimized and indeed we look at
00:08:17 - these strange kind of sensor systems like infrared and the snake that sort of
00:08:21 - is is capitalizing on different sensory systems in addition of course such like
00:08:28 - an infrared detector only can.
00:08:31 - Is effective as a sensor if there's actual signal transduction related to it.
00:08:35 - So can we really talk about the evolution of a sensor in isolation without taking
00:08:41 - into account the morphology in which it's embedded and also the signal transduction
00:08:45 - mechanisms that it has to deploy or it has to be interfaced to?
00:08:49 - Of course, I completely agree. I mean, you have multiple systems interacting.
00:08:52 - And there may be in the evolutionary history, you had different balancers,
00:08:57 - different drives in different directions, and there may be not a single attractor
00:09:02 - into which you converge, starting in from the same original species.
00:09:07 - So I don't expect that. But again, this becomes then a process of actually modeling
00:09:14 - the particular evolutionary pathway.
00:09:18 - What the informational view gives you is not saying how this pathway will look
00:09:23 - like, because that requires a lot of assumptions.
00:09:26 - What it says is, however, what possible niches could look like.
00:09:30 - So what are places where information is hidden that could be discovered?
00:09:35 - And it gives us a hint why you actually can find this niche.
00:09:39 - The big question in evolution is not that there are solutions which work.
00:09:43 - The big question in evolution is, how do you get there? How does evolution actually discover the niche?
00:09:49 - And the argument is, if you have some information that indicates something interesting
00:09:54 - is there, there is, that would be the hypothesis,
00:09:58 - an innate drive, or an innate, relatively generic mechanism that always assumes
00:10:06 - that when there's better information available,
00:10:08 - then it's worth refining it, wherever it comes from.
00:10:11 - And then you have basically something that accelerates adaptation by pure.
00:10:18 - If you like, experience that this actually works in our world.
00:10:21 - In other words, information is a local, gives you a local gradient or a local
00:10:26 - indicator where good solutions may be found.
00:10:30 - Right. But this is an important issue, right? Because it's also sometimes because
00:10:35 - of the processing that goes on behind the sensor that you actually can reach
00:10:41 - the actual hyperacuity of the sensor itself.
00:10:44 - So that means the optimization of that sensor sheet, In some sense,
00:10:48 - it's not informing you about the kind of information processing that is possible.
00:10:53 - You start, let's say, to integrate across multiple senses. A typical example is chemical sensing,
00:10:58 - right, where the sensitivity of single chemoreceptors on the male moth antenna
00:11:04 - are orders of magnitude weaker than the detectability,
00:11:10 - the ability that the animal has with respect to heart rate responses to pheromones, right?
00:11:15 - So, you will see heart rate changes to homeopathic levels of hormones in the
00:11:20 - air that you will never be able to pick up with a single sensor that they have on their antenna.
00:11:25 - So, that means, in some sense, it would argue that to really think about sensor
00:11:30 - systems in isolation could also then be misleading you in terms of understanding
00:11:35 - what the informational capabilities are of the system that is integrating that sensor.
00:11:41 - So, isn't there a risk in the analysis you presented that you say,
00:11:46 - well, if we have a sensor and the sensor gives you the upper limit of what can be achieved.
00:11:50 - And we know for many biological systems, this is maybe not really the case.
00:11:54 - They can go beyond what the sensor as such in isolation will be able to deliver.
00:12:00 - Well, if you look at that, you'd have to look at integration over time,
00:12:04 - for example, and take memory into account.
00:12:06 - Right. So, of course….
00:12:10 - When we say a sensor limits the information in the sense of data processing
00:12:18 - inequality, that you can't get more information through a sensor than the sensor permits.
00:12:22 - But of course, if you can integrate over time, then the effective bandwidth is much higher.
00:12:28 - You can accumulate evidence, there's no question about that.
00:12:32 - And if you do it for long enough, then you at some point have enough information
00:12:35 - to make your decision for itself.
00:12:37 - Or you can integrate information which otherwise is worthless and suddenly other
00:12:41 - information comes in and suddenly this information becomes valuable.
00:12:44 - In fact, in the case of the snakes, the infrared and optical information is
00:12:50 - integrated in the optic tactile and it's quite intricate how they interact and
00:12:55 - how the snake decides there's an object of interest there.
00:13:01 - So this now becomes relevant for the second point that you made,
00:13:06 - also your talk, where you say well so sensors are optimized so that means also
00:13:10 - from an energetic perspective,
00:13:12 - they they are optimized to give you certain kinds of
00:13:15 - information efficiently while the process and the cognition that runs behind
00:13:20 - it is expensive right because metabolically the brain takes a disproportionate
00:13:26 - amount of your energy budget as as an organism and um so so you see that as the
00:13:34 - main from an informational perspective this this
00:13:37 - this energetic optimization of then the processing
00:13:40 - the brains perform is the main challenge for evolution
00:13:43 - is that is that the consequence of what you're saying it's a bit more complicated
00:13:48 - than that um there are different time scales at play here because if i have
00:13:53 - a brain that's big it will eat that energy and i can't just say okay as a as
00:13:57 - an adult i will just shrink my brain by 50 percent who need it actually does happen in some animals.
00:14:03 - There's a it's called a fish that eats its own brain.
00:14:06 - And that animal basically finds a rock.
00:14:10 - So it's wrong until it finds a rock, and when it finds it, it will never leave the rock again.
00:14:15 - And when that happens, it actually basically consumes its own brain.
00:14:19 - So it does happen, but it's not typical.
00:14:21 - So the way to look at it, in my opinion, is from a point of view,
00:14:26 - can you profit from a bigger brain, say, on a longer timescale,
00:14:32 - longer means over several generations?
00:14:34 - And if you can, then yes, you maintain
00:14:37 - that, otherwise you just reduce the investment into bigger brains.
00:14:43 - You should also not forget that there's of course Darwin's brilliant idea of sexual selection.
00:14:47 - So maybe the highly social animals will be more selective towards more intelligent
00:14:53 - sexual partners, and thereby the brains will be driven to be bigger.
00:14:58 - On the other hand, that has to be sustainable,
00:15:01 - and that only works if the brain actually that's something sensible right
00:15:05 - now would you but now so now you see the
00:15:08 - what i'm driving at right because i was
00:15:11 - making the point that the sensory the
00:15:14 - the perceptual or the sensory capabilities of the
00:15:17 - organism can go beyond what
00:15:20 - the sensor can deliver by virtue of doing the processing right so now i could
00:15:25 - argue well maybe this solution was identified or was converged on during evolution
00:15:30 - Because to really optimize the sensor would actually metabolically be way more
00:15:35 - expensive than to put that effort in the processing.
00:15:39 - Like integration and time, use memory.
00:15:42 - And it's in that combination that they can actually get a virtual sensor,
00:15:45 - if you want, or an effective sensor that gives me the information I need.
00:15:49 - I basically would be loathe to separate them so strictly, because sensing and
00:15:57 - processing are, I can't imagine situations where it's almost impossible to distinguish.
00:16:06 - So in some cases, relatively simple to distinguish, in some cases not.
00:16:11 - We have an example, I didn't show that in the talk because of lack of time,
00:16:17 - but we have an example where we can choose whether we prefer to use memory or
00:16:22 - sense the sensor as to achieve a certain utility value.
00:16:28 - And you can shift that around and say if the sensing is cheaper,
00:16:31 - then you shift it towards sensing and you use less memory, and vice versa.
00:16:37 - And you can look at it yourself. If you look at your Google Maps,
00:16:41 - when you find a road, look at the map all the time, then essentially you say,
00:16:45 - sensing is cheaper than remembering the path.
00:16:48 - But of course, this is very inconvenient when you do the path a lot of times,
00:16:54 - and it may be actually cheaper for you to keep it in mind, not having to look
00:16:58 - and stop at every corner to watch the map externally.
00:17:01 - Externally so i think i think i do agree
00:17:04 - with you that um processing is not maybe
00:17:07 - viewed as an extension of sensing perhaps that's
00:17:10 - the way it's it emerged that basically um
00:17:13 - the system discovered that while treating your own brain as a meta sensor is
00:17:19 - is a good thing uh on the other hand it's i'm not clear the way but whether
00:17:25 - we can really separate that yeah but still for the for your framework, this is,
00:17:31 - let's call it the challenge or a constraint we have to again look at,
00:17:36 - maybe after we went through the framework.
00:17:39 - Because it seems to say, well I must be optimal in informational terms because
00:17:45 - that will help me to reduce the cost of the processing.
00:17:51 - I would say that.
00:17:54 - That the cost of the processing must be sustainable.
00:17:58 - It's like having a company that permits itself to have a certain amount of administration level.
00:18:06 - And for example, if I have a big administration, you have to make enough money
00:18:11 - to keep these people working.
00:18:14 - Doesn't mean that you immediately reduce the administration when you don't have
00:18:19 - enough money, or that you will just not survive on the long term if you have
00:18:24 - a big administration that you can handle or that can handle your stuff efficiently.
00:18:28 - So the parsimony principle is, of course, that if I have something to process,
00:18:35 - then I don't want to waste too much effort because I may have other things to process.
00:18:42 - If I have an emergency service, of course, that emergency service has a certain
00:18:46 - bandwidth, public, which is required, and it has to be activated at any time.
00:18:53 - And if the emergency service requires 10 bits per second for some reason,
00:18:57 - these 10 bits must be available to me as an organism, but of course,
00:19:01 - I may do different things and of boring things I'm doing kind of grazing or just walking around.
00:19:07 - I don't want to lose a lot of processing power because I may need it for other
00:19:12 - things. So the argument is a mixture of energy processing, a mixture of other
00:19:18 - resources that may be required for other tasks.
00:19:22 - It is, however, still a parsimony.
00:19:26 - Right. Okay. But then, so that then brings us to sort of the linking of the
00:19:34 - sensing and the action and decisions, right?
00:19:36 - And this is where I think you really put the brunt of your effort to try to
00:19:41 - understand what should be the properties of what's called a decision-making component. So...
00:19:51 - So how do you see then the core on the one hand constraints that this decision-making system is facing?
00:19:59 - And how do you see what are the main principles that allow a decision-making
00:20:04 - system to satisfy those constraints?
00:20:07 - Well, the core constraint is, of course, the way the agent is embodied in the
00:20:11 - world. This imprints a signature on the information flow. of.
00:20:16 - The way what you do in the world impinges in the world and you perceive the
00:20:22 - impingement again determines what you can possibly do.
00:20:26 - Because that's not something that it's in your choice. It's given by physics,
00:20:30 - by your physicality, by your body.
00:20:33 - So this is the main constraint that determines what happens.
00:20:37 - The other constraint is, and that's much freer, and that's of course an apostolate,
00:20:41 - which may be wrong, of course, that the brain essentially Essentially,
00:20:45 - at least in our models, it's essentially free to organize this information.
00:20:51 - This is, of course, not real.
00:20:53 - In reality, there are other constraints. But what we would like to know is what
00:20:59 - other constraints are natural.
00:21:01 - So one example is, for example, this goal-relevant information where we specifically
00:21:06 - split the simple decision-making and the long-term goal, say,
00:21:11 - to study the emergence of sub-goals.
00:21:14 - Of course we put in an assumption here that there's a long term memory that
00:21:19 - stores the goal we want to go to ultimately this is an assumption as Sander
00:21:24 - van Dyck one of my former PhD students said it's embranement it's,
00:21:30 - not the body that we are fixing here, but actually how the brain is constrained internally.
00:21:36 - Ideally, in our studies, we want to limit the assumptions about embrayment as
00:21:42 - far as possible, because we would like to have an answer is what type of brain
00:21:46 - structures do you want to process this information efficiently?
00:21:51 - So in other words, you ask, I want to process information, okay, Okay, a certain amount.
00:21:57 - And the question is, are there special sub-manifolds of solutions which prefer
00:22:01 - certain brain organization for processing the search adaptation faster or efficient?
00:22:08 - This is a question that is ongoing research.
00:22:10 - We don't have a clear answer. But clearly, when you make assumptions about a
00:22:14 - brain, if I may use this word.
00:22:18 - Then we get things like, for example, versions of sub-goals,
00:22:21 - just by virtue of saying, okay, there's somewhere where I'm storing slowly changing long-term goals.
00:22:27 - And then this concept of some goal emerges naturally from the interaction in the world.
00:22:32 - So, making judicious assumptions about how the brain is structured,
00:22:37 - how the world is structured, can give you very natural hypotheses about emergence of natural phenomena.
00:22:44 - Again, yeah. But wouldn't it be fair to say that it's more like enmindment?
00:22:48 - Because you cannot really make normative statements on structure at best you
00:22:55 - can make normative statements on function on information flows just for clarity I'm not very,
00:23:05 - ideological on that it is not a statement about how the brain actually looks
00:23:11 - it's a statement about how a possible organization of information processing
00:23:18 - may look for certain purposes.
00:23:22 - We have assumptions about how information is being processed,
00:23:26 - but these assumptions are.
00:23:29 - At this stage not really well founded and
00:23:32 - they are based on possibility and whether
00:23:35 - the phenomena that emerge are something you
00:23:38 - actually see yeah but in that sense you really want to
00:23:41 - get to a normative formal framework right
00:23:44 - because that was also one of the points you made that it says
00:23:47 - the biology is ambiguous we have no clear understanding
00:23:50 - how this works as a structure robotics is
00:23:53 - arbitrary people have many different solutions for the same
00:23:55 - thing we don't know whether there's any common principle so
00:23:59 - what we really need is is let's say
00:24:02 - a normative framework that says okay these are the
00:24:05 - decisive criteria that all these systems have to follow in order
00:24:08 - to be now informationally optimal yes okay
00:24:12 - and and that that's why i was hassling you earlier on
00:24:15 - this on this notion of informational optimality because that's of course a very
00:24:18 - important guiding principle and assumption of this whole framework yes right
00:24:24 - yeah um so now but then you you linked your framework to this informational
00:24:30 - framework that you're advancing to the Carnot cycle,
00:24:34 - which essentially describes energy.
00:24:37 - It's sort of the expansion of a chamber that allows you to do work.
00:24:41 - So why do you think that's a good metaphor to look at decision-making and information
00:24:46 - processing in biological systems?
00:24:48 - I was very kind of going quite bravely into an aggressive metaphor here.
00:24:57 - But of course, there are actual attempts to link information processing and physics.
00:25:03 - And we have seen quite a bit of progress in recent years, actually.
00:25:09 - For example, by David Volpert, who has generalized Landauer's principle,
00:25:13 - and there are a couple of other really interesting pieces of work in this area.
00:25:16 - And the question is, on a very low level of physics, there seems to be an intricate
00:25:23 - relation between energy processing processing, energy consumption,
00:25:27 - energy production, entropy production.
00:25:29 - Of course, these levels are very, very far away from where an organism sits
00:25:34 - in his information processing level. However, in between...
00:25:40 - There are many levels, and we have to take into account that in every level,
00:25:44 - there is information processing happening.
00:25:46 - When a cell organizes its organelles, the organelles organize the ATP consumption.
00:25:51 - This is organization. This is information processing.
00:25:54 - And I would now go on an extreme speculation.
00:25:58 - I made it entirely wrong, so don't take my word for it.
00:26:02 - But what I would say is that as you progress to arrange your information to
00:26:09 - higher and higher hierarchies,
00:26:10 - you have a kind of loss function or loss component to basically every hierarchy
00:26:15 - level loses your factor 100 of your free energy that you have.
00:26:18 - And what remains is kind of your investment for the next level.
00:26:22 - And as you go up and up and up, only very little remains for you to actually
00:26:28 - operate on freely and free in the sense that you can do new discoveries that
00:26:33 - accumulate this information. Most of the stuff is administration.
00:26:36 - So administering ATP, where ATP goes, where your cellular motors are driving
00:26:41 - stuff out, getting trash out, getting nutrients in.
00:26:46 - This is information processing except that you don't know what's happening.
00:26:50 - It just happens under the boot.
00:26:53 - But I do think that in principle a complete theory would encompass the lowest
00:26:58 - level unbreakable barriers of Landauer essentially and friends.
00:27:05 - To the highest level where essentially information is almost detached from the physicality,
00:27:11 - in a way um and find out no no
00:27:14 - of course it's not detached there is a link but i think this link becomes more
00:27:18 - and more tenuous with every level so it's very at this stage we're very far
00:27:21 - from actually seeing how the high level information processing constraints are
00:27:25 - linked with a low level physical so i two parts with two answers one answer
00:27:30 - is an aggressive method metaphor nothing
00:27:32 - else the other part is no it's actually not just a metaphor it's real but um
00:27:37 - that kind of cycle is really far down the scale and cognition is very far right but that's an important,
00:27:45 - point about this right because it also means that you are willing to commit
00:27:48 - the notion of information processing to actually physical properties of the
00:27:52 - system because in the end what we're talking about is quantifying the entropy
00:27:57 - in the system where the entropy is increasing or decreasing.
00:27:59 - If you say, look, this is information processing, it means there's a change in entropy.
00:28:04 - Essentially, this is really important that we don't get confused what we mean with information then.
00:28:09 - We need to be careful. You don't necessarily reduce entropy in the system itself
00:28:13 - by doing information processing.
00:28:16 - You have to take into account the environment.
00:28:19 - So in the environment, you can choose sub-environments where you reduce entropy,
00:28:24 - for example, by information processing. sensing.
00:28:27 - You definitely increase other type of entropy because you simply generate trash, if you want to.
00:28:33 - And even your system remains unchanged.
00:28:36 - You can have a completely reactive robot that essentially pushes all the boxes
00:28:40 - in your room to the walls.
00:28:43 - But that system reduces the entropy of the boxes distributed over the room.
00:28:49 - Itself, it's completely reactive, so there's no internal change of entropy.
00:28:54 - And of course, entropy in total of the universe increases, because there's heat
00:28:59 - and the atoms, the gas of the air moves faster, whatever. So in other words.
00:29:06 - Information processing does not mean for the agent itself reduction of entropy.
00:29:10 - No, it relates to the observer describing the agent in its environment.
00:29:15 - Yes, if you include the environment, then yes. And also it depends on what you
00:29:19 - consider the environment.
00:29:20 - So it really depends what you look at. And on top of that, even it would depend
00:29:27 - on which state variables of the agent environment you consider.
00:29:30 - Absolutely, yes. So this is really important to understand. It's a relative perspective.
00:29:36 - It's observer-dependent. It's always observer-dependent, except if you go to
00:29:41 - the full wave function, if you like, the full state,
00:29:44 - in which case the problem is, in my opinion, not yet satisfactorily addressed,
00:29:50 - but perhaps it will happen.
00:29:52 - Right. So then to illustrate, to introduce your framework,
00:29:56 - you start to make a distinction between open and closed-loop systems And to
00:30:00 - try to build a more formal perspective on how they would be differentiated better or worse, right?
00:30:07 - So why do you think now open and closed loop, that comparison is helpful to
00:30:14 - introduce this information theoretic framework that you're advancing?
00:30:18 - We could have looked at, let's say, sensory processing as we discussed earlier, right?
00:30:21 - Okay, I mean, essentially, this idea of to Shetton Lloyd of considering these
00:30:28 - two cases there in the ways of the extreme cases, an agent that is basically
00:30:32 - nothing else but the blind process doesn't take in any information.
00:30:36 - It's basically a modulation of physics.
00:30:40 - Consider a modulation of physics, which has a particular property that itself,
00:30:44 - it does not take in any information.
00:30:46 - Closed loop means it does take in this information and gives it extra power.
00:30:52 - It makes it a more complicated process.
00:30:55 - But it turns out that you can, and that's where it gets interesting.
00:31:00 - Bound the extra entropic influence of this closed loop agent by how much it takes in.
00:31:07 - And this is, in my opinion, very cool, because you see for the first time,
00:31:11 - in a way, well, not for the first time, I should be solving it before,
00:31:14 - but in a precise sense for the first time,
00:31:17 - that cognition or cognitive performance underlies information processing principles.
00:31:26 - You can't just make decisions of a certain quality without having a certain
00:31:34 - informational invariance or minimal balance respected.
00:31:40 - And that is cool because you essentially say cognition is not some kind of abstract
00:31:46 - platonic thing that just happens somewhere and you can make anything happen.
00:31:50 - No, you can't make anything happen. happen, there are certain constraints on
00:31:54 - what you can make happen under certain circumstances,
00:31:57 - which I think is a really important step because it says, unlike what AI usually
00:32:03 - does, which reads AI kind of intelligent decision making in an abstract world
00:32:07 - that's devoid of any constraint,
00:32:10 - you can essentially think of anything, you are constrained by very, very tangible.
00:32:18 - Aspects of the world.
00:32:20 - But now, so what you're saying is, look, if you consider the open loop case
00:32:24 - plus the information added by being able to sense, you give an upper bound of
00:32:28 - what the closed loop controller can do. Yes. Right?
00:32:32 - But I could, that's a toy example because within the context of the niche and
00:32:38 - that you earlier emphasized as being also relevant, if I have an open loop control
00:32:43 - in an environment with predators, I'm dead in no time.
00:32:46 - Right? So then you can go, okay, but maybe the upper bound would require the
00:32:52 - open loop plus the information coming from your sensors plus some minimum memory system.
00:32:58 - Sure. Right? To satisfy, let's say, some lower bound of survivability. Of course.
00:33:03 - Does that matter to you or that doesn't matter? No, no. Memory is the next step.
00:33:07 - Okay. We did not consider memory for a very simple reason because,
00:33:12 - well, to shed a light, did not consider memory. But of course,
00:33:16 - memory is the next natural step.
00:33:18 - In fact, there are some attempts to consider memory as a kind of constructed
00:33:24 - reality that is constructed in such a way that it keeps the things active and alive, which are...
00:33:33 - Relevant and which require history to accumulate, for example. Yeah.
00:33:37 - So if you want to have a thresholding process, you have a memory that is able
00:33:42 - to count or to measure, oh, yes, I have seen enough, and please,
00:33:46 - now we can make a decision.
00:33:48 - So yes, of course, memory is the natural next step.
00:33:52 - But memory is a strange thing, in my opinion, because it's half environment
00:33:57 - and half agent in a way. It's a quite hybrid thing.
00:34:02 - I agree no you're right so then
00:34:05 - okay so so here we have this this example but
00:34:08 - now we can start to define let's say constraints onto this agent environment
00:34:13 - interaction but now in open closed-loop case you discuss basically we have a
00:34:17 - world state uh which are considered discrete states if i got it right and this
00:34:22 - is something else we can worry about and then we have sensory states and we have actions, okay?
00:34:28 - So we have a three-state system. And so world, sensor, action,
00:34:32 - and they're coupled to each other.
00:34:36 - What in that makes the agent then? That is a very sharp question.
00:34:42 - So if you remember my diagram about world, sensors, memory, actors,
00:34:50 - memory, and so on, I emphasized the symmetry between world and memory.
00:34:55 - And when you ask what makes the
00:34:57 - agent there the question is very subtle if you look just
00:35:00 - at the graph you don't see it the graph
00:35:04 - itself does not make a distinction there's no way of distinguishing world
00:35:07 - memory my personal opinion is and that's completely speculative i can't prove
00:35:11 - it at this stage the main difference between world memory is the fact that the
00:35:16 - world arrows are highly constrained there is very little that can happen there
00:35:22 - and the information density is low the world is simple.
00:35:26 - Memory is where you can rewire in principle, at least arbitrarily.
00:35:32 - So you could have a maximally compressed information processing.
00:35:38 - Which is essentially when we talk free will, I think that's what hides behind it.
00:35:44 - The fact that in principle you can have via evolution or adaptation or whatever,
00:35:49 - a very, very complicated rewiring of the memory, which is virtually arbitrary.
00:35:54 - Think of a computer, a keyboard, you can choose any keyboard you want,
00:35:58 - can rewind it as you want.
00:36:00 - But what you can't choose is how to organize the pixels on the screen so that
00:36:04 - your eyes will recognize it.
00:36:06 - So in other words, there you have very strict constraint about geometry,
00:36:10 - but on the way your actuators will operate with that, and basically your memory
00:36:14 - can operate with it, and you have many choices.
00:36:17 - And I believe that the agent, If I give you a system which contains a world
00:36:22 - and an agent, I think the agent will be that part of the system.
00:36:27 - Where the constraints are basically unconstrainedly complicated.
00:36:32 - So the world would be the part which has lots of compressible structure.
00:36:37 - It's a very vague statement, I know. But let's resonate with it anyway.
00:36:42 - So already the diagram that you sketch is in some sense a tiny fraction of the
00:36:51 - total set of possible states, because in some sense in your diagram, you go back in time.
00:36:56 - You say, okay, I'm at T0, and now I can show you back into the past where I came from.
00:37:02 - And that's this trajectory of world states, sensor states, action, et cetera.
00:37:06 - Because at any point in time, there's a plurality of world states, right?
00:37:11 - Sure, yeah. There's a plurality of sensor states.
00:37:15 - There's a huge action potential. That's true. Right?
00:37:19 - And then it all collapses into one world state, one sensory state,
00:37:23 - one action. And then we have our next world state, right?
00:37:27 - So that means we have to constrain that highly variable set of states of different kinds.
00:37:34 - So my question is, aren't we lacking the key state that makes the agent an agent,
00:37:41 - and that is an internal drive state,
00:37:43 - that the agent is, I'm ready to explore to serve my survival of my informational
00:37:50 - needs, or I'm ready to consume a resource because I have to take care of my energetic needs.
00:37:57 - So, isn't that the internal state defined by the survivability of the agent,
00:38:03 - not a key constraint on this plurality of world sensor and action states?
00:38:08 - Well, the model is quite general. So, in fact, what M is…,
00:38:14 - i didn't say so it's not a problem to plug into
00:38:17 - m or to internally consider m as
00:38:20 - consign contains such a let's call it pseudo goal
00:38:23 - or pseudo teleological um parameter that's
00:38:28 - not a problem um so we could split it so the assumption that m is one coherent
00:38:33 - blob is the most generic assumption when you have just one age um so you could
00:38:39 - put it in wait why would you put it in m It should actually already mediate
00:38:43 - between the sensory state and the action state,
00:38:45 - right? In the earlier example.
00:38:48 - Yes. In the earlier example, we had just a reaction. But essentially,
00:38:53 - in the later example, S doesn't talk to A without M.
00:38:58 - Yeah, exactly. So you collapse it all in M, essentially. Everything is in M, yes. Okay.
00:39:04 - So then, okay, so we have a scheme now. We can think about how behavior comes
00:39:11 - about and how behavior in turn depends on sensory states in the more advanced
00:39:16 - version, how it also depends on memory.
00:39:17 - Okay, good. So we got that. But now what you really want to understand is,
00:39:23 - okay, what should my actions be?
00:39:24 - And then you say, well, my normative perspective on that is that your actions
00:39:29 - should serve your informational needs because the controller wants to optimize
00:39:34 - its information processing because that's the most expensive thing it's facing. Correct?
00:39:39 - It's a mixture. I mean, we looked at Lagrangians. So we looked at a mixture
00:39:46 - of goals or goal rewards versus informational needs.
00:39:52 - It can go just for informational needs, but in that case, you can just do nothing, for example.
00:39:57 - In the case of empowerment, it doesn't care about informational needs.
00:40:00 - In fact, it's completely orthogonal to that. It says, this is my goal.
00:40:04 - It produces a goal from this prediction. direction.
00:40:07 - I have not made a statement on how to balance informational needs and empowerment.
00:40:13 - It's possible to do that. We have some work in that direction.
00:40:16 - Then you get some salient strategies emerging. This is work by Tom Anthony.
00:40:22 - But the interesting point is really at this stage that we want first to understand
00:40:27 - the ingredients before we try to build a synthesis of things which we individually
00:40:33 - don't understand. Right. Okay.
00:40:35 - Fair enough. So basically what you're saying is, look.
00:40:39 - Within the informational perspective, perspective right i
00:40:42 - can make a normative statement of how i should bring these
00:40:45 - things together and how i can use my memory to do that yeah it's
00:40:48 - just one one one view yes on
00:40:51 - this system right and there um you
00:40:54 - you made the point that there might not be a free lunch but
00:40:57 - sometimes there's free beer so so what
00:41:00 - does that mean relative to optimization of information that
00:41:04 - was essentially an allusion to the embodiment example
00:41:06 - so the twisted world example where you have basically in
00:41:09 - the agent that if the world is
00:41:12 - simply organized so there is a labeling of conditions which is consistent over
00:41:18 - the world then the agent can solve certain natural problems very very easily
00:41:23 - with little information processing when you do the relabeling which in terms
00:41:28 - of what we would call traditional,
00:41:31 - AI is completely equivalent people would say it doesn't
00:41:33 - make a difference it turns out for an embodied agent if
00:41:36 - you take the embodiment seriously so actually know of an ease are essentially
00:41:42 - a local coordinate system of the agent which agent takes it with it if that's
00:41:47 - completely skewed up says and twisted around the world the agent will.
00:41:55 - Have a much harder time performing the same task in other words,
00:42:01 - your world if it's well designed if your embodiment is
00:42:04 - well designed or your world is nice to you that's the
00:42:07 - way i like to say it then your cognitive cost
00:42:10 - is so low that you can easily solve a task that actually looks quite difficult
00:42:15 - and that is something that of course a group of pfeiffer and many others have
00:42:19 - made for a long time but i believe information theory gives us a window and
00:42:25 - to why it's such an advantage.
00:42:28 - It really reduces the cognitive load we need to solve the task.
00:42:33 - Yeah, but I could argue that isn't that almost a trivial statement because if
00:42:40 - you get information for free then it's easier.
00:42:43 - Well, you don't get this explicitly for free. The information is implicit and
00:42:48 - the physicists would call it a gauge symmetry.
00:42:52 - So an agent that has its actions Basically, when you move the agent around the
00:42:57 - world, these actions keep a certain type of meaning.
00:43:01 - If that meaning completely is perturbed or shuffled around from the movement
00:43:06 - of the agent, then this meaning doesn't help you.
00:43:12 - This is what we have in the twisted world. In that case, the agent basically
00:43:16 - is moved around, but the actions north, east, south, west lose their meaning in any other location.
00:43:22 - They mean different things. On the other hand, if you have basically anything
00:43:26 - that takes its action with it, it's like a local gauge symmetry,
00:43:30 - basically saying, I'm taking this property of north means roughly the same thing in this world.
00:43:36 - Now, in which sense does it mean the same thing?
00:43:41 - We haven't properly defined that. But what we did say, we measured the information impact.
00:43:46 - That is very visible. So it's not saying, this is the number of bits that the
00:43:52 - world actually gives you.
00:43:54 - We say, if the world has this kind of symmetry or pseudo-symmetry.
00:43:59 - Then you gain cognitive load. Your cognitive burden is reduced.
00:44:05 - Right. My hope would be that there would be at some point a way of actually
00:44:09 - writing down, saying this is how much your world actually tells you up front. Right, exactly.
00:44:18 - So if we forget about the embodiment for a second, what is the specific informational
00:44:26 - quantity that's being now optimized?
00:44:29 - Is it like the description length of the information I deal with?
00:44:32 - Is it, let's say, the information gain I have per step size?
00:44:39 - In our examples, it was what we call relevant information, the information you
00:44:43 - need to take an action at a certain average utility level.
00:44:50 - So that's basically a pull quantity. It tells you how much information do I
00:44:55 - need from the environment to perform actions at that level.
00:44:58 - Description length is something more complicated than that we don't look at that at all,
00:45:04 - this will be also probably more related to learning itself,
00:45:08 - rather than to actually we would like to call this metabolic information it's
00:45:12 - information that you just process,
00:45:15 - basically on every step it's like you bring out your trash you repost,
00:45:19 - you process it and this is what you do every day it's basically how much time
00:45:23 - do I have to allocate or how much resources, information resources resource
00:45:25 - to allocate for just maintaining status quo. Okay.
00:45:30 - So then the criterion is really utility, right? I have to sort of optimize it
00:45:34 - to reflect some utility.
00:45:35 - But then how well does this scale if the number of possible goal states increases
00:45:42 - and also when the potential goal states can be contradictory?
00:45:48 - Excellent question. The scaling is something we start to address.
00:45:52 - It's a problem in empowerment. The component has some aspects,
00:45:56 - gets some aspects of that.
00:45:57 - We have various tricks, I've released an approximate of algorithms which we are using.
00:46:03 - It's probably of the method to be doing the most well-developed one,
00:46:07 - because it turned out to be.
00:46:10 - Now, you used to talk about contradictory goals. It's an excellent question.
00:46:13 - And I'll give you an example. If you are in Barcelona and you say you want to
00:46:19 - go to, have to decide you go to Lisbon or you go to Granada.
00:46:25 - Then I would say that, first of all, you can go to Madrid and then decide.
00:46:30 - So in other words, it's part of the rule, which you can take without committing to a particular goal.
00:46:36 - Then at some point you have to commit and then you have to split let them make the decision.
00:46:40 - And I do think that organism, if we believe our goal-relevant information formalism,
00:46:48 - would profit from doing so for various reasons.
00:46:51 - Number one, you don't have to know so much.
00:46:54 - You can concentrate on moving on highways only. For example, the road to Madrid.
00:46:59 - You don't have to learn all the side roads that you would have to take to Granada or to Lisbon.
00:47:05 - First you go to Madrid, and only then you worry about the next step.
00:47:09 - Second, organisms, I think, profit from not committing to a decision too early.
00:47:14 - So if you can avoid to not commit, it's actually an advantage because it means
00:47:19 - that you can still reorganize, re-decide if another goal pops up.
00:47:25 - So my argument would be contradictory goals, as long as you don't have to make
00:47:30 - the decision now, don't hurt you necessarily. You go just to the intermediary
00:47:36 - goal, sub-goal that supports the goal.
00:47:40 - Okay. But now there might be other constraints on that, right? Like cost.
00:47:46 - It's not only that goals can be contradictory, like I want to go east or west.
00:47:49 - It can also be I can get a higher reward, 10% more than baseline,
00:47:55 - for 20% more metabolic cost.
00:47:58 - Now that might be a bad deal. But if I just go for my utility,
00:48:02 - it might be an acceptable increase, or it might be I have to take a certain risk of damage.
00:48:11 - In order to obtain a certain reward. So it's not only that they are opposing
00:48:17 - within the task domain, if you want,
00:48:20 - but they can also be opposing on different dimensions of survivability,
00:48:23 - to call it that. This is an excellent question.
00:48:26 - I don't think there's one answer to that.
00:48:31 - If risk-taking is an interesting point, I would say,
00:48:36 - if I had to make a blunt statement, that you take only risks if you really think
00:48:45 - that your chances in the future to get that goal are not that high.
00:48:50 - So risk-taking will be higher if you are in a bad position. It will be lower if you are not.
00:48:55 - That's very natural. I think that would also, if you write it down properly,
00:48:58 - emerge from a mixed evolutionary reinforcement framework. work.
00:49:02 - I would expect an agent that is very confident of continuous and steady growth of power, not risk it.
00:49:10 - In fact, utility, if you look at utility of course, they look typically risk
00:49:15 - averse when you are in positive and high positive values, but they are risk
00:49:19 - friendly when you're not.
00:49:20 - The other point, and this is very interesting, is what is more important,
00:49:26 - the goal or information saving?
00:49:29 - Now, I would answer it this way.
00:49:32 - If you are somebody who is trying to.
00:49:39 - Run through a door because he wants to get catch the train
00:49:42 - um so he runs to
00:49:45 - the door and just you know tries to get in the middle of the
00:49:47 - door so they have enough space and doesn't bother getting stuck a little or
00:49:51 - something like that and tries to get get the train that's it so he probably
00:49:56 - will go for information saving yes he wants to be fast but he doesn't have the
00:49:59 - tire and hasn't practiced it but imagine a sports an olympic sport which is
00:50:03 - running through doors in the shortest time
00:50:05 - possible reaching a train that leaves in exactly 15
00:50:09 - seconds and people train for 10 years to put
00:50:11 - the olympics off running through catching the train
00:50:14 - catching the train i bet with you
00:50:17 - these guys will not care about the information cost they will
00:50:21 - basically run through the door in a way that will optimize their
00:50:24 - throughput so they will run through the door or whatever gripping the
00:50:27 - handle with a hand so they can swing
00:50:30 - around in a very specific way and train probably to
00:50:33 - leave exactly at the angle of 63.5 degrees so that
00:50:36 - will be exactly propagated into the just
00:50:39 - closing door of the train in other words yes if
00:50:43 - i do better infinity then i don't care about saving information right but if
00:50:48 - i'm in my default behavior that's one of many possible actions i do and that's
00:50:52 - the typical behavior for lending organizers they're not playing olympics usually
00:50:56 - then i take the one that's cheaper.
00:51:01 - Right. Okay, so we have constraints now on how to optimize information processing.
00:51:09 - And already you indicated that embodiment itself can be not a source of constraints
00:51:14 - that help you to optimize information processing within the decision-making system.
00:51:22 - But now there's sort of a hidden assumption there that the world is actually
00:51:27 - Markovian so far. Thank you.
00:51:30 - Right? Not exactly, no, no it's not.
00:51:35 - So we have done that for systems where in principle we're at a Markovian.
00:51:40 - If you do a relative information according to the formulas we've shown,
00:51:44 - what we do is the sensor is actually sub-Markovian.
00:51:48 - So you choose a sensor that just picks out the information it wants and that
00:51:53 - creates a non-Markovian world.
00:51:56 - However it doesn't care. So the information in the models we have looked at,
00:51:59 - the sensor essentially will be less information-carrying than the world,
00:52:05 - but it has the freedom of choosing information it wants.
00:52:11 - But you know the upper bound for your information stays constant,
00:52:14 - right? The upper bound is a full… Because the world is predictable.
00:52:17 - In principle, yes. However, you can do the same thing with a limited sensor.
00:52:22 - And what you get there is typically that your relevant information goes up, goes up, not down.
00:52:29 - Surprising result, Christoph, Christoph Salge, you get more relevant information
00:52:35 - if your sensor is incomplete, because you essentially quality of information is worse.
00:52:40 - So you can't choose the information you want to get a worse set of information.
00:52:44 - So you have to look at the bigger part of it to get a same quality of information.
00:52:50 - But that would mean that, let's say, locally you have less information,
00:52:54 - but let's say collectively over all your sensors, you gain information.
00:52:59 - Not necessarily. You gain what we call piggyback information.
00:53:03 - Piggyback information is information that's not useful for the original goal.
00:53:07 - But you have to collect it to be able to reach your original goal at the desired level.
00:53:13 - It's kind of what, whatever you order, you order a laptop from a company,
00:53:18 - you don't get just a big box with lots of styrofoam, which you don't want, but you get it.
00:53:24 - And it's a bit like that. So this extra information comes with it.
00:53:28 - But the extra information is correlated with the goal relevant information.
00:53:32 - It's not uncorrelated. It is correlated, but you don't actually,
00:53:37 - you could, you could throw it away, could throw it away and just keep the core valuable information.
00:53:44 - But the problem is throwing it away is, is, is a waste. You process it already. It's already there.
00:53:50 - So can you do something with it? And we claim that it gives you an opportunity for acceptation.
00:53:56 - So I'm using it for other purposes than original, the original goal.
00:54:02 - So open-ended evolution and that it may be a driver for pushing sensors to the
00:54:08 - maximum refinement without requiring this to be an explicit.
00:54:14 - Evolutionary pressure which would answer why you may have very good sensors
00:54:20 - although there's not an obvious reason why you'd have to have a maximum resolution,
00:54:25 - right so I get that but why I brought up this idea of the predictable world
00:54:31 - or this Markovian assumption is that,
00:54:35 - in terms of a normative framework where you want to dictate in some sense the
00:54:40 - principles along which the system has to optimize itself,
00:54:43 - maybe in a Markovian world those principles are collectively different than
00:54:47 - in a non-Markovian world because in a non-Markovian world I am forced to explore.
00:54:52 - Okay, yeah. To a larger extent and following maybe different procedures than
00:54:57 - I can do in a predictable world, right?
00:55:00 - Now, if I'm forced to explore, this might compromise my optimization norms for
00:55:07 - my information processing system.
00:55:09 - Absolutely. So how do you see that trade-off between then the ability to explore
00:55:14 - in an unknown world or partially unknown world while optimizing my information processing?
00:55:21 - I would like to make a comparison with business.
00:55:24 - In business, you have the components. You have basically the null money flow.
00:55:30 - And you have the investment. And investment is what we would call exploration.
00:55:35 - If you have lots of extra resources, you can invest a lot in trying to basically
00:55:41 - reach out to new markets.
00:55:44 - But sometimes you don't have the reserves, and then you just live on your metabolism.
00:55:48 - You don't invest anything into exploring.
00:55:51 - So I don't think there's a single answer to that, because exploration is not
00:55:56 - per se a value. It's a value because of two reasons, the world does change,
00:56:02 - and we may want to be ready for this change.
00:56:07 - Because things may also, we just have the, we can.
00:56:12 - We have the resources, the extra resources that we can use.
00:56:15 - Like Trisco of Columbus used when Spain finished the war.
00:56:21 - Or the Reconquista, they had extra ambition, extra money. Portugal was already
00:56:26 - starting to explore navigational routes.
00:56:29 - And Spanish thought, okay, we have the extra money, let's invest it.
00:56:33 - But I think if you operate on the verge of survival, you don't explore. You just try to survive.
00:56:42 - Or if you're at a very, very, very well exploiting niche, there are some examples of that.
00:56:50 - One issue I see there is that your exploration norm might actually work orthogonally
00:56:58 - to your exploitation norm.
00:57:00 - And this is, of course, a conflict that you have to resolve in some way if you
00:57:04 - talk about the controller, right?
00:57:06 - So, of course, you can then say, well, under survival pressure,
00:57:10 - exploration will be minimized.
00:57:11 - But at baseline, let's say, you might want to explore so you actually know how
00:57:15 - to escape in the future more efficiently or what have you, right?
00:57:19 - So, in some sense, it also means you might want to break stable states in your
00:57:27 - optimized information processing in order to, let's say,
00:57:30 - identify new models by which you can describe your environment.
00:57:36 - Which in some sense relates to the question I had in this morning about three-year-olds
00:57:40 - having absolutely no memory about the period before that time.
00:57:44 - Because there's some state transition there in human memory,
00:57:47 - right? so the information processing optimization might face a similar catastrophic
00:57:52 - forgetting phase that is absolutely required to abduct into a new level of operation.
00:57:59 - I do think that this is an interesting question. I view the process of doing
00:58:05 - stuff as you have a point, this is where you are, say, and the question is,
00:58:10 - do you have knowledge about the environment of where you could move to?
00:58:14 - This lateral or virtual
00:58:18 - knowledge of what could happen if I
00:58:21 - would move there I think it's very important of whether you want to explore
00:58:25 - or not so if the other options are pretty good there's no reason why not send
00:58:30 - out some species or spend some
00:58:33 - time in nearby solutions if you know the solution is just the best then,
00:58:39 - exploration becomes a problem it's like you have a company which is incredibly successful product.
00:58:44 - And it's very clear that any modification of the product will,
00:58:48 - not be good and that does happen and this company has a big problem actually finding then a new niche,
00:58:55 - they have this problem um so why i'm taking the company because in the company
00:59:00 - there are sentient beings who control it and they can actually do a forward
00:59:03 - model evolution as far as we know is limited before the bonds perhaps there's
00:59:08 - some local forward volume can do in and sexual selection, but it's very limited.
00:59:14 - So it would really depend if the optimum is sharply defined.
00:59:19 - I think you have a problem with exploration.
00:59:21 - Exploration can only happen in big steps, and I don't see, for example,
00:59:25 - a local algorithm like evolution doing that.
00:59:28 - Humans do something else.
00:59:31 - When humans explore, they, because of the superior wiring or the more intricate
00:59:37 - wiring of what we call M, the memory,
00:59:39 - they can modify the topology of a search space and suddenly things that are
00:59:44 - far away become close so I can, when the moment I have Newtonian dynamics suddenly the concept of,
00:59:53 - ballistic flight is something that's close to what else would be just a local
00:59:59 - trying to get something somehow right now suddenly I can make a prediction,
01:00:03 - yes I can throw a stone to the moon in principle if I do it with the right type of energy,
01:00:09 - In other words, this is a qualitative transition in this concept space,
01:00:16 - which we cannot see in evolution.
01:00:19 - Evolution doesn't have it. So in evolution, everything must be somehow locally visible.
01:00:23 - There must be local hints that an exploration.
01:00:28 - Can be successful if there is no local hints then poor
01:00:31 - species will be going extinct if something
01:00:34 - changes we can see that in cycles of of
01:00:36 - say um parasite host cycles
01:00:40 - that are very very strongly linked or
01:00:42 - orchid ears and and hummingbirds which
01:00:46 - are very very tightly linked and they can't actually
01:00:49 - separate to take out one member of the
01:00:52 - echolo ecological web that will never disappear is
01:00:55 - no no very unlikely they
01:00:58 - will end up right so then um okay
01:01:02 - so now we have this is the informational perspective on if
01:01:05 - you want optimal decision making but now you also brought in this notion of
01:01:09 - empowerment which sort of is a complement to this informational perspective
01:01:13 - so what is what is unique about this empowerment notion and where did it come
01:01:17 - from the original idea came it's it's very funny to say is it came originally
01:01:21 - from this robot football project.
01:01:24 - We had the wish to model agents that go to the ball and kick it without having to tell them so.
01:01:32 - So we wanted not to avoid an external reward function.
01:01:37 - So the idea became more formal when we introduced the perception action loop
01:01:44 - and the information of U.
01:01:45 - It was clear that this model would be very naturally presented or represented
01:01:49 - this idea by having a potential set of potential.
01:01:55 - Actions in the future how much change they could possibly invoke in the environment,
01:02:01 - in other words how much can you influence the environment and it was very clear
01:02:06 - also that you need to see the influence if you can't see it doesn't count and
01:02:10 - that was very natural to say,
01:02:12 - basically from social sciences the concept of empowerment Empowerment means
01:02:17 - that people, disenfranchised people, for example, realize they can change the situation they are in,
01:02:23 - and they can also perceive that change.
01:02:27 - The perception is also important. It's a subjective measure.
01:02:30 - And it turned out that this measure turned out to be surprisingly effective.
01:02:37 - We tried it in various scenarios, and something like 11 or 12 different scenarios,
01:02:42 - is what it seems to really do produce behavior, motivated, self-motivated behavior,
01:02:47 - which does not require an external reward function. It basically produces goals.
01:02:51 - You give it a dynamics, and it gives you goals, more or less.
01:02:57 - The idea behind it is, if you have an organism, how does this organism choose its goals?
01:03:02 - Of course, there are some fundamental goals, like finding enough food and meat
01:03:06 - and so on. These are fundamental goals. And it comes from that what you do.
01:03:10 - And it seems to be that empowerment. So maximize your options because that maximizes it.
01:03:17 - The states you can reach in the next step if some states go away or niche go
01:03:21 - get smaller it increases your chance of getting out of this situation and that
01:03:26 - seems to work surprisingly well so the motivation was can we understand from
01:03:32 - an evolutionary part of you with very,
01:03:36 - limited assumptions how organisms can generically create their own goals when
01:03:43 - there is not a very clearly define goal-like eating or fleeing a predator or something like that.
01:03:49 - So that means if we take as an example tool use, right?
01:03:53 - So I encounter an object and empowerment would then tell me,
01:03:57 - well, you can learn that this object will have a certain affordance.
01:04:01 - That means relative to your morphology and your goals, you can achieve a certain
01:04:05 - objective with that, right?
01:04:07 - So that's what you do. You don't necessarily learn just the local properties of that object.
01:04:11 - You learn how to embed it within your own affordance repertoire yes but now so empowerment then.
01:04:22 - Allows you to incorporate objects but how well does that indeed again scale
01:04:29 - to same issues also within informational sense because depending on how you process this,
01:04:34 - how you represent this how you segment across different objects you might have capacity
01:04:39 - limitations yes so how does it scale well
01:04:44 - empowerment itself as it's defined it doesn't
01:04:47 - care about cost so it's really as you said more formal to the other view you
01:04:51 - can combine it you can put a cost or a kind of cost limitation on how many action
01:04:57 - sequences or action potential actions you want to consider and when you do that
01:05:00 - we get interesting results namely dominant strategies strategies,
01:05:04 - which are particularly effective in changing the world.
01:05:08 - So you won't probably remember some kind of weird wobbly movement that happens to move somewhere.
01:05:17 - You will remember really a clear, well-directed, well-established movement that
01:05:23 - clearly changes the world to one state rather than to another.
01:05:27 - So that's something that actually this limitation gives you.
01:05:31 - We have also various tricks and algorithms and approximations how to calculate
01:05:36 - empowerment, also the continuum.
01:05:37 - And this is being developed because now that it's established that empowerment
01:05:42 - really does a lot of cool behaviors, it's worth investing and actually scaling it up.
01:05:49 - And some tricks allow us, for example, to push empowerment forward many hundred
01:05:54 - steps or something. There are quite a few tricks, it's not, it's really,
01:05:57 - really drastic approximations, but they give you qualitatively.
01:06:02 - Uh again sensible results and this
01:06:05 - is what an organism needs an organism will not optimize this function
01:06:08 - to the very best it wants some kind of thing that
01:06:11 - works that's good enough it's good enough yes but now so for empowerment also
01:06:17 - the way you conceptualize this it's like injecting information into the world
01:06:22 - and recovering the result right and then in something you can frame it again
01:06:27 - in a compatible framework of information processing.
01:06:32 - But now the exploration that I have to engage in to understand what this object
01:06:38 - might contribute to my goals will take a certain amount of time.
01:06:44 - So how rapidly does such a process converge and how does it also depend on the
01:06:49 - degrees of freedom that the object would afford?
01:06:52 - Learning is not yet part of the model except for this one example that I showed
01:06:56 - you with a pendulum, where it actually learns how to model the forward model.
01:07:02 - When you say a concrete goal, we have not yet linked concrete goals to a goal,
01:07:07 - which is something that needs to be done, of course.
01:07:10 - But in real life it's also similar. Imagine you play some new game that you
01:07:13 - just learned the rules of.
01:07:15 - I won't mention goal, you probably know how to play, I'll mention focus,
01:07:19 - which is a really nice game.
01:07:21 - We once did it many years ago as an exercise for our students.
01:07:26 - And the point about the game is that none of the students knew the game.
01:07:30 - We didn't know the game. There are no libraries, opening libraries.
01:07:34 - So we really had to learn this game from scratch.
01:07:38 - And it was very interesting. In the beginning, it just looked like random walks.
01:07:42 - You do something, something happens.
01:07:44 - And after four or five games, you as a human player start to see structures.
01:07:49 - You start to see, oh, this does this, this does that.
01:07:52 - This does that and you start to pick up the
01:07:55 - salient points and this is where empowerment would come in it would basically
01:07:59 - say these are the salient points of the world of course it doesn't solve the
01:08:02 - problem of actually beating your opponent but it structures it it tells you
01:08:07 - okay these are the points from where i can try to see whether i can beat my
01:08:11 - opponent so the argument would be it creates broad.
01:08:16 - Road map or milestones which tell me this is where I want to be and if I want
01:08:23 - to have control over these and these states of the game.
01:08:27 - And then you can ask, can I get to the goal there? So the argument is always, you are local.
01:08:31 - Your understanding is always local. Do these.
01:08:38 - Landmarks in my mental map give me a hint how I'm getting closer to whatever
01:08:44 - goal I may want to achieve?
01:08:46 - And this is, in my opinion, how we are able to learn very abstract games or
01:08:52 - math or things like that, that we get these landmarks where we go to,
01:08:59 - and then we start mapping out where from these landmarks are sub-landmarks,
01:09:04 - and which sub-landmarks are conceptually closer to where we want to go to.
01:09:10 - It's purely hypothetical, but I think that's the way to probably look at empowerment.
01:09:16 - Empowerment itself finds the landmarks, the main ones, doesn't find the goals,
01:09:20 - or it creates the landmarks as a goal.
01:09:22 - But of course, when you have a specific goal, it just may give you a way of getting there.
01:09:27 - So but now so if we look at these two frameworks right for this
01:09:30 - information theoretical framework and talk about optimizing information processing linking
01:09:34 - sensory states to actions um and then
01:09:37 - we have this more embodied action-oriented empowerment
01:09:40 - view right and they're orthogonal but empowerment is something that also challenges
01:09:46 - the informational view because empowerment is also telling well there's a lot
01:09:50 - of information really in the embodiment in the action in the world that is offloading
01:09:55 - the informational processing that is going on.
01:09:58 - So maybe this whole emphasis on this very centralized view on behavior,
01:10:03 - where it all has to happen in this cognitive engine that is optimizing informational,
01:10:08 - maybe this is so far at an extreme of our search space, right?
01:10:13 - If we talk about embodied action, that maybe your empowerment notion will sort of invalidate it.
01:10:21 - How do you see that? Well, it's an interesting question. Why do we have agents at all?
01:10:25 - Why is there a concept of an agent in physics?
01:10:29 - Why did they emerge? In my opinion, it's because in a way, some type of information
01:10:38 - likes to be accumulated.
01:10:40 - So essentially, like wants to like, if you like.
01:10:45 - An organism, why does it want to procreate?
01:10:50 - Frankly, I think the reason is because there are some processes that basically
01:10:54 - parasite, or parasites in a way, on the physical world.
01:10:58 - And these parasites, they like to continue doing so.
01:11:04 - Because that's what a parasite does. It wants to propagate its unique way of life.
01:11:09 - Even if you, the physics doesn't care. So it's not antagonistic either.
01:11:14 - If a parasite parasites another organism, that other organism may not like it.
01:11:18 - And so that will try to get rid of it, of course.
01:11:21 - So then it comes out of realistic and it's meta level.
01:11:25 - But I don't think that you can make a unique statement on, oh,
01:11:30 - empowerment is one that's giving you the one perspective, the other one is getting the other.
01:11:37 - They operate and they may have different parameters, different time scales.
01:11:43 - For example, the increase of your bandwidth of processing is slow.
01:11:47 - You can't just make your brain twice as big.
01:11:52 - Unlikely, perhaps with CRISPR we can at some point try that.
01:11:55 - I think it would be ethically unquestionable, but in principle not to try.
01:12:01 - What's fast? Empowerment is relatively fast. You need a forward model.
01:12:06 - If you look at information preservation and information saving, that's something we do.
01:12:12 - Subconsciously, when we learn something, we use up a lot of bandwidth. with.
01:12:16 - Once we know how to do it, we use very little bandwidth because it's probably
01:12:20 - rewired, reorganized in such a way that it will eat up less information.
01:12:25 - So I would say learning to grab a complicated object or handle a complicated
01:12:30 - object will be very bandwidth intensive, takes a long time.
01:12:34 - And it's translated somehow, it's rewired in a
01:12:37 - way that internally will use less information
01:12:40 - so in other words this this process happens
01:12:44 - all the time and what the time constants are
01:12:46 - and what the weights are it's not something i
01:12:49 - would be able to make a speculation right but now if if we try to map these
01:12:56 - these this concept um to physical systems like the brain would it imply that
01:13:03 - the brain tries to optimize its mean activity level is it really that isomorphic,
01:13:10 - or not. No, no, no, no.
01:13:13 - I think the way to think of it is slightly different.
01:13:18 - If I have two brains, one uses a lot of information processing to do something,
01:13:23 - the other one loses very.
01:13:25 - The other one has an optimized way of doing it. It's clear that the other one has an advantage. Why?
01:13:33 - Because it can handle other tasks too. It can learn additional tasks.
01:13:37 - It can concentrate on other tasks. It can react to danger faster.
01:13:42 - So it has lots of advantages indirectly. directly. So, both brains may be, for example, two twins.
01:13:49 - One twin has learned how to play tennis many years ago. The second twin just
01:13:53 - learning it. They play against each other.
01:13:55 - Well, who's going to win? Of course, the one which spends less time thinking about his moves.
01:14:01 - Very simply so. That's because he has had the opportunity to impress and squeeze,
01:14:09 - and optimize its information flow.
01:14:11 - And he can also perhaps even talking on the phone and upsetting his twin brother
01:14:17 - this way, when the twin brother is sweating and trying to catch Justin to ball.
01:14:25 - So in other words, the advantage is not necessarily energetic.
01:14:31 - It's advantage in many dimensions.
01:14:34 - Information theory is basically saying just, with this resources,
01:14:39 - if you have them, that's how much you can process, and that's how good you can process. Right.
01:14:46 - So Daniel, you also, in parallel to your theoretical work, you're now also the
01:14:52 - president-elect of the RoboCup organization,
01:14:55 - which of course is an understandable concern because this is also very much
01:15:00 - about testing a lot of these ideas in the real world.
01:15:03 - But why do you invest effort and time in sort of advancing this notion of RoboCup?
01:15:11 - Why is that so important to you?
01:15:13 - I do believe that we have several advantages by having that.
01:15:16 - First of all, we have a direct comparison.
01:15:19 - You can essentially come up with all kinds of algorithms which work in a lab
01:15:24 - under bearing control conditions.
01:15:26 - But at the end of the day, they will have to work in the field.
01:15:31 - And you can compare, does it actually work? I can't predict,
01:15:34 - say that it works if I have these and these and these and these constraints.
01:15:38 - But on the field, there's always an excuse. Either it works or it doesn't.
01:15:42 - And you see it immediately.
01:15:43 - You see, oh, this guy has, this group has an excellent vision system or this
01:15:49 - group has a very good walking system and then you can learn.
01:15:53 - And even if you don't copy their system exactly, you can pick up.
01:15:58 - Ideas at various levels of abstraction, either concrete code,
01:16:02 - if that's being published, for example, which happens in some groups and leagues,
01:16:06 - or else by seeing, oh, this is a concept that essentially everybody else is now using.
01:16:13 - Let's take a very simple example. Omnidirectional drive used to be a concept like that.
01:16:17 - At the beginning, it was not obvious for the midsize, so you can later everybody introduced it.
01:16:23 - It's a very simple example, but there are more intricate ones.
01:16:26 - The second thing is, I do believe that a lot of interesting questions emerge.
01:16:34 - You have a relatively clearly defined task, a relatively clearly defined world,
01:16:39 - and yet it's very complicated.
01:16:41 - You have to get several things to work at once.
01:16:43 - And it's very motivating to think about, okay, what do I need in principle if
01:16:49 - I want to have such a machine to learn something like that from scratch?
01:16:53 - Of course, many teams write code to win the competition, so they have to be
01:16:58 - very specific about how to do that.
01:17:01 - On the other hand, I think it's a great motivator to think, if I have a robot
01:17:05 - that is not just running forward, I mean, if we look at walking robots, that's what they do.
01:17:10 - A football robot cannot just walk forward. It has to understand what a sidekick
01:17:14 - is and when to do it, when to use it.
01:17:16 - Okay, this is all done by hand, but in principle, the challenge is, what at all?
01:17:19 - When you're a footballer you do
01:17:22 - that by instinct you train also a
01:17:26 - lot but you have many things that you just do on the moment
01:17:29 - at the opportunity that you do and I do think that this context switching that
01:17:33 - happens all the time is one of the major challenges of AI so I think if we can
01:17:38 - address that in a proper way so we can move away from hand crafted behavioral
01:17:43 - rules to a more automatic.
01:17:47 - Autonomous autonomous decision of how to switch contexts from,
01:17:50 - say, walking to snopping to kicking to whatever,
01:17:54 - we will have made a big step ahead in AI.
01:17:56 - And finally, my personal view, that's a very, very personal view,
01:18:01 - it's not official or anything, I believe that we need new materials, new algorithms.
01:18:09 - And the new types of embodiment for robots to be actually able to achieve such
01:18:15 - a level of competence where the big goal is 2050 to actually play and possibly
01:18:21 - win against world champion,
01:18:23 - that it will push this uh envelope much more strongly than if we say yes we
01:18:29 - need soft materials but yeah at some point when when it's ready rather having
01:18:34 - this perspective gives you an incentive to actually try these materials a bit earlier, of course.
01:18:40 - But when will the first robot team win the Champions League?
01:18:44 - Well, the Champions League is beyond 2050.
01:18:50 - In 2050, the goal has been declared, playing against the world champion,
01:18:54 - the human world champion, and win.
01:18:56 - It's a very ambitious goal, but let's put it this way, when it was declared
01:19:01 - in 1997, people really didn't believe that's even possible.
01:19:05 - There were hardly any humanoid robots in labs. There were probably a handful
01:19:09 - of labs in the world that could afford a humanoid robot. And today,
01:19:13 - humanoid robots are everywhere.
01:19:15 - So, even that already was a huge push ahead in terms of making robotic science more democratic,
01:19:25 - more popular, and actually sell people that, yes, it's possible to make a humanoid
01:19:31 - robot that walks and doesn't fall down all the time.
01:19:33 - But do you think the main challenge is in the biomechanics or in the cognition and the motor control?
01:19:41 - Everywhere. I think biomechanics is a big issue. Energy is a big issue.
01:19:46 - I think having an energy so that the role can run for 45 minutes in undrafted is massive.
01:19:53 - So biomechanics, energy is a major issue. But I think the cognition is also
01:19:58 - a major issue because you will have to contend with players like say Messi or
01:20:05 - that really are flexible in their thinking.
01:20:08 - You can optimize in a particular situation something that you can shoot a penalty
01:20:14 - shot without fail, essentially, assuming the hardware doesn't break.
01:20:19 - So you could be better than humans on a penalty, for example.
01:20:23 - It might be possible that we would beat humans on very specific swap tasks.
01:20:27 - But in a generic game situation, to make the right decision,
01:20:31 - hopping up, risking life and then to kick a ball above your head into the goal.
01:20:38 - That's something that humans do.
01:20:41 - And good let's call it chilean here
01:20:44 - right yeah yeah yeah exactly and then essentially doing
01:20:48 - that as a human player uh requires a
01:20:51 - lot of guts and and nerve and instinct and
01:20:54 - not every player can do that so it's very clear that it's a very special skill
01:20:58 - but now can a robot full player in the current competitions get a red card there
01:21:05 - are ways of getting fouls but right now the fouls are relatively mild.
01:21:10 - It's basically blocking the goal and things like that.
01:21:12 - In the future, and for example in the simulation league, they introduce already
01:21:16 - fouls. So if a player I'm not exactly sure how they implement it, but it's automatic.
01:21:22 - If a player kicks another player without going for the ball, I think then it's a foul.
01:21:30 - There are certain rules that Roger implemented.
01:21:33 - And the RoboCop is a bit like the Robot Olympics if you want, right?
01:21:36 - Even though there's a separate competition also with that name.
01:21:38 - And the Olympics also continuously change the disciplines that are participating.
01:21:44 - So do you see also a RoboCup that might be changing that will further expand
01:21:47 - into other domains? Maybe soon we have robot basketball or robot tennis?
01:21:52 - There are changes. First of all, in the main leagues they are actually,
01:21:57 - they become harder and harder.
01:21:59 - That's why if you watch the games, sometimes the games look less interesting
01:22:03 - as time goes by because they become much more hard to follow.
01:22:07 - So football ball for example they took
01:22:10 - away the colors of the goals they took away um lots of
01:22:13 - structure from the field the ball has no color
01:22:16 - anymore so it's really very challenging problem um
01:22:20 - other leagues come and go so there are leagues
01:22:22 - who emerge it's not the robot legged league
01:22:27 - that was the robot sony ibo robots um that league was um introduced um i think
01:22:34 - the first demo games were with 98 and it disappeared later when it was superseded
01:22:40 - by a standard platform leak,
01:22:43 - which is basically wrong. That's when you see the malware works.
01:22:46 - So in this case, you have a development of the leaks or a disappearance of leaks
01:22:51 - or emerge of new leaks. Like we have a logistics leak.
01:22:55 - We have an at home leak, which is concerned with making robots more flexible,
01:23:02 - so flexible they can deal with problems of home robotics, which is,
01:23:07 - of course, a huge challenge.
01:23:09 - It's much easier to develop a robot for industrial sets than for homes,
01:23:15 - but those homes are notoriously unpredictable.
01:23:19 - Would you feel that something like a robot war leak would be fitting?
01:23:23 - Well, first of all, I must say that war is not a very nice term.
01:23:30 - Set of tests. And the second thing is, first of all, it's destructive.
01:23:36 - I find that a little bit unsettling.
01:23:40 - And also it's an issue of ethics in general. I don't think that we want autonomous
01:23:45 - robots to know how to destroy other entities.
01:23:48 - I think that's where it gets problematic.
01:23:51 - But the other thing is that Robot Wars exists, and it's a remote-controlled
01:23:56 - league, so it's not autonomous. So in other words, RoboCop is fully about autonomy. You want autonomy.
01:24:03 - Why I bring it up is that maybe the real challenge here is about building moral robots.
01:24:09 - That's our real challenge. Because I think on the midterm already,
01:24:15 - we really have to master how to build robots that are autonomously and truly,
01:24:21 - let's say, assistive and moral in their behavior.
01:24:23 - Because to build a destructive robot is easy.
01:24:26 - But our challenge is how do we control this and how
01:24:29 - do we make it transparent and what i'm worried about is that
01:24:32 - right now the robot league the robot war competitions are
01:24:35 - a bit sort of in the public media of military organizations of course looking
01:24:41 - into these things it's completely outside the realms of transparent analysis
01:24:47 - and debate and i think that's even a bigger risk so i was wondering
01:24:53 - whether it would not be making sense to maybe have a league where robots can do damage,
01:24:58 - but they manage to not do it autonomously.
01:25:01 - Because this is what we have to master, and we have to drive this debate as researchers.
01:25:06 - We cannot leave it to non-academic institutions to do this behind closed doors.
01:25:13 - No one knows where it's going. And once it hits the streets,
01:25:16 - we have no frameworks to deal with it.
01:25:18 - So that's why I was wondering whether it would not be an idea to at
01:25:20 - least start to think about how to also incorporate this even though it is a
01:25:24 - painful issue it is an insulting issue sometimes but we cannot close our eyes
01:25:29 - for it i would say and uh it's really nice that you bring it up in one of the
01:25:35 - statements i made before i was um basically selected as the coming.
01:25:42 - President is that i think that road ethics is a major point that should be discussed
01:25:48 - in the I think it's an excellent opportunity because even in the football game,
01:25:53 - how much damage I'm ready to do to my fellow player if I want to win.
01:25:59 - So it's already appearing there. It's very clear that the concept of fair game is already there.
01:26:05 - I think that robot ethics has the same problems as human ethics.
01:26:09 - How do you prevent them? This is an example I wrote many years ago and repeated,
01:26:14 - and unfortunately, reality has caught up with me.
01:26:19 - With my example, I was saying, how do you ensure that a passenger airline pilot
01:26:23 - does not take the plane and crash it somewhere?
01:26:27 - This was an example I actually brought up. And you don't.
01:26:31 - You don't know. You can't see into the head. You believe that socialization
01:26:34 - helps, that you know the person, that you trust the person, and you believe
01:26:40 - that they have a self-preservation need and so on.
01:26:44 - So there's a whole set of safeguards which we assume.
01:26:48 - But when you take them away, when
01:26:51 - they disappear, when people don't pay attention, bad things can happen.
01:26:55 - So I don't think that robots will be exempt from that however I think I see
01:27:00 - a way forward for making robots more ethical and that's very simple it's actually
01:27:05 - the same thing that humans need to do to be more ethical namely,
01:27:09 - basically a generalization of the concept of empathy and there's just for an
01:27:15 - example for how that could look like and again there's just pure motivation.
01:27:21 - There's nothing well developed it's just the first glimpse Christian Guglisberger
01:27:25 - from Goldsmith and Christoph Sager from Hertz,
01:27:29 - they have developed a model of basically NPCs,
01:27:33 - so players you have a video game and they are accompanied by a pet or a companion
01:27:39 - and the problem is these companions are usually quite stupid they act really
01:27:45 - in a stupid way and one thing that they did was using empowerment cross-empowerment,
01:27:52 - as a value function for the pet, for the companion.
01:27:58 - So the companion tries to maximize the empowerment of its master and its own
01:28:03 - too. So it tries not to die, of course.
01:28:06 - And when you do that, it's very interesting. For example, it will shoot an enemy
01:28:10 - that endangers the master.
01:28:13 - It will behave in various ways in a sensible way.
01:28:17 - One thing, for example, that it shows how almost human-like it reacts,
01:28:22 - empowerment looks at and takes into account all possible actions and so the pet,
01:28:28 - has by default assumes that the human could be a psycho and kill the pet so
01:28:34 - the pet has also a sense of self-preservation so it will accompany him and help
01:28:38 - the human but it will stay out of his shooting direction unless you turn on
01:28:44 - a trust flag in which case.
01:28:48 - Basically the pet will trust human right and will enter his shooting shooting direction,
01:28:56 - So here we have empathy, of course, because the pet has a model of the human
01:29:00 - and what the human could be doing.
01:29:02 - And if that model is wrong, then yes, I have a problem.
01:29:06 - But at least it's very clear where empathy and trustworthiness and probably ethics come from.
01:29:15 - It's basically knowing what the other one needs and not acting too much against. Right.
01:29:23 - My prediction there is that what we're going to see is to build machines that
01:29:28 - know how to handle trust will be orders of making it more difficult and to get the biomechanics right,
01:29:35 - i'm not sure about that um i think trust is a state of mind so it's a belief
01:29:41 - state it's basically saying what do i believe will the other one do uh it's
01:29:47 - related to game theory um common knowledge advantage.
01:29:50 - For example, if I am a player and I have an advantage having a pet,
01:29:56 - my pet knows that I have an advantage having it and so on, then the trust can emerge on its own.
01:30:03 - If I'm a new player, basically a newbie, perhaps the pet will say,
01:30:08 - no, this guy doesn't know that yet, so I'll wait and see how he behaves before I trust him.
01:30:13 - And that's in real life it happens to you have a team of people and
01:30:17 - one person is new to the to the business a new
01:30:19 - boss you don't trust him immediately you see how he behaves and turns out okay
01:30:24 - he knows what he's doing he has a good model of the future and then you start
01:30:28 - trusting so in other words i don't think that this is i mean the practice will
01:30:34 - be very difficult of course as all learning is but i think conceptually,
01:30:39 - I don't think we are that far away from that, conceptually. The practice is a dimmer store.
01:30:45 - Biomechanics is, we are really far away. So, I don't think I agree with it. That's fine.
01:30:53 - But I think the other issue that's really important here, why I think this should
01:30:57 - be included in RoboCop, is that if we look at the history of our science,
01:31:01 - we know there's no value-free science.
01:31:03 - And the mistake science has made over the ages is to develop technologies and
01:31:09 - knowledge and then leave it to others to figure out what the ethics of it is.
01:31:14 - This is not working. So if we, as the researchers behind these kinds of machines,
01:31:20 - are not actively engaged in that debate, we will not have normative frameworks when we need them.
01:31:26 - Because, for instance, we go to bioethics or general ethics of human behavior,
01:31:29 - we have no normative frameworks. We're stuck.
01:31:32 - And I feel that just for that reason and also given this historical consideration
01:31:36 - Consideration of science is not value free.
01:31:40 - So it's also us, the scientists, who now have to engage with that.
01:31:43 - So for that, I think it's important that gets included.
01:31:46 - But now, so you have a broad set of interests.
01:31:49 - You're driving this whole RoboCup community now forward into the future.
01:31:54 - 2050, you're going to beat the world champion. It's great to have predictions
01:31:57 - that only need to be tested by the time we are in a retirement home somewhere.
01:32:02 - So no one can blame us for making predictions that fail. But now,
01:32:06 - if we would like to follow in a tradition that you represent,
01:32:10 - what would be the Polanyi law that we have to adhere to?
01:32:15 - Well, I wouldn't formulate that as Polanyi law. I think it's a very simple law,
01:32:18 - that it's much older. That's the golden rule.
01:32:23 - Don't do to others what you want be done to you.
01:32:26 - But actually, the rule needs to be generalized. Because a machine that can offload
01:32:32 - its memory onto a big computer has a completely different view on survival than a human.
01:32:37 - A human turned off will not be turned on again.
01:32:41 - A machine turned off can very well be turned on again.
01:32:45 - And so I would say the goal rule needs to be generalized.
01:32:50 - Don't do to somebody else what they themselves, according to the best model
01:32:57 - that you can have from them, would not like to be done to them. Okay.
01:33:03 - So it's a law of empathy, essentially. Yeah.
01:33:07 - And then three years from now, we're going to go visit you with Anna,
01:33:11 - who's waving at us there behind the glass.
01:33:12 - And we're gonna we're gonna check
01:33:15 - whether the prediction you're gonna make today was confirmed or
01:33:18 - not so what's the the one non-trivial prediction you you would like to see tested
01:33:23 - three years from now that you're gonna see confirmed three years is a short
01:33:28 - time can we increase the period four that's too small ten no come on let's do
01:33:36 - it like this compromise three and 10.
01:33:39 - Oh, that's not a compromise. I've made it hard. Well, I think I'm happy to make predictions.
01:33:47 - I'm not so happy making putting times on it. I'll tell you. Okay.
01:33:51 - Because I think that discoveries are power law distributed.
01:33:56 - So it's like avalanches. You know, it will happen or earthquakes that will happen.
01:34:02 - You don't know when and how big big thing.
01:34:04 - So the prediction that I think is that we will have to completely.
01:34:14 - Not completely, but significantly expand our understanding of how to create
01:34:21 - contexts or switch contexts if we want proper AI to merge, rather than highly specialized,
01:34:30 - highly optimized, one domain optimized.
01:34:35 - That is something that I'm sure. I would not put a number on it, probably say,
01:34:45 - no i'd rather i'd rather not give a number to
01:34:47 - that because this this is not that may happen so it's more
01:34:51 - an aspiration of let's say uh if you
01:34:54 - want the general intelligence of this context independent robocop they can play
01:34:58 - football and basketball uh for example where even in robocop that can't play
01:35:02 - football but where i don't have to encode um how to switch context between the
01:35:08 - from a stance to a kick to a defense, et cetera.
01:35:12 - If I don't have to do that myself anymore, I would say we made a huge step ahead.
01:35:17 - Not sure whether it's enough for what we would call AI to be complete,
01:35:22 - but I would say without this, it will not happen.
01:35:25 - Okay, great. Daniel Polanyi, thank you very much for this conversation. Thank you.
01:35:35 - The CSN podcast was produced by the Convergent Science Network of Biometrics
01:35:41 - and Biohybrid Systems, a project funded by the European Sevens Research Framework Program.
01:35:49 - For more interviews, recorded lectures, or upcoming conferences in the field
01:35:54 - of biometrics and biohybrid systems, go to csnnetwork.eu.
01:36:00 - Music.
01:36:01 - And thank you for listening.

CSN Podcasts

Daniel Polani on information theory and embodied cognition

Season 2018

Description

CSN Podcasts

Episodes

Ton Coolen on immune networks and neural networks

Paul Verschure & Tony Prescott on synthetic psychology and robot models

Neil Burgess on boundary vector cells and place cells

Giacomo Rizzolatti on mirror neurons and action understanding

Francesca Cacucci on hippocampus development and grid cells

Timestamp

Be the first to leave a comment

Leave a commentCancel

Legal

Follow us on

Daniel Polani on information theory and embodied cognition

Description

Episodes

Timestamp

Be the first to leave a comment

Leave a commentCancel

Tags

Legal

Follow us on

Login to enjoy full advantages

Please login or subscribe to continue.

Go Premium!

Enjoy the full advantage of the premium access.

Stop following

Cancel subscription