This is the first part of a series exploring Live Discernment, heavily inspired by and a part of Live Theory.
Admissibility of Mathematical Evidence
This was written with help from Sahil K, Aditya Prasad, Aditya Adiga, Kuil & Matt.
When do we say that a mathematical truth is an account[1] of a phenomenon we are interested in? In much the same way that a video can be seen as a true representation of key events, e.g., a wedding, a speech, or a robbery, mathematical artefacts can be viewed as representations of real-world phenomena (Things as we see them). These representations can then be used as relevant material in other contexts, for instance, video evidence in court or mathematical theorems applied in physics and engineering. With video (before the age of AI-generated video), a high degree of trust always came with it as evidence of some activity. If one saw a video of a president giving a speech, then it could be said with a high degree of certainty that the speaker in the video indeed made certain statements or remarks.
Another compelling example is the use of admissible video evidence in court, such as footage of someone breaking the law while on camera. We have utilized institutions to determine the relevance of evidence in the context of the law. For instance, video evidence must meet certain criteria to be admissible in court, such as a chain of custody. We give these examples to motivate our discussion on the admissibility of mathematical evidence, particularly in the age of advanced AI systems that can generate valid mathematical artifacts (or truths if you like). When is a piece of mathematics relevant? In the same way, we can ask when a video file is relevant in a court of law.
Notice that we have used the term relevance above, but we haven't properly introduced it[2]. For our discussion on the admissibility of mathematical evidence, relevance will refer to anything that a person could care about in their meaning-making activities. In mathematics, this could be the practice of mathematics in and of itself, e.g, using theorems in one field of mathematics to prove theorems in another field, or an application to another field, say engineering, physics, or finance. Given this, we will state the following: a mathematical truth is relevant if it is used in the course of expressing a meaning-making activity. We, therefore, administer mathematical evidence if we use it in some form for meaning-making. For instance, the fundamental theorem of calculus and its application to rocket engineering. We don’t tackle the problem of relevance realization as it is a wide and complex topic that is under active research.
The potential for AI to generate valid mathematical artifacts, for instance kimina and DeepSeek Prover, introduces significant changes to mathematical activity. Following our discussion on relevance, it is important to consider how these tools will impact the use of mathematical truths. Part of this shift stems from the failure modes these systems might & will face. One of the pressing challenges that we see is that of false advertising using valid mathematical artifacts. This is where we have an actor who - whether malicious or not - makes use of these math agents to produce mathematics to support potentially dubious claims. This is already happening even without the help of AI. We posit that this will become prevalent as more people have access to genius-level mathematicians in their pockets. This is one example of how the trust in mathematical truths might be compromised, thereby undermining the confidence in a crucial aspect of human society.
In light of these concerns about trust and reliability, it becomes clear that we need new ways to apply mathematical evidence without running the risk of spreading falsehood. Earlier, we discussed how institutions of law have developed systems to apply video evidence in courts reliably; we also need systems that will increase the reliability of mathematical truths. One of the systems we propose is live-discernment, whereby human attunement for relevance is at the center of the application of mathematical artifacts. Here, we imagine that many of the mathematical artifacts will be produced by AI agents; therefore, live discernment will enhance human abilities to curate the relevance of these artifacts in their meaning-making activities.
Mathematical truths in the age of cheap Intelligence
Powered by AI, we will have increasingly strong mathematical cover stories being created. Traditionally, to be able to apply mathematical insights to a “non-mathematical / mathematical" domain, one had to be an expert in the said domain and the mathematical formalism of the insight. Alternatively, they would need to have access to a collaborator who has the mathematical chops to understand the insight and, by having conversations, can apply the insight to the domain in question. A significant amount of time for applied mathematicians is spent on the conversion/translation from the domain of application to the mathematical formalism. In a world where everyone has access to mathematical agents, you can imagine someone interfacing with said agents and being able to output valid mathematics that may be relevant to their area of application. Now they don’t need to be friends with a mathematician to get particular mathematical insights applied to their domain. I can already do this by querying a system like DeepSeek Prover, and it can produce mathematical symbols and statements that are very similar to what a “real mathematician” might be able to produce. A difference here is that a real mathematician would have needed to possess human understanding (Knowledge) of my domain to produce the bespoke formalisms that I can then apply to my side of things. This may or may not be the case with the agents that produce the formalisms; we cannot tell whether the agent understood what was relevant to us and used this understanding to produce mathematics that was true to our needs. Much of applied mathematics is the art of choosing which assumptions to use and which to discard while using mathematical artefacts in the real world. Since they (the real mathematicians) are “integrated” with the real world, the assumptions they outline may hold water; the same cannot be said for AI math agents. For all we know, the Agent might be gaslighting us, as seen in coding agents of the day. Gaslighting and producing invalid mathematical artefacts may be classified as a shallow risk.
A "deeper" risk is that of the referentiality of valid mathematical formalisms. What do the symbols and constructions often lauded in scientific circles actually identify with in the real world? This risk comes to the fore when almost anyone has the ability to produce valid maths willy-nilly. How can we tell good, valid math from bad, valid math? Our position is that the degree of "goodness"[3] reflects the relevance of the mathematical formalism to the person using it or to the person the formalism is being targeted at. When all we need to check is the syntactic and formal validity of the formalism generated, this relevance can and will be spoofed. The constructions used to lie will be similar and sometimes superior (in their use of valid symbols and notation) to those used in truth-seeking. This risk will only get exacerbated with the rise of cheap intelligence
Mathematics Enabled Deception
This table highlights a dichotomy between mathematical artefacts and what is relevant to meaning-making and sense-making in our collective information landscape.
Invalid
Valid
Irrelevant
Easily checkable
These are hard to check. This is our focus
Relevant
Easily checkable
These lead to breakthroughs
Several[4] articles have been published showing how we can lie with numbers. This will only get worse in the age of Math Capable AI systems in the hands of everyone. We highlight two general cases of how mathematics might enable deception to manifest.
1. False Positives in Mathematics/ [Mathematical Over-specification]
Lying with mathematics has officially been turbocharged due to the rise of Math-capable AI systems. One way in which we can lie with math is through false positives.
This happens when the mathematical artefact is thinly specified (hyper-specific/ Highly Abstracted).
What makes a mathematical artefact a false positive?
They are false since they are not relevant to what we hold as real, i.e, they are disconnected from reality. They are positive since they follow all mathematical rules and procedures and are represented in accordance with cultures established within mathematics.
They are false as they ascribe something to the world that is not real. They are positive as they are valid mathematical constructions.
A false positive is deciding something is more important because we have a mathematical structure for it.
When is the use of a mathematical truth put into question? Notice here that we are not questioning the validity of a mathematical truth but rather its relevance. One answer is the case where the assumptions used to apply that mathematical artefact don’t hold up in the real world, or don’t match observations of the phenomena they represent. For instance, when a mathematical representation of a concept space says something about a concept that isn’t part of the concept.
There is always this “playful tension” between mathematics and the real world (well, at least as we can observe it), and through this interaction, we get to understand the world, and we also get to discover more mathematical truths. Sometimes the mathematical truths we uncover don’t accurately “stand for” the phenomena we directly observe in the world, and in these cases, we can develop new mathematics to fit the phenomena we observe, or we can choose to work with the mathematical formalisms, but with strict conditions. The later case is often useful particularly in highly controlled settings (such as (engineering applications for instance when designing electrical circuits, solving maxwell’s equations would take a prohibitively long time and hence we use “toy versions of the equations”, knowing too well that they don’t represent the real world, at least assuming that the full maxwell equations actually represent the real world), sometimes this can be misleading and result in mistakes.
Stockholm Syndrome in Mathematics
It is easy to lie to people who already "believe in mathematics" using mathematics.
We have held mathematics at a high standard as a representation of truths about our world. People trust the logic and rigour in mathematics, and unscrupulous actors can exploit this fact. They can use the same logic that is used to promote false claims. This will be enhanced by math-capable AI Systems that can produce valid math to support any claims, founded or not. One ludicrous example we have been toying with is the Time Cube one. We have teased out a mathematical formalism from an AI that supports the Time Cube theory of a day. The formalisms that the AI generated are valid and can pass a formalism check from any mathematician. If we didn’t know that the Time Cube idea is bonkers, someone could publish the math the AI generated and use it to support the nonsensical idea. People who have traditionally trusted mathematics might be swayed by the arguments presented in a language they hold in high regard.
Spoofing Surprise and Relevance
Surprise and relevance of mathematical artefacts can be spoofed.
Sometimes following the rules of a mathematical formalism might lead to surprises, and sometimes it doesn’t. Sometimes constructing a mathematical structure might lead to surprising applications, and sometimes it doesn’t. Sometimes the math itself might be surprising, and sometimes the application of the mathematical insight in another domain might be surprising. How do we categorise these surprises? When are surprises meaningful, and when are they not? Could we spoof the surprisingness of a mathematical insight both in itself and in an application in the real world? The answer seems likely to be yes. How does this spoofing change with the introduction of math-capable AI systems? I think that, in the same way we can spoof video evidence using AI generators, we can spoof surprise in mathematical results using AI systems. We could coax an AI system to generate valid constructions to support a surprising argument that we want to convince people of. This also rests on the assumption that mathematical artefacts are only interesting to us if they are surprising, since their statements and, therefore, their proofs are almost always a “crude” application of the mechanics of mathematical reasoning. Basically, turn the mathematical crank and see what comes out the other side. It is when mathematicians notice something unusual, unintuitive, or surprising that it becomes a “big deal”. This also happens when a practitioner finds a surprising application of a mathematical insight. Given that AIs are good and will probably become better at generating perfectly valid mathematical definitions, theorems, and proofs, cases of actors coming up with seemingly interesting mathematical results will increase, and it will become harder to tell actually good results from contrived ones.
2. False Negatives in relation to Mathematics
What makes a phenomenon a false negative:
A false negative is something meaningful, but we've decided to de-emphasize it because we don't have a mathematical structure for it.
Mathematical artefacts sometimes fall short of capturing a phenomenon we might be interested in. This happens quite often, and we usually introduce assumptions about the real world (or the phenomena we are interested in) so that the mathematics can actually be of use. One reason this happens is due to the rigid structure of mathematical formalisms. This doesn’t allow them to “grow” into the phenomena we are tracking. It usually requires a mathematician to devise a new structure that is sufficient to represent what we are interested in. And this often doesn’t work. This leads to one “disregarding” the phenomena that the mathematical gadgets failed to capture as irrelevant or unimportant. How many truly interesting things about the real world have we thrown out, so to speak, since we didn't have the requisite formalism to capture them?
What Can We Do?
Given the problem(s) we have highlighted, what can be done to safeguard our sense-making abilities? We propose a conceptual framework and supporting tools we collectively[5] call Live Discernment. We use the word discernment in a precise sense: the activity of checking the relevance of formal outputs (e.g., Mathematics / Code). Relevance to “the territory” we care about or are sensitive to. This operation of “lifting” what we find meaningful to a particular way of observing the world, e.g., a painting, a catcolab model, or a knowledge artefact in general, is lossy. We are developing a conceptual framework that highlights where and when these losses happen. Since "what we care about" is a diffuse concept, we are using the live theory perspective to perform this check sensitively, in a manner that respects the observer's context. This is what inspired the live part of Live Discernment. In the next parts, we'll go into detail about how live discernment is conceptualized and operationalized.
This is the first part of a series exploring Live Discernment, heavily inspired by and a part of Live Theory.
Admissibility of Mathematical Evidence
This was written with help from Sahil K, Aditya Prasad, Aditya Adiga, Kuil & Matt.
When do we say that a mathematical truth is an account[1] of a phenomenon we are interested in? In much the same way that a video can be seen as a true representation of key events, e.g., a wedding, a speech, or a robbery, mathematical artefacts can be viewed as representations of real-world phenomena (Things as we see them). These representations can then be used as relevant material in other contexts, for instance, video evidence in court or mathematical theorems applied in physics and engineering. With video (before the age of AI-generated video), a high degree of trust always came with it as evidence of some activity. If one saw a video of a president giving a speech, then it could be said with a high degree of certainty that the speaker in the video indeed made certain statements or remarks.
Another compelling example is the use of admissible video evidence in court, such as footage of someone breaking the law while on camera. We have utilized institutions to determine the relevance of evidence in the context of the law. For instance, video evidence must meet certain criteria to be admissible in court, such as a chain of custody. We give these examples to motivate our discussion on the admissibility of mathematical evidence, particularly in the age of advanced AI systems that can generate valid mathematical artifacts (or truths if you like). When is a piece of mathematics relevant? In the same way, we can ask when a video file is relevant in a court of law.
Notice that we have used the term relevance above, but we haven't properly introduced it[2]. For our discussion on the admissibility of mathematical evidence, relevance will refer to anything that a person could care about in their meaning-making activities. In mathematics, this could be the practice of mathematics in and of itself, e.g, using theorems in one field of mathematics to prove theorems in another field, or an application to another field, say engineering, physics, or finance. Given this, we will state the following: a mathematical truth is relevant if it is used in the course of expressing a meaning-making activity. We, therefore, administer mathematical evidence if we use it in some form for meaning-making. For instance, the fundamental theorem of calculus and its application to rocket engineering. We don’t tackle the problem of relevance realization as it is a wide and complex topic that is under active research.
The potential for AI to generate valid mathematical artifacts, for instance kimina and DeepSeek Prover, introduces significant changes to mathematical activity. Following our discussion on relevance, it is important to consider how these tools will impact the use of mathematical truths. Part of this shift stems from the failure modes these systems might & will face. One of the pressing challenges that we see is that of false advertising using valid mathematical artifacts. This is where we have an actor who - whether malicious or not - makes use of these math agents to produce mathematics to support potentially dubious claims. This is already happening even without the help of AI. We posit that this will become prevalent as more people have access to genius-level mathematicians in their pockets. This is one example of how the trust in mathematical truths might be compromised, thereby undermining the confidence in a crucial aspect of human society.
In light of these concerns about trust and reliability, it becomes clear that we need new ways to apply mathematical evidence without running the risk of spreading falsehood. Earlier, we discussed how institutions of law have developed systems to apply video evidence in courts reliably; we also need systems that will increase the reliability of mathematical truths. One of the systems we propose is live-discernment, whereby human attunement for relevance is at the center of the application of mathematical artifacts. Here, we imagine that many of the mathematical artifacts will be produced by AI agents; therefore, live discernment will enhance human abilities to curate the relevance of these artifacts in their meaning-making activities.
Mathematical truths in the age of cheap Intelligence
Powered by AI, we will have increasingly strong mathematical cover stories being created. Traditionally, to be able to apply mathematical insights to a “non-mathematical / mathematical" domain, one had to be an expert in the said domain and the mathematical formalism of the insight. Alternatively, they would need to have access to a collaborator who has the mathematical chops to understand the insight and, by having conversations, can apply the insight to the domain in question. A significant amount of time for applied mathematicians is spent on the conversion/translation from the domain of application to the mathematical formalism. In a world where everyone has access to mathematical agents, you can imagine someone interfacing with said agents and being able to output valid mathematics that may be relevant to their area of application. Now they don’t need to be friends with a mathematician to get particular mathematical insights applied to their domain. I can already do this by querying a system like DeepSeek Prover, and it can produce mathematical symbols and statements that are very similar to what a “real mathematician” might be able to produce. A difference here is that a real mathematician would have needed to possess human understanding (Knowledge) of my domain to produce the bespoke formalisms that I can then apply to my side of things. This may or may not be the case with the agents that produce the formalisms; we cannot tell whether the agent understood what was relevant to us and used this understanding to produce mathematics that was true to our needs. Much of applied mathematics is the art of choosing which assumptions to use and which to discard while using mathematical artefacts in the real world. Since they (the real mathematicians) are “integrated” with the real world, the assumptions they outline may hold water; the same cannot be said for AI math agents. For all we know, the Agent might be gaslighting us, as seen in coding agents of the day. Gaslighting and producing invalid mathematical artefacts may be classified as a shallow risk.
A "deeper" risk is that of the referentiality of valid mathematical formalisms. What do the symbols and constructions often lauded in scientific circles actually identify with in the real world? This risk comes to the fore when almost anyone has the ability to produce valid maths willy-nilly. How can we tell good, valid math from bad, valid math? Our position is that the degree of "goodness"[3] reflects the relevance of the mathematical formalism to the person using it or to the person the formalism is being targeted at. When all we need to check is the syntactic and formal validity of the formalism generated, this relevance can and will be spoofed. The constructions used to lie will be similar and sometimes superior (in their use of valid symbols and notation) to those used in truth-seeking. This risk will only get exacerbated with the rise of cheap intelligence
Mathematics Enabled Deception
This table highlights a dichotomy between mathematical artefacts and what is relevant to meaning-making and sense-making in our collective information landscape.
Several[4] articles have been published showing how we can lie with numbers. This will only get worse in the age of Math Capable AI systems in the hands of everyone. We highlight two general cases of how mathematics might enable deception to manifest.
1. False Positives in Mathematics/ [Mathematical Over-specification]
Lying with mathematics has officially been turbocharged due to the rise of Math-capable AI systems. One way in which we can lie with math is through false positives.
This happens when the mathematical artefact is thinly specified (hyper-specific/ Highly Abstracted).
What makes a mathematical artefact a false positive?
When is the use of a mathematical truth put into question? Notice here that we are not questioning the validity of a mathematical truth but rather its relevance. One answer is the case where the assumptions used to apply that mathematical artefact don’t hold up in the real world, or don’t match observations of the phenomena they represent. For instance, when a mathematical representation of a concept space says something about a concept that isn’t part of the concept.
There is always this “playful tension” between mathematics and the real world (well, at least as we can observe it), and through this interaction, we get to understand the world, and we also get to discover more mathematical truths. Sometimes the mathematical truths we uncover don’t accurately “stand for” the phenomena we directly observe in the world, and in these cases, we can develop new mathematics to fit the phenomena we observe, or we can choose to work with the mathematical formalisms, but with strict conditions. The later case is often useful particularly in highly controlled settings (such as (engineering applications for instance when designing electrical circuits, solving maxwell’s equations would take a prohibitively long time and hence we use “toy versions of the equations”, knowing too well that they don’t represent the real world, at least assuming that the full maxwell equations actually represent the real world), sometimes this can be misleading and result in mistakes.
Stockholm Syndrome in Mathematics
It is easy to lie to people who already "believe in mathematics" using mathematics.
We have held mathematics at a high standard as a representation of truths about our world. People trust the logic and rigour in mathematics, and unscrupulous actors can exploit this fact. They can use the same logic that is used to promote false claims. This will be enhanced by math-capable AI Systems that can produce valid math to support any claims, founded or not. One ludicrous example we have been toying with is the Time Cube one. We have teased out a mathematical formalism from an AI that supports the Time Cube theory of a day. The formalisms that the AI generated are valid and can pass a formalism check from any mathematician. If we didn’t know that the Time Cube idea is bonkers, someone could publish the math the AI generated and use it to support the nonsensical idea. People who have traditionally trusted mathematics might be swayed by the arguments presented in a language they hold in high regard.
Spoofing Surprise and Relevance
Surprise and relevance of mathematical artefacts can be spoofed.
Sometimes following the rules of a mathematical formalism might lead to surprises, and sometimes it doesn’t. Sometimes constructing a mathematical structure might lead to surprising applications, and sometimes it doesn’t. Sometimes the math itself might be surprising, and sometimes the application of the mathematical insight in another domain might be surprising. How do we categorise these surprises? When are surprises meaningful, and when are they not? Could we spoof the surprisingness of a mathematical insight both in itself and in an application in the real world? The answer seems likely to be yes. How does this spoofing change with the introduction of math-capable AI systems? I think that, in the same way we can spoof video evidence using AI generators, we can spoof surprise in mathematical results using AI systems. We could coax an AI system to generate valid constructions to support a surprising argument that we want to convince people of. This also rests on the assumption that mathematical artefacts are only interesting to us if they are surprising, since their statements and, therefore, their proofs are almost always a “crude” application of the mechanics of mathematical reasoning. Basically, turn the mathematical crank and see what comes out the other side. It is when mathematicians notice something unusual, unintuitive, or surprising that it becomes a “big deal”. This also happens when a practitioner finds a surprising application of a mathematical insight. Given that AIs are good and will probably become better at generating perfectly valid mathematical definitions, theorems, and proofs, cases of actors coming up with seemingly interesting mathematical results will increase, and it will become harder to tell actually good results from contrived ones.
2. False Negatives in relation to Mathematics
What makes a phenomenon a false negative:
Mathematical artefacts sometimes fall short of capturing a phenomenon we might be interested in. This happens quite often, and we usually introduce assumptions about the real world (or the phenomena we are interested in) so that the mathematics can actually be of use. One reason this happens is due to the rigid structure of mathematical formalisms. This doesn’t allow them to “grow” into the phenomena we are tracking. It usually requires a mathematician to devise a new structure that is sufficient to represent what we are interested in. And this often doesn’t work. This leads to one “disregarding” the phenomena that the mathematical gadgets failed to capture as irrelevant or unimportant. How many truly interesting things about the real world have we thrown out, so to speak, since we didn't have the requisite formalism to capture them?
What Can We Do?
Given the problem(s) we have highlighted, what can be done to safeguard our sense-making abilities? We propose a conceptual framework and supporting tools we collectively[5] call Live Discernment. We use the word discernment in a precise sense: the activity of checking the relevance of formal outputs (e.g., Mathematics / Code). Relevance to “the territory” we care about or are sensitive to. This operation of “lifting” what we find meaningful to a particular way of observing the world, e.g., a painting, a catcolab model, or a knowledge artefact in general, is lossy. We are developing a conceptual framework that highlights where and when these losses happen. Since "what we care about" is a diffuse concept, we are using the live theory perspective to perform this check sensitively, in a manner that respects the observer's context. This is what inspired the live part of Live Discernment. In the next parts, we'll go into detail about how live discernment is conceptualized and operationalized.
An account in the sense of accuracy or truth.
Relevance is one of those topics that is subject to an entire research agenda; covering it here substantially is almost an impossible task.
We mean good in relation to what someone might find meaningful or deeply care about.
I liked Joel Best's article: "Lies, Calculations and Constructions: Beyond "How to Lie with Statistics"", which talks about this.
The collection of both the conceptual framework and the software tools.