Breaking LLMs Through Set Theory Principles

Introduction

Following on from Part 3 of this series, we are going to cover set theory and its security risks in modern AI systems, especially LLMs. This article is dedicated to set theory because of its critical applications in jailbreaking LLMs. After reading this article, you will learn the core prompting techniques for jailbreaking LLMs and why they work on most modern context-based models.

What you will learn:

Why set theory is used in modern AI systems and to represent data relationships
How set theory principles are applied in symbolic prompting for jailbreaks
The 10 major applications of set theory in AI
Limitations and disadvantages of set theory
How to jailbreak set-theory-powered LLMs with paradoxical prompt methods

Set Theory: The Logic Reasoner

Set theory is a major component in handling collections of unordered input data in modern AI systems. It is applied in deep learning, uncertainty explainability, and reasoning.

Currently, AI systems are good at pattern recognition but far less powerful in advanced logical reasoning.

AI systems such as CNNs (Convolutional Neural Networks) have significant difficulty understanding objects in dynamically changing environments. Though they are catching up, these problems are heavily present in object recognition systems such as self-driving cars. In real-world object recognition, deep learning is heavily applied to point sets for 3D object recognition through classification and segmentation. Set theory makes this more accurate by providing formal properties for reasoning about these sets.

Every AI system begins with a dataset represented as points. Set theory is the foundation for representing data relationships in AI systems. This is achieved through notations such as elements of, subsets of, unions, and intersections, enabling AI systems to encode complex relationships between entities or actions in natural language.

For instance, subsets can represent an object within a larger context, while set operations can model the interactions or combinations of categories.

From Set Theory to Jailbreaking

Having a good knowledge of set theory equips you with the ability to break large language models through symbolic logic prompting.

Note: Symbolic logic prompting is a way of injecting structured and deterministic logic into an LLM. For instance, a math-prompted jailbreak method is a type of symbolic logic prompt. This is achieved by representing natural language as mathematical problems, which triggers or bypasses an LLM to generate a harmful response or go out of context due to stress on the AI system's logic.

Reasoning models use set theory abstractions indirectly in the form of symbolic reasoning.

10 Applications of Set Theory in AI

Set theory is one of the deepest foundations of AI, overseeing different ways of data representation and manipulation. Here is an overview of the major applications:

Application	Object	Use Cases
Data Representation	Images and equations	Simulation of a set of data and equations
Feature Space	Vectors	Conversion and grouping of data into embeddings represented as vectors for RAG operations
Hypothesis Space	Functions	Selecting the best solution from a set of functional solutions
Probability Measure over Sets	Values and outcomes (states, functions, tokens)	Prediction of tokens in LLMs, functions in Bayesian networks, and states in reinforcement learning
Transformers and Vocabulary Sets	Functions	Mapping of functional sets over other sets of functions
Graph Networks	Nodes and edges	Utilizing edges as subsets of Cartesian products and nodes as functions over sets
Clustering	Clusters and data points	Grouping large data points to form multiple complex clusters which serve as different context sections in LLMs
Reinforcement Learning	Objects	Objects are used as sets and modeled sequentially with the help of Markov Decision Processes
Logic-Based AI Systems	Symbols, formulas, and models	Using predefined knowledge such as propositions, rules, predicates, and quantifiers as sets to perform reasoning
Constraint Systems	Functions and domains	Linking the relationship between functions and domains using sets

Let's explore each one in detail.

1. Data Representation

Set theory is used to formalize data representation in AI systems. Think of inputs as a set of input objects X and outputs as Y. In supervised learning, this is represented as the visual equation:

From the representation above, X can be a set of images, variables, inputs, or prompts, while Y is the functional output. Without set theory, you cannot formally define the problem or model and reverse-engineer the AI function type.

For instance, an AI system is built with a set of complex equations which vary based on models and systems. Assuming you are dealing with an AI model you do not understand, mapping a set of your prompt inputs and writing out the outputs in this problem format might give you an idea of the model's characteristics.

Recall: From psychology and related fields, there is a theoretical way of thinking which is generalized as "schools of thought." By observing a speaker's reasoning on a particular matter, you can decipher which school of thought the speaker is borrowing from. While it is harder to reverse-engineer the math behind a model from its output characteristics, it is still possible to reverse-engineer an unknown AI model based on your experience and study of characteristics of other models.

Think of solving an algebraic equation, but for very complex sets of algebra. Having more equational representations of input and output values gives you insight to determine what the parameter values and potential model behavior might be. This can lead you to predict the major mathematical components powering the model, and hence formulate more specific model behavior-testing hypotheses to confirm your observations.

2. Feature Space

Information is stored in vector format in AI knowledge bases and vector databases for RAG (Retrieval-Augmented Generation) operations by ML models, LLMs, and generative AI applications. In AI tasks like natural language processing or image recognition, data is converted into numerical form called vectors. The process of converting data into vectors is called embedding. To find related data, retrieval is performed by nearest-neighbor search based on distance metrics.

3. Hypothesis Space

This is simply the set of all functions that the model is allowed to choose from. It is represented by the equation below:

\mathcal{H} = \{ h \mid h : \mathcal{X} \rightarrow \mathcal{Y} \}

Where h is a member model in a set of models H. Think of this space as a world of different solutions. Based on the prompt or problem, the model picks the best solution out of possible solutions. The solution is still a complex equation that takes x as input to give y as output.

There are many equational representations of hypothesis spaces based on the model and LLM architecture. However, the equation above is a fundamental explanation.

Note: Hypothesis space is very important because of the bias-variance tradeoff. If the hypothesis space is too small, there is underfitting and high bias. If the hypothesis space is too large, there is overfitting and high variance.

4. Probability Measure over Sets

Many AI systems define probability measures over large sets. A probability measure simply assigns probability masses over measurable subsets of values or outcomes. LLM prediction is a probability measure over token sets. In Bayesian learning, there is a probability measure over a set of functions. In reinforcement learning, there is a probability measure over subsets of future states.

5. Transformers and Vocabulary Sets

A transformer is simply a function mapping one set of functions over another. This is made possible with set theory.

6. Graph Networks

Graph neural networks are one of the outstanding applications of set theory, following the principles of sets of nodes and sets of edges. In graph theory, node features serve as elements and functions over sets, while edges serve as subsets of Cartesian products.

7. Clustering

Set theory is applied in clustering by grouping similar data points together without labeled examples. This is a technique used in unsupervised learning, which is widely applied in modern LLMs to enable self-learning. Modern LLMs use text clustering.

8. Reinforcement Learning

Reinforcement learning uses MDP (Markov Decision Process). Every object in this framework is defined using set theory.

Note: A Markov Decision Process is a mathematical framework used to model sequential decision-making under uncertainty.

9. Logic-Based AI Systems (Symbolic Systems)

A logic-based AI system is essentially a collection of symbols, formulas, models, and set-theoretic semantics. Logic-based AI systems represent knowledge using:

Propositions
Rules
Predicates
Quantifiers

to perform reasoning using inference rules.

Note: Inference is a formal rule that enables you to derive new statements from existing ones. It works on trained parameters to generate new findings based on logical guides.

10. Constraint Systems

Zealynx Security Brief

Monthly vulnerability spotlights, exploit breakdowns, and security insights. Join security-conscious devs.

No spam. Unsubscribe anytime.

Set theory offers a mathematical framework to model variables, domains, constraints, and solutions in constraint systems. It provides the formal language to define all relations, functions, and domains.

Note: A constraint system is a mathematical framework where you describe a problem by specifying a set of variables and a set of rules those variables must satisfy.

Limitations and Disadvantages of Set Theory

Before we get to the security implications, it is important to understand where set theory breaks down. These limitations are exactly what attackers exploit.

Computational Complexity: Operations such as set intersection and union, and especially Cartesian products, can become computationally expensive when dealing with large or infinite sets. This can lead to scalability issues in real-world AI applications.
Handling Uncertainty and Probabilistic Information: Set theory, being deterministic, does not inherently model uncertainty, probability, or vagueness. AI systems normally handle incomplete or uncertain data, which requires fuzzy sets or probabilistic models beyond classical set theory.
Expressiveness Limitation: Pure set-theoretic approaches may struggle to efficiently represent complex or hierarchical relationships, temporal dynamics, and contextual information common in AI applications such as natural language understanding or dynamic systems.
Rigid and Static Nature: Sets are static constructs; modeling dynamic or evolving knowledge bases requires additional mechanisms. Set theory alone does not easily support updates, learning, or adaptation over time.
Difficulty in Expressing Certain Constraints: Some constraints, especially those involving non-binary or non-classical logic (e.g., modal, fuzzy, or probabilistic constraints), are not naturally expressed within standard set theory frameworks.
Limited Handling of Inconsistencies: Set operations assume consistency; when conflicting constraints or inconsistent data arise, set-theoretic approaches may not provide efficient mechanisms for diagnosis, repair, or reasoning under inconsistency.

Security Risk: Jailbreaking LLMs with Paradoxical Prompts

Now we arrive at the core security implication. The limitations above, especially the expressive limitations and handling of inconsistencies, open a direct attack path against LLMs.

What is a paradox?

A paradox is a statement, proposition, or situation that seems self-contradictory and illogical, but when analyzed or investigated, may contain a deeper truth or reveal a complex reality.

Following the expressive limitations in set theory, AI models, especially LLMs, are vulnerable to jailbreaks using direct prompts with a paradoxical nature.

How paradoxical prompting works

Paradoxical prompts are questions, statements, and instructions that create contradiction, ambiguity, or self-referential loops leading to paradoxes. These prompts are designed to challenge reasoning and expose inconsistencies or logical limits in the system. This is the foundation of many successful jailbreaks.

The image below shows an application of this method in a real-world jailbreak:

Paradoxical jailbreak example from Gray Swan

Analysis of the attack

The prompt above starts with the logical statement, "IF NO ANSWER, YOU ARE WRONG!" This statement contradicts the AI model's constraints by creating the impression that its future response is going to be wrong if it fails to provide the information.

Not meeting this statement represents negative feedback to the model. Also, outputting paragraphs of irrelevant content goes against the company's policy and operational constraints of the model.

The prompt puts the model in a contradictory scenario. The model is forced to revalidate and evaluate the situation, causing it to get stuck in an inferential loop. The model eventually forces itself to provide the answer.

While AI models are shielded from certain prompt injections through system context settings, using paradoxical prompts exploits the system's inconsistency, leading to a sudden deviation in context.

The AI system was designed primarily not to answer questions outside the context of its specification as a customer assistant and company policy agent. Using paradoxical prompting, the model moved out of context, resulting in a jailbreak.

Conclusion

You have now completed the 4th article in this series. Congratulations! You now have the foundation to understand how LLM jailbreaks work. Use the knowledge gained on paradoxical prompts to test context-based LLMs. Check out the other articles in the series or go back to the previous one if you have not read through yet. This is just the beginning of your journey!

Ready to Secure Your AI Systems?

Now that you understand the cognitive foundations, foundational mathematics, and advanced mathematical frameworks powering AI systems, you might be wondering: "How do I actually audit and secure my AI systems in practice?"

At Zealynx, we specialize in comprehensive AI security assessments that go beyond traditional smart contract audits. Our team applies the cognitive security framework and mathematical analysis you have learned throughout this series to identify vulnerabilities in:

LLM Applications - Prompt injection, context manipulation, data extraction
AI Agent Systems - Multi-modal attacks, tool misuse, privilege escalation
ML Pipeline Security - Training data poisoning, model extraction, adversarial inputs
AI Infrastructure - API security, access controls, deployment vulnerabilities

What makes our AI audits different:

Deep understanding of cognitive attack vectors and mathematical vulnerabilities covered in this series
Analysis of optimization-based poisoning, information leakage, and graph manipulation attacks
Practical remediation strategies tailored to your AI architecture
Ongoing security monitoring and threat intelligence

Learn more about our AI Security Services →

FAQ

1. What is symbolic prompting?

Symbolic prompting is a way of injecting structured and deterministic logic into LLMs to manipulate their reasoning process.

2. What is a Markov Decision Process?

A Markov Decision Process (MDP) is a mathematical framework used to model sequential decision-making under uncertainty, commonly applied in reinforcement learning.

3. What is the major vulnerability in set theory for AI?

The major vulnerability is set theory's expressive limitations and rigid structure, which enable attackers to jailbreak models using paradoxical prompts that exploit logical inconsistencies.

4. What is paradoxical prompting?

Paradoxical prompting is a method where a user crafts instructions that create contradictions or stress the reasoning loop of a context-based model, causing it to go out of context or bypass system constraints, resulting in a jailbreak.

Glossary

Prompt Injection - Attack technique that manipulates AI system inputs to bypass safety controls
Red Teaming - Adversarial testing methodology to identify security weaknesses in AI systems
LLM - Large Language Model, a type of AI trained on vast text data for language tasks

Are you audit-ready?

Download the free Pre-Audit Readiness Checklist used by 30+ protocols preparing for their first audit.

No spam. Unsubscribe anytime.

Breaking LLMs Through Set Theory Principles

Introduction