國立臺灣大學電機資訊學院資訊工程學系 碩士論文
Department of Computer Science and Information Engineering College of Electrical Engineering and Computer Science
National Taiwan University Master Thesis
回顧雙變量邏輯之可滿足性
Revisiting the Satisfiability of Two Variable Logic
丁恩琳 En-Lin Ting
指導教授:陳偉松博士 Advisor: Tony Tan, Ph.D.
中華民國 107 年 7 月
July, 2018
國立臺灣大學碩士學位論文 口試委員會審定書
回顧雙變量邏輯之可滿足性
Revisiting the Satisfiability of Two Variable Logic
本論文係丁恩琳君 (R04922047) 在國立臺灣大學資訊工程學 系完成之碩士學位論文,於民國 107 年 7 月 10 日承下列考試委員 審查通過及口試及格,特此證明
口試委員:
所 長:
誌謝
這篇論文的成文,除了要感謝過去在此領域做出貢獻的研究者,更 要感謝許多我身邊的人。
首先,一定要感謝我的指導教授陳偉松老師。謝謝老師在我換指 導教授時的接納,謝謝他對學生的關心與幫助,更要謝謝他的悉心教 導,無論是在知識上、研究方法上或是待人處世上,我皆獲益良多。
這篇論文能完成,都要感謝陳偉松老師。
另外,也要感謝我之前的指導教授呂學一老師。雖然沒有能夠跟著 老師完成碩士論文,但他的指點與包容,令我由衷感謝,亦是難以報 答。
我要感謝碩士學位考試的口試委員蔡益坤老師及廖純中老師,感謝 他們花時間與心力針對我的論文提問討論,並且給予寶貴的意見。
還要謝謝在我攻讀碩士期間實驗室的同學學長學弟,以及修課時的 老師以及一起討論的同學們,在溝通交流中擴展了我的想法、啟發我 的思維並激發我的靈感。
最後要感謝我的家人,他們的關懷與鼓勵往往給予我動力,更是在 我遭遇挫折時成為我的支持與幫助。
謝謝你們。
Acknowledgements
This thesis would not have been possible if it had not been for the re- searchers who have devoted themselves to this area before us, and it would not have been possible without the help from many people around me.
First of all, I would like express my profuse gratitude towards my advisor Tony Tan, who accepted me as his student during my change of advisor. I am very thankful for his teachings, whether it be the specialized knowledge in the area, the methods of researching, or the mindset in general, for all of them help me greatly. The completion of this paper is all thanks to him.
Next, my sincere thanks goes to my previous advisor Hsueh-I Lu. Al- though I did not manage to finish my master thesis with him, I am truly grate- ful for his advising and understanding, and I am deeply indebted.
I would like to thank the committee members of my oral defense - Yih- Kuen Tsai and Churn-Jung Liau. I am very grateful for their time and effort to inquire and discuss about my thesis, and I value their helpful advices greatly.
I would also like to thank my fellow labmates, the teachers of the classes I have taken, and classmates who had discussion with me. The exchange of ideas with them broadened my horizons, improved my way of thinking, and provided me with inspirations.
Last but not least, I give my deep thanks to my family, whose loving care and encouragement often boost my motivation. They are always willing to lend me support and help whenever I am feeling frustrated.
Thank you all.
摘要
含計數量詞的二階邏輯 (C2) 有著許多應用,特別像是本體知識語
言的應用,例如用於語意網的 OWL。一個著名的結果是:C2 的可滿
足性問題可以在非確定性指數時間 (NEXPTIME) 內決定,而且這樣的 複雜度是最佳的。然而,目前已知的解決技巧較為複雜,通常必須去 猜測一個滿足目標式子結構或表示方法,而導致難以實作。
在這篇論文中,我們關注於具有反向關係封閉性並且是 Scott’s 正
規型態的 C2 句子 (RCS)。直觀上,如果一個句子 φ 是 Scott’s 正規型
態而且其使用的二元關係具有反向封閉性,φ 就屬於 RCS。我們基
於 Kopczyński 和 Tan 的技巧 [9],針對RCS 的可滿足性問題及有限
可滿足性問題提出一個新的決策程序,利用將 RCS 的式子轉換成
Presburger 的存在量化式來解決問題。雖然此方法的時間複雜度比最佳 時間高:2-NEXPTIME,但其有幾個優勢:
1. 刻劃出RCS 式子模型的特性,亦即任一 RCS 式子的模型皆是由
正則圖及二分正則圖組成。
2. 顯示出一RCS 式子的頻譜是否為有限是可決定的。
3. 此方法為解決可滿足性問題及有限可滿足性問題的簡單決策程 序。
當原式為 Scott’s 正規型態並且詞彙表固定時,我們演算法的時間複雜
度為 NEXPTIME。我們期待我們的結果能提供討論 C2 式子的另類技
巧,並擴展至其他諸多的本體知識語言。
關鍵字: 含計數量詞的二階邏輯;可滿足性;Presburger 算數;整數 規劃;正則圖
Abstract
Two variable logic with counting quantifiers (C2) has found many appli- cations, especially in ontology language such as OWL used in semantic web.
It is well known that the satisfiability problem for C2 is decidable in nonde- terministic exponential time (NEXPTIME), and the complexity is optimal.
However, the known techniques are quite complicated and they typically in- volve guessing a structure or a representation that satisfies the input formula, which can be hard to implement.
In this thesis, we consider a subclass of C2 formulas, which we call Re- versal closed C2 formulas in Scott’s normal form (RCS). Intuitively, a C2 formula φ is in RCS, if it is in Scott’s normal form and the binary relations used in φ are closed under reversal. We present a decision procedure for the satisfiability and finite satisfiability problems for RCS formulas, which is based on the technique by Kopczyński and Tan [9]. Our approach is by con- verting anRCS formula into an existential Presburger formula. Though the complexity is higher: 2-NEXPTIME (double exponential time), it has a few advantages:
1. It provides a characterization of models of RCS formulas, i.e., every model of anRCS formula is a collection of regular digraphs and bireg- ular graphs.
2. It implies the decidability of checking whether the spectrum of anRCS formula is infinite.
3. It gives simple decision procedures for satisfiability and finite satisfia-
bility problems.
When the input is in Scott’s normal form and the vocabulary is fixed, our algorithm yields time complexity NEXPTIME. We hope that our result can be used to provide an alternative technique to reason about C2 formula, thus many other ontology languages.
Keywords: two variable logic with counting quantifiers; satisfiability; Pres- burger arithmetics; integer programming; regular graphs
Contents
口試委員會審定書 iii
誌謝 v
Acknowledgements vii
摘要 ix
Abstract xi
1 Introduction 1
1.1 Related works . . . 1 1.2 Summary of contributions . . . 2 1.3 Outline . . . 3
2 Two variable logic with counting 5
2.1 First order logic (FO) . . . 5 2.2 Two variable logic with counting quantifiers (C2) . . . 7 2.2.1 The classRCS . . . . 8
3 Presburger arithmetics 11
3.1 Standard Presburger arithmetic . . . 11 3.2 Presburger arithmetic with infinity . . . 12 3.3 The satisfiability of Presburger formula . . . 13
4 Regular graphs 15
4.1 Definitions . . . 15
4.1.1 Biregular graphs . . . 15
4.1.2 Regular digraphs . . . 16
4.2 Presburger characterization of the existence of biregular graph . . . 17
4.2.1 1-type biregular graphs . . . 17
4.2.2 Proof of theorem 4.2.1 . . . 20
4.3 Presburger characterization of the existence of regular digraph . . . 23
5 Satisfiability ofRCS formulas via Presburger arithmetics 27 5.1 Constructing Presburger formula from anRCS sentence . . . 27
5.2 The preservation of satisfiability . . . 31
5.3 Complexity analysis . . . 36
6 Concluding remarks 39
Bibliography 41
Chapter 1 Introduction
Many areas in computer science and information technology utilize first-order logic (FO) and its variances. For example, the currently booming artificial intelligence research uses FO as the basis of knowledge and data representation. Typically, FO sentences are used to describe the knowledge, so it is important to check the consistency of a sentence. By Gödel’s completeness theorem, consistency and satisfiability are equivalent [4].
Formally, the satisfiability problem (SAT) for a class C ⊆ FO is defined as follows.
Given an input formula φ∈ C, decide whether there is a model that satisfies φ. The finite satisfiability problem (FIN-SAT) for C is to decide whether there is a finite model that satisfies φ.
However, the general FO is known to be undecidable [2, 1, 20, 18]. Hence, researchers are looking for restricted but decidable classes of FO. In this paper, we discuss one such class: the C2 logic, i.e., the FO formulas using only two variables but allowing counting quantifiers.
1.1 Related works
From the classical work of Church [2, 1], Turing [20] and later, Trakhtenbrot [18], the satisfiability problem of FO is known to be undecidable, and it is necessary to find com- promises in order to achieve more practical results. Some FO classes of interest are de- rived by restricting the number of variables. It was shown that the satisfiability problem
of FO with only two variables (FO2) is decidable [11], whereas the three variable class is undecidable [8].
Another widely discussed class is C2, which is a more generalized class of FO2 by allowing counting quantifiers. The decidability of C2 was first proved by Grädel, Otto and Rosen [7]. However, the proof is done by showing both the satisfiability problem and its complement are recursively enumerable. Thus, its complexity cannot be deduced.
The time complexity for both SAT and FIN-SAT problem is proved to be double ex- ponential time by Pacholski, Szwast and Tendera [13], and later to be NEXPTIME by Pratt-Hartmann [15]. An immediate implication from this is that C2 is NEXPTIME- complete, since FO2 is already known to be NEXPTIME-hard [10, 3]. As a side note, the algorithms proposed in both [13, 15] involve many non-trivial guessing that would be difficult to implement.
It is worth to note that FO2 has finite model property. More precisely, if an FO2 formula φ is satisfiable, then it is satisfied by a model with cardinality O(2|φ|) [5]. On the other hand, C2 lacks such property. There are some C2sentences that are only satisfiable by infinite structures. This is one such example:
ψ := ∀x∃≥2yE(x, y)∧ ∀x¬∃≥2yE(y, x)
Intuitively, ψ states that each vertex in the model has at least two out-going edges but has only one or none in-coming edge. Therefore, the satisfiability problem and finite satisfiability problem for C2 are not equivalent, unlike for FO2.
1.2 Summary of contributions
In this paper, we take another approach to the satisfiability and finite satisfiability problem for C2.
It has been proved that for a sentence ϕ of C2, there is a corresponding Presburger formula PREBϕ such that there exists a complete structure A with A |= ϕ if and only if PREBϕ(|A|) holds as long as |A| is finite [9]. However, the conversion to Presburger
formula make use of the conversion to QMLC and yields a sextuple exponential time complexity.
Here, we will utilize the Scott’s normal form to simplify the conversion from a C2 sentence to an existential Presburger formula to achieve a less complicated algorithm in terms of both the procedure and the time complexity under the assumption that the binary relations used in Scott’s normal form are closed under reversal. Moreover, we prove such conversion retains its properties even when we allow the structure to be infinite by allowing the Presburger formula to admit infinity∞ in the solution.
On inputRCS formula ϕ, our algorithm does the following.
1. Convert it into an instance of Linear Integer Programming (LIP).
2. Solve the LIP problem.
Step 1 is of non-deterministic double exponential time, while step 2 is of non-deterministic polynomial time (in the size of the input). So, overall, our algorithm for both SAT and FIN-SAT runs in 2-NEXPTIME.
1.3 Outline
We will go through some definitions related to C2 and RCS in chapter 2, and the def- initions and theorems regarding Presburger arithmetics and its extension to infinity are discussed in chapter 3. In chapter 4, we introduce some important tools for the conversion betweenRCS formulas and Presburger formulas. These tools primarily consist of regular and biregular graphs and their corresponding Presburger formula expressions. Our main result is derived in chapter 5, where we utilize the tools in chapter 4 to convert anRCS formula to a corresponding existential Presburger formula that preserves its satisfiability, and then solve the satisfiability problem for the Presburger formula with theorems derived from LIP in chapter 3. We conclude chapter 5 by analyzing the complexity for the algo- rithm. In chapter 6, we show some other results that can be inferred from our algorithm.
Chapter 2
Two variable logic with counting
In this chapter, we introduce the formal definition of two-variable logic with counting, which we denote by C2. We start by reviewing the syntax and semantics of first order logic in section 2.1. Then, in section 2.2 we formally define the class C2.
2.1 First order logic (FO)
We fix a setR of relation symbols. Each R ∈ R is associated with a positive integer, which is called its arity and denoted by ar(R). We also fix a set VAR of first order variables.
For simplicity, letR = {R1, . . . , Rk}. 1
The syntax of first order logic The syntax of first order logic sentence is defined in- ductively as follows:
• For any x, y ∈ VAR, x = y is an FO formula.
• If R∈ R is of arity n and x1, . . . , xn∈ VAR, then R(x1, . . . , xn) is an FO formula.
• If α and β are FO formulas, then so are¬α, α ∧ β and α ∨ β.
• If α is an FO formula and x∈ VAR, then ∃xα and ∀xα are FO formulas as well.
A formula is existential if it is of the form∃x1. . .∃xmψ(x1, . . . , xm) where ψ(x1, . . . , xm) is quantifier-free, and a formula is universal if it is of the form∀x1. . .∀xmψ(x1, . . . , xm).
1For the sake of simplicity, we do not consider function and constant symbols for the vocabulary.
A variable x is quantified if∀ or ∃ preceded x in the formula. Free variables are the variables that are not quantified. For instance, in the formula ∃x∀yR(x, y, z), x, y are quantified and z is a free variable. Formulas without free variables are called sentences.
The semantics of first order logic A structure is A = (A, RA1, . . . , RAk), where
• A is a set of elements, called the domain , or the universe of A.
• each RAi is a relation over A of arity ar(Ri), i.e., RAi ⊆ Aar(Ri).
Let A = (A, RA1, . . . , RkA) be a structure. A valuation in A is a mapping from VAR to A. An model is a pair (A, val) where A is a structure and val is a valuation.
Given an FO formula φ, and a model (A, val), we define (A, val) to be a model of φ, denoted by (A, val)|= φ, inductively as follows.
• (A, val)|= x = y, if and only if val(x) = val(y).
• (A, val)|= R(x1, . . . , xn), if and only if (val(x1), . . . , val(xn))∈ RA.
• (A, val)|= ¬α, if and only if it is not true that (A, val) |= α.
• (A, val)|= α ∧ β, if and only if (A, val) |= α and (A, val) |= β.
• (A, val)|= α ∨ β, if and only if (A, val) |= α or (A, val) |= β.
• (A, val)|= ∃x α, if and only if there is some a ∈ A such that (A, val′)|= α where val′ is the valuation defined as follows:
val′(z) =
val(z), if z ̸= x a, if z = x
• (A, val) |= ∀x α, if and only if for every a ∈ A, (A, val′) |= α where val′ is the valuation defined as follows:
val′(z) =
val(z), if z ̸= x a, if z = x
When φ is a sentence, i.e., there is no free variable in φ, we can omit the valuation and write A|= φ.
An FO formula φ is said to be satisfiable if φ has a model, and φ is finitely satisfiable if φ has a finite model. We define the following two problems.
SAT(FO)
Input: An FO formula φ
Task: Output T , if φ is satisfiable. Otherwise, output F .
FIN-SAT(FO)
Input: An FO formula φ
Task: Output T , if φ is finitely satisfiable. Otherwise, output F . It is well known that in general satisfiability problem of FO is undecidable.
Theorem 2.1.1 [2, 1, 20, 18] SAT(FO) is undecidable.
Theorem 2.1.2 [18] FIN-SAT(FO) is undecidable.
Therefore, we are not considering the general FO in this thesis.
2.2 Two variable logic with counting quantifiers (C
2)
The syntax of C2 is defined inductively as follows:
ϕ ::= z = z | R(z, z) | ¬ϕ | ϕ ∧ ϕ | ∃≥kz ϕ
where z ranges over x, y, R∈ R, and k is a nonnegative integer. Here, ∃≥kzϕ(z) seman- tically means there exists at least k instances of z’s such that ϕ(z) holds. Observe that∀ is well-defined, since∀xφ is equivalent to ¬∃x¬φ for any formula φ. We note that x and y can be reused, for example,∀x(∃y(∃xϕ1(x, y)))∧ ∀y(∀xϕ2(x, y)) where ϕ1(x, y) and ϕ2(x, y) are quantifier-free formulas is an instance of C2formula.
Similar to FO we define SAT(C2) and FIN-SAT(C2) below.
SAT(C2)
Input: A C2 formula φ
Task: Output T , if φ is satisfiable. Otherwise, output F . FIN-SAT(C2)
Input: A C2 formula φ
Task: Output T , if φ is finitely satisfiable. Otherwise, output F . Theorem 2.2.1 [7, 13, 15, 3, 10] SAT(C2) is NEXPTIME-complete.
Theorem 2.2.2 [7, 15] FIN-SAT(C2) is NEXPTIME-complete.
2.2.1 The class RCS
The following is a standard normalization lemma, which is often used in the decision procedures for FO2 and C2formulas [17, 6, 15].
Lemma 2.2.3 (Scott’s normal form) For every C2 sentence ϕ, there is a formula
ϕ∗ := (
∀x α)
∧(
∀x∀y (β ∨ x = y))
∧ ∧
1≤h≤p
∀x∃=Chy (fh(x, y)∧ x ̸= y) (2.1)
that can be constructed in polynomial time in the length of ϕ, and satisfies the following conditions:
(C1) α is quantifier-free and equality-free.
(C2) β is quantifier-free and equality-free.
(C3) p is a positive integer.
(C4) For any h∈ {1, . . . , p}, fhis a binary predicate and Chis a positive integer.
(C5) For any positive integer µ≥ K := max1≤h≤pCh,
ϕ has a model of size µ if and only if ϕ∗has a model of size µ.
A C2 sentence of the form 2.1 is called Scott’s normal form. In this thesis, we assume the set of binary relations used is closed under reversal. Formally, it is stated as follows.
Definition 2.2.4 A C2 formula φ is an RCS formula, if it is in Scott’s normal form as in (2.1) and for every binary relation fhappearing in∧
1≤h≤p∀x∃=Chy(fh(x, y)∧ x ̸= y), there is some h′such that fhis the reversal of fh′.
As mentioned before, the proofs in [7, 13, 15] are rather complicated and involve a lot of guessing. In this thesis, we will present decision procedures for SAT(C2) and FIN-SAT(C2) problems, when the input formulas are restricted toRCS formulas with an entirely different technique. As mentioned earlier, our approach yields a few advantages:
1. It provides a characterization of models ofRCS formulas, i.e., every model of an RCS formula is a collection of regular digraphs and biregular graphs.
2. It implies the decidability of checking whether the spectrum of anRCS formula is infinite.
3. It yields a simple decision procedures for satisfiability and finite satisfiability prob- lems.
Remark 2.2.5 It is worth stating that the satisfiability and finite satisfiability problems for three-variable logic are already undecidable. [8]
Chapter 3
Presburger arithmetics
In this chapter, we introduce the formal definition of Presburger arithmetic and its exten- sion with infinity. We will also discuss the satisfiability problem in both cases.
3.1 Standard Presburger arithmetic
We define the following structureN := ⟨N, +, ≤, 0⟩, where +, ≤, 0 are interpreted in the standard way. A formula on Presburger arithmetic is an FO formula over the vocabulary {+, ≤, 0}.
The satisfiability problem for Presburger arithmetic is defined as follows.
SAT(Presburger)
Input: A Presburger formula φ
Task: Output T , ifN |= φ. Otherwise, output F .
It is known that SAT(Presburger) is decidable[16, 12], and the algorithm given by Pres- burger has nonelementary time complexity. The following theorem states that, in fact, the problem SAT(P resburger) is elementary.
Theorem 3.1.1 [12] SAT(P resburger) with input length n can be decided in O(222cn) for some constant c > 1.
However, the result above is not efficient enough for our need. So, we turn into a subclass of Presburger formulas.
Theorem 3.1.2 When the input Presburger formula φ is restricted to existential formula, then SAT(P resburger) is in NP.
Theorem 3.1.2 follows from Papadimitriou’s result for LIP [14], and the detailed discus- sion can be found in section 3.3.
3.2 Presburger arithmetic with infinity
Presburger arithmetic can in fact be further extended to include infinity in its domain. Let N∞:=N ∪ {∞}. We denote the following structure N∞ :=⟨N∞, +,≤, 0⟩.
• The constant 0 is interpreted as the standard zero.
• The operator + onN is interpreted in the standard way, and when ∞ is involved, it is defined as follows.
For every a∈ N, a + ∞ = ∞ + a = ∞ + ∞ = ∞
• The relation≤ on N is interpreted in the standard way, and when ∞ is involved, it is defined as follows.
For every a∈ N, a ≤ ∞ and ∞ ≤ ∞
Notice that the definition above is consistent with our intuition on infinity. We now define the following problem.
SAT(Presburger-inf)
Input: A Presburger formula φ
Task: Output T , ifN∞|= φ. Otherwise, output F .
3.3 The satisfiability of Presburger formula
In this section, we will only consider the existential Presburger sentences, i.e., the Pres- burger sentences of the form:
ϕ = ∃X1∃X2. . .∃Xnφ(X1, X2, . . . , Xn)
where φ is quantifier-free. We will show that deciding whetherN |= ϕ or N∞ |= ϕ can be solved using the technique of linear integer programming (LIP).
We first recall the following theorem proved by Christos H. Papadimitriou [14].
Theorem 3.3.1 [14] Let A be an m× n matrix and b an m-vector such that the absolute value of every entry of A or b is no larger than a. Then if there exists a solution x∈ Nn for Ax = b, there is some y∈ {0, 1, . . . n(ma)2m+1}nsuch that Ay = b.
From theorem 3.3.1, we can obtain the following corollary.
Corollary 3.3.2 Let A be an m×n matrix and b an m-vector such that the absolute value of any entry of A or b is no larger than a. Then if there exists a solution x∈ Nnfor Ax≤ b, there is some y∈ {0, 1, . . . (n + m)(ma)2m+1}nsuch that Ay ≤ b.
Proof. Let Imdenoted the m× m identity matrix, in other words,
Im :=
1 0 . . . 0
0 1 0
... . .. ...
0 0 . . . 1
It follows immediately by letting
A′ :=
(
A Im )
and noting that A′x′ = b has a solution if and only if Ax ≤ b has the solution x where x consists of the first n entries of x′. By theorem 3.3.1, x′ exists if and only if there is
y′ ∈ {0, 1, . . . (n + m)(ma)2m+1}n+m such that A′y′ = b. Finally, we can conclude that if Ax≤ b has a solution, there exists y ∈ {0, 1, . . . (n + m)(ma)2m+1}nwhere y consists
of the first n entries of y′and Ay ≤ b.
Now we can prove the following theorem.
Theorem 3.3.3 Both SAT(Presburger) and SAT(Presburger-inf) are in NP when the in- put formula φ is restricted to existential formulas.
Proof. We will describe the polynomial time nondeterministic algorithm here.
Via nondeterminism, for each disjunction A∨ B in φ(X1, . . . , Xn), we can eliminate either A or B by guessing correctly, since φ(X1, . . . , Xn) is of quantifier free. Therefore, we can assume φ(X1, . . . , Xn) is of the form ϕ1(X1, . . . , Xn) ∧ . . . ∧ ϕk(X1, . . . , Xn) where each ϕℓ(X1, . . . , Xn) is a linear inequality, which can be converted into an LIP instance Ax≤ b.
For SAT(Presburger), by corollary 3.3.2 and nondeterminism, we can guess the value of each Xifor i∈ {1, . . . , n} from {0, 1, . . . (n+k)(na)2n+1} where a is the largest abso- lute values of the coefficients in all linear equations ϕ1(X1, . . . , Xn), . . . , ϕk(X1, . . . , Xn).
Then we check whether the guessed values satisfy all ϕ1(X1, . . . , Xn), . . . , ϕk(X1, . . . , Xn), and conclude the Presburger formula is satisfiable if so, not satisfiable otherwise.
For SAT(Presburger-inf), it works similarly.
Chapter 4
Regular graphs
In this chapter, we will introduce some tools that we are going to use later on. In particular, we introduce two classes of regular graphs: bipartite regular graphs (biregular graphs) and regular directed graphs (regular digraphs). Intuitively, biregular graphs are bipartite graphs where the degree of each vertex is already fixed, and regular digraphs are directed graphs where the in-degree and out-degree of each vertex are already fixed.
We will show how to construct the existential Presburger formulas that characterize the existence of biregular graphs and regular digraphs. We present the formal definitions in section 4.1. The construction of the Presburger formulas for biregular graphs can be found in section 4.2. A similar construction for regular digraphs can be found in section 4.3.
4.1 Definitions
Section 4.1.1 contains the definition of biregular graphs and section 4.1.2 contains the definition of regular digraphs.
4.1.1 Biregular graphs
A undirected graph G is an ℓ-type bipartite graph if G = (U, V, E1, E2, . . . , Eℓ) is a bipartite graph where U and V are the partitions of the vertices and E1, E2, . . . , Eℓ are the pairwise disjoint subset of U× V .
For any vertex u, the degree of u, denoted by deg(u), is the number of edges adjacent to u in G. The degree of u in the edge set Eiis denoted by degE
i(u) for any i ∈ {1, . . . , ℓ}.
Observe that for vertex u in G, deg(u) =∑
i=1,...,ℓdegE
i(u).
In the following, for any set S and d, e ∈ N, Sd×eis defined to be the set of all d× e matrices whose entries are in S. For C ∈ Nℓ×mand D ∈ Nℓ×n, an ℓ-type bipartite graph G = (U, V, E1, E2, . . . , Eℓ) is (C, D)-biregular if there exists some partitions U1∪. . .∪Um
of U and V1∪ . . . ∪ Vnof V such that:
• For every i = 1, . . . ℓ and j = 1, . . . , m, and for any vertex u∈ Uj, degE
i(u) = Ci,j.
• For every i = 1, . . . , ℓ and j = 1, . . . , n, and for any vertex v ∈ Vj, degE
i(v) = Di,j.
We say a (C, D)-biregular graph G is of size ( ¯M , ¯N ) if ¯M = (|U1|, . . . , |Um|) and N = (¯ |V1|, . . . , |Vn|), and we call U1∪ . . . ∪ Umand V1∪ . . . ∪ Vna witness of G.
Note that we do not restrict the graph to be finite. In which case, some entry in ¯M or N is infinite.¯
4.1.2 Regular digraphs
An ℓ-type directed graph (digraph) G = (V, E1, E2, . . . , Eℓ) is defined similarly, where E1, E2, . . . , Eℓ are pairwise disjoint directed edges. For convenience, we always assume that the set E1 ∪ E2∪ . . . ∪ Eℓ is asymmetric, i.e., if (u, v) ∈ E1 ∪ E2 ∪ . . . ∪ Eℓ, then (v, u) /∈ E1∪ E2∪ . . . ∪ Eℓ.
Similar to how we define the degree of a vertex in an undirected graph, we define the out-going degree and incoming degree of a vertex in a directed graph. The for- mal definition is as follows: For any vertex u, the out-going degree of u, denoted by out-deg(u), is the number of out-going edges from u, and the incoming degree of u, de- noted by in-deg(u), is the number of incoming edges to u. The out-going degree of u in the edge set Ei is denoted by out-degE
i(u), and the incoming degree of u in the edge set Eiis denoted by in-degE
i(u) for any i ∈ {1, . . . , ℓ}. Observe that for any vertex u in G, out-deg(u) =∑
i=1,...,ℓout-degEi(u) and in-deg(u) =∑
i=1,...,ℓin-degEi(u).
For C, D ∈ Nℓ×n, an ℓ-type digraph G = (V, E1, E2, . . . , Eℓ) is a (C, D)-regular- digraph if there exists a partition of V = V1 ∪ . . . ∪ Vn, such that for every i = 1, . . . , ℓ and j = 1, . . . , n, and for any vertex v∈ Vj, in-degE
i(v) = Ci,j and out-degE
i(v) = Di,j. We say a (C, D)-regular digraph G is of size ¯N if ¯N = (|V1|, . . . , |Vn|), and we call V1∪ . . . ∪ Vna witness of G.
Observe that ¯N has infinite entry if and only if the number of vertices in G is infinite.
4.2 Presburger characterization of the existence of bireg- ular graph
In this section, we will prove the following theorem.
Theorem 4.2.1 For every two matrices C ∈ Nℓ×mand D ∈ Nℓ×n, there is a (quantifier- free) Presburger formula BiREGC,D( ¯X, ¯Y ), where ¯X = (X1, . . . , Xm) and ¯Y = (Y1, . . . , Yn), such that the followings hold: For any ¯M ∈ Nm∞and ¯N ∈ Nn∞, there is an ℓ-type (C, D)- biregular graph of size ( ¯M , ¯N ) if and only if BiREGC,D( ¯M , ¯N ) holds.
As a matter of fact, in the finite case, i.e., when ¯M and ¯N are over N instead of N∞, theorem 4.2.1 has already been proven by Kopczyński and Tan [9]. Our goal is to extend it to infinite case. The proof is divided into parts. We discuss the case of one dimensional matrices in subsection 4.2.1 and extend it to all matrices in 4.2.2.
From now on, for any matrix or vector M , we denote the sum of all entries of M by
∑M , and we denote the maximum among the sums of the columns of M by MC(M ),
i.e., MC(M ) := maxj{∑
iMi,j}.
4.2.1 1-type biregular graphs
We will first prove the simpler case of theorem 4.2.1 where C and D both only have one row. We will make use of the following notation: For any m, n, ℓ ∈ N and any two
matrices C ∈ Nm×ℓand D∈ Nn×ℓ,
HC,D :=
( ¯M , ¯N )
∑M +¯ ∑N < 2ℓ¯ · MC(C) · MC(D) + 3ℓ and
there exists a (C, D)-biregular graph of size ( ¯M , ¯N )
(4.1),
Below is the special case of theorem 4.2.1, where the matrices consist of a single row.
Lemma 4.2.2 For every two vectors ¯c∈ N1×mand ¯d∈ N1×n, there is a (quantifier-free) Presburger formula BiREG¯c, ¯d( ¯X, ¯Y ), where ¯X = (X1, . . . , Xm) and ¯Y = (Y1, . . . , Yn), such that the followings hold: For any ¯M ∈ Nm∞ and ¯N ∈ Nn∞, there exists a (¯c, ¯d)- biregular graph of size ( ¯M , ¯N ) if and only if BiREG¯c, ¯d( ¯M , ¯N ) holds.
Proof. Let ¯c = (c1, . . . , cm), and ¯d = (d1, . . . , dn). Let I := {i | ci = 0} and J :=
{j | dj = 0}, i.e., the zero entries in ¯c and ¯d. Let ¯c′ and ¯X′ be ¯c and ¯X without entries in I, and ¯d′ and ¯Y′ be ¯d and ¯Y without entries in J . Observe that both ¯c′ and ¯d′do not have zero entry.
Now define BiREG¯c, ¯d( ¯X, ¯Y ) as follows.
BiREG¯c, ¯d( ¯X, ¯Y ) := ( ∑
X¯′+∑
Y¯′ ≥ 2 · MC(¯c′)· MC( ¯d′) + 3 ∧ ( ¯X′· ¯c′ = ¯Y′· ¯d′) )
∨ ( ∨
( ¯M , ¯N )∈H¯c′, ¯d′
X = ¯¯ M ∧ Y = ¯¯ N )
(4.2)
where
Hc¯′, ¯d′ =
( ¯M , ¯N )
∑M +¯ ∑N < 2¯ · MC(¯c′)· MC( ¯d′) + 3 and
there exists a (¯c′, ¯d′)-biregular graph of size ( ¯M , ¯N )
by definition in 4.1.
The set H¯c′, ¯d′ can be obtained by checking whether there exists a (¯c′, ¯d′)-biregular graph of size ( ¯M , ¯N ) for every ( ¯M , ¯N ) such that∑M +¯ ∑N < 2¯ ·MC(¯c′)·MC( ¯d′) + 3.
Note that BiREG¯c, ¯d( ¯X, ¯Y ) holds means the vertices corresponding to ¯X′and ¯Y′form a (¯c′, ¯d′) -biregular graph while the rest of the vertices can be arbitrary, since they do not have adjacent edges.
Claim 1 For any two vectors ¯M ∈ Nm∞ and ¯N ∈ Nn∞, there exists a (¯c, ¯d)-biregular graph of size ( ¯M , ¯N ) if and only if BiREG¯c, ¯d( ¯M , ¯N ) holds.
Proof of claim. By theorem 7.3 of Kopczyński and Tan [9], we know the claim is true when ¯M ∈ Nm and ¯N ∈ Nn. Let ¯M′ denote ¯M without entries in I and ¯N′ denote ¯N without entries in J . We can observe that if both ¯M′ and ¯N′ only have finite entries, the claim holds, since Xi for i ∈ I and Yj for j ∈ J can be arbitrary. So,without loss of generality, we can consider the case where there is some infinite entry in ¯M′.
To prove the “only if” direction, we assume there exists a (¯c, ¯d)-biregular graph of size ( ¯M , ¯N ). Observe that it is trivial that∑M¯′+∑N¯′ ≥ 2 · MC(¯c′)· MC( ¯d′) + 3 holds.
Finally, both ¯M′· ¯c′ and ¯N′ · ¯d′ are the number of edges in the biregular graph, implying M¯′ · ¯c′ = ¯N′· ¯d′ =∞. Hence, BiREG¯c, ¯d( ¯M , ¯N ) holds.
For the “if” direction, we first observe that ( ¯M′, ¯N′) /∈ Hc¯′, ¯d′ since ¯M′ has infinite entry. Therefore, in order for BiREG¯c, ¯d( ¯M , ¯N ) to hold, ¯M′· ¯c′ = ¯N′· ¯d′ =∞ must hold, and we can conclude there is some infinite entry in ¯N′ as well. Let U = U1∪ . . . ∪ Um
be a partition where|Ui| = Mi and let V = V1∪ . . . ∪ Vnbe a partition where|Vj| = Nj. Now we can construct a (¯c, ¯d) biregular graph recursively by repeating the steps below.
• Assume the vertices in each Ui and Vj are ordered, and we iterate through the sets U1, . . . , Um, V1, . . . , Vn.
• Suppose the set we are currently at is Ui. We find the first vertex u such that its current degree is less than ci. Since ¯N′has at least one infinite entry, say Nj =∞, we can always find a vertex v in Vj with its number of edges less than dj during the construction, and we connect u and v with an edge.
• Similarly, suppose the set we are currently at is Vj. We find the first vertex v such that its current degree is less than dj. Since ¯M′ has at least one infinite entry, say Mi =∞, we can always find a vertex u in Ui with its number of edges less than ci during the construction, and we connect u and v with an edge.
With such construction, we will achieve a (¯c, ¯d)-biregular graph in infinite steps.
By symmetry, the claim also holds for the case where there is some infinite entry in
N¯′. Thus the claim holds.
The claim concludes the proof for lemma 4.2.2.
4.2.2 Proof of theorem 4.2.1
By deleting any zero column ¯ci(or ¯di) of C (or D, respectively) and adding the constraint Xi ≥ 0 (or Yi ≥ 0) to the resulting Presburger formula, we can assume both C, D do not contain any zero-column.
The formal construction of BiREGC,D( ¯X, ¯Y ) is as follows. First, we define the char- acteristic function χ :N∗ → {0, 1}∗ whereN∗ :=∪
k≥1Nk and{0, 1}∗ :=∪
k≥1{0, 1}k.
χ(a1, . . . , ak) := (b1, . . . , bk), where bi = 0 if ai = 0 and bi = 1 otherwise.
Also, let ¯c1, ¯c2, . . . , ¯cℓ be the row vectors of C, and ¯d1, ¯d2, . . . , ¯dℓ be the row vectors of D. That is,
C =
¯ c1
¯ c2
...
¯ cℓ
and D =
d¯1 d¯2 ... d¯ℓ
.
Now we can construct BiREGC,D( ¯X, ¯Y ) inductively:
• When ℓ = 1,
BiREGC,D( ¯X, ¯Y ) := BiREGc¯1, ¯d1( ¯X, ¯Y ).
• When ℓ≥ 2,
BiREGC,D( ¯X, ¯Y ) := ∨
( ¯M , ¯N )∈HC,D
X = ¯¯ M ∧ Y = ¯¯ N (4.3)
∨ ∨
i=1,...,ℓ
(
X¯ · χ(¯ci) + ¯Y · χ( ¯di)≥ 2 · MC(C) · MC(D) + 3
∧ BiREGC\¯ci,D\ ¯di( ¯X, ¯Y )
∧ BiREG¯ci, ¯di( ¯X, ¯Y )
)
where C\ ¯ci denotes the matrix C without ¯ci, and D\ ¯di denotes the matrix D without d¯i. Observe that HC,Dis defined with equation 4.1 and can be greedily computed with the same method as before.
Then we shall prove that there exists an ℓ-type (C, D)-biregular graph of size ( ¯M , ¯N ) if and only if BiREGC,D( ¯M , ¯N ) holds by induction. First, the case for ℓ = 1 is equivalent to the statement in lemma 4.2.2. Now suppose the induction hypothesis holds when C and D have no more than ℓ− 1 rows. We will prove for the case ℓ.
• For the “if” direction, we assume BiREGC,D( ¯M , ¯N ) holds. If the first part of for- mula 4.3 holds, that is∨
( ¯M , ¯N )∈HC,D
X = ¯¯ M ∧ ¯Y = ¯N holds, the biregular graph exists trivially. Therefore, we only have to consider the case where
M¯ · χ(¯ci) + ¯N · χ( ¯di)≥ 2 · MC(C) · MC(D) + 3
∧
BiREGC\¯ci,D\ ¯di( ¯M , ¯N ) ∧ BiREG¯ci, ¯di( ¯M , ¯N )
holds for some i∈ {1, . . . , ℓ}.
We first consider the case where|Ei| ̸= ∞. By induction hypothesis, there exist a (C\¯ci, D\ ¯di)-biregular graph with edge partition E1∪. . .∪Ei−1∪Ei+1∪. . .∪Eℓand a (¯ci, ¯di)-biregular graph with edge partition Eiboth of size ( ¯M , ¯N ). The vertices of the two graphs can be merged one-to-one since they have the same size. However, the edge set{E1, . . . , Eℓ} may not be pairwise disjoint, as there may be some some edge in Ei∩ (E1∪ . . . ∪ Ei−1∪ Ei+1∪ . . . ∪ Eℓ), and suppose (u, v) is such an edge.
Observe that there are at most 2· MC(C) · MC(D) + 2 vertices that can be reached within 2 edges in E1 ∪ . . . ∪ Eℓ from either u or v, including u and v themselves.
Since ¯M · χ(¯ci) + ¯N · χ( ¯di) ≥ 2 · MC(C) · MC(D) + 3, there exists some edge (u′, v′) ∈ Ei such that (u, v′), (u′, v) /∈ E1 ∪ . . . ∪ Ei−1 ∪ Ei+1∪ . . . ∪ Eℓ. By removing (u, v), (u′, v′) from Eiand adding (u, v′), (u′, v) to Ei, the resulting edge set Ei′ will still be an edge partition for a (¯ci, ¯di)-biregular graph. Moreover,
|Ei∩ (E1∪ . . . ∪ Ei−1∪ Ei+1∪ . . . ∪ Eℓ)| > |Ei′∩ (E1∪ . . . ∪ Ei−1∪ Ei+1∪ . . . ∪ Eℓ)|
holds. Hence, by repeating the step of replacing the edges in Ei, we will eventually get a set Ei′′such that|Ei′′∩(E1∪. . .∪Ei−1∪Ei+1∪. . .∪Eℓ)| = 0 when |Ei| ̸= ∞.
This completes the construction of (C, D)-biregular graph for finite Ei.
Now, we assume|Ei| = ∞. Observe that in this case, ¯M· χ(¯ci) + ¯N · χ( ¯di) = ∞, and hence ¯M · χ(¯ci) = ¯N · χ( ¯di) =∞.
By induction hypothesis, we can let G = (U, V, E1, . . . , Ei−1, Ei+1, . . . , Eℓ) be a (C \ ¯ci, D\ ¯di)-biregular graph, and let U = U1∪ . . . ∪ Um and V = V1∪ . . . ∪ Vn
be a witness of G. We will construct the (C, D)-biregular graph recursively by repeating the steps below.
– Assume the vertices in each Ui′ and Vj′ are ordered, and then we can iterate through the sets U1, . . . , Um, V1, . . . , Vn.
– Suppose the set we are currently at is Ui′. We find the first vertex u such that its current degree in Eiis less than Ci,i′. Since ¯N· χ( ¯di) = ∞, ¯N has at least one infinite entry, say Nj′ =∞, such that Di,j′ > 0, so we can always find a vertex v in Vj′ with its number of edges less than Di,j′ during the construction, and we add (u, v) into Ei.
– Similarly, suppose the set we are currently at is Vj′. We find the first vertex v such that its current degree in Ei is less than Di,j′. Since ¯M · χ(¯ci) = ∞, M has at least one infinite entry, say M¯ i′ =∞, such that Ci,i′ > 0, so we can always find a vertex u in Ui′with its number of edges less than Ci,i′ during the
construction, and we add (u, v) into Ei.
By repeating the steps above, we will obtain a (C, D)-biregular graph over infinite steps.
• Finally, we prove the “ only if ” direction. Suppose there is a (C, D)-biregular graph G = (U, V, E1, . . . , Eℓ) of size ( ¯M , ¯N ). If∑M +¯ ∑N < 2ℓ¯ ·MC(C)·MC(D)+3ℓ, then ( ¯M , ¯N )∈ HC,Dand BiREGC,D( ¯M , ¯N ) holds trivially.
Consider the case where ∑M +¯ ∑N¯ ≥ 2ℓ · MC(C) · MC(D) + 3ℓ. Since C and D both do not have zero-column, we can find some i ∈ {1, . . . , ℓ} such that ¯M · χ(¯ci) + ¯N · χ( ¯di) ≥ 2 · MC(C) · MC(D) + 3. Also, we notice that G1 := (U, V, E1, . . . , Ei−1, Ei+1, . . . , Eℓ) is a (C\ ¯ci, D \ ¯di)-biregular graph and G2 := (U, V, Ei) is a (¯ci, ¯di)-biregular graph. Therefore, by induction hypothesis, BiREGC\¯ci,D\ ¯di( ¯M , ¯N ) and BiREG¯ci, ¯di( ¯M , ¯N ) both hold.
This completes our proof.
4.3 Presburger characterization of the existence of regu- lar digraph
Using similar technique, regular digraphs can be characterized by Presburger formulas as well as stated in the following theorem.
Theorem 4.3.1 For every two matrices C, D ∈ Nℓ×m, there is a (quantifier-free) Pres- burger formula REGC,D( ¯X), where ¯X = (X1, . . . , Xm), such that the following holds:
For any ¯M ∈ Nm, there exists a (C, D)-regular-digraph of size ¯M if and only if REGC,D( ¯M ) holds.
Before proving theorem 4.3.1, we first prove some auxiliary lemmas.
Lemma 4.3.2 Let ¯c, ¯d∈ N1×m be one-row vectors. For every ¯M ∈ Nm that satisfies the inequality 2·∑M¯ ≥ 2 · MC(¯c) · MC( ¯d) + 3, the following holds.