Open AccessArticle

Compaction of Church Numerals

Isamu Furuya

^*,†

and

Takuya Kida

^†

Graduate School/Faculty of Information Science and Technology, Hokkaido University, Kita 14 Nishi 9, Sapporo 060-0814, Japan

Author to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Algorithms 2019, 12(8), 159; https://doi.org/10.3390/a12080159

Submission received: 28 June 2019 / Revised: 5 August 2019 / Accepted: 5 August 2019 / Published: 8 August 2019

(This article belongs to the Special Issue Data Compression Algorithms and their Applications)

Download

Browse Figures

Review Reports Versions Notes

Abstract

In this study, we address the problem of compaction of Church numerals. Church numerals are unary representations of natural numbers on the scheme of lambda terms. We propose a novel decomposition scheme from a given natural number into an arithmetic expression using tetration, which enables us to obtain a compact representation of lambda terms that leads to the Church numeral of the natural number. For natural number n, we prove that the size of the lambda term obtained by the proposed method is

O ({({slog}_{2} n)}^{(log n / log log n)})

. Moreover, we experimentally confirmed that the proposed method outperforms binary representation of Church numerals on average, when n is less than approximately 10,000.

Keywords:

loss-less compression; lambda term; higher-order compression

1. Introduction

The goal of this study is to obtain a compact lambda term (

λ

-term) that leads to the Church numeral of a given natural number. Church numerals are unary representations of natural numbers on

λ

-terms. Let

C (n)

be the Church numeral for natural number n; then

C (n)

is as follows:

C (n) = λ f . λ x . \overset{n}{\overset{︷}{f (f \dots (f}} x) \dots) .

Namely, the length of the Church numeral increases linearly with n. We want a more compact representation for n in the scheme of

λ

-terms. We refer to this task as the compaction of Church numerals. Using a binary representation for n is a natural idea, which enables to obtain a

λ

-term whose size is

Θ (log n)

. Another idea is to decompose n into an arithmetic expression and to represent it as a corresponding

λ

-term of the expression. This may reduce the size of the

λ

-term. For example,

n = 500

can be decomposed into

5 \times 10 \times 10

. The

λ

-term corresponding to this expression is given as

(λ p . λ q . λ f . λ x . p (q (q f)) x) C (5) C (10)

, which is much smaller than

C (500)

. In this idea, tetration can be used for the decomposition. Tetration is the next hyperoperation after exponentiation; it is defined as iterated exponentiation, e.g., the third tetration of two

^{3} 2 = 2^{2^{2}} = 16

. Since a function for calculating

^{i} k

on Church numerals is written as a

λ

-term whose size is

O (i + k)

, the size of the

λ

-term for

n =^{i} k

achieves

o (log n)

In this paper, we propose the Recursive tetrational partitioning (RTP) method to decompose a natural number using tetration. We also present an algorithm to generate a compact

λ

-term by using RTP. Moreover, we prove that the size of the obtained

λ

-term is

O ({({slog}_{2} n)}^{log n / log log n})

, where

{slog}_{2} n

is super-logarithm, which is the inverse operation of tetration. It is slightly worse than

O (log n)

, but we also prove that the size of the

λ

-term achieves

O ({slog}_{2} n)

in some cases, and our experimental result shows that our method outperforms a binary representation on

λ

-term on average, when n is less than approximately 10,000.

The compaction of Church numerals can be applied to data compression. Kobayashi et al. [1] proposed a compression method called higher-order compression that uses extended

λ

-terms as the data model. Their method translates an input data to a compact

λ

-term and then encodes the obtained

λ

-term. When a pattern consecutively appears n times in the input, such a part can be represented as a

λ

-term with

C (n)

. Thus efficient compaction of

C (n)

will help for higher-order compression when the data contains many repetitive patterns.

Yaguchi et al. [2] recently proposed an efficient algorithm for higher-order compression. They utilized a simply typed

λ

-term for efficient modeling and encoding. Differing from Kobayashi et al.’s method where each context occurring more than once in an input is extracted, Yaguchi et al.’s method extracts the most frequent context up to a certain size. They state in [2] that their method often achieves better performance than a grammar compression [3] with regard to compression ratio. Of course, their method can also be applied to the compaction of

C (n)

, while it takes time for repeating context extraction. We confirmed via experiments that our method tends to produce more compact

λ

-terms for Church numerals than Yaguchi et al.’s method. Note that our method can be easily incorporated into Yaguchi et al.’s method.

Contributions: The primary contributions of this study are as follows.

For natural numbers, we propose a novel decomposition scheme called RTP, which enables to obtain a compact representation of $λ$ -terms that leads to the Church numerals of the numbers.
By incorporating RTP, we propose an algorithm to perform the compaction of $C (n)$ for natural number n. Moreover, we prove that the size of the $λ$ -terms constructed by the algorithm is $O ({({slog}_{2} n)}^{log n / log log n})$ .
We implemented the proposed algorithm and conducted comparative experiments. With regard to the sizes of the obtained $λ$ -terms, the results show that the method is superior to that of Yaguchi et al. [2] and is also superior to a binary representation on $λ$ -term on average, when n is less than approximately 10,000.

Related studies: Better modeling for an input data leads better compression ratio. Since it is rare that we know the exact model for the input well beforehand, we usually use a universal source coding [4] or assume an appropriate model on the input to express the structure inherent in it. Kieffer and Yang [3] proposed grammar compression that uses a context-free grammar as the model. Grammar compression is a compression scheme that translates an input text into a set of context-free grammar rules, before encoding the rules. In grammar compression, a smaller set of rules gives a better compression ratio. Finding the optimal grammar is an NP-hard problem [5] but several excellent heuristic algorithms have been proposed for its solution [6,7,8,9,10,11]. Obtaining a smaller grammar is of great importance since most existing algorithms that work on grammar-compressed texts have running time dependent on the grammar sizes (see the excellent surveys e.g., [12,13]). Due to information theory arguments, any compression cannot be better than a binary encoding on all numbers. However, as stated in [1] and the above, higher-order compression can be better for a certain subset of numbers, i.e., for

n =^{i} k

Composition: The rest of this paper is organized as follows. In Section 2, we review tetration, lambda notation, and Church numerals. In Section 3, we define the proposed RTP method and present the translation algorithm using RTP. We also prove the upper bound of the size of the

λ

-term produced by our algorithm. In Section 4, we discuss application to higher-order compression and present our experimental result. Conclusions are presented in Section 5.

Previously, one page abstract of this work has been published in the proceedings of Data Compression Conference 2016 [14]. This paper is the full version article of [14].

2. Preliminaries

2.1. Tetration and Super-Logarithm

Definition 1

(Tetration). For natural numbers φ and t, the t-th tetration of φ, denoted by

^{t} φ

, is recursively defined as follows:

\begin{matrix} ^{t} φ : = \{\begin{matrix} 1 & (i f t = 0), \\ φ^{(^{t - 1} φ)} & (o t h e r w i s e) . \end{matrix} \end{matrix}

For example,

^{1} 2 = 2^{^{0} 2} = 2^{1} = 2

^{2} 2 = 2^{^{1} 2} = 2^{2} = 4

^{3} 2 = 2^{^{2} 2} = 2^{4} = 16

, and

^{4} 2 = 2^{^{3} 2} = 2^{16} = 65536

The following corollary is easily induced from Definition 1.

Corollary 1.

For natural numbers φ and t, it holds that

{log}_{φ}^{t} φ =^{t - 1} φ

Definition 2

(Super-logarithm). Super-logarithm, denoted by

slog

, is the inverse operation of tetration, defined as

{slog}_{φ} (^{t} φ) = t

with natural numbers φ and t.

The iterated logarithm of n, denoted by

{log}^{*} n

, is also known as the number of times the logarithm function must be applied to n before it becomes less than or equal to 1. For positive numbers, the super-logarithm is essentially equivalent to the iterated logarithm, i.e., it holds that

{log}^{*} n = ⌈ {slog}_{e} n ⌉

for any

n > 0

. However, note that in the

O (.)

-notation the size of super-logarithm depends on its base, that is,

O ({slog}_{a} n) \neq O ({slog}_{b} n)

a \neq b

2.2. Lambda Terms

Definition 3

(Lambda terms (

λ

-terms)). Let x be a variable. Then, lambda term (λ-term) M is defined inductively as follows:

\begin{matrix} M : : = x | λ x . M | M_{1} M_{2} . \end{matrix}

We call

λ

-terms formed

λ x . M

and

M_{1} M_{2}

λ-abstraction and functional application, respectively. Let

V_{1}

and

V_{2}

be the sets of the variables occurring in

M_{1}

and

M_{2}

, respectively, then, we identify

M_{1}

with

M_{2}

if there is an isomorphism from

V_{1}

V_{2}

M [x_{1} : = M_{1}, x_{2} : = M_{2}, \dots]

denotes the

λ

-term such that all occurrences of

x_{1}, x_{2}, \dots

in M are replaced by

M_{1}, M_{2}, \dots

, respectively.

We show in parentheses “(“ and “)” the precedence of combining

λ

-terms. However, for simplicity, we omit these parentheses based on standard omission rules, that is, we consider that functional applications are left-associative and prior to

λ

-abstractions. For example, we consider

M_{1} M_{2} M_{3}

(M_{1} M_{2}) M_{3}

rather than

M_{1} (M_{2} M_{3})

, and consider

λ x . M_{1} M_{2}

λ x . (M_{1} M_{2})

rather than

(λ x . M_{1}) M_{2}

Definition 4

(β-reduction). Let x be a variable, and

M_{1}

and

M_{2}

be λ-terms. Then,

β

-reduction is a relation of λ-terms, denoted by using

⟶_{β}

, such that

(λ x . M_{1}) M_{2} ⟶_{β} M_{1} [x : = M_{2}]

. We write

⟶_{β}^{*}

for the reflexive transitive closure of

⟶_{β}

For example,

(λ x . M_{1} x x) M_{2} ⟶_{β} M_{1} M_{2} M_{2}

because the right-side of

⟶_{β}

(M_{1} x x) [x : = M_{2}]

. Similarly,

(λ x . λ y . x y y) M_{1} M_{2} ⟶_{β}^{*} M_{1} M_{2} M_{2}

because

(λ x . λ y . x y y) M_{1} M_{2} ⟶_{β} (λ y . M_{1} y y) M_{2} ⟶_{β} M_{1} M_{2} M_{2}

We define the sizes of

λ

-terms as follows. The definition can be found in [1].

Definition 5

(Sizes of λ-terms). Let x be a variable and M be a λ-term. Then, we denote the size of the λ-term by

# M

, and inductively define the size of each λ-term as follows:

\begin{matrix} # x = 1, # (λ x . M) = # M + 1, # (M_{1} M_{2}) = # M_{1} + # M_{2} + 1 . \end{matrix}

For

λ

-terms

M_{1}

and

M_{2}

, we say that

M_{2}

is more compact than

M_{1}

# M_{2} < # M_{1}

. Compaction of a

λ

-term

M_{1}

means to find a

λ

-term

M_{2}

such that

M_{1} N ⟶_{β}^{*} M^{'}

and

M_{2} N ⟶_{β}^{*} M^{'}

for every N with a

λ

-term

M^{'}

, and

# M_{2} < # M_{1}

In this paper, we assume Word RAM as the computational model. Thus, we ignore the size of names and pointers when we discuss size and space.

2.3. Church Numerals

It is known that natural numbers are represented as Church numerals on

λ

-terms. Also, some

λ

-terms enabling arithmetic operations on Church numerals are provided.

Definition 6

(Church numerals). For natural number n, its Church numeral, denoted by

C (n)

, is

\begin{matrix} C (n) : = λ f . λ x . \overset{n}{\overset{︷}{f (f \dots (f}} x) \dots) . \end{matrix}

Corollary 2

(Arithmetic operations on Church numerals). Let

n_{1}

and

n_{2}

be natural numbers. Then we obtain three λ-terms

Add (n_{1}, n_{2})

Mul (n_{1}, n_{2})

, and

Tet (n_{1}, n_{2})

, such that

Add (n_{1}, n_{2}) ⟶_{β}^{*} C (n_{1} + n_{2})

Mul (n_{1}, n_{2}) ⟶_{β}^{*} C (n_{1} \cdot n_{2})

, and

Tet (n_{1}, n_{2}) ⟶_{β}^{*} C (^{n_{2}} n_{1})

, respectively, as follows:

\begin{matrix} Add (n_{1}, n_{2}) & = & (λ p . λ q . λ f . λ x . p f (q f x)) C (n_{1}) C (n_{2}), \\ Mul (n_{1}, n_{2}) & = & (λ p . λ q . λ f . λ x . p (q f) x) C (n_{1}) C (n_{2}), \\ Tet (n_{1}, n_{2}) & = & (λ p . λ f . λ x . \overset{n_{2}}{\overset{︷}{p p \dots p}} f x) C (n_{1}) . \end{matrix}

As can be seen in Corollary 2, in each

λ

-term, one

λ

-abstraction occurs first, and one or two Church numerals follow it. We call the former

λ

-abstraction and the following Church numerals function term and argument terms, respectively. For example, for

Add (n_{1}, n_{2})

(λ p . λ q . λ f . λ x . p f (q f x))

is the function term and the following

C (n_{1})

and

C (n_{2})

are the argument terms.

2.4. Binary Expression of Natural Numbers on $λ$ -Terms

Instead of Church numerals, we can represent natural numbers on

λ

-terms based on binary representation as follows [15].

Definition 7

(Binary representation on λ-terms). For natural number n, assume that its binary representation is

B_{k} B_{k - 1} \dots B_{2} B_{1}

with

B_{i} \in {0, 1}

for

1 \leq i \leq k

. Then, the binary expression of n on λ-terms, denoted by

B (n)

, is

\begin{matrix} B (n) : = b_{B_{1}} (b_{B_{2}} \dots (b_{B_{k - 1}} (b_{B_{k}} (λ x . x))) \dots), \end{matrix}

where

b_{0} = λ p . λ f . λ x . p f (p f x)

and

b_{1} = λ p . λ f . λ x . f (p f (p f x))

For example, the binary expression of 57 is 111001, then its binary expression on

λ

-terms is

b_{1} (b_{0} (b_{0} (b_{1} (b_{1} (b_{1} (λ x . x))))))

Note that

B (n) ⟶_{β}^{*} C (n)

and

# B (n) = Θ (log n)

follows Definition 7, while

# C (n) = Θ (n)

3. Proposed Method

3.1. Basic Idea

Let n be a natural number. As stated in Section 1, we can reduce

# C (n)

by using decomposition of n into an arithmetic expression. For the example, for

n = 500

, we can decompose it to an arithmetic expression

5 \cdot 10 \cdot 10

(more precisely,

(5 \cdot 10) \cdot 10

). A corresponding

λ

-term M of the expression is obtained as

M = (λ p . λ q . λ f . λ x . p (q f) x) ((λ p . λ q . λ f . λ x . p (q f) x) C (5) C (10)) C (10)

, and its size is 85, while

# C (500) = 1003

. Here,

M ⟶_{β}^{*} C (500)

. Moreover, we can make M more compact by combining two

λ

-abstractions into a single

λ

-abstraction, such as

(λ p . λ q . λ f . λ x . p (q (q f)) x)

C (5) C (10)

. This also generates

C (500)

through

β

-reduction similar to M, while its size is just 51.

We perform compaction of

C (n)

in the following three steps.

Step 1:: Decompose n into a special form of arithmetic expression, called TAE.
Step 2:: Translate the arithmetic expression into a $λ$ -term having the form of $M C (φ^{*})$ .
Step 3:: Apply the above Step 1 and 2 for the natural number $φ^{*}$ recursively.

For Step 1, we introduce TAEs in Section 3.2. For Step 2, we describe a translation method in Section 3.3. There are many ways to achieve decomposition of a natural number into a TAE, and the size of the translated

λ

-term changes depending on the expression. Then, we consider how we effectively obtain a TAE such that the translated

λ

-term of the TAE is compact. We propose RTP in Section 3.4, which is a heuristic approach decomposing a natural number into a TAE. Finally, for Step 3, we discuss further recursive compaction of Church numerals in Section 3.5.

3.2. Tetrational Arithmetic Expression (TAE)

We define tetrational arithmetic expression (TAE), denote by E, as a special form of arithmetic expression.

Definition 8

(Tetrational arithmetic expressions (TAEs)). Let N be an arbitrary natural number. Then, tetrational arithmetic expression (TAE) E is inductively defined as follows:

\begin{matrix} E : : = N | E + E | E \cdot E |^{N_{2}} N_{1} . \end{matrix}

If the evaluated result of a TAE E is a natural number n, we write

E [n]

for the TAE. Especially, let

E_{φ}

denote a restricted form of TAE with a natural number φ, defined as

\begin{matrix} E_{φ} : : = φ | E_{φ} + E_{φ} | E_{φ} \cdot E_{φ} |^{t} φ, \end{matrix}

where t is an arbitrary natural number.

For example,

2 \cdot 3 +^{2} 2

is a TAE, which is written as

E [10]

because its evaluated result is equal to 10. Similarly,

^{2} 2 \cdot 2 + 2

is also a TAE written as

E_{2} [10]

Informally speaking, reducing the number of natural numbers occurring in TAEs is effective for compaction of corresponding

λ

-terms, since it enables to reduce the number of the kinds of argument terms in the

λ

-terms; note that we can transform

(λ p . λ q . λ f . λ x . p f (q f x))

C (n_{1}) C (n_{1})

(λ p . λ f . λ x . p f (p f x))

C (n_{1})

, for example. From this point of view, it is expected that the corresponding

λ

-term of

E_{φ} [n]

becomes compact when n is a multiple of

φ

. Let

r = n mod φ

and

\bar{n} = n - r

. Thus, we consider TAE

E_{φ} [\bar{n}] + r

instead of

E_{φ} [n]

itself.

3.3. Translation from TAE to $λ$ -Term

Here, we consider how we translate an TAE

E_{φ} [\bar{n}]

to a corresponding

λ

-term. A simple method is to translate each arithmetic operation occurring in

E_{φ} [\bar{n}]

by using

λ

-terms stated in Corollary 2. For example, for

E_{2} [200] =^{3} 2 \cdot (^{2} 2 \cdot 2 +^{2} 2) +^{2} 2 \cdot 2

, the translated

λ

-term is obtained as

\begin{matrix} Add (Mul (Tet (2, 3), Add (Mul (Tet (2, 2), 2), Tet (2, 2))), Mul (Tet (2, 2), 2)) . \end{matrix}

As seen above, the corresponding

λ

-term seems to become long. However, we can translate the TAE to a compact

λ

-term.

Definition 9

(Corresponding functional

λ

-terms (CFLT)). Let n and φ be natural numbers. We say a λ-term M is a corresponding functional λ-term (CFLT) of TAE

E [n]

if both of the following conditions hold for M:

1.: M has the form of $(λ p . λ f . λ x . \hat{M}) C (φ)$ , where $\hat{M}$ is a λ-term that contains no Church numerals.
2.: $M ⟶_{β}^{*} C (n)$ .

For a given TAE E, We denote a CFLT of E by

Λ (E)

For example, for

E_{2} [200] =^{3} 2 \cdot (^{2} 2 \cdot 2 +^{2} 2) +^{2} 2 \cdot 2

, one of its CFLTs is

\begin{matrix} Λ (E_{2} [200]) = (λ p . λ f . λ x . p p p (λ y . p p (λ z . p f z) (p p f y)) (p p (λ w . p f w) x)) C (2) . \end{matrix}

(1)

The size of the above (1) is 48, which is much smaller than

# C (200) = 403

For TAE

E_{φ}

, the following lemma holds.

Lemma 1.

For any TAE

E_{φ}

, its CFLT exists.

Proof.

We prove the lemma by induction. Considering

λ

-terms stated in Corollary 2, for

E_{φ} = φ

and

E_{φ} =^{t} φ

, their CFLTs are obtained as

\begin{matrix} Λ (φ) & = & (λ p . λ f . λ x . p f x) C (φ) and \\ Λ (^{t} φ) & = & (λ p . λ f . λ x . \overset{t}{\overset{︷}{p p \dots p}} f x) C (φ), \end{matrix}

respectively.

Let

E_{φ}^{(1)}

and

E_{φ}^{(2)}

be TAEs. Then, we assume that there are CFLTs of

E_{φ}^{(1)}

and

E_{φ}^{(2)}

. Moreover, by Definition 9, we assume that

Λ (E_{φ}^{(1)}) = (λ p . λ f . λ x . {\hat{M}}^{(1)}) C (φ)

and

Λ (E_{φ}^{(2)}) = (λ p . λ f . λ x . {\hat{M}}^{(2)}) C (φ)

. Here, CFLTs of

E_{φ} + E_{φ}

and

E_{φ} \cdot E_{φ}

are obtained as

\begin{matrix} Λ (E_{φ}^{(1)} + E_{φ}^{(2)}) & = & (λ p . λ f . λ x . (λ y . {\hat{M}}^{(1)} [x : = y]) ((λ y . {\hat{M}}^{(2)} [x : = y]) x)) C (φ) a n d \\ Λ (E_{φ}^{(1)} \cdot E_{φ}^{(2)}) & = & (λ p . λ f . λ x . (λ g . λ y . {\hat{M}}^{(1)} [f : = g, x : = y]) (λ y . {\hat{M}}^{(2)} [x : = y]) x) C (φ), \end{matrix}

respectively. □

For compaction, next we introduce simplification of

λ

-terms.

Definition 10

(Simplification of λ-terms). Let x be a variable,

M_{1}

and

M_{2}

be λ-terms, and

{\tilde{⟶}}_{β}

be a special β-reduction from

(λ x . M_{1}) M_{2}

M_{1} [x : = M_{2}]

which defined only if at least one of the following holds:

1.: x occurs in $M_{1}$ only once (in other words, x is linear in $M_{1}$ ),
2.: $# M_{2} = 1$ .

Then, simplification is reflexive transitive closure of

{\tilde{⟶}}_{β}

By using simplification for

(λ x . M_{1}) M_{2}

, we can reduce the size of the

λ

-term since

# M_{1} [x : = M_{2}] = # M_{1} + # M_{2} - 1

(if x occurs in

M_{1}

only once) or

# M_{1} [x : = M_{2}] = # M_{1}

(if

# M_{2} = 1

) holds, while

# ((λ x . M_{1}) M_{2}) = # M_{1} + # M_{2} + 2

From the above discussion, we can translate a given TAE

E_{φ}

into a compact

λ

-term by applying simplification after translating

E_{φ}

into the CFLT in the manner of the proof of Lemma 1. We show the translating algorithm in Algorithm 1. The time complexity of Algorithm 1 depends on the total number of the occurrences of addition and multiple in the input TAE.

Algorithm 1 Translation from

E_{φ}

to its CFLT.

Input:

E_{φ}

Output:

Λ (E_{φ})

1: function translate(

E_{φ}

)

2: if

E_{φ} = φ

then

3: return

(λ p . λ f . λ x . p f x)

4: else if

E_{φ} =^{t} φ

then

5: return

(λ p . λ f . λ x . {\overset{︷}{p p \dots p}}^{t} f x)

6: else if

E_{φ} = E_{φ}^{(1)} + E_{φ}^{(2)}

then

(λ p . λ f . λ x . {\hat{M}}^{(1)}) : =

translate(

E_{φ}^{(1)}

)

(λ p . λ f . λ x . {\hat{M}}^{(2)}) : =

translate(

E_{φ}^{(2)}

)

9: return

(λ p . λ f . λ x . (λ y . {\hat{M}}^{(1)} [x : = y]) ((λ y . {\hat{M}}^{(2)} [x : = y]) x))

10: else if

E_{φ} = E_{φ}^{(1)} \cdot E_{φ}^{(2)}

then

11:

(λ p . λ f . λ x . {\hat{M}}^{(1)}) : =

translate(

E_{φ}^{(1)}

)

12:

(λ p . λ f . λ x . {\hat{M}}^{(2)}) : =

translate(

E_{φ}^{(2)}

)

13: return

(λ p . λ f . λ x . (λ g . λ y . {\hat{M}}^{(1)} [f : = g, x : = y]) (λ y . {\hat{M}}^{(2)} [x : = y]) x)

14: end if

15: end function

16:

17:

M : =

translate(

E_{φ}

)

18:

M^{*} : =

the

λ

-term generated by simplification of M

19: return

(M^{*} C (φ))

3.4. Recursive Tetrational Partitioning (RTP)

We consider how we obtain an effective

E_{φ} [\bar{n}]

for given

\bar{n}

and

φ

in the sense that the CFLT becomes more compact.

E_{φ} [\bar{n}]

is more effective if it consists of fewer arithmetic operations because the size of the CFLT increases with the number of arithmetic operations occurring in it. For example, for

\bar{n} = 12

and

φ = 3

3 \cdot 3 + 3

is more effective than

3 + 3 + 3 + 3

. We propose recursive tetrational partitioning (RTP), which is a heuristic method to find such an effective TAE for given

\bar{n}

and

φ

Definition 11

(Recursive tetrational partitioning (RTP)). Let n and

φ \geq 2

be natural numbers, and

r = n mod φ

. Then, recursive tetrational partitioning (RTP) is a method of decomposing n into

E_{φ} [n - r] + r

. Let

T_{φ} [n]

denote a TAE generated by RTP, then RTP recursively decompose n in the manner described as follows:

\begin{matrix} T_{φ} [n] : = \{\begin{matrix} n & (i f n \leq φ), \\ ^{k} φ \cdot T_{φ} [p_{k} - r_{k}] + (\overset{r_{k}}{\overset{︷}{^{k} φ + \dots +^{k} φ}}) \\ + \dots +^{1} φ \cdot T_{φ} [p_{1} - r_{1}] + (\overset{r_{1}}{\overset{︷}{^{1} φ + \dots +^{1} φ}}) + r & (o t h e r w i s e), \end{matrix} \end{matrix}

where k is the maximum natural number such that

^{k} φ \leq n

, and for

1 \leq i \leq k

p_{i}

is the natural number such that

0 \leq p_{i} \cdot^{i} φ <^{i + 1} φ

and

r_{i} = p_{i} mod φ

. Here, if

p_{i} - r_{i} = 0

or 1, we omit the corresponding terms or coefficients in the generated TAE, respectively.

For example, for

n = 201

and

φ = 2

T_{φ} [n] = T_{2} [201] =^{3} 2 \cdot (^{2} 2 \cdot 2 +^{2} 2) +^{2} 2 \cdot 2 + 1

. In Definition 11, note that each coefficient

(p_{i} - r_{i})

^{i} φ

is a multiple of

φ

, and then, each

T_{φ} [p_{i} - r_{i}]

generates just

E_{φ} [p_{i} - r_{i}]

without any remainder.

Theorem 1.

For given natural numbers n and

φ \geq 2

T_{φ} [n]

is uniquely determined.

Proof.

We prove the statement by induction. For

n \leq φ

, the statement clearly holds. Otherwise, let k be the maximum natural number such that

^{k} φ \leq n

\bar{n} = n - (n mod φ)

, and

P = {p_{k}, p_{k - 1}, \dots, p_{1}}

be a set such that

\begin{matrix} \sum_{i = 1}^{k} p_{i} \cdot^{i} φ = \bar{n} . \end{matrix}

(2)

Then, it is sufficient to prove that P is uniquely determined for

\bar{n}

since k,

\bar{n}

, and each

r_{i}

are uniquely determined from n,

φ

, and P. By Definition 11,

0 \leq p_{i} \cdot^{i} φ <^{i + 1} φ

holds for

1 \leq i \leq k

. We assume that

p_{k}

is not the maximum natural number such that

p_{k} \cdot^{k} φ \leq \bar{n}

, then

\begin{matrix} \bar{n} - p_{k} \cdot^{k} φ \geq^{k} φ + (\bar{n} mod^{k} φ) \end{matrix}

(3)

holds. On the other hand,

\begin{matrix} \sum_{i = 1}^{k - 1} p_{i} \cdot^{i} φ < \sum_{i = 2}^{k}^{i} φ \leq^{k} φ + (\bar{n} mod^{k} φ) \end{matrix}

(4)

also holds. By (3) and (4),

\sum_{i = 1}^{k} p_{i} \cdot^{i} φ < \bar{n}

holds. However, this contradicts to (2). Therefore,

p_{k}

must be the maximum one. Similar discussion holds for

p_{k - 1}, p_{k - 2}, \dots, p_{2}

, that is, for

2 \leq i \leq k - 1

p_{i}

is the maximum natural number such that

p_{i} \cdot^{i} φ \leq \bar{n} mod^{i + 1} φ

. By (2),

p_{1}

is unique if

p_{k}, p_{k - 1} \dots, p_{2}

are unique. Hence, P is uniquely determined for

\bar{n}

and k. □

Theorem 2.

For given natural numbers n and

φ \geq 2

, there exists a CFLT of

T_{φ} [n]

Proof.

By Definition 11,

T_{φ} [n]

has form of

E_{φ} [\bar{n}] + r

, where

r = n mod φ

and

\bar{n} = n - r

. By Lemma 1, for

E_{φ} [\bar{n}]

, there is a CFLT

Λ (E_{φ} [\bar{n}]) = (λ p . λ f . λ x . \hat{M}) C (φ)

. Then, a CFLT of

T_{φ} [n]

is obtained as

\begin{matrix} Λ (T_{φ} [n]) = (λ p . λ f . λ x . \hat{M} [x : = (\overset{r}{\overset{︷}{f (f \dots (f}} x) \dots))]) C (φ) . \end{matrix}

□

For the size of

Λ (T_{φ} [n])

, the following two lemmas hold.

Lemma 2.

For a given

T_{φ} [n]

with natural numbers n and

φ \geq 2

, let

N_{a}

and

N_{m}

be the numbers of addition and multiplication occurring in

T_{φ} [n]

, respectively. Also, for tetration

^{n_{2}} n_{1}

, we call

n_{2}

second argument of tetration, then let

N_{t}

be the sum of second arguments of all tetration occurring in

T_{φ} [n]

. Then,

\begin{matrix} # Λ (T_{φ} [n]) \leq 14 N_{a} + 8 N_{m} + 2 N_{t} + 2 r + 2 φ + 6 \end{matrix}

holds with

r = n mod φ

Proof.

By Definition 9, we assume that

Λ (E_{φ}^{(i)}) = (λ p . λ f . λ x . {\hat{M}}^{(i)}) C (φ)

. Then, the following holds:

\begin{matrix} # Λ (^{t} φ) & = & # ((λ p . λ f . λ x . \overset{t}{\overset{︷}{p p \dots p}} f x) C (φ)) \\ = & 2 t + 7 + # C (φ), \end{matrix}

(5)

\begin{matrix} # Λ (^{t} φ \cdot E_{φ}^{(1)}) & = & # ((λ p . λ f . λ x . \overset{t}{\overset{︷}{p p \dots p}} (λ y . {\hat{M}}^{(1)} [x : = y]) x) C (φ)) \end{matrix}

(6)

\begin{matrix} = & # {\hat{M}}^{(1)} + 2 t + 7 + # C (φ), \end{matrix}

(7)

\begin{matrix} # Λ (E_{φ}^{(1)} + E_{φ}^{(2)}) & = & # ((λ p . λ f . λ x . (λ y . {\hat{M}}^{(1)} [x : = y]) ((λ y . {\hat{M}}^{(2)} [x : = y]) x)) C (φ)) \\ = & # {\hat{M}}^{(1)} + # {\hat{M}}^{(2)} + 9 + # C (φ) . \end{matrix}

(8)

By Definition 11,

T_{φ} [n]

has the form of

E_{φ} + r

, where

E_{φ}

is TAE inductively defined as

E_{φ} : : =^{t} φ |^{t} φ \cdot E_{φ} | E_{φ} + E_{φ}

with natural number t. Therefore, by Equations (5), (7), and (8),

\begin{matrix} # Λ (E_{φ}) & \leq & 6 N_{a} + 4 N_{m} + 2 N_{t} + 3 + 4 \sum_{i} {\hat{M}}^{(i)} + # C (φ) \\ = & 6 N_{a} + 4 N_{m} + 2 N_{t} + 3 + 4 (2 N_{a} + N_{m}) + # C (φ) \\ = & 14 N_{a} + 8 N_{m} + 2 N_{t} + 3 + # C (φ) . \end{matrix}

(9)

holds. Note that, in (5), (7), and (8), the constant size 3 arise from

λ p . M

λ f . M

, and functional application of

λ

-abstraction and Church numeral

C (φ)

, which do not appear in higher level

λ

-term as

{\hat{M}}^{(i)}

. Also note that

{\overset{︷}{p p \dots p}}^{t}

occurring in (6) is the

λ

-term generated by simplification. Then,

\begin{matrix} # Λ (T_{φ} [n]) & \leq & 14 N_{a} + 8 N_{m} + 2 N_{t} + 2 r + 3 + # C (φ) \\ = & 14 N_{a} + 8 N_{m} + 2 N_{t} + 2 r + 2 φ + 6 \end{matrix}

follows (9). □

Lemma 3.

For a given natural number

n > 8

, there is a natural number φ such that

2 \leq φ \leq ⌊ \sqrt{n} ⌋

and

# Λ (T_{φ} [n]) < # C (n)

Proof.

Let

φ = ⌊ \sqrt{n} ⌋

and

r = n mod φ

. By Definition 11,

T_{φ} [n]

has the form of

φ \cdot φ + r

. Then, since

r < φ

\begin{matrix} # Λ (T_{φ} [n]) & = & # Λ (φ \cdot φ + r) \\ = & # ((λ p . λ f . λ x . p (p f) (\overset{r}{\overset{︷}{f (f \dots (f}} x) \dots))) C (φ)) \\ = & 2 r + 2 φ + 14 \\ < & 4 φ + 14 \\ \leq & 4 \sqrt{n} + 14 \end{matrix}

(10)

holds. By (10), the statement holds for

n > 15

because

# C (n) = 2 n + 3

. For

8 < n \leq 15

, the statement holds when

φ = 3

. Table 1 shows n,

# C (n)

, and

# Λ (T_{φ} [n])

in such cases. □

3.5. Further Compaction

Let

Λ (T_{φ} [n]) = M C (φ)

. Hereinafter, we call M and

C (φ)

function term and argument term, respectively, similarly to the

λ

-terms stated in Corollary 2. Moreover, for a given natural number n,

φ^{*}

denotes a natural number such that

φ^{*} \geq 2

and

# Λ (T_{φ^{*}} [n])

is smaller than any other

# Λ (T_{φ} [n])

, where

φ \neq φ^{*}

Lemma 3 implies that there is room for more compaction of

Λ (T_{φ^{*}} [n])

φ^{*} > 8

, since there is a natural number

φ_{1}^{*} \geq 2

such that

# Λ (T_{φ_{1}^{*}} [φ^{*}]) < # C (φ^{*})

. This holds recursively, that is, we can obtain a compact

λ

-term

L (n)

such that

L (n) ⟶_{β}^{*} C (n)

and

# L (n) \leq # Λ (T_{φ^{*}} [n])

\begin{matrix} L (n) : = \{\begin{matrix} Λ (T_{φ^{*}} [n]) & (i f φ^{*} \leq 8), \\ M L (φ^{*}) & (i f φ^{*} > 8), \end{matrix} \end{matrix}

where M is the function term of

Λ (T_{φ^{*}} [n])

. We show an algorithm generating

L (n)

for a given n in Algorithm 2. In line 9, we use Algorithm 1 to obtain

Λ (E_{φ})

. Assume that

O (α)

is the time complexity of Algorithm 1, then as shown in Algorithm 2,

O (α \sqrt{n})

time is required to obtain

φ^{*}

once.

For the size of

L (n)

, the following theorem holds.

Theorem 3.

For a given natural number n if there are φ and t such that

^{t} φ = n

, then

# L (n) = O ({slog}_{φ} n)

Proof.

By Definition 11 and Lemma 3,

\begin{matrix} # L (n) & \leq & # Λ (T_{φ} [n]) \\ = & # ((λ p . λ f . λ x . \overset{t}{\overset{︷}{p p \dots p}} f x) C (φ)) \\ \leq & # ((λ p . λ f . λ x . \overset{t}{\overset{︷}{p p \dots p}} f x) C (8)) \\ = & 2 t + 26 . \end{matrix}

holds. Since

t = {slog}_{φ} n

, the statement holds. □

As seen in the above Theorem 3, by using proposed method, we can obtain

L (n)

which is a tetrationally compact expression of

C (n)

for some n. This is properly smaller than

B (n)

which we introduced in Definition 7. However, Theorem 3 refers to only special cases of n. For general n, the following theorem holds.

Algorithm 2 Generation of

L (n)

for a given natural number n.

Input: n

Output:

L (n)

1: function generate(n)

2: if

n \leq 8

then

3: return

C (n)

4: else

φ : = 2

φ^{*} : = φ

minsize : = \infty

M : = NULL

6: while

φ \leq ⌊ \sqrt{n} ⌋

r : = n mod φ

E_{φ} + r : = T_{φ} [n]

((λ p . λ f . λ x . \hat{M}) C (φ)) : = Λ (E_{φ})

10:

Λ (T_{φ} [n]) : = ((λ p . λ f . λ x . \hat{M} [x : = ({\overset{︷}{f (f \dots (f}}^{r} x) \dots))]) C (φ))

11: if

# Λ (T_{φ} [n]) < minsize

then

12:

φ^{*} \leftarrow φ

13:

minsize \leftarrow # Λ (T_{φ} [n])

14:

M \leftarrow

the function term of

Λ (T_{φ} [n])

15: end if

16:

φ \leftarrow φ + 1

17: end while

18:

M_{\arg} : =

generate(

φ^{*}

)

19: return

(M M_{\arg})

20: end if

21: end function

22:

23: return generate(n)

Theorem 4.

For a given natural number n,

# L (n) = O ({({slog}_{2} n)}^{\frac{log n}{log log n}})

We need the following lemma to prove Theorem 4.

Lemma 4.

For a given natural number n, let

M_{φ^{*}}

and

M_{2}

be function terms of

Λ (T_{φ^{*}} [n])

and

Λ (T_{2} [n])

, respectively. Then,

# M_{φ^{*}} \leq # M_{2}

holds.

Proof.

By Definition 5,

# Λ (T_{φ^{*}} [n]) = # M_{φ^{*}} + # C (φ^{*}) + 1

and

# Λ (T_{2} [n]) = # M_{2} + # C (2) + 1

holds. Therefore, by definition of

φ^{*}

, the following holds:

\begin{matrix} # Λ (T_{φ^{*}} [n]) \leq # Λ (T_{2} [n]) & \Leftrightarrow & # M_{φ^{*}} + # C (φ^{*}) + 1 \leq # M_{2} + # C (2) + 1 \\ \Leftrightarrow & # M_{φ^{*}} - # M_{2} \leq # C (2) - # C (φ^{*}) \\ \Leftrightarrow & # M_{φ^{*}} - # M_{2} \leq 0 . \end{matrix}

The statement follows the above. Note that, by Definition 11,

T_{φ} [n]

is only defined for

φ \geq 2

. □

Now we can prove Theorem 4.

Proof of Theorem 4.

By the definition of

L (n)

, we assume that

L (n) = M_{φ^{*}} L (φ^{*})

, where

M_{φ^{*}}

is the function term of

Λ (T_{φ^{*}} [n])

. By Definition 5, the following holds:

\begin{matrix} # L (n) = # M_{φ^{*}} + # L (φ^{*}) + 1 . \end{matrix}

(11)

First, we consider the size of

M_{φ^{*}}

. For a given natural number n, we inductively define a TAE

{\hat{T}}_{2} (n)

as follows:

\begin{matrix} {\hat{T}}_{2} (n) : = \{\begin{matrix} n & (if n \leq 2), \\ ^{k} 2 \cdot {\hat{T}}_{2} (p_{k} - r_{k}) +^{k} 2 + \dots +^{1} 2 \cdot {\hat{T}}_{2} (p_{1} - r_{1}) +^{1} 2 + 2 & (otherwise) . \end{matrix} \end{matrix}

where k is the maximum natural number such that

^{k} 2 \leq n

, and for

1 \leq i \leq k

, each

p_{i}

is the natural number such that

0 \leq p_{i} \cdot^{i} 2 <^{i + 1} 2

and

r_{i} = p_{i} mod 2

. If

p_{i} = 0

or 1, we omit the corresponding terms or coefficients in the TAE, similarly to

T_{2} [n]

. While

{\hat{T}}_{2} (n)

is not

E_{2} [n]

(since there is a case that the evaluated result of

{\hat{T}}_{2} (n)

is not equal to n),

{\hat{T}}_{2} (n)

is clearly a TAE

E_{φ}

. Therefore, by Lemma 1, a CFLT

Λ ({\hat{T}}_{2} (n))

exists. By Lemma 2, the upper bound of

# Λ ({\hat{T}}_{2} (n))

depends on the number of the arithmetic operations occurring in

{\hat{T}}_{2} (n)

. Let

N_{a}

and

N_{m}

be the numbers of addition and multiplication occurring in

T_{2} [n]

, respectively. Also, for tetration

^{n_{2}} n_{1}

, we call

n_{2}

second argument of tetration, then let

N_{t}

be the sum of second arguments of all tetration occurring in

T_{2} [n]

. Let us denote

{\hat{N}}_{a}

{\hat{N}}_{m}

, and

{\hat{N}}_{t}

similarly for

{\hat{T}}_{2} (n)

. Here,

N_{a} \leq {\hat{N}}_{a}

N_{m} \leq {\hat{N}}_{m}

, and

N_{t} \leq {\hat{N}}_{t}

hold, since in

T_{2} [n]

, no terms increase compared to

{\hat{T}}_{2} (n)

while the terms

^{i} 2

following

^{i} 2 \cdot {\hat{T}}_{2} (n)

is reduced if

r_{i} = 0

. Then,

\begin{matrix} # Λ (T_{2} [n]) & \leq & max (# Λ ({\hat{T}}_{2} (n))) \\ \leq & 14 {\hat{N}}_{a} + 8 {\hat{N}}_{m} + 2 {\hat{N}}_{t} + 11 . \end{matrix}

follows Lemma 2. Note that

(n mod 2) \leq 1

. Let

\hat{N} (n) = 14 {\hat{N}}_{a} + 8 {\hat{N}}_{m} + 2 {\hat{N}}_{t} + 11

, then by the above,

\begin{matrix} # Λ (T_{2} [n]) & \leq & \sum_{i = 1}^{k} (14 \cdot 2 + 8 \cdot 1 + 2 \cdot 2 i + \hat{N} (p_{i} - r_{i}) + 11) \\ \leq & 2 k^{2} + 13 k + k \hat{N} (p_{k} - r_{k}) . \end{matrix}

(12)

holds. We assume that

ρ

is the depth of the recursion in

{\hat{T}}_{2} (n)

. Then,

k \hat{N} (p_{k} - r_{k}) = O (k^{ρ})

holds.

Therefore, by (12),

\begin{matrix} # Λ (T_{2} [n]) = O (k^{ρ}) . \end{matrix}

(13)

holds. Here,

{(^{k} 2)}^{ρ} \leq n \Leftrightarrow ρ log^{k} 2 \leq log n

holds. Thus, by Corollary 1,

\begin{matrix} ρ \leq \frac{log n}{log^{k} 2} < \frac{log n}{log^{({slog}_{2} n) - 1} 2} = \frac{log n}{log log n} \end{matrix}

holds. By Definition 2,

k \leq {slog}_{2} n

. Then,

\begin{matrix} # Λ (T_{2} [n]) = O ({({slog}_{2} n)}^{\frac{log n}{log log n}}) \end{matrix}

(14)

follows (13). By Lemma 4,

# M_{φ^{*}}

is bounded by

Λ (T_{2} [n])

since the function term of

Λ (T_{2} [n])

is smaller than

Λ (T_{2} [n])

itself. Then, by (14), we obtain

\begin{matrix} # M_{φ^{*}} = O ({({slog}_{2} n)}^{\frac{log n}{log log n}}) . \end{matrix}

(15)

Second, we consider the size of

L (φ^{*})

. If

φ^{*} \leq 8

, it is constant since

L (φ^{*}) = C (φ^{*})

and

# C (φ^{*}) \leq C (8) = 19

. Otherwise,

# L (φ^{*}) \leq Λ (T_{φ_{1}^{*}} [φ^{*}])

holds with a natural number

φ_{1}^{*}

. By the definition of

L (n)

, it recursively holds until

φ_{i}^{*} \leq 8

. Thus, by (11) and (15),

\begin{matrix} # L (φ^{*}) = O ({({slog}_{2} φ^{*})}^{\frac{log φ^{*}}{log log φ^{*}}}) + O ({({slog}_{2} φ_{1}^{*})}^{\frac{log φ_{1}^{*}}{log log φ_{1}^{*}}}) + \dots + O ({({slog}_{2} φ_{m}^{*})}^{\frac{log φ_{m}^{*}}{log log φ_{m}^{*}}}) + m + C \end{matrix}

(16)

holds, where C is a constant and m is a natural number such that

φ_{m}^{*} \leq 8

. Here, by Lemma 2,

φ^{*} \leq ⌊ \sqrt{n} ⌋, φ_{1}^{*} \leq ⌊ \sqrt{φ^{*}} ⌋, φ_{2}^{*} \leq ⌊ \sqrt{φ_{1}^{*}} ⌋, \dots, φ_{m}^{*} \leq ⌊ \sqrt{{φ^{*}}_{m - 1}} ⌋

holds, then the recursion depth m is

O (log n)

. Hence,

\begin{matrix} # L (φ^{*}) = O ({({slog}_{2} φ^{*})}^{\frac{log φ^{*}}{log log φ^{*}}}) + O (log n) + C \end{matrix}

(17)

follows (16). Note that

\sum_{i = 1}^{\infty} F (x_{i}) < 2 F (x_{1})

holds for function

F (x)

and natural number x such that

F (x_{i}) \geq 0

and

x_{i + 1} \leq \sqrt{x_{i}}

for

1 \leq i

As a result, by (12), (15), and (17), we obtain

\begin{matrix} # L (n) & = & O ({({slog}_{2} n)}^{\frac{log n}{log log n}}) + O ({({slog}_{2} φ^{*})}^{\frac{log φ^{*}}{log log φ^{*}}}) + O (log n) + C \\ = & O ({({slog}_{2} n)}^{\frac{log n}{log log n}}) \end{matrix}

and the statement holds. □

4. Application to Higher-Order Compression and Comparative Experiments

We implemented our method stated in Section 3, and conducted an experiment compared to the binary expression on

λ

-terms stated in Definition 7 and

λ

-terms generated by using the method stated in [2] (we call the method YKS).

Let

B (n)

denote the binary expression of n on

λ

-terms. As seen in Section 3, our proposed method generates a compact

λ

-term

L (n)

for a given

C (n)

. While

C (n)

has the size of

Θ (n)

, the size of the

λ

-term

L (n)

achieves

O ({slog}_{2} n)

for some n, as shown in Theorem 3. However, as shown in Theorem 4, the size of

L (n)

becomes

O ({({slog}_{2} n)}^{log n / log log n})

for general n, which is greater than that of

B (n)

Θ (log n)

. On the other hand, YKS is a method of higher-order compression, which generates an extended

λ

-term for a given text. Here, a text is an element of

Σ^{*}

, where

Σ

is an alphabet. Let

L_{YKS} (n)

be the generated extended

λ

-term by using YKS for a given text

a^{n} c

with

a, c \in Σ

. Then, we can find that the size of

L_{YKS} (n)

achieves

O ({slog}_{2} n)

for some n, similarly to

L (n)

We define extended λ-terms following to [2]. We show that the application of

B (n)

and

L (n)

to extended

λ

-terms is directly possible.

Definition 12

(Extended λ-terms and their sizes). Let x be a variable and

a \in Σ

. Then, extended λ-term

\tilde{M}

is defined inductively as follows:

\begin{matrix} \tilde{M} : : = a | x | λ x . \tilde{M} | {\tilde{M}}_{1} {\tilde{M}}_{2} . \end{matrix}

The size of

\tilde{M}

, denoted by

# \tilde{M}

, is inductively defined as follows:

\begin{matrix} # a = 1, # x = 1, # (λ x . \tilde{M}) = # \tilde{M} + 1, # ({\tilde{M}}_{1} {\tilde{M}}_{2}) = # {\tilde{M}}_{1} + # {\tilde{M}}_{2} + 1 . \end{matrix}

Similar to normal

λ

-terms, we define

β

-reduction, variable replacement notation

{\tilde{M}}_{1} [x : = {\tilde{M}}_{2}]

, and simplification on extended

λ

-terms. Clearly, we can regard any normal

λ

-terms as extended

λ

-terms.

In higher-order compression, a given text

a^{n} c

is regarded as an extended

λ

-term

\tilde{S} (n) = ({\overset{︷}{a (a \dots (a}}^{n} c) \dots))

, then we generate a compact extended

λ

-term

\tilde{M}

such that

\tilde{M} ⟶_{β}^{*} \tilde{S} (n)

. From

B (n)

and

L (n)

, we can obtain extended

λ

-terms

\tilde{B} (n)

and

\tilde{L} (n)

such that

\tilde{B} (n) ⟶_{β}^{*} \tilde{S} (n)

and

\tilde{L} (n) ⟶_{β}^{*} \tilde{S} (n)

, respectively, as follows. By Definition 7 and the definition of

L (n)

, we assume that

B (n) = (λ p . λ f . λ x . {\hat{M}}_{B}) M_{B}^{'}

and

L (n) = (λ p . λ f . λ x . {\hat{M}}_{L}) M_{L}^{'}

. Then,

\tilde{B} (n)

and

\tilde{L} (n)

are obtained as

\begin{matrix} \tilde{B} (n) = (λ p . {\hat{M}}_{B} [f : = a, x : = c]) M_{B}^{'} and \tilde{L} (n) = (λ p . {\hat{M}}_{L} [f : = a, x : = c]) M_{L}^{'}, \end{matrix}

respectively. Note that

B (n) a c ⟶_{β}^{*} \tilde{S} (n)

and

L (n) a c ⟶_{β}^{*} \tilde{S} (n)

holds, since

B (n) ⟶_{β}^{*} C (n)

L (n) ⟶_{β}^{*} C (n)

, and

C (n) a c ⟶_{β}^{*} \tilde{S} (n)

. Here,

\tilde{B} (n)

and

\tilde{L} (n)

are the results of simplification for

B (n) a c

and

L (n) a c

, respectively.

Figure 1 and Figure 2 show the experimental results. In both of them, “Binary”, “YKS”, and “Proposed” denote the methods using the binary expression on

λ

-terms, YKS, and ours, respectively. The horizontal axis shows the repetition number n of given text

a^{n} c

. Then, we compare the sizes of the generated extended

λ

-terms by using these three methods, for the given

a^{n} c

In Figure 1, the vertical axis shows the sizes of

\tilde{B} (n)

L_{YKS} (n)

, and

\tilde{L} (n)

. The inequality

# \tilde{L} (n) \leq # \tilde{B} (n)

holds in 5187 out of 10,000 cases within the range

[1, 10000]

. The average ratio

# \tilde{L} (n) / # \tilde{B} (n)

for the range is approximately 0.9962. Similarly, the inequality

# \tilde{L} (n) \leq # L_{YKS} (n)

holds in 5959 cases, and the average ratio

# \tilde{L} (n) / # L_{YKS} (n)

is approximately 0.9321.

In Figure 2, the vertical axis shows the ratio (the average size of cumulative sum of extended

λ

-terms from 1 to n)

/ # \tilde{B} (n)

. That is, Figure 2 shows how

# L_{YKS} (n)

and

# \tilde{L} (n)

increase compared to

# \tilde{B} (n)

on average. As can be seen,

# \tilde{L} (n)

tends to be greater than

# \tilde{B} (n)

when n is greater than 10,000. We consider that it is consistent with the theoretical upper bound

O ({({slog}_{2} n)}^{log n / log log n})

, stated in Theorem 4.

5. Conclusions

In this paper, we addressed the problem of compaction of Church numerals. For given natural number n, by using proposed RTP, we decompose n into an arithmetic expression, which enables to obtain a compact

λ

-term leading to the Church numeral of n. We proved that the size of the obtained

λ

-term becomes

O ({({slog}_{2} n)}^{log n / log log n})

. Moreover, we experimentally confirmed that the

λ

-terms produced by our method tend to be smaller than binary expressions on

λ

-terms on average, when the given number is less than approximately 10,000. We have not proved the lower bound of the size in the worst case; it is our future work.

The compaction of Church numerals can be applied to higher-order compression, that uses extended

λ

-terms as the data model. In the procedure of higher-order compression, a repetitive part in the input can be represented as a

λ

-term with

C (n)

. Thus, efficient compaction of

C (n)

will help for improving the compression performance of higher-order compression.

In data compression, data models are finally encoded in bit sequences. For bit encoding of

λ

-terms, Tromp [16] proposed a method for untyped

λ

-terms. Very recently, Takeda et al. [17] proposed an efficient encoding scheme for simply-typed

λ

-terms. Finding an efficient bit encoding for our method is also our future work.

Author Contributions

Conceptualization, I.F. and T.K.; Software, I.F.; Writing—original draft, I.F. and T.K.; Writing—review & editing, I.F. and T.K.

Funding

This work was supported by JSPS KAKENHI Grant Number JP15K00002, JP18K11149, JP18K19771 and JST CREST Grant Number JPMJCR1402, Japan.

Acknowledgments

The authors would like to thank Ayumi Shinohara and his colleagues for providing the source code for higher-order compression.

Conflicts of Interest

The authors declare no conflict of interest.

References

Kobayashi, N.; Matsuda, K.; Shinohara, A.; Yaguchi, K. Functional programs as compressed data. High. Order Symb. Comput. 2012, 25, 39–84. [Google Scholar] [CrossRef]
Yaguchi, K.; Kobayashi, N.; Shinohara, A. Efficient algorithm and coding for higher-rder compression. In Proceedings of the 2014 Data Compression Conference (DCC2014), Snowbird, UT, USA, 26–28 March 2014; p. 434. [Google Scholar]
Kieffer, J.C.; Yang, E.H. Grammar-based codes: A new class of universal lossless source codes. IEEE Trans. Inf. Theory 2000, 46, 737–754. [Google Scholar] [CrossRef]
Cover, T.M.; Thomas, J.A. Elements of Information Theory (Wiley Series in Telecommunications and Signal Processing); Wiley-Interscience: New York, NY, USA, 2006. [Google Scholar]
Charikar, M.; Lehman, E.; Liu, D.; Panigrahy, R.; Prabhakaran, M.; Sahai, A.; Shelat, A. The smallest grammar problem. IEEE Trans. Inf. Theory 2005, 51, 2554–2576. [Google Scholar] [CrossRef]
Larsson, N.J.; Moffat, A. Off-line dictionary-based compression. Proc. IEEE 2000, 88, 1722–1732. [Google Scholar] [CrossRef]
Maruyama, S.; Tanaka, Y.; Sakamoto, H.; Takeda, M. Context-sensitive grammar transform: Compression and pattern matching. In Proceedings of the String Processing and Information Retrieval (SPIRE2008), Melbourne, Australia, 10–12 November 2008; pp. 27–38. [Google Scholar]
Masaki, T.; Kida, T. Online grammar transformation based on re-pair algorithm. In Proceedings of the Data Compression Conference (DCC2016), Snowbird, UT, USA, 30 March–1 April 2016; pp. 349–358. [Google Scholar]
Nevill-Manning, C.G.; Witten, I.H.; Maulsby, D.L. Compression by induction of hierarchical grammars. In Proceedings of the Data Compression Conference (DCC’94), Snowbird, UT, USA, 29–31 March 1994; pp. 244–253. [Google Scholar]
Takabatake, Y.; Sakamoto, H. A space-optimal grammar compression. In Proceedings of the 25th Annual European Symposium on Algorithms (ESA 2017), Vienna, Austria, 4–6 September 2017; Leibniz International Proceedings in Informatics (LIPIcs). Volume 87, pp. 67:1–67:15. [Google Scholar] [CrossRef]
Ohno, T.; Goto, K.; Takabatake, Y.; I, T.; Sakamoto, H. LZ-ABT: A practical algorithm for α-balanced grammar compressio. In Proceedings of the 29th International Workshop on Combinatorial Algorithms (IWOCA 2018), Singapore, 16–19 July 2018; Springer: Singapore, 2018; pp. 323–335. [Google Scholar]
Rytter, W. Grammar compression, LZ-encodings, and string algorithms with implicit input. In Automata, Languages, and Programming (ICALP2004); Díaz, J., Karhumäki, J., Lepistö, A., Sannella, D., Eds.; Springer: Berlin/Heidelberg, Germany, 2004; pp. 15–27. [Google Scholar]
Lohrey, M. Algorithmics on SLP-compressed strings: A survey. Groups Complex. Cryptol. 2012, 4, 241–299. [Google Scholar] [CrossRef]
Furuya, I.; Kida, T. Compaction of church numerals for higher-order compression. In Proceedings of the Data Compression Conference (DCC2018), Snowbird, UT, USA, 27–30 March 2018; p. 408. [Google Scholar]
Mogensen, T.A. An investigation of compact and efficient number representations in the pure lambda calculus. In Revised Papers from the 4th International Andrei Ershov Memorial Conference on Perspectives of System Informatics (PSI ’02): Akademgorodok, Novosibirsk, Russia; Springer-Verlag: London, UK, 2001; pp. 205–213. [Google Scholar]
Tromp, J. Binary lambda calculus and combinatory logic. In Kolmogorov Complexity and Applications; Hutter, M., Merkle, W., Vitanyi, P.M., Eds.; Number 06051 in Dagstuhl Seminar Proceedings; Internationales Begegnungs- und Forschungszentrum fuer Informatik (IBFI): Schloss Dagstuhl, Germany, 2006. [Google Scholar]
Takeda, K.; Kobayashi, N.; Yaguchi, K.; Shinohara, A. Compact bit encoding schemes for simply-typed lambda-terms. In Proceedings of the 21st ACM SIGPLAN International Conference on Functional Programming, Nara, Japan, 18–24 September 2016; Volume 51, pp. 146–157. [Google Scholar]

Figure 1. Term size for integer n.

Figure 2. Average ratio to binary expression.

Table 1. n,

# C (n)

, and

# Λ (T_{φ} [n])

for

8 < n \leq 15

and

φ = 3

Table 1. n,

# C (n)

, and

# Λ (T_{φ} [n])

for

8 < n \leq 15

and

φ = 3

n	$# C (n)$	$# Λ (T_{φ} [n])$	n	$# C (n)$	$# Λ (T_{φ} [n])$
9	21	20	13	29	26
10	23	22	14	31	28
11	25	24	15	33	28
12	27	24

© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Furuya, I.; Kida, T. Compaction of Church Numerals. Algorithms 2019, 12, 159. https://doi.org/10.3390/a12080159

AMA Style

Furuya I, Kida T. Compaction of Church Numerals. Algorithms. 2019; 12(8):159. https://doi.org/10.3390/a12080159

Chicago/Turabian Style

Furuya, Isamu, and Takuya Kida. 2019. "Compaction of Church Numerals" Algorithms 12, no. 8: 159. https://doi.org/10.3390/a12080159

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Compaction of Church Numerals

Abstract

1. Introduction

2. Preliminaries

2.1. Tetration and Super-Logarithm

2.2. Lambda Terms

2.3. Church Numerals

2.4. Binary Expression of Natural Numbers on $λ$ -Terms

3. Proposed Method

3.1. Basic Idea

3.2. Tetrational Arithmetic Expression (TAE)

3.3. Translation from TAE to $λ$ -Term

3.4. Recursive Tetrational Partitioning (RTP)

3.5. Further Compaction

4. Application to Higher-Order Compression and Comparative Experiments

5. Conclusions

Author Contributions

Funding

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Article Menu

Compaction of Church Numerals

Abstract

1. Introduction

2. Preliminaries

2.1. Tetration and Super-Logarithm

2.2. Lambda Terms

2.3. Church Numerals

2.4. Binary Expression of Natural Numbers on λ -Terms

3. Proposed Method

3.1. Basic Idea

3.2. Tetrational Arithmetic Expression (TAE)

3.3. Translation from TAE to λ -Term

3.4. Recursive Tetrational Partitioning (RTP)

3.5. Further Compaction

4. Application to Higher-Order Compression and Comparative Experiments

5. Conclusions

Author Contributions

Funding

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

2.4. Binary Expression of Natural Numbers on $λ$ -Terms

3.3. Translation from TAE to $λ$ -Term