-
Notifications
You must be signed in to change notification settings - Fork 165
/
haskell.tex
141 lines (116 loc) · 5.11 KB
/
haskell.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
\section{Haskell Code Generation}
The process the BNF Converter uses to generate Haskell code is quite straightforward. Here we will only present an overview of this process, for comparison with the methods used for Java and C. For a more complete look at this process see the documentation on the BNF Converter Homepage \cite{bnfcsite}.
\shortsection{The Abstract Syntax}
Consider the example grammar given in Section \ref{example}.
The Haskell abstract syntax generated by the
BNF Converter, shown in Figure \ref{fig:haskell}A, is essentially what a Haskell programmer would write by hand, given the close relationship between a declarative grammar and Haskell's algebraic data types.
\shortsection{The Lexer and Parser}
The BNF Converter generates lexer and parser specifications for the Alex \cite{alex} and Happy \cite{happy} tools. The lexer file (omitted for space considerations) consists mostly of standard rules for literals and identifiers, but has
rules added for reserved words and symbols (i.e.\ terminals
occurring in the grammar), regular expressions defined with the token pragma, and comments.
The Happy specification (Figure \ref{fig:haskell}B) has a large number of token definitions,
followed by parsing rules corresponding closely to the source BNF rules. Note the left-recursive list transformation, as defined in Section \ref{leftrec}.
\shortsection{The Pretty Printer and Case Skeleton}
The pretty printer consists of a Haskell class {\tt Print} with instances
for all generated data types, taking precedence into account. The class method
{\tt prt}\
generates a list of strings for a syntax tree of any type (Figure \ref{fig:haskell}C).
The list of strings is then put in layout (indentation, newlines) by a \textit{rendering}
heuristic, which is generated independently of the grammar. This function is designed to make C-like languages look good by default, but it is written with easy modification in mind.
The case skeleton (Figure \ref{fig:haskell}D) is a simple traversal of the abstract syntax tree representation that can be used as a template when defining the compiler
back end, e.g.\ type checker and code generator. The same methodology is
also used to generate the pretty printer. The case branches in the skeleton
are initialized to fail, and the user can simply replace them with something more interesting.
\shortsection{The Makefile and Test Bench}
The generated test bench file can be loaded in the Haskell interpreter \texttt{hugs} to
run the parser and the pretty printer on terminal or file input.
If parsing succeeds the test functions display a syntax tree,
and the pretty printer linearization. Otherwise an error message is displayed.
A simple makefile is created to run Alex on the lexer, Happy on the parser, and
LaTeX on the document, by simply typing {\tt make}. The {\tt make clean}
command removes the generated files.
\shortsection{Translation Summary}
Overall, it is easy to represent an LBNF grammar as a Haskell data
type---a straightforward translation between source productions and algebraic data types. Language implementors have long known that the similarities between algebraic data types and grammar specifications make functional programming a good choice for compilers.
\begin{figure}
\begin{boxedminipage}[t]{\textwidth}
\begin{minipage}[l]{0.5\textwidth}
\textbf{A. Abstract Syntax}
\scriptsize
\begin{verbatim}
data PROGRAM = PROGRAM [EXP]
deriving (Eq, Show)
data EXP =
EOr EXP EXP
| EAnd EXP EXP
| ETrue
| EFalse
| EVar
deriving (Eq, Show)
\end{verbatim}
\normalsize
\textbf{B. Happy Parser}
\scriptsize
\begin{verbatim}
PROGRAM :: { PROGRAM }
PROGRAM : ListEXP { PROGRAM (reverse $1) }
EXP :: { EXP }
EXP : EXP '||' EXP1 { EOr $1 $3 }
| EXP1 { $1 }
EXP1 :: { EXP }
EXP1 : EXP1 '&&' EXP2 { EAnd $1 $3 }
| EXP2 { $1 }
EXP2 :: { EXP }
EXP2 : 'true' { ETrue }
| 'false' { EFalse }
| Ident { EVar $1 }
| '(' EXP ')' { $2 }
ListEXP :: { [EXP] }
ListEXP : {- empty -} { [] }
| ListEXP EXP ';' { flip (:) $1 $2 }
\end{verbatim}
\normalsize
\end{minipage}
\hfill
\begin{minipage}[r]{0.5\textwidth}
\textbf{C. Pretty Printer}
\scriptsize
\begin{verbatim}
instance Print PROGRAM where
prt i e = case e of
PROGRAM exp -> prPrec i 0 (concat [prt 0 exp])
instance Print EXP where
prt i e = case e of
EOr exp0 exp -> prPrec i 0
(concat [prt 0 exp0 , ["||"] , prt 1 exp])
EAnd exp0 exp -> prPrec i 1
(concat [prt 1 exp0 , ["&&"] , prt 2 exp])
ETrue -> prPrec i 2 (concat [["true"]])
EFalse -> prPrec i 2 (concat [["false"]])
EVar id -> prPrec i 2 (concat [prt 0 id])
prtList es = case es of
[] -> (concat [])
x:xs ->
(concat [prt 0 x , [";"] , prt 0 xs])
\end{verbatim}
\normalsize
\textbf{D. Case Skeleton}
\scriptsize
\begin{verbatim}
transPROGRAM :: PROGRAM -> Result
transPROGRAM x = case x of
PROGRAM exp -> failure x
transEXP :: EXP -> Result
transEXP x = case x of
EOr exp0 exp -> failure x
EAnd exp0 exp -> failure x
ETrue -> failure x
EFalse -> failure x
EVar id -> failure x
\end{verbatim}
\hfill
\end{minipage}
\end{boxedminipage}
\caption{Haskell source code fragments generated from Figure \ref{fig:source}}
\label{fig:haskell}
\end{figure}