aboutsummaryrefslogblamecommitdiff
path: root/cibic.tex
blob: fdcd47f24ebd0a4d2ca6ea519da0bba841377792 (plain) (tree)
1
2
3
4
5
6
7
8
9
10





                                      
                                                             


                                          
                                      
                                                   

                                               
 

                            
             
                                   

            




                                           













                                                                             













                                                                                
              
                





                                                                              











                                                                                
                  
              
                                                       




















                                                          
             






































                                                                                                






                                                                            
 










                                                                                
 







                                                                                
 

                                                                       





















































                                                                                                          

                                                                               
                                                                              

                                                                          
                                                                           









                                                                               
                 


                                                        












                     



                                       















                                                       
              
            








                                                                               
                                







                                                                               
                                     

















                                                                               
                                                







                                                                                                  
                   


                                                                            
          












                                                           
                  
             





                                                                                
 










                                                       
 






























































































































































































































                                                                                        






















                                                       
                                          















                                                                               
                          









                                                                                
                                





















                                                                                

                                      
\documentclass[10pt, a4paper]{article}

\usepackage{xeCJK, fontspec}
\usepackage{amsmath, verbatim}
\usepackage{setspace, float}
\usepackage{graphicx, wrapfig}
\usepackage{rotating, enumerate, minted, fancyvrb, longtable}
\usepackage[left=1in, right=1in]{geometry}
\usepackage[font=small]{caption}

\defaultfontfeatures{Mapping=tex-text}
\setCJKmainfont[BoldFont={STHeiti}]{Adobe Song Std}
%\XeTeXlinebreaklocale "zh"
%\XeTeXlinebreakskip = 0pt plus 1pt minus 0.1pt

%\setlength{\parindent}{2em}
%\setlength{\parskip}{1em}
\makeatletter
\def\verbatim@font{\ttfamily\small}
\makeatother

%\makeatletter
%\let\@afterindentrestore\@afterindentfalse
%\let\@afterindentfalse\@afterindenttrue
%\@afterindenttrue
%\makeatother

\xeCJKsetup{CJKglue=\hspace{0pt plus .08 \baselineskip }}

\newcommand{\ud}{\mathrm{d}}
\usemintedstyle{colorful}
\begin{document}
\title{\Large{CIBIC: C Implemented Bare and Ingenuous Compiler}}

\author{\textsc{尹茂帆 Ted Yin}\\ \\\multicolumn{1}{p{.7\textwidth}}
    {\centering\emph{F1224004 5120309051\\
Shanghai Jiao Tong University ACM Class}}}
\maketitle
\begin{abstract}
    This report presents the design and features of a simple C compiler which
    generates MIPS assembly code. Although not all the language requirements and
    features are implemented according to the standard, it still supports major
    C features, such as basic types (void, int, char), basic flow control syntax
    (if, while-loop, for-loop), user-defined types (aka. typedef), functions,
    pointers (including function pointers), struct/union (may be nested), etc.
    Besides, it makes use of SSA (Single Static Assignment) form for the IR
    (Intermediate Representation) and optimization. The whole compiler is
    written in pure C, obeying C89/C99 standard. The report first introduces the
    lexer and parser generation, then focuses on the data structures being used
    in the AST (Abstract Syntax Tree) and symbol tables, and makes a conclusion
    on the supported syntactical features. Next, the report shows the
    intermediate representation design and claims the techniques that have been
    used in the project. Finally, various optimization techniques are presented
    and discussed, some are accompanied with code snippets.
\end{abstract}
\tableofcontents
\section{Lexer and Parser}
CIBIC makes good use of existing lexer and parser generators. It uses Flex to
generate lexer while using Bison as parser generator. Both generators read
boilerplate text which contains C code fragments to be filled in the generated
lexer/parser. The best practice, I suggest, is to avoid embedding too much
concrete code in the boilerplate. Instead, we shall write well-designed
functions in a separate file (``\texttt{ast.c}'' in CIBIC) and invoke functions
in the boilerplate. The reason is that it is almost not practical nor convenient
to debug the generated lexer or parser. Using the seperation method, we can set
up breakpoints in the seperate file easily and debug on the fly. The
declarations of all functions that may be invoked during parsing can be found in
``\texttt{ast.h}''.

It requires a little more effort to track the location of each token using Flex.
Luckily, Flex provides with a macro called ``\texttt{YY\_USER\_ACTION}'' to let
the user do some extra actions after each token is read. So I maintained a
global variable ``\texttt{yycolumn}'' to keep the current column, a global char
array ``\texttt{linebuff}'' to buffer the text being read for error report.
\begin{listing}[H]
    \centering
    \RecustomVerbatimEnvironment{Verbatim}{BVerbatim}{}
    \begin<