-
Notifications
You must be signed in to change notification settings - Fork 0
CCgenStructure
The compiler for Whiley is structured as a collection of modules, with the backend code emitters under the main compiler module. The C code is produced by the module wycc (and the Java byte code is produced by wyjc). The module contains a couple of very small java files to glue together things (WyccMain & CFile) and one huge file that handles almost all of the heavy lifting (Wyil2CBuilder), which I refer to simply as builder. The test suite for wycc should really (****) be common with all the other backends, but is implemented as a copy that has gotten a little out-of-sync.
- Builder (Wyil2CBuilder)is just that one class with one child class (Method) to store knowledge about the subroutines (Methods Or Functions - FOMs).
- CFile provides a mock structured buffering of the file of C code produced (similar to the structured buffering used to handle the Java byte code produced by wyjc).
- WyccMain (as it says in its comments) provides all of the necessary plumbing to process command-line options, construct an appropriate pipeline and then instantiate the Whiley Compiler to generate class files. This is the file that defines options to the Whiley Compiler that specifically control the generation of C code (eg. "-no_floats").
At the level above (closer to the file system root) the modules, there are numerous directories for related scripts, configurations files, and libraries. The Whiley compiler is invoked by several different scripts under "bin/", with "bin/wycc" supporting, separately and in combinations, compilation of Whiley sources, generating C code, compiling the C code, linking those object files with support libraries, and executing the resulting program.
Add the bin directory to your PATH environment variable, and invoke "wycc -h" to get the help message. This will provide a block of text that ought to enable you to figure out which options and what parameters you need to provide to get what you want. This utility handles all aspects, and various combinations, of invoking the following:
- the whiley compiler to generate a file of C code,
- gcc to compile the C translation of your program into a .o file,
- gcc to link your .o file with the runtime library support and produce a a.out file, and
- your finished a.out executable.
While bin/wycc invokes gcc to perform static linking of your code with standard components, further logical linking is performed at runtime. This was a solution (maybe a crude one) to an incompatibility between Whiley and C, namely that C does not (at least the C that I know) support routine name overloading while Whiley does.
The runtime library provides the definition of "main()" that gets control at the start of execution ( a standard convention of C code). However, some code in your files really does need to get control even before "main()" (to register type information, and routines to call) and, to accommodate this, each .c file produced by the Whiley compiler has one static routine tagged as a constructor (a feature needed by C++), and each constructor registers globally two local routines (B & D) for call-backs (by main into the file). Thus, "main()" initially calls the "B" initialiser routine from each file, then calls the "D" initialiser routine from each file, and then calls all of the routines named "main" from all the Whiley files (provided that each has a different type signature - two routines with the same name and the same type signature will cause an execution error). When main() finishes, the program exits normally.
Each "B" initialiser routine globally registers resources, each "D" initialiser routine queries the global registries to populate a registry local to that file. The resources currently include record definitions and FOM (Function Or Method) name-signature-address tuples.
A major complication of this system is that there are multiple separate type systems and they are not fully compatible. C has a very limitted type system that is only minimally expandable. Whiley has its own distinctive type system, which happens to be a bit more like the Java type system than the C system.
The runtime environment for wycc relies on system of "boxing" values to insure that routines both know how values are represented and can accept multiple types of values. The header file "box.h" describes all the box types in use by the system. Box types are defined not only by the nature of the data but also by the implementation details. For example, the box types include Wy_Int and Wy_WInt to encode integer values with the former being small enough to fit in the space of an address, the latter being unbounded, and the system converting between the two in a fashion that is intended to be totally transparent (and undetectable from Whiley source code). **** The current implementation only allows for lists of boxed members, but efficiencies can be had by defining boxes for lists of naked fundamental types, such as Wy_String is currently for Wy_Char. Boxes also include reference counts to permit shared instances and garbage collection.
The runtime environment sometimes needs to be aware of the original Whiley type instead of or in addition to the box type. Builder converts Whiley types (from the tree structure internal to the compiler) into a dense (terse) representation that may be hard on the eyes but is neatly regular for parsing (converting to a tree structure internal to the runtime library). The runtime structure is described near the beginning of wycc_lib.c (look for "struct type_desc"). Essential to this scheme is a dictionary mapping the dense string representations to a token (smallish integer) and an array of type_desc structures (indexed by those tokens). All the fundamental data types are "leaf" types and occupy the beginning of the array (done by library initialisation). Composite types are added to the array and dictionary as they are encountered, as they might be in registering FOM signatures, record definitions, or conversion requests.