The compiler contains a builtin parser for a restricted subset of C and C++ that allows the easy generation of foreign variable declarations, procedure bindings and C++ class wrappers. The parser is invoked via the declaration-specifier foreign-parse, which extracts binding information and generates the necessary code. An example:
(declare (foreign-declare " #include <math.h> #define my_pi 3.14 ") (foreign-parse "extern double sin(double);") ) (print (sin 3.14))
The parser would generate code that is equivalent to
(declare (foreign-declare " #include <math.h> #define my_pi 3.14 ") (define-foreign-variable my_pi float "my_pi") (define sin (foreign-lambda double "sin" double))
Note that the read syntax #>! ... <# and #>? ... <# provide a somewhat simpler way of using the parser. The example above could alternatively be expressed as
#>! #define my_pi 3.14 extern double sin(double); <# (print (sin 3.14))
Another example, here using C++. Consider the following class:
// file: foo.h class Foo { private: int x_; public: Foo(int x); void setX(int x); int getX(); };
To generate a wrapper class that provides generic functions for the constructor and the setX and getX methods, we can use the following class definition:
; file: test-foo.scm #>! #include "Foo.h" <# (define x (make <Foo> 99)) (print (getX x)) ; prints ``99'' (setX x 42) (print (getX x)) ; prints ``42'' (destroy x)
Provided the file foo.o contains the implementation of the class Foo, the given example could be compiled like this (assuming a UNIX like environment):
% csc test-foo.scm foo.o -c++
Here is another example, a minimal ``Hello world'' application for QT. We can see the three different ways of embedding C/C++ code in Scheme:
; compile like this: ; csc hello.scm -c++ -C -I$QTDIR/include -L "-L$QTDIR/lib -lqt" ; Include into generated code, but don't parse: #> #include <qapplication.h> #include <qpushbutton.h> <# ; Parse but don't embed: we only want wrappers for a few classes: #>? class QWidget { public: void resize(int, int); void show(); }; class QApplication { public: QApplication(int, char **); ~QApplication(); void setMainWidget(QWidget *); void exec(); }; class QPushButton : public QWidget { public: QPushButton(char *, QWidget *); ~QPushButton(); } <# (define a (apply make <QApplication> (receive (argc+argv)))) (define hello (make <QPushButton> "hello world!" #f)) (resize hello 100 30) (setMainWidget a hello) (show hello) (exec a) (destroy hello) (destroy a)
The parser will generally perform the following functions
1) Translate macro, enum-definitions and constants into define-foreign-variable or define-constant forms
2) Translate function prototypes into foreign-lambda forms
3) Translate variable declarations into accessor procedures
4) Handle basic preprocessor operations
5) Translate simple C++ class definitions into TinyCLOS wrapper classes and methods
Basic token-substitution of macros defined via #define is performed. The preprocessor commands #ifdef, #ifndef, #else, #endif, #undef and #error are handled. The preprocessor commands #if and #elif are not supported and will signal an error when encountered by the parser, because C expressions (even if constant) are not parsed. The preprocessor command #pragma is allowed but will be ignored.
During processing of foreign-parse declarations the macro CHICKEN is defined (similar to the C compiler option -DCHICKEN).
Macro- and type-definitions are available in subsequent foreign-parse forms. C variables declared generate a procedure with zero or one argument with the same name as the variable. When called with no arguments, the procedure returns the current value of the variable. When called with an argument, then the variable is set to the value of that argument. Structs are not supported. C and C++ style comments are supported.
Function-, member-function and constructor/destructor definitions may be preceded by the __callback qualifier, which marks the function as performing a callback into Scheme. If a wrapped function calls back into Scheme code, and __callback has not been given very strange and hard to debug problems will occur.
Constants (as declared by #define or enum) are not visible outside of the current Compilation units unless the export_constants pseudo declaration has been used.
When given the option -ffi, Chicken will compile a C/C++ file in ``Scheme'' mode, that is, it wraps the C/C++ source inside #>! ... <# and compiles it while generating Scheme bindings for exported definitions.
Keep in mind that this is not a fully general C/C++ parser. Taking an arbitrary headerfile and feeding it to Chicken will in most cases not work or generate riduculuous amounts of code. This FFI facility is for carefully written headerfiles, and for declarations directly embedded into Scheme code.
Using the __declare(DECL, VALUE) form, pseudo declarations can be embedded into processed C/C++ code to provide additional control over the wrapper generation. Pseudo declarations will be ignored when processed by the system's C/C++ compiler.
abstract [values: <string>] Marks the C++ class given in <string> as being abstract, i.e. no constructor will be defined.
export_constants [values: yes, 1, no, 0] Define a global variable for constant-declarations (as with #define or enum), making the constant available outside the current compilation unit. Use the values yes/1 for switching constant export on, or no/0 for switching it off.
full_specialization [values: yes, 1, no, 0] Enables ``full specialization'' mode. In this mode all wrappers for functions, member functions and static member functions are created as fully specialized TinyCLOS methods. This can be used to handle overloaded C++ functions properly. Only a certain set of foreign argument types can be mapped to TinyCLOS classes, as listed in the following table:
char | <char> |
bool | <bool> |
c-string | <string> |
unsigned-char | <exact> |
[unsigned-]int | <exact> |
[unsigned-]short | <exact> |
[unsigned-]long | <integer> |
[unsigned-]integer | <integer> |
float | <inexact> |
double | <inexact> |
(enum _)char | <exact> |
(const T)char | (as T) |
(function ...) | <pointer> |
c-pointer | <pointer> |
(pointer _) | <pointer> |
(c-pointer _) | <pointer> |
u8vector | <u8vector> |
s8vector | <s8vector> |
u16vector | <u16vector> |
s16vector | <s16vector> |
u32vector | <u32vector> |
s32vector | <s32vector> |
f32vector | <f32vector> |
f64vector | <f64vector> |
All other foreign types are specialized as <top>.
Full specialization can be enabled globally, or only for sections of code by enclosing it in
__declare(full_specialization, yes) ... int foo(int x); int foo(char *x); ... __declare(full_specialization, no)
prefix [values: <string>, no, 0] Sets a prefix that should be be added to all generated Scheme identifiers. For example
__declare(prefix, "mylib:") #define SOME_CONST 42
would generate the following code:
(define-constant mylib:SOME_CONST 42)
To switch prefixing off, use the values no or 0.
rename [value: <string>] Defines to what a certain C/C++ name should be renamed. The value for this declaration should have the form "<c-name>;<scheme-name>", where <c-name> specifies the C/C++ identifier occurring in the parsed text and <scheme-name> gives the name used in generated wrapper code.
scheme [value: <string>] Embeds the Scheme expression <string> in the generated Scheme code.
substitute [value: <string>] Declares a name-substitution for all generated Scheme identifiers. The value for this declaration should be a string containing a regular expression and a replacement string (seperated by the ; character):
__declare(substitute, "^SDL_;sdl:") extern void SDL_Quit();
generates
(define sdl:Quit (foreign-lambda integer "SDL_Quit") )
transform [values: <string>] Defines an arbitrary transformation procedure for names that match a given regular expression. The value should be a string containing a regular expression and a Scheme expression that evaluates to a procedure of one argument. If the regex matches, the procedure will be called at compile time with the match-result (as returned by string-match) and should return a string with the desired transformations applied:
(require-for-syntax 'srfi-13) #>! __declare(transform, "([A-Z]+)_(.*);(lambda (x) (string-append (cadr x) \"-\" (string-downcase (caddr x))))") void FOO_Bar(int x) { return x * 2; } <# (print (FOO-bar 33))
type [value: <string>] Declares a foreign type transformation, similar to define-foreign-type. The value should be a list of two to four items, separated by the ; character: a C typename, a Scheme foreign type specifier and optional argument- and result-value conversion procedures.
;;;; foreign type that converts to unicode (assumes 4-byte wchar_t): ; ; - Note: this is rather kludgy is only meant to demonstrate the `type' ; pseudo-declaration (declare (uses srfi-4)) (define mbstowcs (foreign-lambda int "mbstowcs" nonnull-u32vector c-string int)) (define (str->ustr str) (let* ([len (string-length str)] [us (make-u32vector (add1 len) 0)] ) (mbstowcs us str len) us) ) #>! __declare(type, "unicode;nonnull-u32vector;str->ustr") static void foo(unicode ws) { printf("\"%ls\"\n", ws); } <# (foo "this is a test!")
The parser understand the following grammar:
PROGRAM = PPCOMMAND | DECLARATION ";" PPCOMMAND = "#define" ID [TOKEN ...] | "#ifdef" ID | "#ifndef" ID | "#else" | "#endif" | "#undef" ID | "#error" TOKEN ... | "#include" INCLUDEFILE | "#pragma" TOKEN ... DECLARATION = FUNCTION | VARIABLE | ENUM | TYPEDEF | CLASS | CONSTANT | "struct" ID | "__declare" "(" PSEUDODECL "," <tokens> ")" INCLUDEFILE = "\"" ... "\"" | "<" ... ">" FUNCTION = ["__callback"] [STORAGE] TYPE ID "(" TYPE [ID] "," ... ")" [CODE] | ["__callback"] [STORAGE] TYPE ID "(" "void" ")" [CODE] VARIABLE = [STORAGE] TYPE ID ["=" INITDATA] STORAGE = "extern" | "static" | "volatile" | "inline" CONSTANT = "const" TYPE ID "=" INITDATA PSEUDODECL = "export_constants" | "prefix" | "substitute" | "abstract" | "type" | "scheme" | "rename" | "transform" | "full_specialization" ENUM = "enum" "{" ID ["=" NUMBER] "," ... "}" TYPEDEF = "typedef" TYPE ID TYPE = ["const"] BASICTYPE [("*" ... | "&" | "<" TYPE "," ... ">" | "(" "*" [ID] ")" "(" TYPE "," ... ")")] BASICTYPE = ["unsigned" | "signed"] "int" | ["unsigned" | "signed"] "char" | ["unsigned" | "signed"] "short" | ["unsigned" | "signed"] "long" | "float" | "double" | "void" | "bool" | "__bool" | "__scheme_value" | "__fixnum" | "struct" ID | "union" ID | "enum" ID | ID CLASS = "class" ID [":" [QUALIFIER] ID "," ...] "{" MEMBER ... "}" MEMBER = [QUALIFIER ":"] ["virtual"] (MEMBERVARIABLE | CONSTRUCTOR | DESTRUCTOR | MEMBERFUNCTION) MEMBERVARIABLE = TYPE ID ["=" INITDATA] MEMBERFUNCTION = ["__callback"] ["static"] TYPE ID "(" TYPE [ID] "," ... ")" ["const"] ["=" "0"] [CODE] | ["__callback"] ["static"] TYPE ID "(" "void" ")" ["const"] ["=" "0"] [CODE] CONSTRUCTOR = ["__callback"] ["explicit"] ID "(" TYPE [ID] "," ... ")" [BASECONSTRUCTORS] [CODE] DESTRUCTOR = ["__callback"] "~" ID "(" ["void"] ")" [CODE] QUALIFIER = ("public" | "private" | "protected") NUMBER = <a C integer or floating-point number, in decimal, octal or hexadecimal notation> INITDATA = <everything up to end of chunk> BASECONSTRUCTORS = <everything up to end of chunk> CODE = <everything up to end of chunk>
The following table shows how argument-types are translated:
[unsigned] char | char |
[unsigned] short | [unsigned-]short |
[unsigned] int | [unsigned-]integer |
[unsigned] long | [unsigned-]long |
float | float |
double | double |
bool | int |
__bool | int |
__fixnum | int |
__scheme_value | scheme-object |
char * | c-string |
signed char * | s8vector |
[signed] short * | s16vector |
[signed] int * | s32vector |
[signed] long * | s32vector |
unsigned char * | u8vector |
unsigned short * | u16vector |
unsigned int * | u32vector |
unsigned long * | u32vector |
float * | f32vector |
double * | f64vector |
CLASS * | (instance CLASS <CLASS>) |
TYPE * | (pointer TYPE) |
TYPE & | (ref TYPE) |
TYPE<T1, ...> | (template TYPE T1 ...) |
TYPE1 (*)(TYPE2, ...) | (function TYPE1 (TYPE2 ...)) |
The following table shows how result-types are translated:
void | void |
[unsigned] char | char |
[unsigned] short | [unsigned-]short |
[unsigned] int | [unsigned-]integer |
[unsigned] long | [unsigned-]long |
float | float |
double | double |
bool | bool |
__bool | bool |
__fixnum | int |
__scheme_value | scheme-object |
char * | c-string |
CLASS * | (instance CLASS <CLASS>) |
TYPE * | (pointer TYPE) |
TYPE & | (ref TYPE) |
TYPE<T1, ...> | (template TYPE T1 ...) |
TYPE1 (*)(TYPE2, ...) | (function TYPE1 (TYPE2 ...)) |
Foreign variable definitions for macros are not exported from the current compilation unit, but definitions for C variables and functions are.
foreign-parse does not embed the text into the generated C file, use foreign-declare for that (or even better, use the #>! ... <# syntax which does both).
Functions with variable number of arguments are not supported.
Each C++ class defines a TinyCLOS class, which is a subclass of <c++-object>. Instances of this class contain a single slot named this, which holds a pointer to a heap-allocated C++ instance. The name of the TinyCLOS class is obtained by putting the C++ classname between angled brackets (<...>). TinyCLOS classes are not seen by C++ code.
The C++ constructor is invoked by the initialize generic, which accepts as many arguments as the constructor. If no constructor is defined, a default-constructor will be provided taking no arguments. To allow creating class instances from pointers created in foreign code, the initialize generic will optionally accept an arguments list of the form 'this POINTER, where POINTER is a foreign pointer object. This will create a TinyCLOS instance for the given C++ object.
To release the storage allocated for a C++ instance invoke the destroy generic.
Static member functions are wrapped in a Scheme procedure named <class>::<member>.
Member variables are ignored.
Operator functions and default arguments are not supported.
Exceptions must be explicitly handled by user code and may not be thrown beyond an invocation of C++ by Scheme code.