The MARST package is intended to translate Algol 60 programs to the C programming language.
Processing scheme can be understood as the following:
Algol 60 source program | V +-------------+ | MARST | +-------------+ | V C source code | V +-------------+ algol.h ------>| C compiler |<------ Standard headers +-------------+ | V Object code | V +-------------+ ALGLIB ------>| Linker |<------ Standard libraries +-------------+ | V +-------------+ Input data ------>| Executable |-------> Output data +-------------+
where:
The file name of this library may be different for different
distributions. For example, in GNU/Linux distribution the name
libalgol.a is used.
To install the MARST package under GNU/Linux use standard installation procedure. (For details see file INSTALL.)
As a result of the installation four components should be installed:
marst (usually in usr/local/bin),
macvt (usually in usr/local/bin),
algol.h (usually in usr/local/include or/and in usr/include), and
libalgol.a (usually in usr/local/lib).
To invoke MARST translator the following syntax is used:
MARST [options...] [filename]
Options:
If this option is set then the translator emits elementary syntactic units (so called sections) of source Algol 60 program to output C code as comments.
This option is useful to localize syntax error more precisely. For
example, Algol 60 has three kinds of comments (usual comments, comments
of end-end type and extended parameter delimiters). Therefore it is
easily to make a mistake if, for example, to forget comma between end
bracket and next statement.
This option set maximal error allowance. The translator stops after the
specified number of errors detected. The value of nnn should be in the
range from 0 to 255. If this option is not specified then -e 0 is used
as default value and means to continue translation until the end of the
input file.
This option set desirable line width for output C code produced the translator. The value of nnn should be in the range from 50 to 255. If this option is not specified then -l 72 is used as default value.
Note that actual line width may happen to be greater than nnn, since the
translator can't break output text in any place, but this happens
extremely seldom.
If this option is not set then the translator uses stdandard output by
default.
By default the translator writes date and time of translation to output
C code.
By default during translation the translator displays warning messages reflecting potential errors and non-standard features used in source Algol 60 program.
To translate source Algol 60 program it should be prepared in text file and the name of this file should be specified in command-line.
If the name of input text file is not specified then the translator uses standard input file instead. *Note* that the translator reads input file *twice*, therefore this file should be only regular file, but not pipe, terminal input, etc. Hence if standard input file is used it should be redirected to regular file.
For one run the translator can process only one input text file.
The following example shows how to use the MARST package in most cases.
At first we prepare source Algol 60 program, for example, in text file named sample.alg:
begin outstring(1, "Hello, world\n") end
Now we translate this program to the C programming language:
marst sample.alg -o sample.c
and get text file named sample.c which then we compile and link in usual way (we should remember about algol and math libraries):
gcc sample.c -lalgol -lm
Finally, we run executable:
sample
and see what we get. That's all.
For more examples see directory 'ex' in the distribution.
The input language of the MARST translator is hardware representation of the reference language Algol 60 as described in IFIP document:
Modified Report on the Algorithmic Language ALGOL 60. The Computer Journal, Vol. 19, No. 4, Nov. 1976, pp. 364-379. (This document is an official IFIP standard document. It is *not* a part of the MARST package.)
Note that there are some differences between the Revised Report and the Modified Report since latter is a result of application of the following IFIP document to the Revised Report:
R.M.De Morgan, I.D.Hill, and B.A.Whichman. A Supplement to the ALGOL 60 Revised Report. The Computer Journal, Vol. 19, No. 3, 1976, pp. 276-288. (This document is an official IFIP standard document. It is *not* a part of the MARST package.)
Source Algol 60 program should be coded as usual text file using ASCII character set.
Basic symbols should be coded as the following.
Basic symbol Hardware representation ----------------------------------------------- a, b, ..., z a, b, ..., z A, B, ..., Z A, B, ..., Z 0, 1, ..., 9 0, 1, ..., 9 + + - - x * / / integer division % exponentiation ^ (or **) < < not greater <= = = not less >= > > not equal != equivalence == implication -> or | and & not ! , , . . ten (10) # (pound sign) : : ; ; := := ( ( ) ) [ [ ] ] opening quote " closing quote " array array begin begin Boolean Boolean (or boolean) comment comment do do else else end end false false for for go to go to (or goto) if if integer integer label label own own procedure procedure real real step step string string switch switch then then true true until until value value while while
Any symbol can be surrounded by any number of white-space characters (i.e. by blanks, HT, CR, LF, FF, or VT), but any multi-character symbol should contain no white-space characters. Moreover, a letter sequence is recognized as a keyword if and only if there is no letter or digit immediately preceeding or following that sequence (excluding keyword 'go to' that can contain zero or more blanks between 'go' and 'to').
Examples
... 123 then abc ... 'then' will be recognized as 'then' symbol ... 123then abc ... 'then' will be recognized as letters 't', ... 123 thenabc ... 'h', 'e', 'n', but not as 'then' symbol ... th en ... 'th en' will be recognized as letters 't', 'h', 'e', 'n'
Note that identifiers and numbers can contain white-space characters. This may be used if identifier is the same as a keyword. For example, identifier 'label' should be coded as 'la bel' or 'lab el'. Note also that white-space characters are not significant (excluding their use within strings), so 'abc' and 'a b c' denotes the same identifier.
All letters are case sensitive (excluding the first 'b' in keyword 'Boolean'). This means that 'abc' and 'ABC' are different identifiers, and 'Then' will not be recognized as a keyword 'then'.
Any identifier or number may contain up to 100 characters (excluding internal white-space characters).
Quoted string are coded in C style. For example:
outstring(1, "This\tis a string\n"); outstring(1, "This\tis a st" "ring\n"); outstring(1, "This\tis all one st" "ring\n");
Within string (i.e. between double quotes enclosing string body) escape sequences may be used (as \t and \n in the example above). Double quote and backslash within string should be coded as \" and \\. Between parts of a string any number of white-space characters is allowed.
Excluding strings coding and limitations on length of identifiers and numbers there are no other differences between syntax of the reference language and syntax of the MARST input language.
All input/output is performed by standard Algol 60 procedures.
This implementation allows up to 16 I/O channels that have numbers 0,
1, ..., 15. The channel 0 is always connected to 'stdin'
and so only
input from this channel is allowed. The channel 1 is always connected
to 'stdout'
and so only output to this channel is allowed. The other
channels allow both input and output.
(The standard procedure 'fault' uses channel number 'sigma' that is not
available to the programmer. This latent channel is always connected to
'stderr'
.)
Before Algol program startup all channels (excluding channels 0 and 1) are disconnected, i.e. no files are assigned to them.
If an input (an output) is required from (to) the channel n then the following happens:
1) If the channel n is connected for output (input) then the I/O routine closes a file assigned to the channel making it disconnected.
2) If the channel n is disconnected then the I/O routine opens a file in 'read' ('write') mode and assigns this file to the channel making it connected for input (output).
3) Finally, the I/O routine performs input from (output to) the channel; if an end-of-file has been detected on input then I/O routine signals an error condition.
To determine the name of file which is to be assigned to the channel n the I/O routine tries to check environment variable named "FILE_n". If such variable exists then its value is used as filename; otherwise its name (i.e. character string "FILE_n") is used as filename.
The MARST translator supports some extensions of the reference language to make the package more convenient to the programmer.
The possibility of modular programming is illustrated by the following example:
First file Second file ---------------------------------------------------- procedure one(a, b); procedure one(a, b); value a, b; real a, b; value a, b; real a, b; begin code; ... end; procedure two(x, y); value x, y; real x, y; procedure two(x, y); code; value x, y; real x, y; begin begin ... <main program> end; end
The procedures 'one' and 'two' in the first file are called precompiled procedures. Declarations of these procedures should be outside block or compound statement representing program. The procedures 'one' and 'two' in the second file are called code procedures; they have keyword 'code' instead statement representing procedure body. Declarations of code procedures also should be outside program block or compound statement.
This mechanism allows to translate precompiled procedures independently from the main program (and precompiled procedures may be programmed in any other C-compatible programming languages). The programmer can consider that directly before program execution declarations of all precompiled procedures are placed into the file containing main program (the second file in example above) instead declarations of corresponding code procedures. (Of course, it is not a new for C programmers.)
inline
Pseudo procedure inline
has the following (implicit) heading:
procedure inline(str); string str;
Any procedure statement using the inline
procedure translated into
code which is the string 'str' after deletion of enclosing quotes.
Here is an example:
Source program Output C code ------------------------------------------------ . . . . . . a := 1; dsa_0->a_5 = 1; b := 2; dsa_0->b_8 = 2; inline("printf(\"OK\");"); printf("OK"); c := 3; dsa_0->c_4 = 3; . . . . . .
Procedure statement inline
may be used as a usual Algol statement
anywhere in program.
print
Pseudo procedure print
is intended mainly for test printing (since
standard Algol input/output is out of criticism). This procedure has
unspecified heading with variable parameter list.
Here is an example:
real a, b; integer c; Boolean d; array u, v[1:10], w[-5:5,-10:10]; . . . print(a, b, u); print(c); . . . print("test shot", (a+b)*c, !d | u[1] > v[1], u, v, w); . . .
Each actual parameter passed to the procedure print
is output to
standard channel 1 (stdout) in printable form.
Algol converter utility is MACVT. It is an auxiliary program which is intended to convert Algol 60 programs from some other representation to MARST representation. Such conversion is neccessary when existing Algol programs should be adjusted to translate them using MARST.
MACVT is not a translator itself. This program just reads original code of Algol 60 program from input text file, converts each main symbol to MARST representation (see Input Language), and writes result code to output text file. It is assumed that output code produced by MACVT will be further translated by MARST in usual way. Note that MACVT performs no syntax checking.
Input language understood by MACVT differs from MARST input language only in representation of basic symbols. (Should note that in this sense MARST input language is a subset of MACVT input language.)
Representation of basic symbols implemented in MACVT is based mainly on well known (in 1960s) Algol 60 compiler developed by IBM Corp. first for IBM 7090 and later for System 360. This representation may be considered as non-official standard because it was widely used at that time when Algol 60 was actual programming language.
To invoke MACVT converter the following syntax is used:
MACVT [options...] [filename]
Options:
This option is used by default until other representation is not
choosen. It assumes that input Algol 60 program is coded using classic
representation: all white-space characters are non-significant
(excluding quoted strings) and any keyword should be enclosed in
apostrophes. For details see below.
If this option is set then it is allowed not to enclose keywords in
apostrophes. But in this case white-space characters should not be used
within multi-character basic symbols. See below for details.
If this option is set then all letters (except in comments and strings)
are converted to lower case, i.e. conversion is case-insensitive.
This option is the same as -free-coding but additionaly keywords for
arithmetic, logical and relational operators can be coded without
apostrophes. For details see below.
If this option is not set then the converter uses standard output by
default.
This option allows the converter to recognise diphthong ., as semicolon
(including its usage to terminate comment sequence).
This option allows the converter to recognise single apostrophe (when it
is followed by +, -, or digit) as ten symbol.
To convert source Algol 60 program it should be prepared in text file and the name of this file should be specified in command-line.
If the name of input text file is not specified then the converter uses standard input file by default.
For one run the converter can process only one input text file.
In the table shown below one or more valid representations are given for each basic symbol.
Basic symbol Extended hardware representation ----------------------------------------------------------- a, b, ..., z a, b, ..., z A, B, ..., Z A, B, ..., Z 0, 1, ..., 9 0, 1, ..., 9 + + - - x * / / integer division % '/' 'div' exponentiation ^ ** 'power' 'pow' < < 'less' not greater <= 'notgreater' = = 'equal' not less >= 'notless' > > 'greater' not equal != 'notequal' equivalence == 'equiv' implication -> 'impl' or | 'or' and & 'and' not ! 'not' , , . . ten (10) # ' '10' : : .. ; ; ., := := .= ..= ( ( ) ) [ [ (/ ] ] /) opening quote " ` closing quote " ' array 'array' begin 'begin' Boolean 'boolean' code 'code' comment 'comment' do 'do' else 'else' end 'end' false 'false' for 'for' go to 'goto' if 'if' integer 'integer' label 'label' own 'own' procedure 'procedure' real 'real' step 'step' string 'string' switch 'switch' then 'then' true 'true' until 'until' value 'value' while 'while'
Remarks
1. Classic (apostrophized) form of keywords and some other basic symbols is allowed for any (i.e. for classic as well as free) representation.
2. In case of classic representation all white-space characters (except their usage in comments and quoted strings) are ignored anywhere.
3. Basic symbol coded in apostrophes may contain white-space characters which are ignored. Besides, all letters are case-insensitive.
4. Basic symbol may be coded in free form (without apostrophes) only if free representation (-free-coding) is used.
5. In case of free representation any multi-character basic symbol should contain no white-space characters.
6. Free form of keywords that denotes arithmetic, logical, or relational operators (e.g. greater instead 'greater') is allowed only if more free representation (-more-free) is used.
7. Single apostrophe is recognised as ten symbol only if -old-ten option is specified in command-line. (Note that in this case '10' will not be then recognised as ten symbol.)
8. Diphthong ., is recognised as semicolon only if -old-sc option is specified in command-line.
9. If opening quote is coded as ", then closing quote should be coded as " too. If opening quote is coded as `, then closing quote should be coded as '. (About strings coding see Section 5.)
Finally it should be noted that MACVT copies comments and white-space characters to output text to keep original formatting of input text.