This manual is for C-INTERCAL version 0.29. It does not replace the old groff manual, nor is it designed to be read in conjunction with it; instead, it serves a different purpose, of providing information useful to users of C-INTERCAL (unlike the other manual, it is not derived from the original INTERCAL-72 manual).
Copyright © 2007 Alex Smith.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation; with no Invariant Sections, with no Front-Cover Texts, and with no Back-Cover Texts. A copy of the license is included in the section entitled “GNU Free Documentation License.”
This is the Revamped Instruction Manual for C-INTERCAL (this version is distributed with C-INTERCAL version 0.29). It is divided into four parts.
The first part is about the C-INTERCAL compiler
ick
, and how to use it. It covers installing the
compiler, using the compiler, what error and warning messages are
produced by the compiler, and some information on how to use the
debugger.
The second part is about the core INTERCAL language, invented in 1972, and some other commands since then which don’t feel like they’re extensions. (This is a pretty arbitrary distinction, but people who write the documentation are entitled to arbitrary distinctions. The manual’s licensed under a license that lets you change it (see Copying), so if you disagree you can move the commands from section to section yourself.) Mostly only commands that are implemented in C-INTERCAL are covered here (if you’re interested in the other commands implemented in other compilers, read CLC-INTERCAL’s documentation). However, a comprehensive guide to portability of these commands between C-INTERCAL and other INTERCAL compilers is given.
The third part covers the INTERCAL extensions and dialects that are implemented by C-INTERCAL, such as TriINTERCAL and Threaded INTERCAL. Again, extensions and dialects not implemented have been mostly left out.
The final part contains appendices (which were known as
‘tonsils’ in the original INTERCAL
manual), such as character sets used by
INTERCAL, programs other than ick
in the C-INTERCAL distribution, information on how to
read and update the list of optimizer idioms used by the compiler,
and the copyright.
The C-INTERCAL distribution is distributed in source code form; this means that before using it, you first have to compile it. Don’t worry: if you have the right software, it’s not at all difficult. Most Linux-based and UNIX-based computers are likely to have the software needed already; the software needed to compile source-distributed packages is also readily available for free for other operating systems. The following instructions will help you install the distribution in a method appropriate for your system.
C-INTERCAL distributions have been stored in many different places over time; it can sometimes be hard to make sure that you are finding the most recent version. In order to make sure that you have the most recent version, the easiest way is to look at the alt.lang.intercal newsgroup; all releases of the C-INTERCAL compiler ought to be announced there. (If you are interested in what other INTERCAL compilers are available, it may also be worth looking there.) If you don’t have access to a newsreader, your newsreader doesn’t cover that newsgroup, or the distance between releases has been too large for your news server to keep the message, it’s likely that you can find the announcement in an archive on the World Wide Web; at the time of writing (2007), the archives of the newsgroup are stored by Google Groups, and a search for ‘alt.lang.intercal’ there should tell you where to find a copy.
If you’re looking for the latest version, note that the number after the dot represents the major version number; you want to maximise this in favour of the number before the dot, which is the bugfix level within a major version. (Major versions are released as version 0.whatever; if a new version comes out that fixes bugs but adds no new features, nowadays its number will be of the form 1.whatever, with the same major number. This has not always been the case, though.)
C-INTERCAL is distributed in compressed pax format;
for instance, you may find it as a
‘.pax.lzma’ file if you have the
unlzma
decompression program (this is advised, as
it’s the smallest); ‘.pax.bz2’ is
larger and ‘.pax.gz’ is larger still.
Most computers can decompress files in this format, even if they
don’t realise it, because pax is forwards-compatible with
tar; try renaming the extension from
‘.pax’ to
‘.tar’ after decompressing to see if you
have a progam that can decompress it. (If you’re wondering
why such an apparently non-standard format is being used, this is
is actually a case where C-INTERCAL is being
perfectly nonstandard by conforming to the standards; tar is no
longer specified by POSIX, and pax is its replacement. It’s
just that pax never really caught on.)
It doesn’t matter where you extract the distribution file to: it’s best if you don’t put it anywhere special. If you aren’t an administrator, you should extract the file to somewhere in your home directory (Linux or UNIX-like systems) or to your My Documents directory (recent versions of Windows; if you’re using an older version, then you are an administrator, or at least have the same privileges, and can extract it anywhere). Some commands that you might use to extract it:
unlzma ick-0-29.pax.lzma tar xvf ick-0-29.pax
or
bunzip2 ick-0-29.pax.bz2 tar xvf ick-0-29.pax
or
gunzip ick-0-29.pax.gz tar xvf ick-0-29.pax
On most UNIX-based and Linux-based systems, tar
will be available to unpack the installation files once
they’ve been uncompressed with gunzip
.
(I’ve heard that some BSD systems have pax
itself to decompress the files, although have not been able
to verify this; some Linux distributions also have
pax
in their package managers. Both tar and pax
should work fine, though.) gunzip
is also likely
to be available (and bunzip2
and
unlzma
are less likely, but use those versions
if you have them to save on your bandwidth); if it
isn’t, you will need to download a copy from the
Internet.
tar xzvf ick-0-29.pax.gz
or
tar xqvf ick-0-29.pax.bz2
If you are using the GNU version of tar
(which
is very likely on Linux), you can combine the two steps into
one as shown here, except when using the lzma-compressed
version.
djtar -x ick-0-29.pax.gz
On a DOS system, you will have to install DJGPP anyway to be
able to compile the distribution, and once you’ve done
that you will be able to use DJGPP’s decompressing and
unpacking utility to extract the files needed to install the
distribution. (You will need to type this at the command
line; on Windows 95 and later, try choosing Run... from the
start menu then typing cmd
(or
command
if that fails) in the dialog box that
opens to get a command prompt, which you can exit by typing
exit
. After typing any command at a command
line, press RET to tell the shell to
execute that command.)
If you’re running a Windows system, you could always try double-clicking on the ick-0-29.pax.gz file; probably renaming it to have the extension ‘.tgz’ is likely to give the best results. It’s quite possible that you’ll have a program installed that’s capable of decompressing and unpacking it. Unfortunately, I can’t guess what program that might be, so I can’t give you any instructions for using it.
Whatever method you use, you should end up with a directory
created called ick-0.29; this is your main
installation directory where all the processing done by the
installation will be carried out. You will need to have that
directory as the current directory during install (at the command
prompt in all the operating systems I know, you can set the
current directory by typing cd ick-0.29
).
There are scripts included in the distribution to automate the process of installing, in various ways. The simplest method of installing on most operating systems (on DOS, see Installation on DOS) is to use the following routine:
configure
. Although building in the distribution
directory works, it is recommended that you build elsewhere;
create a directory to build in (using mkdir
on most
operating systems), then run configure
from inside
that directory (for instance, you could do this from inside the
main installation directory:
mkdir build cd build ../configure
to build in a subdirectory of the distribution called
“build”). You also specify where you want the
files to be installed at this stage; the default of
‘/usr/local’ is good for many
people, but you may want to install elsewhere (in particular,
if you want to test out C-INTERCAL without
installing it, create a new directory somewhere you own and
specify that as the place to install it, so the install will
actually just copy the files into the right structure for use
instead of installing them). To specify a location, give the
option --prefix=location to
configure; for instance, configure --prefix=/usr
would install in /usr.
make
.
The Makefile will be set up for your version of
make
, and to automatically recompile only what needs
compiling (it will even recompile the build system if you change
that).
make install
. (This is the only
step that needs root/administrator permissions; so on a system
that uses sudo to elevate permissions, for instance, write it as
sudo make install
if you’re installing into a
directory that you can’t write to as a non-administrative
user.) This step is optional; if you do not install
C-INTERCAL, you can still run it by directly
referencing the exact location of the ick
command.
On all systems, it’s worth just trying this to see if it
works. This requires a lot of software on your computer to work,
but all of it is standard on Linux and UNIX systems. The first
command is a shell-script which will analyse your system and set
settings accordingly; it will explain what it’s doing and
what settings it detected, and create several files in the
installation directory to record its results. (This is a
configure script produced by the GNU autoconf (configure); its
autoconf source code is available in the file
configure.ac.) The second command actually compiles
the source code to produce binaries; this takes the longest of
any of the steps. You will see all the commands that it’s
running as it runs them. The third command will copy the files
it’s compiled to appropriate shared locations on your
system so that anyone on the system can just use
ick
.
There may be various factors that prevent this simple
installation method working. On a system not based on UNIX or
Linux, you may find that you don’t have some of the
software required to run this (for instance, you may be missing
the shell sh
, and don’t have the shell
bash
which can emulate it, and so can’t run
configure
that depends on one of those shells being
available) and so this method won’t work for you. In such
cases, one solution may be to install all the software required;
the GNU project has a version of all the commands required, for
instance, and there may be ports available for your operating
system. However, the only software absolutely required is a C
compiler (C-INTERCAL was designed to work with
gcc
and is tested mostly with that compiler, but in
theory it should work with other C compilers too, and this is
tested on occasion) and the associated software needed to compile
C files to object files and executables, combine object files
into libraries, etc.; but this requires trying to do the build by
hand, so it’s generally easier just to install a UNIX-like
shell and associated tools.
Another possibility that might stop this process working is if
your version of the relevant software is incompatible with the
GNU versions that were used for testing. For instance, I have
come across proprietary versions of lex
that need
directives in the source file to say in advance how much memory
the lexer-generator needs to allocate. In such cases, pay
attention to the error messages you’re getting; normally
they will suggest trivial modifications to the source files that
will cause the compilation to work again.
Some Linux and UNIX systems (notably Debian and Ubuntu) don’t have the required files for compilation installed by default. To install them, just download and install the required packages: for Ubuntu at the time of writing, they are ‘binutils’, ‘cpp’, ‘gcc’, ‘libc6-dev’, ‘make’ to compile C-INTERCAL, and if you want to modify it, you may also need ‘autoconf’, ‘automake’, ‘bison’, and ‘flex’. For debugging help, you may also want ‘gdb’, and to recompile the documentation, you may need ‘groff’, ‘texlive’, ‘texinfo’, and ‘tidy’.
If you’re trying to do something unusual, you probably want
to set some of the settings yourself rather than letting the
compilation process guess everything. In this case, use
configure --help
to view the options that you can
set on configure
; there’s a wide range of
settings that you can set available there, and one of them may be
what you want.
On DOS-based systems, it’s possible to install C-INTERCAL via compiling it using DJGPP, a free DOS development system. (You can obtain DJGPP via its homepage, at http://www.delorie.com/djgpp/.) The process for installing it works like this:
You might want to install other packages, particularly GNU Bison and GNU Flex, in order to be able to rebuild certain parts of the compiler if you change them. This is not necessary to simply be able to run C-INTERCAL without changing it, though.
bash
session, change to the
‘buildaux’ subdirectory of your main
C-INTERCAL directory, and run the command
build-dj.sh
. This will run the entire
C-INTERCAL build system, and hopefully end up with
executables you can run in the ‘build’
subdirectory that will be created in your main
C-INTERCAL directory.
make install
from the new
‘build’ subdirectory. It will run just
fine in-place without a need to install, though, if you prefer.
It may happen that you decide to uninstall C-INTERCAL after installing it; this may be useful if you want to test the installation system, or change the location you install programs, or for some reason you don’t want it on your computer. It’s worth uninstalling just before you install a new version of C-INTERCAL because this will save some disk space; you cannot install two versions of C-INTERCAL at once (at least, not in the same directory; but you can change the --prefix of one of the installations to get two versions at once).
If you installed C-INTERCAL using make
install
, you can uninstall it by using make
uninstall
from the installation directory, assuming that
it still exists. If you can’t use that method for some
reason, you can uninstall it by deleting the files
ick and convickt where your computer
installs binaries (with an extension like
‘.exe’ added if that’s usual for
binaries on your operating system), libick.a,
libickmt.a, libickec.a, and
libyuk.a where your computer installs libraries, and
the subdirectories ick-0.29 in the places where your
computer installs data files and include files, and their
contents.
You can go further than uninstalling. Running make
clean
will delete any files created by compilation;
make distclean
will delete those files, and also any
files created by configuring. It’s probably a wise idea to
uninstall before doing a distclean, though, as otherwise
information needed to uninstall will be deleted, as that
information is generated by configure
. You can go
even further and use make veryclean
which will
delete not only files created by configuring, but the entire
build system; doing so is not recommended unless you have some
method of rebuilding the build system from its original sources
(a script to do this is provided in repository versions of
C-INTERCAL, because the generated part of the build
system is not stored in the repository).
If you can’t get C-INTERCAL to install at all, or something goes wrong when you’re using it, reporting a bug is probably a good idea. (This is still important even if you figure out how to fix it, and the information isn’t in the manual, because the fix can be added to the source code if possible, or at least to the manual, to benefit future users.) For general help, you may want to post to the alt.lang.intercal news group; to report a bug or submit a patch, email the person who released the most recent C-INTERCAL version (which you can determine by looking at that newsgroup).
If you do find a bug (either the compiler not behaving in the way you’d expect, or if you find a way to cause E778 (see E778) without modifying the source code), it helps a lot if you can submit a bug report explaining what causes it. If you’re not sure, say that; it helps if you give examples of input, command line options, etc. that cause the bug. There are several debug options (see Debug Options) that you can use to help pin down a bug if you’re interested in trying to solve the problem yourself; looking at the output C code can also help pin down a bug if the compiler gets that far.
Information that should be given in a bug report is what you
expect to happen, what actually happens, what input and command
line options you gave to the compiler, what operating system
you’re using, any ideas you might have as to what the
problem is, and any appropriate debug traces (for instance,
-H (see -H) output if
you think the bug is in the optimizer). Core dumps aren’t
portable between systems, so don’t send those; however, if
you’re getting an internal error and can dump core with
-U (see -U), it helps
if you can load a debugger (such as gdb
) on the core
dump, use the debugger to produce a backtrace, and send that
backtrace.
If you figure out how to solve the bug yourself, and want to
submit the patches to help other users (this also carries the
advantage that your patches will then be maintained along with
the rest of the distribution, and that you won’t have to
reapply them every time you upgrade to a newer version of
C-INTERCAL), you must first agree to license your
code under the same license as the code that surrounds it
(normally, that’s the GNU General Public License, but if
you submit a patch to a file with a different license, like this
manual (yes, documentation patches are useful too), you must
agree to that license). You will be credited for the patch in the
source code unless you specifically ask not to be or you
don’t give your name (in both these cases, you must license
the code to the public domain so that it can be incorporated
without the attribution requirement). Preferably, patches should
be submitted in the format created by the command diff
-u
; this command is likely to be available on UNIX and
Linux systems and versions are also available for DOS and Windows
(including a DJGPP port of the GNU version). If you can’t
manage that, just submit your new code with enough lines of old
code around it to show where it’s meant to go, and a
description of approximately where in the file it was. Patches
should be submitted by email to the person who most recently
released a version of C-INTERCAL.
If you have a suggestion for a new feature, it makes sense to first discuss it on the alt.lang.intercal news group; other INTERCAL compiler maintainers may also want to implement that feature. If you have developed code to implement that feature in C-INTERCAL, you can submit it the same way that you would submit a patch for a bug.
Due to the licensing conditions of C-INTERCAL, you are allowed to release your own version or distribution if you want to. In such cases, it’s recommended that you follow the following guidelines:
make distcheck
, which will make the
distribution paxballs, and rename them to have the correct
extensions (Automake thinks they’re tarballs, so will use
‘.tar’ rather than
‘.pax’, and you have to fix this by
hand). make distcheck
will also perform some sanity
checks on the build system of the resulting paxball, which will
help to ensure that nothing important is missing from it; and
some regression tests on a version of C-INTERCAL built from the
distribution tarball itself, to prove that it runs correctly and
produces plausible output. (A failure of the regression checks
will not stop the build, but should stop you distributing the
resulting compiler.)
All operations on INTERCAL source code available
in C-INTERCAL, other than the conversion from one
character set to another, are currently carried out by the compiler
ick
.
The syntax is
ick -options inputfile
(Options can be given preceded by separate hyphens, or all in a row after one hyphen, or a mixture; they’re all single characters.) By default, this compiles one INTERCAL program given as the input file directly to an executable without doing anything fancy; usually you will want to give options, which are described below.
The following command-line options to ick
affect
what dialect of the INTERCAL language is
compiled by the compiler; you may need to set one or more of
these options if your input is not the default
C-INTERCAL but instead some other language like
INTERCAL-72 or CLC-INTERCAL, or just
because you like certainty or like being different with respect
to your output. Note that there is no command-line option
corresponding to TriINTERCAL (or the base 4-7
versions); instead, the numeric base to use is determined by
looking at the filename extension (‘.i’
for base 2, the default, or ‘.3i’ to
‘.7i’ for the base 3-7 versions.)
If this option is not given, there is a small chance that a random bug appears in the compiler, which causes the programs it creates to manifest a bug that causes error E774 (see E774). Giving the option means that this bug will not happen. (You may wonder why this bug was preserved; it is in fact a bug that was carefully preserved since the days of INTERCAL-72, in this case, but the option to turn it off is available as a workaround. (There are no plans to fix this or any of the other carefully preserved bugs any time soon, because that would kind of defeat the point of having preserved them.) Interestingly, the INTERCAL-72 compiler documentation mentions a similar command-line option that is a workaround for the same bug.)
This option needs to be given to allow any multithreading or backtracking commands or identifiers to be used. (Unlike with other language features, this is not autodetected because it’s legal to have a program with multiple COME FROM (see COME FROM) commands aiming at the same line even when it isn’t multithreaded, in which case the commands cause error E555 (see E555) when that line is encountered (with the usual caveats about both commands having to be active at the time).) Attempts to use non-COME FROM multithreading or backtracking commands without this option produce error E405 (see E405).
This option makes it possible to link non-INTERCAL programs with INTERCAL programs; instead of giving INTERCAL programs only on the command line, give one INTERCAL program, followed by any number of programs in other languages that have been written to be able to link to INTERCAL programs. It also allows expansion libraries to be specified on the command line, after the INTERCAL program (expansion libraries are given with no extension). For more information, see External Calls. Also, both the -a and -e options must be set to use CREATEd operators (regardless of whether external calls are used or not).
This option causes the system library to never be linked; this option is only useful if your program references a line number in the range 1000 to 1999, contains no line numbers in that range, and yet still doesn’t want the system library to be linked in; therefore, it is mostly useful with -e when adding in a custom replacement system library written in a non-INTERCAL language, especially the expansion library syslibc (a system library replacement written in C).
This option tells the compiler to treat the source code as INTERCAL-72; as a result, any language constructs that are used but weren’t available in 1972 will trigger error E111 (see E111).
This option allows the CREATE statement (see CREATE) to be used. Note that enabling it carries a run-time penalty, as it means that operand overloading code has to be generated for every variable in the program. (This option is not necessarily needed for the external call version of CREATE to work, but the external call version has fewer features without it.) Note that -e (see -e) also needs to be set to be able to CREATE operators.
It is possible to write INTERCAL code
sufficiently tortuous that it ends up assigning to a
constant. Generally speaking, this isn’t what you
wanted to do, so the compiler will kindly cause an error
(E277; see E277) that stops the insanity
at that point, but at the cost of a significant amount of
performance you can give this option to tell the compiler to
simply change the constant and keep on going anyway. (Note
that unlike CLC-INTERCAL, this only changes uses
of the constant preceded by #
in your program,
not things like line numbers; you want Forte for that.) This
option also allows you to write arbitary expressions on the
left of an assignment statement if you wish.
When this option is given, the generated programs will write the number 4 as ‘IIII’ rather than ‘IV’, in case you’re writing a clock program.
This tells the compiler to treat the input as PIC-INTERCAL (see PIC-INTERCAL) rather than ordinary C-INTERCAL input, and generate PIC output code accordingly. There are a lot of options that are incompatible with this, as well as many language features, due to the limited memory available on a PIC. If you get error E256 (see E256), you have this option given when it shouldn’t be; likewise, if you get error E652 (see E652), you should be using this option but aren’t. (A few simple programs are C-INTERCAL/PIC-INTERCAL polyglots, but such programs are incapable of doing input or output, meaning that they aren’t particularly useful.)
The C-INTERCAL and CLC-INTERCAL
compilers use different notation for various things,
sometimes to the extent where the same notation is legal in
both cases but has a different meaning. As this is the
C-INTERCAL compiler, it rather guessably uses
its own notation by default; however, the
CLC-INTERCAL notation can be used as the default
instead using this option. (In most situations where there
isn’t an ambiguity about what something means, you can
use the ‘wrong’ syntax freely.) The option causes
ambiguous characters like ?
to be interpreted
with Princeton rather than Atari meanings.
This option causes some constructs with different meanings in C-INTERCAL and CLC-INTERCAL to use the CLC-INTERCAL meaning rather than the C-INTERCAL meaning. At present, it affects the abstention of a GIVE UP (see GIVE UP) command by line number, which is possible as long as this switch isn’t given; reading through the INTERCAL-72 manual, there are a lot of things that imply that this probably wasn’t intended to be possible, but as far as I can tell that manual doesn’t actually say anywhere that this particular case is disallowed, even though it rules out all other similar cases. It also causes I/O on array variables to be done in CLC-INTERCAL’s extended Baudot syntax, rather than using the Turing Tape method.
Sometimes things will go wrong with your program, or with the way
ick
was installed. There may even be unknown bugs in
ick
itself (if you find one of these, please report
it). The following options are used to debug the whole system on
various levels.
If you think that something has gone wrong with the parser, or you want to see how your program is being parsed, you can give this option on the command line. All the debug output produced by the parser and lexical analyser will be output.
This option allows debugging of the final executable at the C code level. Any C code generated will be left in place, and the -g option will be given to the C compiler that’s used to compile the code, so all the information needed for a C debugger to be used on the executable will be present there.
These options allow debugging of the optimiser, or produce output helpful for understanding how your program has been summarised. -h produces a summary of what optimiser rules were used, the initial expression and what it was optimised to; -H produces a more expanded view that shows each intermediate step of optimisation, and -hH shows the same output as -H, but written completely using C syntax (the other options output in a strange mix of INTERCAL and C).
This option turns on generation of warnings (see Warnings). To make sure that they aren’t actually useful, or are only marginally useful, the warning generator is far too sensitive, and there is no way to decide which warnings are given and which ones aren’t; you either get all of them or none.
This option causes the program to run immediately after being
compiled, and profiles the resulting program to identify
performance bottlenecks, etc. The usefulness of this depends
on the resolution of the timers on the computer and operating
system; DOS, in particular, is really bad with timer
resolution. The output will be saved in a file called
yuk.out when the program finishes running.
It’s legal to turn on both the profiler and the
interactive debugger at the same time, but if you do this the
profiler will also identify bottlenecks in the person typing
in commands to step through the program! The profiler will,
in fact, identify all the timings that particular commands in
the program take; so WRITE IN
instructions will
often show up as taking a long time due to their need to wait
for input.
This option causes the produced program to support the printflow option fully; when this option is not given, printflow will in most cases have partial or no support (except in multithreaded programs, where this option is redundant), because not all the code needed for it will be included in the program to save space.
When you are getting problems with finding files – for instance, the compiler can’t find the skeleton file (see E999) or the system library (see E127) – this option will let you know, on standard error, where the compiler is looking for files. This may hopefully help you pin down where the file-finding problems are coming from, and also offers the option of simply placing copies of the files where the compiler is looking as a last resort.
This is the main debugging option: it loads yuk, an interactive INTERCAL debugger with ability to step through the program, set breakpoints, view and modify variables, etc. See yuk.
This options causes the command line to be displayed for all
calls to other programs that ick
makes (mostly
to gcc
); it is therefore useful for debugging
problems with the command lines used when using the external
calls system (see External
Calls).
The internal error E778 (see E778) should
never happen. However, there are all sorts of potential
problems that may come up, and if part of the code detects
something impossible, or more usually when the operating
system detects things have got too insane and segfaults,
normally this error will just be generated and that’s
that. (I most often get this when I’ve been writing a
new section of code and have made a mistake; hopefully, all
or at least most of these errors are fixed before release,
though.) If you want more information as to what’s
going on, you can give the -U option, which will
cause the compiler to raise an abort signal when an internal
error happens. This can generally be caught by a debugger
that’s being run on ick
itself at the
time; on many systems, it will also cause a core dump.
These options allow you to control how far to compile (all the way to an executable, or only to C, etc.), and where the output will be created. Note that the output options may change depending on the other options selected; for instance, many of the debug options will prevent the code being compiled all the way to an executable.
By default, the original INTERCAL code will be compiled all the way to an executable, and the intermediate C and object files produced will be deleted. Giving this option causes the compiler to stop when it has finished producing the C file, leaving the C file there as the final output of the compiler. (Its filename is the same as the source file, but with ‘.c’ as its extension/suffix rather than the source file’s extension.) Without this option, an executable will be produced with the extension changed to whatever’s appropriate for the system you are on (or omitted entirely if that’s appropriate for the system).
This option also places verbose comments in the output C file.
This option causes the compiler to progress no further than
producing the C output file, but instead of writing it to a
file writes it directly to standard output. This might
occasionally be useful when using ick
as part of
a pipe; it can also be useful to see how far the compiler
gets with compiling code before an error happens, when
you’re trying to track down an error.
There are various command line options that can be used to tell
ick
whether and in what ways to optimize code.
This option requests the compiler to attempt to analyse the
flow of the program and optimize accordingly; for instance,
it will detect which commands can’t possibly be
ABSTAINED
from and refrain from generating code
to check the abstention status of those commands.
This option tells the compiler to optimize the output for speed. This is done to crazy extremes; the compiler may take several hours/days analysing the program in some cases and still not come up with an improvement. It turns on all the other optimizer options. Note that not all systems accept this option, because it sometimes outputs a shell script disguised as an executable rather than an actual executable.
This option tells the compiler to apply optimizer idioms to the expressions in the code given, when appropriate. The list of idioms is stored in the file src/idiotism.oil; note that it is compiled into the compiler, though, so you will have to rebuild and reinstall the compiler if you change it. For more information about changing the list of idioms, see Optimizer Idiom Language.
Some options just can’t be classified.
If this option is given, the compiler doesn’t run at all, but instead prints a set of instructions for using it, explaining which options are available on the system you’re on and which options conflict with which other options.
Once the compiler runs and produces an output executable, that executable itself will accept a range of options that control the way it runs. None of these options have to be used; a default value will be assumed if they aren’t.
Whether ‘+’ or ‘-’ is given at the start of this option, it will cause the program to print out what options are available and what state they are in. It will then cause the program to exit via an internal error.
If the ‘+’ version of this is given (rather than the default ‘-’), then the program will print a message explaining that you are a wimp (the mode itself is known as wimpmode), and for the rest of execution will input in Arabic numerals (‘123’ rather than ‘ONE TWO THREE’) and likewise will output in Arabic numerals rather than Roman numerals (such as ‘CXXIII’). True INTERCAL programmers should rarely have to use this mode.
This option does not actually appear to do anything.
This option causes standard output to be flushed whenever any characters are output when the ‘+’ version is used, rather than on each newline (the default ‘-’ version). It is most useful for more responsive pipes when outputting binary data, and also useful for debugging very slow programs.
The usual debugging methods don’t work with multithreaded or backtracking programs. This option exists to give at least a slim chance of working out what is going on with them. It causes the program to print the line number of the command it thinks it may be executing next (i.e. the line number that would be printed if that line had an error) immediately after executing each command, and also an internal identifier for the thread that that command was in. It also prints a trace of what parts of the multithreader are being activated; so for instance, it will tell you when a thread is being forked into multiple threads or when a choicepoint has been deleted. Note that the -w option (see -w) must be given to gain full support for flow printing in non-multithreaded non-backtracking programs, because otherwise the required code to print this information will not be generated.
This option is occasionally capable of doing something, but is deliberately undocumented. Normally changing it will have no effect, but changing it is not recommended.
Various environment variables can be set to affect the operation
of ick
.
Variable | Meaning |
---|---|
ICKINCLUDEDIR
ICKLIBDIR ICKSYSDIR ICKCSKELDIR |
These four environment variables suggest locations in which
ick should look to find various files that it
needs: the skeleton file, system library, C header files and
libraries that it needs, constant-output optimiser, and the
GNU General Public License (which the debugger needs to be
able to display on demand for legal reasons).
|
CC |
The name of a C compiler to use (defaults to
gcc ; C-INTERCAL has recently been
tested only with gcc and clang ).
This option has no effect on DJGPP, where gcc is
always used.
|
ICKTEMP
TMPDIR TEMP TMP |
On DJGPP, ick creates temporary files to pass
options to gcc as a method of getting around the limit on the
length of a command line that can sometimes affect DOS
programs. These four environment variables are tried (in this
order) to determine a location for the temporary file; if
none of them are set, the current directory is used.
|
Things may go wrong, either during the compilation or the execution of your program. Note that some things that would be compile-time errors in many other languages – such as syntax errors – are in fact run-time errors in INTERCAL.
Errors and warnings appear as an error code starting with
‘ICL’, followed by a three digit number,
followed by ‘I’ for an error or
‘W’ for a warning. However, they will be
notated here as ‘E000’, etc., to save
space and because consistency was never a strong point of
INTERCAL. This is followed by a text description
of the error, and a hint as to the location of the error. This is
not the line on which the error occurred, but rather the line on
which the next command to be executed is. To add to the fun, the
calculation of the next command to be executed is done at
compile-time rather than runtime, so it may be completely wrong due
to things like abstention on COME FROM
s or computed
COME FROM
s. The moral of this story is that, if you
really want to know where the error is, use a debugger. Note also
that if the error happens at compile-time, there is no guarantee
that the line number given makes any sense at all. Some errors
don’t give next line numbers, mostly those for which it
doesn’t make logical sense, such as E633 (see E633). After this is a suggestion to correct (or
reconsider) the source code and to resubnit it. (This typo has been
carefully preserved for over a decade.)
This is a list of the error messages that might be produced during the compilation or execution of an INTERCAL program.
This is an unusual error; it’s what’s printed
when a syntax error is encounted at runtime, in a situation
in which it would be executed. (An ABSTAIN
ed
syntax error, for instance, would not be executed; this is
one of the mechanisms available for writing comments.) The
text of the error message is simply the statement that
couldn’t be decoded.
DO YOU EXPECT ME TO FIGURE THIS OUT?
This error occurs when there is an attempt to use a constant with a value outside the onespot range; it’s a compile-time error.
PROGRAMMER IS INSUFFICIENTLY POLITE
The balance between various statement identifiers is
important. If less than approximately one fifth of the
statement identifiers used are the polite versions containing
PLEASE
, that causes this error at compile time.
PROGRAMMER IS OVERLY POLITE
Of course, the same problem can happen in the other direction; this error is caused at compile time if more than about one third of the statement identifiers are the polite form.
COMMUNIST PLOT DETECTED, COMPILER IS SUICIDING
This error happens when you give the -t option (see -t) but you use a language construct that wasn’t available in INTERCAL-72. If this happens, then either there’s a mistake in the program that prevents it being INTERCAL-72 or you shouldn’t be compiling it as INTERCAL-72 in the first place.
PROGRAM HAS DISAPPEARED INTO THE BLACK LAGOON
There is a hard limit of 80 NEXT
s at a time;
this is to discourage excessive use of NEXTING
for things like recursion. (Recursive programs are entirely
legal; you simply have to figure out how to do it with
computed COME FROM
instead. (For the record, it
is possible. (Using lots of nested brackets when talking
about recursion is great (yay!).))) Another problem with
writing the source code that can cause this error is a
failure to properly FORGET
the entry on the
NEXT
stack created when trying to simulate a
goto.
SAYING ’ABRACADABRA’ WITHOUT A MAGIC WAND WON’T DO YOU ANY GOOD
Your program asked to include a system library (by specifying a line number in a magic range without including a line with that number), but due to installation problems the compiler couldn’t find the system library to include. You could try using the -u (see -u) option to see where the compiler’s looking; that may give you an idea of where you need to copy the system library so that the compilation will work. This error happens at compile time and doesn’t give a next command line number.
PROGRAM HAS GOTTEN LOST
This error happens at compile time when the compiler
can’t figure out where a NEXT
command is
actually aiming (normally due to a typo in either the line
label given or the line label on the line aimed for). The
logic behind this error means that the next line to be
executed is unknown (after all, that’s the whole point
of the error) and is therefore not given. The -e
command-line option (see -e) makes
this error into a run-time error, because it allows
NEXT
commands to dynamically change targets at
runtime, as well as line labels to dynamically change values,
and thus the error is impossible to detect at compile time.
I WASN’T PLANNING TO GO THERE ANYWAY
This error happens at compile time when an
ABSTAIN
or REINSTATE
references a
non-existent target line. This generally happens for much the
same reasons as E129 (see E129).
YOU MUST LIKE THIS LABEL A LOT!
At present, it’s impossible to have more than one line
with the same line number. That would make NEXT
act too much like COME FROM
in reverse to be
interesting. This error happens at compile time. (For
inconsistency, it is possible to have multiple lines
with the same number as long as at most one of them is in an
INTERCAL program (the others have to be in
programs in other languages included via the external calls
system). The resulting behaviour is entirely inconsistent
with the rest of the language, though, for what I hope are
obvious reasons.)
SO! 65535 LABELS AREN’T ENOUGH FOR YOU?
Legal values for line labels are 1 to 65535 (certain subranges are reserved for system and expansion libraries). This error comes up if you use nonpositive or twospot values for a line label.
NOTHING VENTURED, NOTHING GAINED
You used a variable that isn’t actually in your program. Failing that (which, contrary to previous versions of this manual, is indeed possible in the present version of C-INTERCAL, although I’m not telling how; a hint: what mechanism in C-INTERCAL allows for a computed variable number?), you specified an illegal number for a variable (legal numbers are positive and onespot). This error happens at compile time, at least for illegal variable numbers.
BUMMER, DUDE!
In INTERCAL, you’re allowed to
STASH
as much as you like; this makes the
language Turing-complete and allows for unlimited recursion
when combined with computed COME FROM
in the
right way. Unfortunately, real computers aren’t so
idealised; if you manage to write a program so
memory-intensive that the computer runs out of memory to
store stashes, it causes this error at runtime. To fix this
error, you either have to simplify the program or upgrade
your computer’s memory, and even then that will only
help to some extent.
ERROR HANDLER PRINTED SNIDE REMARK
Arrays have to be large enough to hold at least one element; you tried to dimension an array which isn’t large enough to hold any data. This error happens at run time.
VARIABLES MAY NOT BE STORED IN WEST HYPERSPACE
This error happens at run time when the subscripts given to
an array are inconsistent with the way the array was
dimensioned, either because there were the wrong number of
subscripts or because a subscript was too large to fit in the
array. It can also happen when a multidimensional array is
given to a command, such as WRITE IN
, that
expects it to be monodimensional.
I’VE FORGOTTEN WHAT I WAS ABOUT TO SAY
This run-time error message is caused by the compiler running out of memory whilst trying to do I/O; at present, it can only happen during CLC-INTERCAL-style I/O.
THAT’S TOO HARD FOR MY TINY BRAIN
Some commands simply aren’t available in PIC-INTERCAL. I mean, PICs generally have less than a kilobyte of memory; you’re not going to be able to use some of the more confusing language features with that sort of resource limitation. The solution is to replace the affected command, or to not give the -P option (see -P) if you didn’t mean to compile as PIC-INTERCAL in the first place.
DON’T BYTE OFF MORE THAN YOU CAN CHEW
This error happens when there is an attempt to store a twospot value in a onespot variable. The actual size of the value is what matters when counting its spots; so you can store the output of a mingle in a onespot variable if it happens to be less than or equal to 65535, for instance. (This is not necessarily the case in versions of INTERCAL other than C-INTERCAL, though, so you have to be careful with portability when doing this.)
YOU CAN ONLY DISTORT THE LAWS OF MATHEMATICS SO FAR
Reverse assignments are not always mathematically possible. Also, sometimes they require changing the value of a constant; this is only legal if you specifically specified that it was legal by using the -v option. In the case of an impossible reverse assignment (including a situation in which operand overloading causes a reverse assignment to happen), this error happens at runtime.
This error can also come up when a scalar variable is overloaded to an array (which doesn’t make sense, but could happen if someone exploited bugs in the CREATE statement (see CREATE)), and an attempt is made to read or assign to that variable. (Subscripting a scalar variable is a syntax error, so there is no use for doing such an overload anyway.)
THAT MUCH QUOTATION AMOUNTS TO PLAGIARISM
There is a limit of 3200 on the number of nested spark/ears
groups allowed. If you somehow manage to exceed that limit,
that will cause this error. Try breaking the expression up
into smaller expressions. (The limit is trivial to increase
by changing SENESTMAX
in ick.h; if
you ever actually come across a program that hits the limit
but wasn’t designed to, just email the maintainer to
request a higher limit.)
YOU CAN’T HAVE EVERYTHING, WHERE WOULD YOU PUT IT?
Your program references so many variables that the compiler couldn’t cope. This error is unlikely to ever happen; if it does, try reducing the number of variables you use by combining some into arrays. This is a compile-time error.
THAT’S TOO COMPLEX FOR ME TO GRASP
This is another compile-time error that’s unlikely to ever happen; this one signifies the compiler itself running out of memory trying to compile your program. The only solutions to this are to simplify your program, or to make more memory available to the compiler.
I’M ALL OUT OF CHOICES!
Your program asked that a choicepoint be backtracked to or removed, but there aren’t any choicepoints at the moment. This runtime error usually indicates a logic mistake in your program. In backtracking programs translated from other backtracking languages, this indicates that the program has failed.
PROGRAM REJECTED FOR MENTAL HEALTH REASONS
Your program used a construct that only makes sense when
multithreading or backtracking (WHILE
,
MAYBE
, GO BACK
, or GO
AHEAD
), but you didn’t specify the
-m option (see -m). If
you meant to write a multithreaded or backtracking program,
just give that option; if you didn’t, be careful what
words you use in comments! This error happens at
compile-time.
THROW STICK BEFORE RETRIEVING!
In order to RETRIEVE
a variable, it has to be
STASH
ed first; if it isn’t, then this
error happens at runtime.
IT CAME FROM BEYOND SPACE
A COME FROM
aiming at a line label — as
opposed to a computed COME FROM
, which is
allowed to be pointing at a nonexistent line — must
point to a valid line label. The same applies to NEXT
FROM
. This error happens at compile time if a
nonexistent line label is found in one of these contexts.
YOU WANT MAYBE WE SHOULD IMPLEMENT 64-BIT VARIABLES?
This error is like E275 (see E275), but applies when an attempt is made at runtime to store a threespot value (or even a fourspot or morespot value) in a twospot variable, or a threespot or greater value is produced as an intermediate during a calculation (for instance by a mingle operation). No values above twospot are allowed at any point during an INTERCAL program; if you want to process higher numbers, you have to figure out a different way of storing them.
BETTER LATE THAN NEVER
Oops! The compiler just noticed that it had a buffer overflow. (Normally programs catch buffer overflows before they happen; C-INTERCAL catches them just afterwards instead.) This only happens on systems which don’t have a modern C standard library. Try using shorter or fewer filenames on the command line, to reduce the risk of such an overflow.
FLOW DIAGRAM IS EXCESSIVELY CONNECTED
Aiming two COME FROM
s at the same line only
makes sense in a multithreaded program. In a non-multithread
program, doing that will cause this error at compile time (if
neither COME FROM
is computed) or at run time
(if the command that has just finished running is
simultaneously the target of two or more COME
FROM
s). This either indicates an error in your program
or that you’ve forgotten to use the -m
option (see -m) if you are actually
trying to split the program into two threads.
I DO NOT COMPUTE
The program asked for input, but for some reason it
wasn’t available. (This is a runtime error, obviously.)
The error may happen because the input is being piped in from
a command or file which has reached end-of-file, or because
the user typed CTRL-D
(UNIX/Linux) or CTRL-Z
(DOS/Windows) while the program was trying to WRITE
IN
some data.
WHAT BASE AND/OR LANGUAGE INCLUDES string?
When reading spelt-out-digit input, the input didn’t seem to be a valid digit in English, Sanskrit, Basque, Tagalog, Classical Nahuatl, Georgian, Kwakiutl, Volapük, or Latin. This seems to have languages covered pretty well; what on earth were you using, or did you just make a spelling mistake?
ERROR TYPE 621 ENCOUNTERED
The compiler encountered error E621 (see E621). This happens at runtime when the program
requests that no entries are removed from the
NEXT
stack (which is possible), but that the
last entry removed should be jumped to (which given the
circumstances isn’t, because no entries were removed).
THE NEXT STACK RUPTURES. ALL DIE. OH, THE EMBARRASSMENT!
When an attempt is made to RESUME
past the end
of the NEXT
stack, the program ends; however,
this cause the program to end in a manner other than via
GIVE UP
or DON'T TRY AGAIN
, so an
error message must be printed, and this is that error
message.
PROGRAM FELL OFF THE EDGE
You can’t just let execution run off the end of the
program. At least, that is, if it doesn’t end with
TRY AGAIN
. An attempt to do that causes this
error at runtime. Note that if your program references the
system library, then it counts as being appended to your
program and so the program will run into the first line of
the system library rather than cause this error. As it
happens, the first line of the system library is a syntax
error, so doing this will cause E000 (see E000) with the error text ‘PLEASE
KNOCK BEFORE ENTERING’. There isn’t a next
statement to be executed with E633, so the next statement
won’t be given in the error message.
HOW DARE YOU INSULT ME!
The PIN
command doesn’t make much sense
for anything bigger than a PIC; using it
in a non-PIC program causes this error at
compile-time. Try using the normal input and output
mechanisms instead. This error may also be a clue that you
are trying to compile a PIC-INTERCAL program
without giving the -P option (see -P).
COMPILER HAS INDIGESTION
There isn’t a limit on the length of an input program other than your computer’s memory; if your computer does run out of memory during compilation, it causes this error. This error can also be caused if too many input files are specified on the command line; if you suspect this is the problem, split the compilation into separate compilations if you can, or otherwise you may be able to concatenate together your input files into larger but fewer files. Yet another potential cause of this error is if a line in an input program is too long; sensible line-wrapping techniques are encouraged.
RANDOM COMPILER BUG
No compiler is perfect; sometimes errors just happen at random. In this case, the random error is E774. If you don’t like the idea that your program may be shot down by a random compiler bug, or you are doing something important, you can use the -b option (see -b) to prevent this bug happening. (You may wonder why this bug is in there at all if it’s so easily prevented. The answer is that such a bug was present in the original INTERCAL-72 compiler, which also had an option to turn the bug off. It’s also a reward for people who actually read the manual.)
A SOURCE IS A SOURCE, OF COURSE, OF COURSE
You specified a file to compile on the command line, but the compiler couldn’t find or couldn’t open it. This is almost certainly because you made a typo specifying the file.
UNEXPLAINED COMPILER BUG
This should never come up, either at compile time or at run time. It could come up at either when an internal check by the compiler or the runtime libraries realises that something has gone badly wrong; mistakes happen, and in such cases the mistake will have been detected. (If this happens at compile time you can use the -U option (see -U) to cause the compiler to send an abort signal – which normally causes a core dump – when the error happens, to help debug what’s causing it.) More often, this error comes up when the operating system has noticed something impossible, like an attempt to free allocated memory twice or to write to a null pointer, and tells the compiler an error has occured, in which case the same response of putting up this error happens. The point is that in all cases this error indicates a bug in the compiler (even if it happens at run time); in such cases, it would be very helpful if you figure out what caused it and send a bug report (see Reporting Bugs).
ARE ONE-CHARACTER COMMANDS TOO SHORT FOR YOU?
This is a debug-time error caused when you give too much input to the debugger when all it wanted was to know what you wanted to do next.
PROGRAM IS TOO BADLY BROKEN TO RUN
There’s a limit to how many breakpoints you can have in a program; you’ve broken the limit and therefore broken the debugger. This is a debug-time error.
I HAVE NO FILE AND I MUST SCREAM
The output file couldn’t be written, maybe because the disk is full or because there’s already a read-only file with the same name. This is a compile-time error.
HELLO? CAN ANYONE GIVE ME A HAND HERE?
This error occurs at compile-time if a file type was requested for which the required libraries are unavailable. (Support for Funge does not ship with the compiler; instead, you need to generate the library yourself from the cfunge sources. For more information, see Creating the Funge-98 Library.)
FLAG ETIQUETTE FAILURE BAD SCOUT NO BISCUIT
This error occurs at runtime if an INTERCAL program was passed an unknown option flag.
YOU HAVE TOO MUCH ROPE TO HANG YOURSELF
There is no limit on the number of threads or choicepoints that you can have in a multithreaded or backtracking program (in a program that isn’t multithreaded or backtracking, these are obviously limited to 1 and 0 respectively). However, your computer may not be able to cope; if it runs out of memory in the multithreader, it will cause this error at runtime.
I GAVE UP LONG AGO
TRY AGAIN
has to be the last command in a
program, if it’s there at all; you can’t even
follow it by comments, not even if you know in advance that
they won’t be REINSTATE
d. This error
happens at compile time if a command is found after a
TRY AGAIN
.
NOCTURNAL EMISSION, PLEASE LAUNDER SHEETS IMMEDIATELY
This error should never happen, and if it does indicates a compiler bug. It means the emitter function in the code degenerator has encountered an unknown opcode. Please send a copy of the program that triggered it to the INTERCAL maintainers.
DO YOU REALLY EXPECT ME TO HAVE IMPLEMENTED THAT?
Some parts of the code haven’t been written yet. There ought to be no way to cause those to actually run; however, if you do somehow find a way to cause them to run, they will cause this error at compile time.
ILLEGAL POSSESSION OF A CONTROLLED UNARY OPERATOR
Some operators (such as whirlpool (@
) and
sharkfin (^
)) only make sense in
TriINTERCAL programs, and some have a minimum
base in which they make sense. This error happens at
compile-time if you try to use an operator that conflicts
with the base you’re in (such as using
TriINTERCAL operators in an
INTERCAL program in the default base 2).
EXCUSE ME, YOU MUST HAVE ME CONFUSED WITH SOME OTHER COMPILER
This error occurs just before compile-time if a file is encountered on the command line that C-INTERCAL doesn’t recognise. (If this error occurs due to a ‘.a’, ‘.b98’, ‘.c’, ‘.c99’, or ‘.c11’ file, then you forgot to enable the external calls system using -e (see -e).)
NO SKELETON IN MY CLOSET, WOE IS ME!
The skeleton file ick-wrap.c or pickwrap.c is needed to be able to compile INTERCAL to C. If the compiler can’t find it, it will give this error message. This indicates a problem with the way the compiler has been installed; try using the -u option (see -u) to find out where it’s looking (you may be able to place a copy of the skeleton file in one of those places).
This is a list of the warnings stored in the warning database. Warnings only come up when the -l option (see -l) is given; even then, some of the warnings are not currently implemented and therefore will never come up.
DON’T TYPE THAT SO HASTILY
The positional precedence rules for unary operators are somewhat complicated, and it’s easy to make a mistake. This warning is meant to detect such mistakes, but is not currently implemented.
THAT WAS MEANT TO BE A JOKE
If an INTERCAL expression has been translated from another language such as C, the optimiser is generally capable of translating it back into something similar to the original, at least in base 2. When after optimisation there are still INTERCAL operators left in an expression, then this warning is produced. (Therefore, it’s likely to come up quite a lot if optimisation isn’t used!) The system library produces some of these warnings (you can tell if a warning has come up in the system library because you’ll get a line number after the end of your program).
THAT RELIES ON THE NEW WORLD ORDER
This warning comes up whenever the compiler recognises that you’ve added some code that didn’t exist in INTERCAL-72. This allows you to check whether your code is valid INTERCAL-72 (although -t (see -t) is more useful for that); it also warns you that code might not be portable (because INTERCAL-72 is implemented by most INTERCAL compilers, but more recent language features may not be).
SYSLIB IS OPTIMIZED FOR OBUSCATION
There is an idiom used in the system library that does a right-shift by selecting alternate bits from a twospot number and then mingling them the other way round. A rightshift can much more easily be done with a single rightshift, so this is a silly way to do it, and this warning warns that this idiom was used. However, the present optimizer is incapable of recognising whether this problem exists or not, so the warning is not currently implemented.
YOU CAN’T EXPECT ME TO CHECK BACK THAT FAR
It’s an error to assign a twospot value (a value over 65535) to a onespot variable, or to use it as an argument to a mingle. If the optimizer can’t guarantee at compile time that there won’t be an overflow, it issues this warning. (Note that this doesn’t necessarily mean there’s a problem — for instance, the system library generates some of these warnings — only that the optimiser couldn’t work out for sure that there wasn’t a problem.)
WARNING HANDLER PRINTED SNIDE REMARK
Your code looks like it’s trying to assign 0 to an array, giving it no dimension; this is an error. This warning is produced at compile time if it looks like a line in your code will cause this error, but it isn’t necessarily an error because that line of code might never be executed.
FROM A CONTRADICTION, ANYTHING FOLLOWS
It’s sometimes impossible to reverse an assignment (a reverse assignment can happen if the -v option (see -v) is used and an expression is placed on the left of an assignment, or in operand overloading); if the compiler detects that a reversal failure is inevitable, it will cause this warning. Note that this doesn’t always cause an error, because the relevant code might never be executed.
THE DOCUMENTOR IS NOT ALWAYS RIGHT
There is no way to get this warning to come up; it isn’t even written anywhere in C-INTERCAL’s source code, is not implemented by anything, and there are no circumstances in which it is even meant to come up. It is therefore not at all obvious why it is documented.
KEEP LOOKING AT THE TOP BIT
C-INTERCAL uses a slightly different typing mechanism to some other INTERCAL compilers; types are calculated at compile time rather than run time. This only makes a difference in some cases involving unary operators. It’s impossible to detect at compile time for certain whether such a case has come up or not, but if the compiler or optimizer thinks that such a case might have come up, it will issue this warning.
WARNING TYPE 622 ENCOUNTERED
Your code looks like it’s trying to resume by 0; this is an error. This warning is produced at compile time if it looks like a line in your code will cause this error, but it isn’t necessarily an error because that line of code might never be executed.
The C-INTERCAL distribution contains a runtime
debugger called ‘yuk’. Unlike most other debuggers, it
is stored as object code rather than as an executable, and it is
compiled into the code rather than operating on it. To debug code,
add -y (see -y) to the
command line of ick
when invoking it; that tells it to
compile the debugger into the code and then execute the resulting
combination. (The resulting hybrid debugger/input executable is
deleted afterwards; this is to prevent it being run by mistake, and
to prevent spreading the debugger’s licence onto the code it
was compiled with.)
yuk can also be used as a profiler using the -p option (see -p); this produces a file yuk.out containing information on how much time was spent running each command in your program, and does not prompt for debugger commands.
Note that some command line arguments are incompatible with the debugger, such as -m and -f. In particular, this means that multithreaded programs and programs that use backtracking cannot be debugged using this method; the +printflow option (see +printflow) to a compiled program may or may not be useful for debugging multithreaded programs.
When the debugger starts, it will print a copyright message and a message on how to access online help; then you can enter commands to run/debug the program. The debugger will show a command prompt, ‘yuk007 ’, to let you know you can input a command.
Here are the commands available. Commands are single characters
followed by newlines, or followed by a line number (in decimal) and
a newline or a variable name (a .
, ,
,
:
or ;
followed by a number in decimal;
note that some commands only allow onespot and twospot variables as
arguments).
Command | Description |
---|---|
aLINE | All non-abstained commands on line LINE become abstained from once. |
bLINE | A breakpoint is set on line LINE. The breakpoint causes execution with ‘c’ to stop when it is reached. |
c | The program is run until it ends (which also ends the debugger) or a breakpoint is reached. |
dLINE | Any breakpoint that may be on line LINE is removed. |
eLINE | An explanation of the main expression in each command on line LINE is printed to the screen. The explanation is in the same format as the format produced by -h (see -h) and shows what the optimiser optimised the expression to (or the original expression if the optimiser wasn’t used). |
fLINE | Removes the effect of the ‘m’ command on line LINE. |
gLINE |
Causes the current command to be the first command on
LINE (if not on that line already) or the next
command on LINE, as if that line was
NEXT ed to and then that NEXT stack
item was forgotten.
|
h | Lists 10 lines either side of the current line; if there aren’t 10 lines to one or the other side of the current line, instead more lines will be shown on the other side to compensate, if available. |
iVAR |
Causes variable VAR to become IGNORE d,
making it read-only.
|
jVAR |
Causes variable VAR to become
REMEMBER ed, making it no longer read-only.
|
k |
Continues executing commands until the NEXT stack
is the same size or smaller than it was before. In other words,
if the current command is not a NEXT and
doesn’t have a NEXT FROM aiming at it, one
command is executed; but if a NEXT does happen,
execution will continue until that NEXT returns or
is forgotten. A breakpoint or the end of the program also end
this.
|
lLINE | Lists 10 lines of source code either side of line LINE, the same way as with ‘h’, but using a line stated in the command rather than the current line. |
mLINE | Produces a message onscreen every time a command on line LINE is executed, but without interrupting the program. |
n |
Show the NEXT stack on the screen.
|
o |
Continue executing commands until the NEXT stack
is smaller than it was before. If you are using
NEXT s like procedures, then this effectively means
that the procedure will run until it returns. A breakpoint or
the end of the program also end this.
|
p | Displays the value of all onespot and twospot variables. |
q | Aborts the current program and exits the debugger. |
rLINE | Reinstates once all abstained commands on line LINE. |
s | Executes one command. |
t | Continues execution until the end of the program or a breakpoint: each command that executes is displayed while this command is running. |
uLINE | Continues execution of the program until just before a command on line LINE is run (or a breakpoint or the end of the program). |
vVAR | Adds a ‘view’ on variable VAR (which must be onespot or twospot), causing its value to be displayed on the screen whenever a command is printed on screen (for instance, because the command has just been stepped past, or due to the ‘m’ or ‘t’ commands). |
w | Displays the current line and current command onscreen. |
xVAR | Removes any view and any action that may be associated with it on variable VAR (which must be onespot or twospot). |
yVAR | Adds a view on variable VAR; also causes a break, as if a breakpoint was reached, whenever the value of that variable changes. |
zVAR | Adds a view on variable VAR; also causes a break, as if a breakpoint was reached, whenever that variable’s value becomes 0. |
VAR | A onespot or twospot variable written by itself prints out the value of that variable. |
<VAR |
WRITEs IN a new value for variable VAR.
Note that input must be in the normal ‘ONE TWO
THREE’ format; input in any other format will
cause error E579 (see E579) and as that is
a fatal error, the debugger and program it’s debugging
will end.
|
* |
Displays the license conditions under which ick is
distributed.
|
? | Displays a summary of what each command does. (‘@’ does the same thing.) |
While the code is executing (for instance, during a ‘c’, ‘k’, ‘o’, ‘t’ or ‘u’ command), it’s possible to interrupt it with CTRL-C (on UNIX/Linux) or CTRL-BREAK (on Windows/DOS); this will cause the current command to finish running and the debugger prompt to come back up.
INTERCAL programs consist of a list of statements. Execution of a program starts with its first statement; generally speaking each statement runs after the previous statement, although many situations can change this.
Whitespace is generally insignificant in INTERCAL programs; it cannot be added in the middle of a keyword (unless the keyword contains whitespace itself) or inside a decimal number, but it can be added more or less anywhere else, and it can be removed from anywhere in the program as well.
An INTERCAL statement consists of an optional
line label, a statement identifier, an optional execution chance,
the statement itself (see Statements),
and optionally ONCE
or AGAIN
.
The history of INTERCAL is plagued with multiple syntaxes and character sets. The result has settled down with two versions of the syntax; the original Princeton syntax, and the Atari syntax (which is more suited to the operating systems of today).
INTERCAL-72 | C-INTERCAL | CLC-INTERCAL | J-INTERCAL |
---|---|---|---|
some versions | version 0.18+ | all versions | no |
The original INTERCAL-72 compiler was the Princeton compiler, which introduced what has become known as the Princeton syntax for INTERCAL; this is the syntax used in the original manual, for instance, and can be considered to be the ‘original’ or ‘official’ INTERCAL syntax. It is notable for containing various characters not found in some character sets; for instance, it writes the operator for mingle as a cent sign (known as ‘change’). The other operator that often causes problems is the bookworm operator ‘V’, backspace, ‘-’, which is used for exclusive-or; the backspace can cause problems on some systems (which was probably the original intention). This syntax is also the default syntax in the CLC-INTERCAL compiler, which is the de facto standard for expanding the Princeton syntax to modern INTERCAL features that are not found in INTERCAL-72; however, it does not appear to have been used as the default syntax in any other compilers. Nowadays, there are other ways to write the required characters than using backspace; for instance, the cent sign appears in Latin-1 and UTF-8, and there are various characters that approximate bookworms (for instance, CLC-INTERCAL uses the Latin-1 yen symbol for this, which just to make things confusing, refers to a mingle in modern Atari syntax).
INTERCAL-72 | C-INTERCAL | CLC-INTERCAL | J-INTERCAL |
---|---|---|---|
some versions | yes | version 0.05+ | yes |
The other main syntax is the Atari syntax, so called because it was originally described in notes about an “Atari implementation” added to the paper INTERCAL-72 manual when it was softcopied in 1982. These notes describe a never-completed compiler implementation for 6502 by Mike Albaugh and Karlina Ott; it was meant to use the Atari 800 cartrtidge and screen editor, but that portion was never written. The syntax was designed to work better on ASCII-based systems, by avoiding the change character (although it can still be written as ‘c’, backspace, ‘/’, which the Atari compiler documentation claims that the Princeton compiler supported) in favour of a ‘big money’ character (‘$’), and using the ‘what’ (‘?’) as an alternative character for exclusive-or. This is the syntax that C-INTERCAL and J-INTERCAL have always used, and is the one most commonly used for communicating INTERCAL programs on Usenet and other similar fora (where ASCII is one of the most reliable-to-send character sets). It is also the syntax used for examples in this manual, for much the same reason. The Atari syntax for constructs more modern than INTERCAL-72 is normally taken to be that used by the C-INTERCAL compiler, because it is the only Atari-syntax-based compiler that contains non-INTERCAL-72 constructs that actually need their own notation.
INTERCAL-72 | C-INTERCAL | CLC-INTERCAL | J-INTERCAL |
---|---|---|---|
yes | all versions | all versions | all versions |
The first part of an INTERCAL statement is a line label that specifies what its line number is. This is optional; it’s legal to have a statement without a line number, although that prevents other commands referring to it by number. Line numbers must be constants, and unique within the program. However, they do not have to be in order; unlike some other languages with line numbers, a line with a higher number can come earlier in the program than a line with a lower number, and the numbers don’t affect the order in which commands are executed.
A line label is a integer expressed in decimal within a wax/wane
pair ((
and )
). For instance, this is a
valid line label:
(1000)
Note that line numbers from 1000 to 1999 are used by the system library, so using them within your own programs may produce unexpected errors if the system library is included. Apart from this, line numbers from 1 to 65535 are allowed.
It has become reasonably customary for people writing INTERCAL libraries to pick a range of 1000 line numbers (for instance, 3000 to 3999) and stick to that range for all line numbers used in the program (apart from when calling the system library), so if you want to write an INTERCAL library, it may be a good idea to look at the existing libraries (in the pit/lib directory in the C-INTERCAL distribution) and choose a range of numbers that nobody else has used. If you aren’t writing a library, it may be a good idea to avoid such number ranges (that is, use only line numbers below 1000 or very high numbers that are unlikely to be used by libraries in the future), so that you can easily add libraries to your program without renumbering in the future.
INTERCAL-72 | C-INTERCAL | CLC-INTERCAL | J-INTERCAL |
---|---|---|---|
yes | all versions | all versions | all versions |
After the line label (if a statement has one) comes the statement identifier, which marks where the statement starts. Either the label or the statement identifier, whichever one comes first, marks where the preceding statement finishes.
The main statement identifier is DO
. It also has two
synonyms, PLEASE
and PLEASE DO
; these
synonyms are the ’polite’ forms of statement
identifiers. Although the three identifiers have the same
meaning, using either polite or non-polite identifiers too much
can cause an error; the correct proportion is approximately 3
non-polite identifiers for every polite identifier used. None of
these identifiers actually does anything else apart from marking
where the statement starts; they leave the statements in the
default ‘reinstated’ state.
Adding NOT
or N'T
to the end of any of
these identifiers, to create a statement identifier such as
DO NOT
or PLEASE DON'T
, also creates a
valid statement identifier. These differ in meanings from the
previous set of identifiers, though; they cause the statement
they precede to not be executed by default; that is, the command
will be skipped during execution (this is known as the
‘abstained’ state). This applies even if the command
in question is in fact a syntax error, thus causing this to be a
useful method of writing comments. One common idiom is to write
code like this:
PLEASE NOTE: This is a comment.
The statement identifier (PLEASE NOT
) is the only
part of this statement that is valid INTERCAL;
however, because the statement identifier is in the negated form
that contains NOT
, the syntax error won’t be
executed, and therefore this is a valid statement. (In
INTERCAL, syntax errors happen at runtime, so
a program containing a statement like DOUBT THIS WILL
WORK
will still compile, and will not end due to the
syntax error unless that statement is actually executed. See
E000.)
The ABSTAIN
and REINSTATE
statements
can override the NOT
or otherwise on a statement
identifier; see ABSTAIN.
In backtracking programs, MAYBE
is also a valid
statement identifier; see MAYBE. It comes
before the other keywords in the statement identifier, and an
implicit DO
is added if there wasn’t one
already in the statement identifier (so MAYBE
,
MAYBE DO
, MAYBE DON'T
, MAYBE
PLEASE
, and so on are all valid statement identifiers).
INTERCAL-72 | C-INTERCAL | CLC-INTERCAL | J-INTERCAL |
---|---|---|---|
yes | all versions | version 0.02+ | all versions |
It’s possible to specify that a command should be run only
a certain proportion of the time, at random. This is a rarely
used feature of INTERCAL, although it is the
only way to introduce randomness into a program. (The
C-INTERCAL compiler approximates this with
pseudorandomness.) An execution chance specification comes
immediately after the statement identifier, but before the rest
of the statement, and consists of a double-oh-seven
(%
) followed by an integer from 1 to 99 inclusive,
written in decimal; this gives the percentage chance of the
statement running. The execution chance only acts to prevent a
statement running when it otherwise would have run; it cannot
cause a statement that would otherwise not have run to run. For
instance, the statement DO %40 WRITE OUT #1
has a
40% chance of writing out ‘I’, but the
statement DON'T %40 WRITE OUT #1
has no chance of
writing out I
or anything else, because the
N'T
prevents it running and the double-oh-seven
cannot override that.
INTERCAL-72 | C-INTERCAL | CLC-INTERCAL | J-INTERCAL |
---|---|---|---|
no | version 0.25+ | no | no |
The last part of a statement is an optional ONCE
or
AGAIN
. ONCE
specifies that the
statement is self-abstaining or self-reinstating (this will be
explained below); AGAIN
specifies that the statement
should behave like it has already self-reinstated or
self-abstained. Whether the behaviour is self-abstention or
self-reinstatement depends on whether the statement was initially
abstained or not; a ONCE
on an initially reinstated
statement or AGAIN
on an initially abstained
statement indicates a self-abstention, and a ONCE
on
an initially abstained statement or AGAIN
on an
initially reinstated statement indicates a self-reinstatement.
The first time a self-abstaining statement is encountered, it is
executed as normal, but the statement is then abstained from and
therefore will not run in future. Likewise, the first time a
self-reinstating statement is encountered, it is not executed (as
is normal for an abstained statement), but then becomes
reinstated and will run in future. In each of these cases, the
ONCE
effectively changes to an AGAIN
;
the ONCE
only happens once, as might be expected.
REINSTATING
a currently abstained self-abstaining
statement or ABSTAINING
(that is, with the
ABSTAIN
or REINSTATE
commands) a
currently reinstated self-reinstating statement causes the
AGAIN
on the statement to change back into a
ONCE
, so the statement will again self-abstain or
self-reinstate. Likewise, REINSTATING
a currently
abstained self-reinstating statement or ABSTAINING
a
currently reinstated self-abstaining statement causes its
ONCE
to turn into an AGAIN
.
Historical note: ONCE
was devised by Malcom Ryan as
a method of allowing synchronisation between threads in a
multithreaded program (ONCE
is atomic with the
statement it modifies, that is, there is no chance that threads
will change between the statement and the ONCE
).
AGAIN
was added to Malcom Ryan’s Threaded
Intercal standard on the suggestion of Kyle Dean, as a method of
adding extra flexibility (and to allow the ONCE
s to
happen multiple times, which is needed to implement some
multithreaded algorithms).
Many INTERCAL statements take expressions as arguments. Expressions are made up out of operands and operators between them. Note that there is no operator precedence in INTERCAL; different compilers resolve ambiguities different ways, and some versions of some compilers (including the original INTERCAL-72 compiler) will cause error messages on compiling or executing an ambiguous expression, so it’s safest to fully group each expression.
INTERCAL-72 | C-INTERCAL | CLC-INTERCAL | J-INTERCAL |
---|---|---|---|
yes | all versions | all versions | all versions |
The basic operands in INTERCAL are constants and variables. These together make up what in other languages are known as ‘lvalues’, that is, operands to which values can be assigned. (Constants can also be lvalues in INTERCAL, but by default C-INTERCAL turns this off because it carries an efficiency penalty and can be confusing; this can be turned on with the -v option (see -v).)
Constants can have any integer value from 0 to 65535 inclusive;
higher values (up to 4294967295) can be generated in programs,
but cannot be specified literally as constants. (The usual way to
work around this limitation is to interleave two constants
together; see Mingle.) A constant is
written as a mesh (#
) followed by a number in
decimal. At the start of the program, all constants have the same
value as the number that identifies them; for instance,
#100
has 100 as its value, and it’s strongly
advised not to change the value of a constant during the
execution of a program.
There are four types of variable: 16-bit and 32-bit unsigned
integers, and arrays of 16-bit and 32-bit unsigned integers.
These are represented with a spot, twospot, tail, and hybrid
(.
, :
, ,
, and
;
) respectively. For this reason, integers within
the range 0 to 65535 inclusive are known as ‘onespot
numbers’, and integers within the range 0 to 4294967295
inclusive are known as ‘twospot numbers’; variables
with those ranges are known as onespot and twospot variables.
(Note that arrays did not work in C-INTERCAL before
version 0.7.)
Variables are represented with a character representing their
data type, followed by an integer from 1 to 65535 inclusive,
written in decimal. Non-array variables don’t need to be
declared before they are used; they automatically exist in any
program that uses them. For instance, .1
and
.001
are the same variable, onespot number 1. Array
variables need to be dimensioned before they are used, by
assigning dimensions to them; see Calculate.
Because there are no operator precedences in INTERCAL, there are various solutions to specifying what precedences actually are.
All known versions of INTERCAL accept the
INTERCAL-72 grouping rules. These state that
it’s possible to specify that an operator takes
precedence by grouping it inside sparks ('
) or
rabbit-ears ("
), the same way as wax/wane pairs
(parentheses) are used in other programming languages.
INTERCAL-72 and earlier C-INTERCAL
versions demanded that expressions were grouped fully like
this, and this practice is still recommended because it leads
to portable programs and is easier to understand. Whether
sparks or rabbit-ears (often called just ‘ears’
for short) are used normally doesn’t matter, and
programmers can use one or the other for clarity or for
aesthetic appeal. (One common technique is to use just sparks
at the outermost level of grouping, just ears at the next
level, just sparks at the next level, and so on; but
expressions like ''#1~#2'~"#3~#4"'~#5
are
completely unambiguous, at least to the compiler.)
There are, however, some complicated situations involving array subscripting where it is necessary to use sparks and ears at alternate levels, if you want to write a portable program. This limitation is in C-INTERCAL to simplify the parsing process; INTERCAL-72 has the same limitation, probably for the same reason. Compare these two statements:
DO .1 <- ,3SUB",2SUB.1".2 DO .1 <- ,3SUB",2SUB.1".2~.3"".4
The problem is that in the first statement, the ears close a group, and in the second statement, the ears open a group, and it’s impossible to tell the difference without unlimited lookahead in the expression. Therefore, in similar situations (to be precise, in situations where a group is opened inside an array subscript), it’s necessary to use the other grouping character to the one that opened the current group if you want a portable program.
One final comment about sparks and rabbit-ears; if the next
character in the program is a spot, as often happens because
onespot variables are common choices for operands, a spark
and the following spot can be combined into a wow
(!
). Unfortunately, the rabbit-ear/spot
combination has no one-character equivalent in any of the
character sets that C-INTERCAL accepts as input
(UTF-8, Latin-1, and ASCII-7) as none of these contain the
rabbit character, although the Hollerith input format that
CLC-INTERCAL can use does.
The precedence rules used by CLC-INTERCAL for
grouping when full grouping isn’t used are simple to
explain: the largest part of the input that looks like an
expression is taken to be that expression. The main practical
upshot of this is that binary operators right-associate; that
is, .1~.2~.3
is equivalent to
.1~'.2~.3'
. C-INTERCAL versions
0.26 and later also right-associate binary operators so as to
produce the same results as CLC-INTERCAL rules
in this situation, but as nobody has yet tried to work out
what the other implications of CLC-INTERCAL
rules are they are not emulated in C-INTERCAL,
except possibly by chance.
In INTERCAL-72 and versions of
C-INTERCAL before 0.26, unary operators were
always in the ‘infix’ position. (If you’re
confused about how you can have an infix unary operator: they
go one character inside a group that they apply to, or one
character after the start of a constant or variable
representation; so for instance, to portably apply the unary
operator &
to the variable :1
,
write :&1
, and to portably apply it to the
expression '.1~.2'
, write
'&.1~.2'
.) CLC-INTERCAL, and
versions of C-INTERCAL from 0.26 onwards, allow
the ‘prefix’ position of a unary operator, which
is just before whatever it applies to (as in
&:1
). This leads to ambiguities as to
whether an operator is prefix or infix. The portable solution
is, of course, to use only infix operators and fully group
everything, but when writing for recent versions of
C-INTERCAL, it’s possible to rely on its
grouping rule, which is: unary operators are interpreted as
infix where possible, but at most one infix operator is
allowed to apply to each variable, constant, or group, and
infix operators can’t apply to anything else. So for
instance, the C-INTERCAL
'&&&.1~.2'
is equivalent to the
portable '&"&.&1"~.2'
(or the more
readable version of this,
"&'"&.&1"~.2'"
, which is also
portable). If these rules are counter-intuitive to you,
remember that this is INTERCAL we’re
talking about; note also that this rule is unique to
C-INTERCAL, at least at the time of writing, and
in particular CLC-INTERCAL is likely to
interpret this expression differently.
Operators are used to operate on operands, to produce more complicated expressions that actually calculate something rather than just fetch information from memory. There are two types of operators, unary and binary operators, which operate on one and two arguments respectively. Binary operators are always written between their two operands; to portably write a unary operator, it should be in the ‘infix’ position, one character after the start of its operand; see Prefix and infix unary operators for the full details of how to write unary operators portably, and how else you can use them if you aren’t aiming for portability. This section only describes INTERCAL-72 operators; many INTERCAL extensions add their own operators.
INTERCAL-72 | C-INTERCAL | CLC-INTERCAL | J-INTERCAL |
---|---|---|---|
yes | all versions | all versions | all versions |
Mingle, or interleave, is one of the two binary operators in INTERCAL-72. However, different INTERCAL compilers represent it in different ways, so it is impossible to write a mingle in a program completely portably, because it differs between Princeton and Atari syntax, and worse, the sequence of character codes needed to represent it in each syntax has varied from compiler to compiler.
The original INTERCAL-72 compiler (the Princeton
compiler) used the ’change’ (cent) character for a
mingle, represented as c
, backspace,
/
. (By the way, this is still the most portable
way to write a mingle; both C-INTERCAL and
CLC-INTERCAL accept it, at least if a lowercase
c
is used, the Atari compiler was to accept it,
and its documentation claimed that the Princeton compiler also
accepted it; CLC-INTERCAL also accepts a capital
C
before the backspace and /
, and
allows |
rather than /
.) The
uncompleted Atari compiler intended to use a ’big
money’ character ($
) as the mingle
character; this character is also the only one accepted for
mingle by the J-INTERCAL compiler.
C-INTERCAL originally also used the $
character for mingle, and this character is the one most
commonly seen in existing C-INTERCAL programs, and
most often used when giving examples of
INTERCAL on Usenet, because it exists in the
ASCII-7 character set, and because it doesn’t contain
control characters. From version 0.18 of
C-INTERCAL onwards, various other units of
currency (change, quid, and zlotnik if Latin-1 is used as the
input, and euro if Latin-9 is used as the input) are accepted;
from version 0.20 onwards, in addition to the Latin-1
characters, all the currency symbols in Unicode are accepted if
UTF-8 is used as the input format. CLC-INTERCAL
has always used the change character (either the Latin-1
version or the version that contains a backspace) for mingle.
In this manual, mingle will be represented as $
,
but it’s important to bear in mind that this character is
not the most portable choice.
The mingle operator should be applied to two operands or expressions. To be portable, the operands must both be onespot expressions, that is expressions which have a 16-bit result; C-INTERCAL relaxes this rule slightly and only requires that the result be in the onespot range. (This is because the data type of a select operator’s value is meant to be determined at runtime; C-INTERCAL determines all data types at compile time, so has to guess a 32-bit result for a select with a 32-bit type as its right operand even when the result might actually turn out to be of a 16-bit type, and so this behaviour prevents an error when a select operation returns a value with a 16-bit data type and is used as an argument to a mingle.) The result is a 32-bit value (that is, it is of a 32-bit data type, even if its value fits into the onespot range), which consists of bits alternated from the two arguments; to be precise, its most significant bit is the most significant bit of its first argument, its second most significant bit is the most significant bit of its second argument, its third most significant bit is the second most significant bit of its first argument, and so on until its least significant bit, which is the least significant bit of its second argument.
One of the most common uses of interleaving is to create a
constant with a value greater than 65535; for instance, 65536
is #0$#256
. It is also commonly used in
expressions that need to produce 32-bit results; except in some
simple cases, this is usually coded by calculating separately
the odd-numbered and even-numbered bits of the result, and
mingling them together at the end. It is also used in
expressions that need to left-shift values or perform similar
value-increasing operations, as none of the other operators can
easily do this; and mingle results are commonly used as the
argument to unary binary logic operators, because this causes
them to behave more like the binary logic operators found in
some other languages.
INTERCAL-72 | C-INTERCAL | CLC-INTERCAL | J-INTERCAL |
---|---|---|---|
yes | all versions | all versions | all versions |
The select operator is one of the two binary operators in
INTERCAL-72; unlike mingle, every known
implementation of INTERCAL ever has used the
sqiggle character (~
) as the representation of the
select operator, meaning that writing it portably is easy.
The select operator takes two arguments, which can be of either
datatype (that is, 16- or 32-bit). It returns a value made by
selecting certain bits of its first operand indicated by the
second operand, and right-justifying them. What it does is that
it ignores all the bits of the first operand where the second
operand has a 0 as the corresponding bit, that is, deletes them
from a copy of the operand’s value; the bits that are
left are squashed together towards the least-significant end of
the number, and the result is filled with 0s to make it up to
16 or 32 bits. (In INTERCAL-72 the minimum
multiple of 16 bits possible that the result fits into is
chosen, although if :1 has the value 131061 (in hex, 1FFFF) the
expression #21~:1
produces a 32-bit result because
17 bits were selected, even though many of the leading bits
were zeros; in C-INTERCAL the data type of the
result is the same as of the right operand of the select, so
that it can be determined at compile time, and so using a unary
binary logic operator on the result of select when the right
operand has a 32-bit type is nonportable and not recommended.)
As an example, #21~:1
produces 21 as its result if
:1 has the value 131061, 10 as its result if :1 has the value
30 (1E in hex; the least significant bit of 21 is removed
because it corresponds to a 0 in :1), and 7 as its result if :1
has the value 21 (because three bits in 21 are set, and those
three bits from 21 are therefore selected by 21).
Select is used for right-shifts, to select every second bit from a number (either to produce what will eventually become an argument to mingle, or to interpret the result of a unary binary logic operator, or occasionally both), to test if a number is zero or not (by selecting it from itself and selecting 1 from the result), in some cases as a limited version of bitwise-and (that only works if the right operand is 1 less than a power of 2), and for many other purposes.
INTERCAL-72 | C-INTERCAL | CLC-INTERCAL | J-INTERCAL |
---|---|---|---|
yes | all versions | all versions | all versions |
There are three unary operators in INTERCAL-72,
each of which carries out a binary logic operation on adjacent
bits of the number. The operators are and, or, and exclusive
or; and and or are represented by an ampersand
(&
) and book (V
) respectively,
and exclusive or has the same notational problems as mingle, as
it differs between Princeton and Atari syntax. It was
represented by a bookworm, written V
, backspace,
-
, in the Princeton INTERCAL-72
implementation, and this is still the most portable way to
write it (C-INTERCAL and CLC-INTERCAL
accept it). The Atari implementation of
INTERCAL-72 wrote it with a what (?
),
and this is the representation originally used by
C-INTERCAL (and still accepted), the only
representation accepted by J-INTERCAL, the one
most commonly used on Usenet, and the one used in this manual
(although again, it’s worth pointing out that this
isn’t portable). CLC-INTERCAL approximates a
bookworm with the yen character, which being a currency
character is one of the possible representations for mingle in
C-INTERCAL; C-INTERCAL uses the
rather confusing method of interpreting a yen character as
exclusive-or if input in Latin-1 but as mingle if input in
UTF-8. (This usually does the right thing, because
CLC-INTERCAL doesn’t support UTF-8.) In the
same way, CLC-INTERCAL has a
C-INTERCAL compatibility option to allow the use
of ?
for exclusive-or.
The operators take each pair of consecutive bits in their
arguments (that is, the least significant with the second least
significant, the second least significant with the third least
significant, the third least significant with the fourth least
significant, and so on, with the pair consisting of the most
significant and least significant being used to calculate the
most significant bit of the result), and perform an appropriate
logic operation on them; and sets a bit of the result if and
only if both bits in the pair were set, or sets each bit
corresponding to each pair where either bit was set, and
exclusive or sets if and only if the bits in the pair had
different values (that is, one was set, but not both). So for
instance, #&26
is 16 (26 is 1A in hexadecimal
or 11010 in binary); #V26
is 31 (11111 in binary),
and #?26
is 23 (10111 in binary).
The most commonly seen use for these operators is to carry out bitwise ands, ors, and exclusive ors between two different 16-bit expressions, by mingling them together, applying a unary binary logic operator, and selecting every second bit of the result; such code often results due to people thinking in terms of some other language when writing INTERCAL, but is still often useful. (Historically, the first idiom added to the optimizer, apart from constant folding, was the mingle/unary/select sequence.) There are more imaginative uses; one impressive example is the exclusive or in the test for greater-than from the original INTERCAL-72 system library:
DO :5 <- "'?":1~'#65535$#0'"$":2~'#65535$#0'"' ~'#0$#65535'"$"'?":1~'#0$#65535'"$":2~'#0$ #65535'"'~'#0$#65535'" DO .5 <- '?"'&"':2~:5'~'"'?"'?":5~:5"~"#65535~ #65535"'~'#65535$#0'"$#32768'~'#0$#65535'" $"'?":5~:5"~"#65535$#65535"'~'#0$#65535'"' "$"':5~:5'~#1"'~#1"$#2'~#3
The first statement works out the value of :1 bitwise exclusive or :2; the second statement then works out whether the most significant set bit in :5 (that is, the most significant bit that differs between :1 and :2) corresponds to a set bit in :2 or not. In case that’s a bit too confusing to read, here’s the corresponding optimizer idiom (in OIL):
((_1~:2)~((?32(:2~:2))^#2147483648))->(_1>(:2^_1))
(Here, the ^ refers to a bitwise exclusive or, an operation found in OIL but not in INTERCAL, which is why the INTERCAL version is so much longer.) The INTERCAL version also has some extra code to check for equality and to produce 1 or 2 as the output rather than 0 or 1.
INTERCAL-72 | C-INTERCAL | CLC-INTERCAL | J-INTERCAL |
---|---|---|---|
yes | version 0.7+ | all versions | all versions |
In order to access the elements of an array, either to read or
write the array, it is necessary to use the array subscript
operator SUB
. Note that an array element is not a
variable, so it is not accepted as an acceptable argument to
statements like IGNORE
; however, it can be
assigned to.
The syntax for an array element is the array, followed by the
keyword SUB
, followed by an expression for the
element number in the array. In the case of a multidimensional
array, more than one expression is given after the keyword
SUB
to give the location of the element in each of
the array’s dimensions. The first element in an array or
array dimension is numbered 1.
For instance, this is a legal (but not particularly useful) INTERCAL program with no syntax errors that shows some of the syntaxes possible with array subscripting:
PLEASE ,1 <- #2 DO .1 <- #2 DO ,1 SUB .1 <- #1 DO ,1 SUB #1 <- ,1 SUB #2 PLEASE ;1 <- #2 BY #2 DO ;1 SUB #1 #2 <- ,1 SUB ,1 SUB .1 DO READ OUT ;1SUB#1.1 DO GIVE UP
Grouping can get complicated when nested array subscripting is used, particularly with multiple subscripts. It is the programmer’s job to write an unambiguous statement, and also obey the extra grouping rules that apply to array subscripts; see Grouping Rules.
There is a wide range of statements available to INTERCAL programs; some identifiably belong to a particular variant or dialect (such as Backtracking INTERCAL), but others can be considered to be part of the ’core language’. The statements listed here are those that the C-INTERCAL compiler will accept with no compiler switches to turn on particular dialect options. Note that many statements have slightly different effects in different implementations of INTERCAL; known incompatibilities are listed here, but it’s important to check your program on multiple compilers when attempting to write a portable program.
INTERCAL-72 | C-INTERCAL | CLC-INTERCAL | J-INTERCAL |
---|---|---|---|
yes | version 0.15+ | all versions | all versions |
One of the more commonly-used commands in INTERCAL is the syntax error. A properly-written syntax error looks nothing like any known INTERCAL command; a syntax error that looks vaguely like a command but isn’t may confuse C-INTERCAL before version 0.28, and possibly other compilers, into bailing out at compile time in some situations (this is known as a ‘serious syntax error’), and so is not portable. For other syntax errors, though, the semantics are easily explained: there is a run-time error whenever the syntax error is actually executed, and the line containing the syntax error is used as the error message.
One purpose of this is to allow your programs to produce their own custom errors at run time; however, it’s very important to make sure that they start and end in the right place, by manipulating where statement identifiers appear. Here’s a correct example from the system library:
DOUBLE OR SINGLE PRECISION ARITHMETIC OVERFLOW
This is a valid INTERCAL command, that
produces an error when run (note the DO
at the
start). An even more common use is to produce an initially
abstained syntax error by using an appropriate statement
identifier, for instance
PLEASE NOTE THAT THIS IS A COMMENT
This would produce an error if reinstated somehow, but assuming
that this isn’t done, this is a line of code that does
nothing, which is therefore equivalent to a comment in other
programming languages. (The initial abstention is achieved with
the statement identifier PLEASE NOT
; the extra
E
causes the command to be a syntax error, and this
particular construction is idiomatic.)
Referring to the set of all syntax errors in a program (or the
set of all commands of any other given type) is achieved with a
special keyword known as a ‘gerund’; gerund support
for syntax errors is resonably recent, and only exists in
CLC-INTERCAL (version 1.-94.-3 and later, with
COMMENT
, COMMENTS
, or
COMMENTING
), and C-INTERCAL
(COMMENT
in version 0.26 and later, and also
COMMENTS
and COMMENTING
in version 0.27
and later). Therefore, it is not portable to refer to the set of
all syntax errors by gerund; using a line label is a more
portable way to refer to an individual syntax-error command.
INTERCAL-72 | C-INTERCAL | CLC-INTERCAL | J-INTERCAL |
---|---|---|---|
yes | all versions | all versions | all versions |
At present, the only INTERCAL command that
contains no keywords (apart from the statement identifier and
possibly ONCE
or AGAIN
) is what is
known as the ‘calculate’ command. It is used to
assign values to variables, array elements, and arrays; assigning
a value to an array changes the number of elements that that
array can hold, and causes the values of all elements previously
in that array to be lost. The syntax of a calculate command is as
follows:
DO .1 <- ':2~:3'~#55
That is, the command is written as a variable or array element,
then the <-
operator (known as an
‘angle-worm’ and pronounced ‘gets’), then
an expression to assign to it. In the special case when an array
is being dimensioned by assigning a value to it, the expression
can contain the keyword BY
to cause the array to
become multidimensional; so for a 3 by 4 by 5 array, it would be
possible to write
DO ,1 <- #3 BY #4 BY #5
The calculate command always evaluates the expression, even if for some reason the assignment can’t be done (for instance, if the variable being assigned to is read-only); this is important if the expression has side-effects (for instance, giving an overflow error). If the variable does happen to be read-only, there is not an error; the expression being assigned to it is just evaluated, with the resulting value being discarded.
The gerund to refer to calculations is CALCULATING
;
however, if you are planning to use this, note that a bug in
older versions of C-INTERCAL means that assignments
to arrays are not affected by this gerund before version 0.27.
CLC-INTERCAL from 1.-94.-4 onwards, and
C-INTERCAL from 0.26 onwards, allow arbitrary
expressions on the left hand side of an assignment
(C-INTERCAL only if the -v
option is
used); for more information on how such ‘reverse
assignments’ work, see Operand Overloading.
INTERCAL-72 | C-INTERCAL | CLC-INTERCAL | J-INTERCAL |
---|---|---|---|
yes | all versions | see text | all versions |
The only flow-control commands in INTERCAL-72 were
NEXT
, RESUME
, and FORGET
;
together these manipulate a stack of locations in the program
known as the ‘NEXT stack’. Although all
INTERCAL compilers have implemented these,
from CLC-INTERCAL version 0.05 onwards
CLC-INTERCAL has considered them obsolete, and
therefore a special command-line switch needs to be used to
enable them. (They are still the most portable flow-control
commands currently available, though, precisely because
INTERCAL-72 implements nothing else.) Note that
there is a strict limit of 80 locations on the NEXT stack,
enforced by all known INTERCAL compilers; this
helps to enforce good programming style, by discouraging
NEXT-stack leaks (which are otherwise quite easy to write).
Here are examples to show the syntax of these three statements:
DO (1000) NEXT DO FORGET '.1~.1'~#1 DO RESUME .5
The NEXT
command takes a line label as its argument
(unlike most other INTERCAL commands, it comes
after its argument rather than before); both FORGET
and RESUME
take expressions.
(CLC-INTERCAL from version 0.05 onwards also allows
an expression in NEXT
, rather than a label, to give
a computed NEXT
, but this behaviour was not
implemented in other compilers, and is deprecated in
CLC-INTERCAL along with noncomputed
NEXT
; if computed NEXT
is ever
implemented in C-INTERCAL, it will likely likewise
be deprecated upon introduction). (Update: it was implemented in
C-INTERCAL version 0.28, but only as part of the
external calls system, so it cannot be used in ordinary programs;
a sample expansion library gives in-program access to a limited
form of computed NEXT
, but should probably not be
used.) Running a NEXT
causes the program control to
transfer to the command whose line label is referenced, and also
saves the location in the program immediately after the
NEXT
command on the top of the NEXT stack.
In order to remove items from the NEXT
stack, to
prevent it filling up (which is what happens with a naive attempt
to use the NEXT
command as an equivalent to what
some other languages call GOTO), it is possible to use the
FORGET
or RESUME
commands. They each
remove a number of items from the NEXT stack equal to their
argument; RESUME
also transfers control flow to the
last location removed from the NEXT
stack this way.
Trying to remove no items, or more items than there are in the
stack, does not cause an error when FORGET
is used
(no items or all the items are removed respectively); however,
both of these cases are errors in a RESUME
statement.
Traditionally, boolean values in INTERCAL
programs have been stored using #1 and #2 as the two logic
levels. This is because the easiest way to implement an if-like
construct in INTERCAL-72 is by NEXTING
,
then NEXTING
again, then RESUMING
either by 1 or 2 according to an expression, and then if the
expression evaluated to 1 FORGETTING
the remaining
NEXT stack entry. By the way, the previous sentence also
explained what the appropriate gerunds are for NEXT
,
RESUME
, and FORGET
.
INTERCAL-72 | C-INTERCAL | CLC-INTERCAL | J-INTERCAL |
---|---|---|---|
yes | all versions | all versions | all versions |
The NEXT stack is not the only stack available in an
INTERCAL program; each variable used in the
program also has its own stack, which holds values of the same
type as the variable. The STASH
command pushes a
variable’s value onto that variable’s stack;
RETRIEVE
can be used in the same way to pop the top
element of a variable’s stack to replace that
variable’s value. The syntax is the same as most other
INTERCAL commands, with the word
STASH
or RETRIEVE
followed by the
variable or variables to stash or retrieve:
DO STASH .1 + ;2 DO RETRIEVE ,3
Note that it is possible to stash or retrieve multiple variables
at once, by listing their names separated by intersections
(+
); it’s even possible to stash or retrieve a
variable twice in the same statement.
It is not entirely clear how RETRIEVE
interacts with
IGNORE
in historical INTERCAL-72
compilers; the three modern INTERCAL compilers
all use different rules for the interaction (and the
C-INTERCAL maintainers recommend that if anyone
decides to write their own compiler, they choose yet another
different rule so that looking at the interaction (the so-called
‘ignorret test’) can be used as a method of
determining which compiler is running):
RETRIEVE
simply allows a change to
its value despite the read-only status.
The appropriate gerunds for STASH
and
RETRIEVE
are STASHING
and
RETRIEVING
respectively.
INTERCAL-72 | C-INTERCAL | CLC-INTERCAL | J-INTERCAL |
---|---|---|---|
yes | all versions | all versions | all versions |
Variables in INTERCAL can be either read-write
or read-only. At the start of a program, all variables are
read-write, but this status can be changed dynamically during
execution of a program using the IGNORE
and
REMEMBER
statements (whose gerunds are
IGNORING
and REMEMBERING
respectively).
The syntax is the same as for STASH
and
RETRIEVE
: the command’s name followed by an
intersection-separated list of variables. For instance:
DO IGNORE .4 DO REMEMBER ,4 + ;5
Using the IGNORE
statement sets a variable to be
read-only (or does nothing if it’s read-only already);
REMEMBER
sets it to be read-write. Any attempt to
assign to a read-only variable silently fails. One place that
this is used is in the system library; instead of not assigning
to a variable in certain control flow paths, it instead sets it
to be read-only so that subsequent assignments don’t change
its value (and sets it to be read-write at the end, which
succeeds even if it was never set read-only in the first place);
the advantage of this is that it doesn’t need to remember
what flow path it’s on except in the variable’s
ignorance status.
The interaction between IGNORE
and
RETRIEVE
was never defined very clearly, and is in
fact different in C-INTERCAL,
CLC-INTERCAL and J-INTERCAL; for more
details, see RETRIEVE.
INTERCAL-72 | C-INTERCAL | CLC-INTERCAL | J-INTERCAL |
---|---|---|---|
yes | all versions | all versions | all versions |
The statement identifier of a statement determines whether
it’s in an abstained or reinstated state at the start of a
program; these states determine whether the statement runs at all
when it’s encountered. It is, however, possible to change
this state dynamically during a program’s execution, and
the statements to do this are rather appropriately named
ABSTAIN
and REINSTATE
. There are two
forms of each, one which takes a single line label (which must be
constant in most compilers, but can instead be an expression in
recent CLC-INTERCAL versions), and one which takes
an intersection-delimited list of gerunds. They look like this:
DO ABSTAIN FROM ABSTAINING + REINSTATING DO ABSTAIN FROM (10) DO REINSTATE CALCULATING DO REINSTATE (22)
(This also illustrates the gerunds used for these commands; note
that ABSTAINING
from REINSTATING
is
generally a bad idea!) The line referenced, or every command
represented by any gerund referenced, are reinstated or abstained
as appropriate (effectively changing the DO to DON’T (or
PLEASE to PLEASE DON’T, etc.), or vice versa). Using these
forms of ABSTAIN
and/or REINSTATE
won’t abstain from a command that’s already
abstained, or reinstate a command that’s already
reinstated.
There is a strange set of restrictions on ABSTAIN
and REINSTATE
that has existed since
INTERCAL-72; historically such restrictions have not
always been implemented, or have not been implemented properly.
They together define an unusual interaction of
ABSTAIN
and GIVE UP
(note, for
instance, that there isn’t a gerund for GIVE
UP
). The wording used in the INTERCAL-72
manual is:
[...] the statement DO ABSTAIN FROM GIVING UP is not accepted, even though DON’T GIVE UP is. [...] DO REINSTATE GIVING UP is invalid, and attempting to REINSTATE a GIVE UP statement by line label will have no effect. Note that this insures that DON’T GIVE UP will always be a "do-nothing" statement.
This restriction was not implemented at all in the only
CLC-INTERCAL version before 0.02 (i.e. version
0.01), or in C-INTERCAL versions before 1.26. The
restriction was implemented in C-INTERCAL version
1.26 and CLC-INTERCAL versions 0.02 and later as
“GIVE UP
cannot be REINSTATED
or
ABSTAINED FROM
”; however, this is not strictly
the same as the definition used by INTERCAL-72
(C-INTERCAL still uses this definition in
CLC-INTERCAL compatibility mode). The
J-INTERCAL implementation of this restriction is to
make REINSTATING
or ABSTAINING
from a
line label that refers to a GIVE UP
statement a
compile-time error, but this does not fit the
INTERCAL-72 definition either. The definition
adopted with version 0.27 and later of C-INTERCAL,
which is hopefully correct, is to allow abstaining from a
GIVE UP
statement by line number but to rule out the
other three cases (reinstating by line number silently fails,
reinstating or abstaining by gerund is impossible because there
is no gerund).
As well as CLC-INTERCAL’s extension to
abstain/reinstate by computed line number, there is also (since
version 0.25) a C-INTERCAL-specific extension to
ABSTAIN
, also known as ‘computed
abstain’, but with a different syntax and different
semantics. It’s written like an ordinary
ABSTAIN
, but with an expression between the words
ABSTAIN
and FROM
, for instance:
DO ABSTAIN #1 FROM (1000) DO ABSTAIN .2 FROM WRITING IN
Unlike non-computed ABSTAIN
, this form allows a
command to be abstained from even if it’s already been
abstained from; so if the first example command is run and line
(1000) is already abstained, it becomes
‘double-abstained’. The number of times the statement
is abstained from is equal to the number of times it was already
abstained from, plus the expression (whereas with non-computed
abstain, it ends up abstained once if it wasn’t abstained
at all, and otherwise stays at the same abstention status).
Reinstating a statement always de-abstains it exactly once; so
double-abstaining from a statement, for instance, means it needs
to be reinstated twice before it will actually execute.
There are many uses for ABSTAIN
(both the computed
and non-computed versions) and REINSTATE
, especially
when interacting with ONCE
and AGAIN
(see ONCE and AGAIN); the computed
version, in particular, is a major part of a particular concise
way to write conditionals and certain kinds of loops. They also
play an important role in multithreaded programs.
The READ OUT
and WRITE IN
commands are
the output and input commands in INTERCAL;
they allow communication between the program and its user. There
was a numeric I/O mechanism implemented in
INTERCAL-72, and it (or trivial variants) have been
likewise implemented in all more modern variants. However, it had
some obvious deficiences (such as not being able to read its own
output) which meant that other methods of I/O were implemented in
C-INTERCAL and CLC-INTERCAL.
The syntax of READ OUT
and WRITE IN
is
the same in all cases: the name of the command followed by an
intersection-separated list of items; the form of each item, the
compiler you are using, and its command line arguments together
determine what sort of I/O is used, which can be different for
different elements in the list.
INTERCAL-72 | C-INTERCAL | CLC-INTERCAL | J-INTERCAL |
---|---|---|---|
yes | all versions | see text | all versions |
INTERCAL-72 had its own versions of I/O commands;
these commands are available in all modern
INTERCAL compilers as well (but
CLC-INTERCAL implements output slightly
differently). To distinguish INTERCAL-72 input and
output from the other more modern types of I/O, the READ
OUT
and WRITE IN
commands must take one of
the following values: a onespot or twospot variable, a single
element of a tail or hybrid array, or (in the case of
READ OUT
) a constant, meaning that these are some
examples of the possible forms:
READ OUT .1 READ OUT ;2 SUB .3:4 READ OUT #3 WRITE IN :4 WRITE IN ,5 SUB #6
The statements do what you would expect; READ OUT
outputs its argument to the user, and WRITE IN
inputs a number from the user and assigns it to the variable or
array element referenced. (If the variable, or the array that
contains the array element, happens to be read-only, the input
or output still happens but in the case of WRITE
IN
silently skips the assignment, instead throwing away
the input.) The formats used for input and output are, however,
different from each other and from the formats used by most
mainstream languages.
Input is achieved by writing a number in decimal, one digit at
a time, with each digit written out as a word; so to input the
number 12345, a user would have to type ONE TWO THREE
FOUR FIVE
as input (if they were using English, the most
portable choice of language). In INTERCAL-72 only
English is accepted as a language, but other compilers accept
other languages in addition. C-INTERCAL from
version 0.10 onwards accepts English, Sanskrit, Basque,
Tagalog, Classical Nahuatl, Georgian, and Kwakiutl; also
Volapük from version 0.11 onwards, and Latin from version 0.20
onwards. J-INTERCAL accepts the same languages,
except with Esperanto instead of Latin; from version 0.05 of
CLC-INTERCAL onwards, the same list of languages
as C-INTERCAL is supported (apart from Latin,
which was added in version 1.-94.-8), plus Scottish Gaelic.
The format that output can be read in is a modified form of Roman numerals, known as ‘butchered’ Roman numerals. INTERCAL-72, C-INTERCAL and J-INTERCAL do this the same way; CLC-INTERCAL is somewhat different. The characters ‘I’, ‘V’, ‘X’, ‘L’, ‘C’, ‘D’, and ‘M’ mean 1, 5, 10, 50, 100, 500 and 1000 respectively, placing a lower-valued letter after a higher-valued letter adds them, and placing a lower-valued letter before a higher-valued letter subtracts it from the value; so ‘XI’ is 11 and ‘IX’ is 9, for instance. In INTERCAL-72, C-INTERCAL, and J-INTERCAL, a bar over a numeral multiplies its value by 1000, and writing a letter in lowercase multiplies its value by 1000000; however, CLC-INTERCAL uses lowercase to represent multiplication by 1000 and for multiplication by 1000000 writes a backslash before the relevant numeral.
INTERCAL-72 | C-INTERCAL | CLC-INTERCAL | J-INTERCAL |
---|---|---|---|
no | version 0.07+ | version 0.05+ | no |
C-INTERCAL’s method of character-based
(rather than numeric) input and output is known as the Turing
Tape method; it is a binary (character-set-agnostic)
input/output mechanism. To specify that
C-INTERCAL-style I/O is being used, an array must
be used as the argument to READ OUT
or WRITE
IN
; as the syntax is the same as for
CLC-INTERCAL’s I/O, command-line arguments
and the capabilities of the version of the compiler being used
serve to distinguish the two mechanisms.
The character-based input writes as many characters into a tail
or hybrid array as will fit, one character in each element. The
number that’s written into the array is not the character
code, though, but the difference between the character code and
the previous character code, modulo 256. (To be precise, the
code is the new character minus the previous character, or 256
minus (the previous character minus the new character) if the
previous character had a higher character code; the
’previous character’ is the previous character from
the input, not the previous character written into the array.)
End-of-file causes 256 to be written into the array. The
concept is that of a circular tape containing all the
characters, where the program measures how many spaces it needs
to move along the tape to reach the next character. The
’previous character’ starts at 0, but is preserved
throughout the entire program, even from one WRITE
IN
to the next.
Character-based output uses a similar model, but conceptually the output device moves on the inside of the tape, rather than on the outside. Therefore, the character that is actually output is the bit-reversal of the difference between the last character output before it was bit-reversed and the number found in the array (subtracting in that order, and adding 256 if the result is negative). (Rather than trying to parse the previous sentence, you may find it easier to look either at the source code to the compiler if you have it (the relevant part is binout in src/cesspool.c) or at some example C-INTERCAL programs that do text-based I/O.)
INTERCAL-72 | C-INTERCAL | CLC-INTERCAL | J-INTERCAL |
---|---|---|---|
no | see text | all versions | no |
There are also two CLC-INTERCAL-specific I/O mechanisms. These are Baudot-based text I/O (which is also implemented from C-INTERCAL version 0.27 onwards), and CLC-INTERCAL generalised binary I/O (not implemented in C-INTERCAL).
Baudot text-based I/O is specified by using a tail array as an
argument to WRITE IN
or READ OUT
. (A
tail array can also be used to specify
C-INTERCAL-style Turing Tape I/O. In order to
determine which is used: both C-INTERCAL and
CLC-INTERCAL use their own sort of I/O unless a
command-line argument instructs them to use the other.) In the
case of WRITE IN
, one line of input is requested
from the user (C-INTERCAL requires this to be
input in Latin-1, and will then automatically convert it;
CLC-INTERCAL gives the option of various character
sets for this input as command-line options); the final newline
is removed from this line, then it is converted to extended
Baudot and stored in the tail array specified (causing an error
if the array is too small). Because Baudot is only a 5-bit
character set, each element is padded to 16 bits;
CLC-INTERCAL pads with zeros,
C-INTERCAL pads with random bits. Trying to input
at end-of-file will act as if the input were a blank line.
READ OUT
is the reverse; it interprets the array
as extended Baudot and converts it to an appropriate character
set (Latin-1 for C-INTERCAL, or whatever was
specified on the command line for CLC-INTERCAL),
which is output to the user, followed by a newline. Note that
the Baudot is often longer than the corresponding character in
other character sets due to the need to insert shift codes; for
information on the extended Baudot character set, Character Sets.
Generalised binary I/O is specified using a hybrid array as an
argument to WRITE IN
or READ OUT
.
Input works by reading in a number of bytes equal to the length
of the array (without trying to interpret them or translating
them to a different character set), prepending a byte with 172
to the start, padding each byte to 16 bits with random data,
then replacing each pair of consecutive bytes (that is, the
first and second, the second and third, the third and fourth,
and so on) with (the first element selected from the second
element) mingled with (the complement of the first element
selected from the complement of the second element). Output is
the exact opposite of this process. End-of-file reads a 0,
which is padded with 0s rather than random data; if a
non-end-of-file 0 comes in from the data, its padding will
contain at least one 1. Any all-bits-0-even-the-padding being
read out will be skipped.
INTERCAL-72 | C-INTERCAL | CLC-INTERCAL | J-INTERCAL |
---|---|---|---|
yes | all versions | all versions | all versions |
The GIVE UP
command causes the program to end (or,
in a multithreaded program, causes the current thread to end). It
is written simply as GIVE UP
. There is not much else
to say about it, except to mention that it is the only way to end
the program without an error unless the last line of the program
is TRY AGAIN
, and that it has an unusual interaction
with ABSTAIN
; for details of this, see ABSTAIN. (Going past the last command in the
program is an error.)
There is no gerund for GIVE UP
; in particular,
GIVING UP
is a syntax error.
INTERCAL-72 | C-INTERCAL | CLC-INTERCAL | J-INTERCAL |
---|---|---|---|
no | version 0.25+ | no | no |
The TRY AGAIN
command is a very simple command with
many limitations; its effect is to place the entire program in a
loop. If it exists, it must be the very last command in the
program (it cannot even be followed by syntax errors), and it
causes execution of the program to go back to the first command.
If the TRY AGAIN
command is abstained or for some
other reason doesn’t execute when reached, it exits the
program without the error that would usually be caused by going
past the last line of code.
The gerund for TRY AGAIN
is TRYING
AGAIN
.
INTERCAL-72 | C-INTERCAL | CLC-INTERCAL | J-INTERCAL |
---|---|---|---|
no | see text | see text | see text |
The COME FROM
statement (incidentally also invented
in 1972, but not in connection with INTERCAL)
is the main control-flow command in CLC-INTERCAL
(which deprecates NEXT
), and one of two main control
flow structures in other modern INTERCAL
compilers. It takes either a label or an expression as its
argument; these forms are noncomputed COME FROM
and
computed COME FROM
.
Noncomputed COME FROM
was implemented in version 0.5
of C-INTERCAL, but did not conform to modern-day
semantics until version 0.7; it is available in every version of
CLC-INTERCAL and J-INTERCAL. Computed
COME FROM
support is available in every version of
CLC-INTERCAL and in C-INTERCAL from
version 0.25 onwards, but not in J-INTERCAL; the
variant NEXT FROM
of COME FROM
is
available from CLC-INTERCAL version 1.-94.-8 and
C-INTERCAL version 0.26 (both computed and
noncomputed). C-INTERCAL and
CLC-INTERCAL also have a from-gerund form of
COME FROM
and NEXT FROM
, which was also
implemented from CLC-INTERCAL version 1.-94.-8 and
C-INTERCAL version 0.26.
The basic rule of COME FROM
is that if a COME
FROM
statement references another statement, whenever that
statement is reached, control flow will be transferred to the
COME FROM
after that statement finishes executing.
(NEXT FROM
is identical except that in addition to
the COME FROM
behaviour, the location immediately
after the statement that was nexted from is saved on the NEXT
stack, in much the same way as if the statement being nexted from
was itself a NEXT
.)
Here are examples of noncomputed, computed, and from-gerund
COME FROM
:
DO COME FROM (10) DO COME FROM #2$'.1~#1' DO COME FROM COMING FROM
(The last example is an infinite loop. If it said DO NEXT
FROM NEXTING FROM
, it would not be an infinite loop
because the NEXT stack would overflow and cause an error. This
also establishes the gerunds used for COME FROM
and
NEXT FROM
.)
There are some things to be careful with involving COME
FROM
and NEXT FROM
. First, if the statement
come from or nexted from happens to be a NEXT
, the
NEXT
doesn’t count as ’finishing
executing’ until the NEXT stack entry created by the
NEXT
is RESUME
d to. In particular, this
means that if FORGET
is used to remove the entry, or
a RESUME
with a large argument resumes a lower
entry, the COME FROM
doesn’t steal execution
at all.
Second, you may be wondering what happens if two COME
FROM
s or NEXT FROM
s aim at the same line. In
a non-multithreaded program (whether a program is multithreaded
or not is determined by a compiler option for those compilers
that support it), this is an error; but it is only an error if
the statement that they both point to finishes running, and both
COME FROM
s or NEXT FROM
s try to execute
as a result (they might not if, for instance, one is abstained or
has a double-oh-seven causing it not to run some of the time). If
both COME FROM
s or NEXT FROM
s are
noncomputed, however, a compiler can (but does not have to) give
a compile time error if two COME FROM
s or NEXT
FROM
s share a label, and so that situation should be
avoided in portable code. (If it is wanted, one solution that
works for C-INTERCAL and CLC-INTERCAL
is to use computed COME FROM
s or NEXT
FROM
s with a constant expression.)
Some programming support libraries may be included automatically at the end of your program by the C-INTERCAL compiler. While such a convenience feature might be judged not in the spirit of INTERCAL, the required level of perversity is arguably restored by the way inclusion is triggered: whenever your program refers to a line from specified magic ranges without defining any line in those ranges in the program.
(CLC-INTERCAL does not have this feature, but it is trivial to concatenate a copy of any desire library onto the end of the program.)
The following table maps magic line ranges to systen libraries. Descriptions of the libraries follow it.
From | To | Versions | Description |
---|---|---|---|
(1000) | (1999) | All | Basic System Library |
(5000) | (5999) | >=0.29 | Floatring Point Library |
INTERCAL-72 | C-INTERCAL | CLC-INTERCAL | J-INTERCAL |
---|---|---|---|
yes | all versions | no | all versions |
INTERCAL has a system library, called ‘syslib’ (versions for bases other than 2 will have a numeric suffix on the name).
The intention of the system library is to provide a range of
useful capabilities, like multiplication, that can otherwise be
hard to write in INTERCAL. System library
routines are used by NEXTING
to their line number
(see NEXT), where they will make changes to
certain variables depending on certain other variables (depending
on which routine is called), and RESUME
back to the
original program. As the system library is itself written in
INTERCAL, there are some restrictions that
need to be obeyed for calls to it to be guaranteed to work; none
of the variables it uses (.1
to .6
and
:1
to :5
) should be read-only or
overloaded (although the value of any variables that aren’t
mentioned in the routine’s description will be preserved by
the routine), and none of the lines in it should have their
abstention status changed by lines outside it (this can happen
with blatant infractions like DO ABSTAIN FROM (1500)
or more subtle problems like gerund-abstention) or have
COME FROM
s or NEXT FROM
s aiming at
them.
The system library is currently available in all bases from 2 to 7 (see TriINTERCAL), but not every command is available in every base, and C-INTERCAL is the only one of the three compilers listed above that has the system library to ship with a version in bases other than 2. (This table was originally based on the INTERCAL-72 manual, but has had extra information added for bases other than 2.) Here, “overflow checked” means that #1 is assigned to .4 if there is not an overflow, and #2 is assigned to .4 if there is; “overflow captured” means that if there is overflow, the digit that overflowed is stored in the variable referenced. In all cases, division by 0 returns 0.
Line | Description | Bases |
---|---|---|
(1000) | .3 <- .1 plus .2, error exit on overflow | 2, 3, 4, 5, 6, 7 |
(1009) | .3 <- .1 plus .2, overflow checked | 2, 3, 4, 5, 6, 7 |
(1010) | .3 <- .1 minus .2, no action on overflow | 2, 3, 4, 5, 6, 7 |
(1020) | .1 <- .1 plus #1, no action on overflow | 2, 3, 4, 5, 6, 7 |
(1030) | .3 <- .1 times .2, error exit on overflow | 2, 3, 4, 5, 6, 7 |
(1039) | .3 <- .1 times .2, overflow checked | 2, 3, 4, 5, 6, 7 |
(1040) | .3 <- .1 divided by .2 | 2, 3, 4, 5, 6, 7 |
(1050) | .2 <- :1 divided by .1, error exit on overflow | 2, 3, 4, 5, 6, 7 |
(1200) | .2 <- .1 times #2, overflow captured in .3 | 4, 6 |
(1210) | .2 <- .1 divided by #2, one digit after the quartic or sextic point stored in .3 | 4, 6 |
(1500) | :3 <- :1 plus :2, error exit on overflow | 2, 3, 4, 5, 6, 7 |
(1509) | :3 <- :1 plus :2, overflow checked | 2, 3, 4, 5, 6, 7 |
(1510) | :3 <- :1 minus :2, no action on overflow | 2, 3, 4, 5, 6, 7 |
(1520) | :1 <- .1 concatenated with .2 | 2, 3, 4, 5, 6, 7 |
(1530) | :1 <- .1 times .2 | 2, 3, 4, 5, 6, 7 |
(1540) | :3 <- :1 times :2, error exit on overflow | 2, 3, 4, 5, 6, 7 |
(1549) | :3 <- :1 times :2, overflow checked | 2, 3, 4, 5, 6, 7 |
(1550) | :3 <- :1 divided by :2 | 2, 3, 4, 5, 6, 7 |
(1700) | :2 <- :1 times #2, overflow captured in .1 | 4, 6 |
(1710) | :2 <- :1 divided by #2, one digit after the quartic or sextic point stored in .1 | 4, 6 |
(1720) | :2 <- :1 times the least significant digit of .1, overflow captured in .2 | 5, 7 |
(1900) | .1 <- uniform random number from #0 to #65535 | 2, 3, 4, 5, 6, 7 |
(1910) | .2 <- normal random number from #0 to .1, with standard deviation .1 divided by #12 | 2, 3, 4, 5, 6, 7 |
If you happen to be using base 2, and are either using the external call system (see External Calls) or are willing to use it, it is possible to use a version of the system library written in C for speed, rather than the default version (which is written in INTERCAL). To do this, use the command line options -eE (before the INTERCAL file), and syslibc (at the end of the command line).
INTERCAL-72 | C-INTERCAL | CLC-INTERCAL | J-INTERCAL |
---|---|---|---|
yes | all versions | no | all versions |
INTERCAL also has a floating-point library, called ‘floatlib’, presently available only in base 2. It is used by several of the demonstration programs shipped with the distribution. In versions after 0.28 it is included automatically at the end of your program by the compiler whenever your program refers to a line from (5000) to (5999) without defining any line in that range in the program.
Here is a summary of routines in floatlib.i:
Line | Description |
---|---|
(5000) | :3 <- :1 plus :2 |
(5010) | :3 <- :1 minus :2 |
(5020) | :2 <- the integer part of :1, :3 <- the fractional part of :1 |
(5030) | :3 <- :1 times :2 |
(5040) | :3 <- :1 divided by :2 |
(5050) | :3 <- :1 modulo :2 |
(5060) | :2 <- :1 cast from a two’s-complement integer into a floating-point number |
(5070) | :2 <- :1 cast from a floating-point number into the nearest two’s-complement ineger |
(5080) | :2 <- :1 cast from a floating-point number into a decimal representation in scientific notation |
(5090) | :2 <- :1 cast from a decimal representation in scientific notation into a floating-point number |
(5100) | :2 <- the square root of :1 |
(5110) | :2 <- the natural logarithm of :1 |
(5120) | :2 <- e to the power of :1 (the exponential function) |
(5130) | :3 <- :1 to the power of :2 |
(5200) | :2 <- sin :1 |
(5210) | :2 <- cos :1 |
(5220) | :2 <- tan :1 |
(5400) | :1 <- uniform random number between zero and one exclusive |
(5410) | :2 <- :1 times phi |
(5419) | :2 <- :1 divided by phi |
Note: All of the above routines except (5020), (5060), (5080), (5200), (5210), and (5400) also modify .5 as follows: .5 will contain #3 if the result overflowed or if the arguments were out of domain, #2 if the result underflowed, #1 otherwise. (See below.)
The INTERCAL floating-point library uses the IEEE format for 32-bit floating-point numbers, which uses bit 31 as a sign bit (1 being negative), bits 30 through 23 hold the exponent with a bias of 127, and bits 22 through 0 contain the fractional part of the mantissa with an implied leading 1. In mathematical notation:
‘N = (1.0 + fraction) * 2^(exponent - 127) * -1^sign’
Thus the range of floating-point magnitudes is, roughly, from 5.877472*10^-39 up to 6.805647*10^38, positive and negative. Zero is specially defined as all bits 0. (Actually, to be precise, zero is defined as bits 30 through 0 as being 0. Bit 31 can be 1 to represent negative zero, which the library generally treats as equivalent to zero, though don’t hold me to that.)
Note that, contrary to the IEEE standard, exponents 0 and 255 are not given special treatment (besides the representation for zero). Thus there is no representation for infinity or not-a-numbers, and there is no gradual underflow capability. Conformance with widely-accepted standards was not considered to be a priority for an INTERCAL library. (The fact that the general format conforms to IEEE at all is due to sheer pragmatism.)
INTERCAL-72 | C-INTERCAL | CLC-INTERCAL | J-INTERCAL |
---|---|---|---|
no | version 0.7+ | version 1.-94.-8+ | no |
One extension to INTERCAL that is implemented by both C-INTERCAL and CLC-INTERCAL is known as TriINTERCAL, and extends INTERCAL to bases other than binary. Unlike ordinary INTERCAL programs, which have the extension ‘.i’, TriINTERCAL programs in bases from 3 to 7 (the only allowed bases) have extensions from ‘.3i’ to ‘.7i’ respectively.
The change of numeric base only affects expressions, and in particular the behaviour of operators, and the range of variables. (The onespot and twospot ranges become the highest number of trits or other digits in the base required that fit inside the binary ranges, so for instance, the maximum value of a onespot variable in ternary is 59048, or 3 to the power 10 minus 1.) Interleave/mingle is the simplest to explain; it alternates digits just as it alternated bits in binary. The other operators all change, as follows:
?
and ^
).
(In Princeton syntax, these are the bookworm or yen sign, and a
spike (|
).) The two operators do the same thing in
binary, but differ in higher bases. (Nevertheless, it is an error
to use a sharkfin in binary, because it is a so-called
‘controlled unary operator’, as are the rest of the
new operators defined in this section, which has a lower limit on
which base it is allowed in.) Instead of doing the exclusive-or
operation, the bits being combined are either subtracted or
added; if the result is out of range for the base being used, the
base is added or subtracted from the result until it is in range.
(For the subtraction, the bit that was less significant is
subtracted from the bit that was more significant in any given
pair of bits, except for the subtraction between the most and
least significant bits, where the most significant bit is
subtracted from the least.)
The way to think of it is this: in base 2, an AND gives the
result 0 if either argument is a 0, and otherwise a 1, and
likewise, an OR gives the result 1 if either argument is a 1,
and otherwise a 0; they could be said to favour 0 over 1 and
1 over 0 respectively. In base 3, AND favours 0 over 2 over
1, OR favours 2 over 1 over 0, and BUT favours 1 over 0 over
2. (The symbol for BUT is @
(a
‘whirlpool’, which is another name for the BUT
operation) in Atari syntax, and ?
in Princeton
syntax.) The pattern continues: in base 4, AND favours 0 over
3 over 2 over 1, BUT favours 1 over 0 over 3 over 2, 2BUT
(written 2@
or 2?
) favours 2 over 1
over 0 over 3, and OR favours 3 over 2 over 1 over 0. (This
can be extended to higher bases following the same pattern,
introducing operators 3@
or 3?
,
etc., to favour 3, etc., when neither AND (which always
favours 0) nor OR (which favours the highest digit in the
base) are available.) All the whirlpool operators are
controlled unary operators, which are only legal when both
the base contains the favoured digit, and they aren’t
redundant to AND nor OR.
Note that the base doesn’t affect anything other than variable ranges and expressions; in particular, it doesn’t affect the bit-reversal used by Turing Tape I/O. (The tape still has characters written on it in binary, even though the program uses a different base.)
The multithreading and backtracking extensions to
INTERCAL were originally invented by Malcom
Ryan, who implemented COME FROM
-based multithreading
as a modified version of C-INTERCAL, known as
Threaded INTERCAL, but did not implement
backtracking. (The same functionality is implemented in
C-INTERCAL today, but with different code. Most
likely, this means that the original code was better.) He also
invented the original version of Backtracking
INTERCAL, but did not implement it; the only
known implementation is the C-INTERCAL one. A
different version of multithreading, using WHILE
,
was implemented as part of CLC-INTERCAL (like all
extensions first available in CLC-INTERCAL, it is
most likely due to Claudio Calvelli) and then added to
C-INTERCAL, although its implications were not
noticed for some time afterwards.
So nowadays, three freely-mixable threading-like extensions to INTERCAL exist, all of which are implemented in C-INTERCAL. (A fourth, Quantum INTERCAL, is implemented in CLC-INTERCAL but not C-INTERCAL, and so will not be discussed further here.) If you’re wondering about the description of backtracking as a threading-like extension, it’s implemented with much of the same code as multithreading in C-INTERCAL, because the INTERCAL version can be seen as roughly equivalent to multithreading where the threads run one after another rather than simultaneously. (This conceptualisation is probably more confusing than useful, though, and is also not strictly correct. The same could probably be said about INTERCAL as a whole, for that matter.)
INTERCAL-72 | C-INTERCAL | CLC-INTERCAL | J-INTERCAL |
---|---|---|---|
no | version 0.25+ | version 0.05+ | no |
The original multithreading implementation worked by giving a
new meaning to what was previously an error condition. If in a
multithreaded program (a program is marked as multithreaded
using options to a compiler) two or more COME
FROM
s or NEXT FROM
s (or a mixture of these)
attempt to steal control simultaneously, the original thread
splits into multiple threads, one for each of the commands
trying to take control, and a different command gains control
of the program in each case.
From then on, all the threads run simultaneously. The only
thing shared between threads (apart from the environment in
which they run) is the abstained/reinstated status of each
command; everything else is separate. This means, for instance,
that it’s possible to change the value of a variable in
one thread, and it will not affect the corresponding variable
in other threads created this way. Likewise, there is a
separate NEXT stack in each thread; if both a COME
FROM
and a NEXT FROM
aim at the same line,
for instance, the NEXT FROM
thread will end up
with a NEXT stack entry that isn’t in the COME
FROM
thread, created by the NEXT FROM
itself. This is known as unwoven thread creation; none of the
threads created this way are ‘woven’ with any of
the other threads created this way. (Whether threads are woven
depends on how they were created.) If the thread being split
was itself woven with other threads, exactly one of the
resulting threads after the split is woven with the threads
that the original thread was woven to, but the rest will not be
woven to anything. (If that seems a somewhat unusual rule:
well, this is INTERCAL.)
In C-INTERCAL, there are other guarantees that can
be made about unwoven threads (that is, threads not woven to
any other thread). In particular, they can all be guaranteed to
run at approximately the same speed; to be more precise, the
number of commands that have been given the chance to execute
in any given thread will not differ by more than 2 from the
number of commands that have been given the chance to execute
in any other thread that was created at the same time.
(However, COME FROM
s and NEXT FROM
s
can make this relationship less precise; it is unspecified (in
the technical sense that means the compiler can choose any
option it likes and change its mind on a whim without telling
anyone) whether a COME FROM
or NEXT
FROM
aiming at the current command counts towards the
command total or not, thus causing the relationship to become
weaker the more of them have the chance to execute. In versions
of C-INTERCAL from 0.27 onwards, there is a third
guarantee; that if a COME FROM
comes from itself,
it will actually give other threads at least some chance to
run, at some speed, by counting itself as a command every now
and then; previously this requirement didn’t exist,
meaning that a COME FROM
could block all threads
if it aimed for itself due to the speed restrictions and the
fact that COME FROM
s need not count towards the
total command count.) Also, all commands, including any
ONCE
or AGAIN
attached to the
command, are atomic; this means that it’s impossible for
another thread to conflict with what the command is doing. (In
a departure from the usual INTERCAL status
quo, these guarantees are somewhat better than in most
other languages that implement threading, amusingly continuing
to leave INTERCAL with the status of being
unlike any other mainstream language.)
The only way to communicate between unwoven threads is by
changing the abstention status of commands; this always affects
all threads in the program, whether woven or not. (The
combination of ABSTAIN
and ONCE
is
one way to communicate atomically, due to the atomic nature of
ONCE
.)
If there are at least two threads, the GIVE UP
command ends the current thread, rather than the current
program.
INTERCAL-72 | C-INTERCAL | CLC-INTERCAL | J-INTERCAL |
---|---|---|---|
no | version 0.27+ | version 0.05+ | no |
The WHILE
command (which is not strictly speaking
a command, but more a sort of metacommand that joins commands)
is a second method of achieving multithreading. (In
CLC-INTERCAL, there are at least two other
meanings for WHILE
, but only the one implemented
in C-INTERCAL is discussed here.) The syntax is
somewhat unusual, and consists of two commands separated by the
WHILE
keyword, but sharing the statement
identifier, execution chance, and any
ONCE
/AGAIN
keyword that may be
present. For instance:
(1) DO COME FROM ".2~.2"~#1 WHILE :1 <- "'?.1$.2'~'"':1/.1$.2'~#0"$#65535'"$ "'"'&.1$.2'~'#0$#65535'"$#0'~#32767$#1"
(OK, maybe that’s an unnecessarily complicated example,
and maybe it shouldn’t have included the /
operator which is part of another INTERCAL
extension (see Operand
Overloading). Still, I thought that maybe you’d want
to see how addition can be implemented in
INTERCAL.)
A WHILE
command starts two threads (the original
thread that ran that command and a new one), one of which runs
the command to the left of the WHILE
and one of
which runs the command to the right. Any line number applies to
the left-hand command, not the WHILE as a whole, which is a
metalanguage construct. NEXTING FROM
,
ABSTAINING FROM
or similar behaviour with respect
to the WHILE
itself is impossible, although
it’s certainly possible to abstain from either of its
operands (and abstaining from the left operand has much the
same effect as abstaining from the WHILE
itself;
the right-hand thread deliberately takes a bit of time to get
started just so that this behaviour happens). The right-command
thread starts just before the left command is run (so
NEXTING
, etc., directly to the left command will
not start that loop in the first place); if that command
finishes (which may be almost immediately for something like a
calculate command, or take a long time for something like
NEXT
), that thread loops and reruns that command
as long as the left command has not finished; COMING
FROM
that command, or a NEXT
/NEXT
FROM
from/aiming at that command, doesn’t count as
finishing that command until it is RESUME
d back to
(if possible; if it’s come from, that command can never
end and the right-hand loop will continue forever, or until it
GIVE
s UP
or the loop ends due to the
command ending later in another thread). A WHILE
command itself exists across all threads of a multithreaded
program in a way; for each left-hand command that ends (in any
thread), the next time a right-hand command of the same
WHILE
ends it will cause the thread it’s
looping in to end, regardless of whether that thread
corresponds to the thread in which the left-hand command ended.
(As well as a right-hand command ending, there’s also the
possibility that it never got started; there is a delay before
the right-hand command runs during which a left-hand command
ending can prevent the right-hand thread starting in the first
place; this counts as the same sort of event as terminating a
right-hand loop, and can substitute for it anywhere a
right-hand command ending is mentioned.) There is one
exception, in that if two or more left-hand commands end in a
space of time in which no right-hand commands for that
WHILE
ends, they together only cause one
right-hand command to end. (What, did you expect the logical
and orthogonal behaviour?)
The two threads produced by a WHILE
(the original
thread and a new copy of it) have more in common than ordinary
INTERCAL threads created by COME
FROM
; ordinary threads share only
ABSTAIN
/REINSTATE
information,
whereas the WHILE
-produced threads count as
‘woven’ threads which also share variables and
stashes. (They still have separate instruction pointers,
separate instruction pointer stacks, such as the NEXT stack,
and separate choicepoint lists. Overloading information is
shared, though.) Being woven is a relationship between two or
more threads, rather than an attribute of a thread, although a
thread can be referred to as being unwoven if it is not woven
to any other thread.
Ordinary multithreading cannot create woven threads. When
threads are created by multiple COME FROM
s from an
original thread, which was woven with at least one other
thread, one of the resulting threads counts as the
‘original’ thread and remains woven; the rest are
‘new’ threads which initially start out with the
same data as the original, but are not woven with anything.
Backtracking in a thread (see Backtracking) causes it to unweave with any
threads it may be woven with at the time (so the data in the
thread that backtracks is set back to the data it, and the
threads it was woven with at the time, had at the time of the
MAYBE
, but the other threads continue with the
same data as before). The only way to cause three or more
threads to become woven is with a new WHILE
inside
one of the threads that is already woven, which causes all the
new threads to be woven together (the weaving relationship is
transitive).
INTERCAL-72 | C-INTERCAL | CLC-INTERCAL | J-INTERCAL |
---|---|---|---|
no | version 0.25+ | no | no |
A somewhat unusual threading construct that’s available
is backtracking. In case you haven’t come across it
before (the concept exists in other languages but is
implemented differently and usually in a less general way), the
basic idea is that instead of executing or not executing a
command, you can MAYBE
execute a command. This
causes the command to be executed, but also creates a dormant
thread in which the command wasn’t executed; at any time
later, the program can either decide that it liked the
consequences of the command and GO AHEAD
and get
rid of the dormant thread, or decide that it didn’t like
the consquences of the command and GO BACK
to the
dormant thread, discarding the current one. The dormant thread
is more commonly called a ‘choicepoint’, that is, a
point at which a choice was made but a different choice can
still be made, and is generally not thought of as a thread at
all by most programmers. (In case you’re wondering:
dormant threads are always unwoven.)
To create a choicepoint, the statement identifier
MAYBE
is used, rather than the more usual
DO
or PLEASE
. (Combination statement
identifiers are still allowed, but must be in the order
MAYBE PLEASE DO NOT
with optionally some parts
omitted, or different versions of NOT
used, or
both.) Here’s an example:
MAYBE DON'T GIVE UP
When a command whose statement identifer contains
MAYBE
is reached, it is executed or not executed
as normal, but in addition a choicepoint is created containing
the program as it is at that time. Only ABSTAIN
and REINSTATE
, which always affect all threads in
a program (even choicepoints), can alter the values stored in
the choicepoint; so in this way, a choicepoint is also somewhat
similar to the concept of a continuation in other languages.
The choicepoint is placed on a choicepoint stack, which is
maintained separately for each thread in much the same way that
stashes and the NEXT
stack are.
The choicepoint does not actually do anything immediately, but
if the program doesn’t like the look of where it’s
ended up, or it decides to change its mind, or just wants to
try all the possibilities, it can call the GO BACK
command (which has no arguments, and is just the statement
identifier, optional execution chance, GO BACK
,
and optional ONCE
or AGAIN
). This
causes the current thread to unweave from all other threads and
then replace itself with the thread created by the choicepoint
on top of the choicepoint stack. The difference is that this
time, the abstention or reinstatement status of the command
that was modified with MAYBE
is temporarily
reversed for determining whether it runs or not (this reversal
only lasts immediately after the GO BACK
, and does
not affect uses of the command in other threads or later in the
same thread), so unless it has been ABSTAIN
ed or
REINSTATE
d in the meantime it will run if and only
if it wasn’t run the first time. The choicepoint
stack’s top entry is replaced by a ‘stale’
choicepoint that definitely isn’t a thread; attempting to
GO BACK
to a stale choicepoint instead causes the
stale choicepoint to be deleted and the program to continue
executing. (This is what gives
INTERCAL’s backtracking greater
flexibility in some ways than some other languages; to get
backtracking without the stale choicepoints having an effect,
simply run COME FROM
the GO BACK
as
the previous statement.)
Note that, though, when a thread splits into separate threads
(whether woven or unwoven), the choicepoint stack doesn’t
split completely, but remains joined at the old top of stack.
The two choicepoint stacks can add and remove items
independently, but an attempt to GO BACK
to before
the current thread split off from any other threads that are
still running instead causes the current thread to end,
although it will GO BACK
as normal if all other
threads that split off from it or that it split off from since
the top choicepoint of the stack was created have ended since.
This means that it’s possible to backtrack past a thread
splitting and get the effect of the thread unsplitting, as long
as both resulting threads backtrack; this is another way in
which INTERCAL’s backtracking is more
flexible than that of some other languages.
If, on the other hand, a program decides that it likes where it
is and doesn’t need to GO BACK
, or it wants
to GO BACK
to a choicepoint lower down the stack
while skipping some of the ones nearer the top of the stack, it
can run the GO AHEAD
command, which removes the
top choicepoint on the stack, whether it’s a genuine
choicepoint or just a stale one.
Both GO AHEAD
and GO BACK
cause
errors if an attempt is made to use them when the choicepoint
stack is empty.
INTERCAL-72 | C-INTERCAL | CLC-INTERCAL | J-INTERCAL |
---|---|---|---|
no | version 0.26+ | version 0.05+ | no |
(Operand overloading in C-INTERCAL is nowhere near as advanced as it is in CLC-INTERCAL. This chapter only explains the partial implementation used by C-INTERCAL; for a full implementation, see CLC-INTERCAL and its documentation.)
Operand overloading is a method of using a onespot or twospot variable as a substitute for an expression. When a variable is overloaded to an expression (which could be another variable, or something more complex), any uses of that variable cause the expression to be substituted instead.
At the beginning of the program, all variables stand for
themselves; so .1
really does mean .1
,
for instance. The meaning of a variable can be overloaded using
the slat operator (/
), which is the same in both
Princeton and Atari syntax: it is a binary operator whose left
argument must be a onespot or twospot variable and whose right
argument can be any expression. The slat operator returns the
true value of its left argument, but as a side effect, changes
the meaning of its left argument to be its right argument. Here
is an example:
DO .1 <- .2/'.3~.4'
The example causes .2
’s true value to be
assigned to .1
(unless of course .1
is
read-only), but also causes .2
from then on to
actually mean '.3~.4'
, except when it’s the
left operand of a slat operator. So for instance, DO .1
<- .2
would actually assign '.3~.4'
to
.1
. Somewhat confusingly, this also works in the
other direction; DO .2 <- .1
would assign
.1
to '.3~.4'
, which would have the
effect of changing the values of .3
and
.4
so that '.3~.4'
had the correct
value, or throw an error if it couldn’t manage this. (The
general rule in this case is that any variable or constant in the
expression that overloads the variable is at risk of being
changed; this is known as a ‘reverse assignment’.
Code like DO .1 <- .1/#1
is entirely capable of
changing the value of #1
, although to protect new
INTERCAL users C-INTERCAL will
refuse to carry out operations that change the value of constants
unless a command-line switch (see -v) is
used to give it permission. In C-INTERCAL, changing
the value of a constant only changes meshes with that value, but
in CLC-INTERCAL it can also change non-mesh uses of
that constant, so doing so is not portable anyway.)
When multiple overloading rules are in effect, they are all
applied; overloading .1
to '.2~.3'
and
.2
to '.3$.4'
will cause
.1
to refer to ''.3$.4'~.3'
. However,
this expansion stops if this would cause a loop; to be precise,
overloading is not expanded if the expansion is nested within the
same expansion at a higher level (so .1/.2
and
.2/.1
together cause .1
to expand to
.2
, which expands to .1
, which cannot
expand any further). In C-INTERCAL, the expression
on the right hand side of a slat is not evaluated and not
expanded by operand overloading.
STASHING
a variable causes its overloading
information to be stashed too; RETRIEVING
it causes
its overload rule to also be retrieved from the stash (or any
overload rule on the variable to be removed if there wasn’t
one when the variable was stashed).
Overloading a onespot variable to a twospot variable or vice versa is possible, but the results are unlikely to be predictable, especially if a onespot variable is used to handle a twospot value. Possible outcomes include truncating the value down to the right bitwidth, throwing an error if a value outside the onespot range is used, and even temporarily handling the entire twospot value as long as it doesn’t end up eventually being assigned a value greater than twospot.
Note that reverse assignments can cause unpredictable behaviour
if an attempt is made to reverse-assign the same variable twice
in the same expression. In particular, sequences of commands like
DO .1 <- .2/'.3$.3' DO .2 <- #6
are liable to
succeed assigning garbage to .3
rather than failing
as they ought to do, and likewise any situation where a variable
is reverse-assigned twice in the same expression may assign
garbage to it. This behaviour is seen as unsatisfactory, though,
and plans exist to improve it for future versions.
INTERCAL-72 | C-INTERCAL | CLC-INTERCAL | J-INTERCAL |
---|---|---|---|
no | version 0.26+ | no | no |
PIC-INTERCAL is a simplified version of INTERCAL designed especially for embedded systems, designed to minimise code and data usage by INTERCAL programs so that they can fit on devices whose memory is measured in bytes rather than megabytes. (It is named after the first microcontroller for which code was successfully generated, and which influenced the choices of commands, the PIC16F628 manufactured by Microchip, and is most likely to be portable to other microcontrollers in the same range.) C-INTERCAL only compiles as far as C code when producing PIC-INTERCAL; it is up to the user to find the appropriate cross-compiler to translate this C into the relevant dialect of machine code. (Two header files in the distribution, src/pick1.h and src/pick2.h, don’t have any effect on the compiler but are referenced by the generated code, and the intent is for the user to change them to suit the behaviour of the PIC compiler used, because these are not as standardised as C compilers for everyday systems.)
There are several restrictions on PIC-INTERCAL programs:
ABSTAIN
and REINSTATE
still work, but
cannot be computed ABSTAIN
s, and will not
necessarily work when used to affect the system library or
calls to it.
READ OUT
and WRITE IN
don’t
work. (See below for a replacement.)
COME FROM
and NEXT FROM
must aim at a
label, not an expression or gerund.
In order to provide I/O capabilities, a new command
PIN
is available. It controls up to 16 I/O pins on
the PIC or other embedded system; an I/O pin is capable of
receiving or sending voltages to an electrical or electronic
circuit. This explanation assumes that the device being
controlled is a PIC16F628A, and therefore has its pins in two
blocks of 8 named ‘PORTA’ and
‘PORTB’; for other microcontrollers,
adapting the code in src/pick1.h is likely to be
necessary to tell the compiler how to control the I/O pins, and
the way in which this done will affect which I/O pins it is that
the program will end up being able to communicate with.
The PIN
command takes one twospot variable as its
argument, like this:
DO PIN :2
The twospot variable is conceptually divided into 4 blocks of 8
bits. The highest two blocks control the directions of the pins
in PORTB
(most significant block) and
PORTA
(second most significant block); a 1 on any
bit means that the corresponding I/O pin should be set to send
data, and a 0 means that it should be set to receive data. The
lower two blocks control the values on the pins that are sending
(and are ignored for receiving pins); the second least
significant block controls PORTB
and the least
significant block controls PORTA
, with a 1 causing
the program to set the output voltage to that of the
microcontroller’s negative voltage supply rail, and a 0
causing the program to set the output voltage to that of the
microcontroller’s positive voltage supply rail. (These
voltages may vary on other systems; consult your system’s
datasheet and the changes you made to the header files.) After
setting the pins, the PIN
command then reads them as
part of the same operation, this time setting the values of the
lower blocks that are receiving, rather than setting the pins
from the lower blocks that are sending. However, 1 and 0 bits on
all bits of the twospot variable have the opposite meaning when
doing this, so that 1 means receiving/positive voltage rail and 0
means sending/negative voltage rail. There is no way to input
without output, or vice versa, but it’s trivial to just
send the same output again (which has no effect, because the
voltage on sending pins is maintained at the same level until it
is changed), or to ignore the input received.
INTERCAL-72 | C-INTERCAL | CLC-INTERCAL | J-INTERCAL |
---|---|---|---|
no | version 0.28+ | see text | no |
The CREATE
command allows the creation of new syntax
at runtime. CLC-INTERCAL has had such a command
since 1.-94.-8, but its syntax is completely different and
incompatible with the C-INTERCAL version, and so is
not documented here (see the CLC-INTERCAL
documentation for more details). The C-INTERCAL
version is only defined if the -a
option is used on
the command line (and a runtime error E000 otherwise), because it
forces the operand overloading code to be introduced and so slows
down every variable access in the program.
The syntax of the CREATE
command is to write
CREATE
, then a line label, then anything. OK, well
not quite anything; you’re restricted to syntax that is
supported by the ‘just-in-case’ compiler that runs on
comments at compile time just in case they gain a meaning later
(see below). The anything provides an example statement to
CREATE
; statements which look the same (but may
differ in details) are created. Typical syntax for a
CREATE
statement would therefore look something like
this:
DO CREATE (5) SWITCH .1 WITH .2
There is also computed CREATE
, working identically
to ordinary CREATE
except that the line number is
taken from an expression and the created command must start with
a letter (to avoid an ambiguity if the expression giving the line
label happens to be an array reference), with a syntax like this:
DO CREATE .5 SWITCH .1 WITH .2
Here, a new SWITCH WITH
statement (there is no such
statement in INTERCAL normally) is being
created. This command makes it possible to do this:
DO SWITCH .3 WITH .4
Normally that line would be an error (E000) due to being
unrecognised, but having been CREATE
d, it’s
now a real statement. (The gerund to affect created statements is
COMMENT
, just like before they were created; the
gerund to affect CREATE
itself is
CREATION
(CREATING
is also allowed, but
not as elegant).) When the created statement is encountered, it
NEXT
s to line (5), the line number specified in the
CREATE
statement. In order for the code there to be
able to affect the variables mentioned in the statement, the
variables :1601
(for the first variable or
expression mentioned), :1602
(for the second
variable or expression mentioned), and so on, are
STASH
ed and then overloaded to the respective
expressions or variables mentioned in the created command; so
:1601
has been overloaded to mean .3
and :1602
has been overloaded to mean
.4
at this point. Then, the code at (5) runs; if it
returns via a RESUME #1
, :1601
and
:1602
will be RETRIEVE
d automatically
and the program will continue from after the created statement.
(If you do not resume to that point, say if you’re creating
a flow control statement, you’ll have to deal with the
stashes for the 1600-range variables yourself.)
So what syntax is available in created statements? All the
capital letters except ‘V’ (which is an
operator in INTERCAL) are available and can be
used freely and as many times as desired; they match themselves
literally. However, they are not allowed to spell an
INTERCAL keyword at any point (so watch out
for DO
and FROM
, for instance).
Whitespace is allowed, but is ignored (both in the
CREATE
template statement, and in the code being
created; so DO SW ITCH :8 WITH :50
will also have
been created). Then, there are three groups of matchable data:
scalar variables (onespot or twospot variables, as used in the
examples above) match other scalar variables, array elements
(like ,4 SUB '.5~.6'
) match other array elements,
and other expressions match other other expressions. Two
matchable data may not appear consecutively in a created command,
but must be separated by at least one capital letter (to prevent
array-subscript-related ambiguities; remember that the
just-in-case compiler has to compile these statements at compile
time without knowing what they are). The actual expressions used
in the CREATE
statement don’t matter;
they’re just examples for the runtime to match against.
It is also possible (from C-INTERCAL version 0.29 onwards) to create new operators. Such operators are always binary operators (that is, they take two arguments and parse like mingle or select), and always return 32-bit results. There are three types of legal names for such operators, all of which are treated equivalently: lowercase letters, punctuation marks otherwise unused in INTERCAL, and overstrikes consisting of a character, a backspace, and another character (apart from overstrikes already used for built-in INTERCAL operators). The syntax for creating an operator looks like one of these:
DO CREATE (5) x DO CREATE .5 =
The arguments to the operator will be overloaded onto :1601 and
:1602 (which are, like with CREATE
d statements,
stashed before the overloading happens), and the return value is
read from :1603 (which is stashed, then overloaded to itself).
All these three variables are retrieved again after the operator
finishes evaluating.
Note that it is a very unwise idea to use a CREATE
d
operator in the expression for a computed COME FROM
or NEXT FROM
, because this always leads to an
infinite regress; whenever any line label is reached (including
the line label that the CREATE
statement pointed
at), the expression needs to be evaluated in order to determine
whether to COME FROM
that point, which in turn
involves evaluating lines which have labels.
Some other points: a newer CREATE
statement
supercedes an older CREATE
statement if they give
equivalent templates, multiple CREATE
statements may
aim at the same line (this is the recommended technique for
creating a statement that can handle expressions even if
they’re array elements or variables; you do this by
specifying multiple templates in multiple CREATE
statements), and strange things happen if a twospot variable in
the 1600-range is used as an argument to a created statement
itself (because of the stash/retrieve, such a variable can
usually be read, but may not always be able to be written without
the data being lost).
INTERCAL-72 | C-INTERCAL | CLC-INTERCAL | J-INTERCAL |
---|---|---|---|
no | version 0.28+ | no | no |
C-INTERCAL has a feature allowing INTERCAL and non-INTERCAL code to be mixed. This is achieved by causing the non-INTERCAL programs to participate in the INTERCAL line-numbering model. The same feature allows expansion libraries to be linked into the code.
To create a combined program containing
INTERCAL and non-INTERCAL
code, use ick
as the compiler as normal, but specify
both the INTERCAL and
non-INTERCAL source files on the command line,
and use the -e command-line option. ick
will invoke other compilers as necessary, after modifying the
source files accordingly. At present, external calls are only
supported to and from C and Funge-98.
In each case, it will be the INTERCAL program
that is invoked first. (This means that it is impossible to link
together more than one INTERCAL program, but
you probably don’t want to, because concatenating the
programs is likely to have a similar effect.) You can get the
INTERCAL program to NEXT
to the
non-INTERCAL program immediately, or the
non-INTERCAL program to COME FROM
or NEXT FROM
the INTERCAL program
immediately, to obtain the effect of running the
non-INTERCAL program first.
Note that external calls are incompatible with PIC-INTERCAL and
with multithreading; note also that you must use gcc
as your compiler, and GNU cpp and ld, for them
to work in the current version of C-INTERCAL.
INTERCAL-72 | C-INTERCAL | CLC-INTERCAL | J-INTERCAL |
---|---|---|---|
no | version 0.28+ | no | no |
Linking C and INTERCAL programs is achieved
by placing various constructs into the C programs that are
equivalent to various INTERCAL constructs.
It is possible to simulate a line with a label and a dummy
command (which serves as a COME FROM
suckpoint and
NEXT
target), a command with a line label,
NEXT
, RESUME
, and
FORGET
, and COME FROM
and NEXT
FROM
. Onespot and twospot variables are accessible from
inside the C program, where they can be read and written;
however, the INTERCAL program cannot access
any variables inside the C program that weren’t part of
the INTERCAL program originally.
To prevent various logical impossibilities, there are restrictions on where these can be used and what preparation is needed before they are used. Also, the semantics are not always exactly what you might expect for technical reasons.
It should be observed that the INTERCAL link
intrudes on the user namespace. To prevent possible namespace
clashes, no identifiers starting with ick_
or
ICK_
should be used anywhere in the linked C
program for any reason, except where specified in this manual.
For a C program to be connected to an INTERCAL program, it needs to be marked with the correct header file, and needs to have functions marked for communication with the INTERCAL program.
#include <ick_ec.h>
¶
The header file ick_ec.h must be included
using the preprocessor in any file which uses any of the
INTERCAL external call functions,
variables, or macros. (Note that this file may not
necessarily exist, or exist in the usual place;
ick
will deal with making sure the correct
header file is included.) This will include
stdint.h (a standard C header file, which
must exist on your system), so that you can access
INTERCAL variables (the
INTERCAL types onespot, twospot, tail,
hybrid correspond to the C types uint16_t, uint32_t,
uint16_t*, uint32_t* respectively); also, it will provide
the prototypes for all the functions and definitions for
all the macros needed to use the external calls system
with C.
ICK_EC_FUNC_START
¶
ICK_EC_FUNC_END
Many of the INTERCAL interface macros
(ick_linelabel
, ick_comefrom
,
and ick_nextfrom
) make it possible to jump
from an INTERCAL program to a C
program. Because C doesn’t allow jumping into the
middle of a function, there has to be some way to create
a block of code which can be jumped into. This
is what these two macros achieve.
This declaration and definition:
ICK_EC_FUNC_START(identifier) { /* code goes here */ } ICK_EC_FUNC_END
is equivalent to this:
void identifier(void) { /* code goes here */ }
except that it is possible to jump from an
INTERCAL program into the declared and
defined program. (If you need to write a prototype for
the function early, void identifier(void);
is perfectly acceptable, but an early prototype is not
required unless you call the function from earlier within
the C code.) Of course, you can substitute any identifier
that’s legal as a function name for
identifier
(as long as it doesn’t
start with ick_
or ICK_
). The
resulting function is a function (for instance, you can
take its address or call it in the usual ways); the only
differences are that it can be jumped into from
INTERCAL code and that it is
constrained to take no arguments and return no data. (It
can still access global and INTERCAL
variables.) If the function is jumped into from
INTERCAL code, but then control flow
reaches the end of the function, or the function
return;
s but was not called from C, the
resulting behaviour is undefined; C-INTERCAL
will attempt to continue by some means at that point, but
may fail. If a function is unsure whether it gained
control from C or from INTERCAL code, it may use
ick_return_or_resume
(described below).
Because you are not allowed to declare two C functions
with the same name (even in different modules), all
functions declared with ICK_EC_FUNC_START
must have unique names across the entire compilation.
It is sometimes necessary for a C program to do its own
initialisation before the INTERCAL program starts running. To
do so, it can use the ick_startup
macro inside a
function declared with ICK_EC_FUNC_START
; the
syntax is ick_startup(block)
, where the argument
is an expression, statement, or compound statement to run.
The argument itself must not contain any ick_-prefixed macros
or functions except possibly ick_create, may have side
effects, and must fit the C preprocessor’s idea of what
a macro argument should look like (it’s more used to
parsing expressions than blocks; the general rule is to avoid
commas except when they’re directly or indirectly
inside parentheses or strings).
A line label is something that can be NEXT
ed to
and COME FROM
. Unlike an
INTERCAL line label, it does not label a
statement, and therefore attempts to ABSTAIN
or
REINSTATE
it may be errors, or may be ignored
(it’s unspecified which, which means that either may
happen for any or no reason, but exactly one will happen in
any given case, although the choice might not be consistent).
The macro ick_linelabel(expression);
may appear
anywhere a compound statement would normally be able to
appear. (That is, it looks like a function call being used as
a standalone expression, but in fact the places where it can
appear are more limited.) In contrast to ordinary line
labels, an expression can be used rather than just a
constant; however, the behaviour is undefined if the
expression has side-effects. Upon encountering the line
label, any COME FROM
s or NEXT FROM
s
aiming at the line label (including
ick_comefrom
s and ick_nextfrom
s)
will steal control from the program; RESUMING
after a NEXT FROM
will work, but suffers from
the same caveats as setjmp/longjmp do (any auto variables
that change their value between the NEXT FROM
and RESUME
will have their value clobbered (i.e.
their value is no longer reliable and should not be
accessed)). Note that the INTERCAL
variables are immune to this problem. You can also avoid the
problem by marking variables as volatile
in the
C program.
It is possible to NEXT
or ick_next
to a ick_linelabel
, which has the same effect as
saving the NEXT
stack, calling the function
containing the ick_linelabel
and then
immediately doing a C goto
to an imaginary label
preceding it. Due to this possibility, an
ick_linelabel
is only allowed within a function
defined with ICK_EC_FUNC_START
.
In INTERCAL programs, labels don’t
stand on their own, but instead label a statement. The
difference between a standalone line label and a line label
that labels a statement is that COME FROM
s will
come from the label itself (which is before the next
statement) when aiming at a standalone line label, but the
end of the statement when aiming at a labeled
statement. To achieve the same effect in C, the macro
ick_labeledblock
is available; it can be used as
ick_labeledblock(expression,expression)
or
ick_labeledblock(expression,statement)
; the
first argument is the label, and the second argument is an
expression or statement to label (if an expression is
labeled, it will be converted to a statement that evaluates
it for its side effects and discards the result). It is even
permitted to label a block statement in this way. Note,
however, that you have to contend with the C
preprocessor’s ideas of where macro arguments begin and
end when doing this. Other than the position of the
COME FROM
target created by the label, this
behaves the same way as ick_linelabel
(so for
instance, computed line labels are allowed, but the
expression that computes them must not have side effects, and
it is only allowed within a function defined with
ICK_EC_FUNC_START
).
The ick_comefrom
and ick_nextfrom
macros are, like the other INTERCAL flow
control macros (as opposed to functions), only allowed within
a function defined with ICK_EC_FUNC_START
. They
act almost exactly like the INTERCAL
statements of the same name (although note that C statements
cannot be ABSTAIN
ed FROM
even if
they act the same way as INTERCAL
statements); they are written as
ick_comefrom(expression);
and
ick_nextfrom(expression);
respectively (note
that they must be called as statements, and cannot be used as
part of an expression). Whenever a standalone line label is
encountered whose expression evaluates to the same number as
the expression inside the ick_comefrom
or
ick_nextfrom
, and that number is at most 65535,
then control will be transferred to the
ick_comefrom
or ick_nextfrom
,
leaving a NEXT
stack entry behind in the case of
ick_nextfrom
; likewise, if the end of a labeled
statement, expression or block is reached and the label has
the right number. Some caveats: the expression need not be
constant, but must not have side effects, must not be
negative, and must fit into the range of an unsigned
long
in the C program (and the statement will do
nothing if the expression evaluates to a value larger than
65535). In keeping with the best C traditions, these caveats
are not checked, but instead result in undefined behaviour if
breached.
There are also versions ick_comefromif
and
ick_nextfromif
, which take a second parameter,
which is a condition that specifies whether control is
actually stolen from the target. The condition may have side
effects, and is only run when the line numbers match; it
should return 0 or NULL to leave control flow alone, or
nonzero to steal control, and should be either an integral
type or a pointer type. Although side effects are allowed,
the condition must not look at or alter auto
or
register
variables in the enclosing function,
not even if they are also marked volatile
.
(Global and static
variables are fine, though.)
ick_next
is a macro that acts like the
INTERCAL statement NEXT
.
Contrary to the other INTERCAL-like
macros, it can be used in any function regardless of whether
it was defined with ICK_EC_FUNC_START
; however,
it must still be used as a statement by itself, and a call to
it looks like ick_next(expression);
. The
expression is the label to NEXT
to, and works
under the same rules as the expressions for
ick_comefrom
; it need not be constant (unlike in
C-INTERCAL!), but must not have side effects,
must not be negative, must fit into the range of an unsigned
long, and is ignored if it is over 65535. If there happen to
be multiple labels with the correct value at the time, the
compiler will NEXT
to one of them. Bear in mind
that there is a limit of 80 entries to the NEXT
stack, and that this limit is enforced.
If the resulting NEXT
stack entry is
RESUME
d to, the program will continue after the
ick_next
as if via setjmp
, with all
the usual restrictions that that entails; if the resulting
NEXT
stack entry is forgotten, then the
ick_next
call will never return. (Note that the
’as if via setjmp’ condition allows you to
preserve the vales of auto
and
alloca
-allocated storage as long as its value
has not changed since the ick_next
was called,
which is a significantly more lenient condition than that
normally imposed on such variables (see External Calls and auto).)
ick_resume
is a macro, but there are few
restrictions on its use; it is permitted to use it inside an
expression (but it returns void, making this not particularly
useful), and acts like a function which takes an unsigned
short argument, returns void, and has a prototype (but you
cannot take its address; if you need to be able to do that,
write a wrapper function for it). It can be used within any
function regardless of how it was declared, and never
returns; instead, it pops the specified number of
NEXT
stack entries and resumes execution at the
last one popped, just as the INTERCAL
statement does. This causes the same errors as the
INTERCAL statement if the number of
entries popped is zero or larger than the NEXT
stack.
There is also a macro ick_return_or_resume();
;
it can only be used inside a function defined with
ICK_EC_FUNC_START
, and is equivalent to
return;
if the function was called from C, or
ick_resume(1);
if the function was called from
INTERCAL. It’s therefore a safe way
to return from such a C function if you don’t know how
control reached it in the first place.
The ick_forget
macro removes NEXT
stack entries, and the corresponding C stack entries. It must
be called as a statement by itself, and its invocation looks
like this: ick_forget(expr);
, where the
expression is the number of NEXT
stack entries
to forget (all of them will be forgotten if the number is
higher than the number of entries). The expression will be
casted to an unsigned short
.
ick_forget
can only be used inside a function
declared with ICK_EC_FUNC_START
. As it is
removing stack entries both in INTERCAL
and in C, it will clobber the value of all auto
variables created since the highest remaining
NEXT
stack entry came into being (or since the
start of the program, if the NEXT
stack is
emptied by the command) and also deallocate all
alloca
storage allocated since then. It also
causes the return address of the current function to become
undefined, so that function must not return; control may
leave it via RESUME
, or via COME
FROM
, or via NEXT
or NEXT
FROM
followed by the relevant NEXT
stack
entry being forgotten (the function is still
’running’ but suspended while the
NEXT
stack entry still exists). (Note that these
restrictions are stronger than those on RESUME
;
this is because RESUME
preserves most of the
stack, but FORGET
destroys parts of the stack
and therefore cannot avoid destroying the data stored there.
It could be much worse; a previous (never released) version
of the code didn’t remove those parts of the stack in
many circumstances, leading to a stack leak that caused
programs to segfault after a while.)
This class of four functions make it possible to get and set INTERCAL scalar variables from C code. Their prototypes are as follows:
uint16_t ick_getonespot(unsigned short varnumber); void ick_setonespot(unsigned short varnumber, uint16_t newvalue); uint32_t ick_gettwospot(unsigned short varnumber); void ick_settwospot(unsigned short varnumber, uint32_t newvalue);
The program will error out with a fatal error (see E200) if the variable you request is mentioned
nowhere in the INTERCAL program; if you
attempt to set an IGNORE
d variable, the attempt
will silently fail (just as if you assigned to it in an
INTERCAL program). The get functions are
safe to use in a computed line label, so you can use them to
produce computed line labels that depend on
INTERCAL variables. (uint16_t
and uint32_t
are standard C data types; if your
system doesn’t provide them, get better system header
files.)
If you care about speed, note that .1 is the fastest variable of all to access, and otherwise variables first mentioned near the top of the INTERCAL program will be faster to access than variables mentioned lower down.
The ick_create
function (prototype: void
ick_create(char*, unsigned long)
) allows the external
calls system to be used to create new
INTERCAL syntax; to do this, you give a
‘signature’ representing the syntax you want to
define and a line number to the function (which are its two
arguments, respectively). The signature defines the syntax
that you are defining; whenever that syntax is encountered
within the INTERCAL program, it will
NEXT
to the line number you specify, which can
do various clever things and then RESUME
back to
the INTERCAL program (or if you’re
defining a flow-control operation, you might want to leave
the NEXT
stack entry there and do other things).
However, note that the overloading of :1601
,
etc., will still take place as in the
INTERCAL version of CREATE
if
the -a option is used (see -a), so care is needed when writing flow
control statements that they work both with and without the
option and don’t cause STASH
leaks (which
means no FORGET
ting the relevant
NEXT
stack entry, and no looking at 1600-range
variables). This allows the external calls system to define
whole new INTERCAL commands, with the same
power as any other programming language.
There are various restrictions on what syntax you can
CREATE
with this method, which are best
explained by an explanation of the relevant
C-INTERCAL compiler internals. When an
INTERCAL program is compiled by
C-INTERCAL, any unrecognised statements it comes
across are compiled by a ‘just-in-case’ compiler
that attempts to compile them anyway with no knowledge of
their syntax, just in case the syntax becomes defined later.
(E000 (see E000) will be thrown when such
statements are encountered at runtime, unless the syntax has
been CREATE
d since to give a statement a
meaning.) For the just-in-case compiler to run, the resulting
statement must be completely unrecognised; this means that it
may contain no keywords (not even a sequence of letters that
forms a keyword, such as FROM
or
DO
), it must consist only of variable names,
expressions, and capital letters other than
‘V’ (because
‘V’ is a unary operator, so
otherwise there would be ambiguity), and in which any two
variable names or expressions are separated by at least one
capital letter. The compiler will produce a
‘signature’ for the unknown command that can be
defined.
A signature consists of a sequence of characters (and is
represented as a null-terminated string; the runtime makes a
shallow copy of the string and keeps it until the end of the
program, so arrangements must be made to ensure that the
storage in which the string is allocated stays around that
long, but this opens up interesting possibilities in which
the signature that was actually CREATE
d can be
modified retroactively); whitespace is not allowed in a
signature. Capital letters can be used (apart from
‘V’), and match the same capital
letters literally in the INTERCAL syntax
being created; also available are the special characters
‘.,;~’, which match respectively a
scalar variable (a onespot or twospot variable such as
:1
), an array variable (such as
;2
), an array element (such as ,3 SUB #4
#5
), and an expression that isn’t a variable
name and isn’t an array element (such as
.4$.5
). If you want to be able to match other
things (say, to be able to match all expressions), you will
need to submit multiple signatures using multiple calls to
ick_create
; maybe you could write a library to
do that automatically.
CREATE
d operators also have signatures, but of
quite a different form. The signature for a single-character
operator is a lowercase u, followed by its character code in
hexadecimal (no leading zeros, and in lowercase); the
signature for an overstrike is a lowercase o, followed by the
lower relevant character code in hexadecimal, followed by a
lowercase x, followed by the higher relevant character code
in hexadecimal.
The routine that is NEXT
ed to will presumably
want to be able to see what in the
INTERCAL program was matched by the
signature, so a range of function-like macros is provided to
access that. They must be run from within the invocation of
the function which was NEXT
ed into by the
created syntax (see External Calls and auto for
when a function invocation ends, which could be sooner than
you think when the C-INTERCAL external calls
system is used), and are undefined behaviour when that
invocation did not gain control from a CREATE
d
statement. Here are their effective prototypes:
int ick_c_width(int); int ick_c_isarray(int); unsigned short ick_c_varnumber(int); uint32_t ick_c_value(int); /* These require -a to work */ uint32_t ick_c_getvalue(int); void ick_c_setvalue(int, uint32_t);
The first argument to all these macros is the position of the match in the signature (0 for the first non-capital-letter match in the signature, 1 for the second, and so on until no more items are left in the signature to match); specifying a position that isn’t in the signature is undefined behaviour.
ick_c_width
returns the data type, as a width in
bits, of the expression (or the width in bits of an element
of the passed in array), and ick_c_isarray
returns 1 if the argument was an array variable or 0 if it
was an expression (array elements and scalar variables are
expressions). ick_c_varnumber
returns the
variable’s number (for instance 123 for
.123
), or 0 if the corresponding argument was
not a variable; in the cases where the argument was a
variable, these three functions together provide enough
information to figure out which variable (which is useful if
you’re writing an extension which takes a variable name
as an argument).
ick_c_value
returns the value of the
corresponding expression at the time the CREATE
d
command was called; ick_c_getvalue
is almost
equivalent, but only works if the -a option (see
-a) was used during compilation, and
returns the value of the corresponding expression now. (The
uint32_t return type is large enough to hold either a onespot
or twospot value, and will be zero-extended if the
corresponding expression had onespot type.)
ick_c_setvalue
also depends on -a,
and will assign to the corresponding expression (be careful
not to provide a value that is too large for it!). In the
case that the corresponding expression is not a variable,
this will attempt to perform a reverse assignment to the
expression, and can produce ordinary
INTERCAL errors if it fails. It is not
possible to redimension an array this way, as this is
assignment, not a calculate operation.
Because the external calls system merges the
INTERCAL NEXT
stack with the
C return value and data storage stack (note for pedants: the
C standards nowhere mandate the existence of such a stack, or
even mention one, but the restrictions stated in them imply
that implementations have to act as if such a stack existed,
because of the way the scoping rules and recursion work), the
external calls system therefore has severe effects on data
that happens to be stored there. (In
INTERCAL terms, imagine what would happen
if data could be stored on the NEXT
stack; if C
used the more sensible system of having a STASH
for each variable, these problems would never occur in the
first place, instead causing an entirely different set of
problems.) Similar considerations apply to the common
nonstandard C extension alloca
, which
dynamically alters the size of the stack; also, in what goes
below, register
variables should be considered
to be auto
, because the compiler may choose to
allocate them on the stack. Theoretical considerations would
lead one to conclude that variable-length arrays should obey
most of the same restrictions; in practice, though,
it’s unwise to attempt to mix those with
INTERCAL code at all, except by separating
them into separate functions which aren’t flagged with
ICK_EC_FUNC_START
and use no
ick_
-prefixed identifiers, even indirectly.
(They may cause a compile to fail completely because they
don’t mix well with goto
.)
In the description below, INTERCAL commands should be taken to include the equivalent C macros.
NEXT
/NEXT FROM
paired with
RESUME
have the least effect, and the most
obvious effect, on auto
variables in the
function that was NEXT
ed from, which is the same
effect that the standard C function longjmp
has.
That is, alloca
storage stays intact, and
auto
variables have their values
‘clobbered’ (that is, their value is no longer
reliable and should not be used) if they changed since the
corresponding NEXT
and are not marked as
volatile
. (This is a very easy restriction to
get around, because changing the values of such variables is
quite difficult without using statically-allocated pointers
to point to them (a dubious practice in any case), and
volatile
is trivial to add to the declaration.)
COME FROM
has more restrictions; it deallocates
all alloca
storage in the function that was
COME FROM
, and functions that called it or that
called functions that called it, etc., using C calls (as
opposed to NEXT
), and those invocations of the
functions will cease to exist (thus destroying any
auto
variables in them), even in the case of
COMING FROM
a function into the same function.
auto
variables in the function that is come into
will start uninitialised, even if initialisers are given in
their declaration, and it will be a ‘new’
invocation of that function. (It is quite possible that the
uninitialised values in the auto
variables will
happen by chance to have the values they had in some previous
invocation of the function, though, because they are likely
to be stored in much the same region of memory; but it is
highly unwise to rely on this.) Note that
volatile
will not help here. Observant or
source-code-reading readers may note that there is a mention
of an ick_goto
in the source code to
C-INTERCAL; this is undocumented and this manual
does not officially claim that such a macro exists (after
all, if it did, what in INTERCAL could it
possibly correspond to?), but if such a macro does exist it
obeys the same restrictions as COME FROM
.
FORGET
is the worst of all in terms of
preserving data on the stack; it deallocates
alloca
data and clobbers or deletes
auto
variables in all function invocations that
have come into existence since the NEXT
that
created the topmost remaining NEXT
stack entry
was called, or since the start of the program if the
NEXT
stack is emptied, and the current function
will continue in a new invocation. volatile
is
useless in preventing this, because the relevant parts of the
stack where the data were stored are deleted by the command
(that’s what FORGET
does, remove stack).
If any of these data are required, they have to be backed up
into static storage (variables declared with
static
or global variables), or into heap
storage (as in with malloc
), or other types of
storage (such as temporary files) which are not on the stack.
(Incidentally, suddenly deleting parts of the stack is
excellent at confusing C debuggers; but even
RESUME
and COME FROM
tend to be
sufficient to confuse such debuggers. More worrying is
probably the fact that the C standard provides a portable
method for deleting the stack like that, and in fact the
external calls runtime library is written in standard
freestanding-legal C89 (with the exception of
+printflow debug output which requires a hosted
implementation), meaning that in theory it would be possible
to split it out to create an implementation of a
C-plus-COME-FROM-and-NEXT language, and doing so would not be
particularly difficult.)
Note that INTERCAL variables are not stored on the C stack, nor are any of the metadata surrounding them, and so are not affected unduly by control flow operations.
INTERCAL-72 | C-INTERCAL | CLC-INTERCAL | J-INTERCAL |
---|---|---|---|
no | version 0.29+ | no | no |
C-INTERCAL supports linking INTERCAL programs with Funge-98 programs (to be precise, only Befunge-98 programs are currently supported). However, it does not ship with a Funge-98 interpreter, and such an interpreter needs to be linked to the resulting program in order to run the Befunge program. Therefore, you need to convert a third-party Funge-98 interpreter to a library usable by C-INTERCAL before you can use this part of the external calls system (see Creating the Funge-98 Library); however, this only has to be done once.
Once the library has been created, you can link an
INTERCAL program with a Befunge-98 program
by invoking ick
like this:
ick -e intercalprogram.i befungeprogram.b98
You can link no more than one Befunge-98 program at once (just like you can link no more than one INTERCAL program at once). Also, the INTERCAL program must come first on the command line.
It is legal to link INTERCAL, C, and Befunge-98 simultaneously; however, the identifiers used in the third-party Funge-98 interpreter have not been mangled to avoid collisions, and therefore problems may be caused if the C program uses the same identifiers as the Funge-98 interpreter.
Before external calls to Funge-98 can be used, the relevant library must be compiled. (After the library has been compiled, then you will need to reinstall C-INTERCAL; however, you will not need to recompile C-INTERCAL.)
At present, only the cfunge Funge-98 interpreter (https://launchpad.net/cfunge/+index) can be converted into a library suitable for use by C-INTERCAL; also, doing this is only supported on POSIX systems (although if someone gets it to work on DOS/Windows, the author of this manual would be interested to hear about it). Also, a source-code distribution (rather than a binary distribution) is needed. One way to obtain the latest cfunge sources is via the bzr version-control system, using the following command (correct as of the time of writing, but as always, links can become dead):
bzr branch lp:cfunge
(As a licensing note, note that cfunge is licensed under the GNU General Public licence version 3, whereas C-INTERCAL is licensed under version 2 and all later versions of that licence; although these terms are obviously compatible with each other, you must ensure yourself that your program has appropriate licensing terms to allow a GPLv3 library to be linked to it.)
Once you have downloaded the cfunge sources, you need to
compile them into a library suitable for use with
C-INTERCAL (note that this is a somewhat
different process to compiling them into a standalone
Funge-98 interpreter). There is a script provided in the
C-INTERCAL distribution to do this,
etc/cftoec.sh. It must be run in the
etc subdirectory of the C-INTERCAL
distribution (i.e. the directory the script itself is in),
and must be given the path to the root directory of the
cfunge source distribution (that is, the directory that
contains the src, lib and
tools subdirectories of that distribution) as
its only argument. Note that it may give some compiler
warnings on compilation; my experience is that warnings about
C99 inlining can be safely ignored (they reflect a deficiency
in gcc
itself that luckily seems to be
irrelevant in this case), but other warnings may indicate
problems in the exact versions of the sources that you
downloaded (and errors definitely indicate such problems).
Once the library has been created, it will appear as the new file lib/libick_ecto_b98.a in the C-INTERCAL distribution (the cfunge distribution will be left unchanged); reinstalling C-INTERCAL will install this file to its proper location. (It is also in a valid location to be able to be run if you aren’t installing C-INTERCAL but instead just running it from the distribution’s directory.)
This section will not make much sense to a non-Funge programmer; therefore, if you are not used to Funge, you probably want to skip it.
To a Funge program, the external calls interface is accessed via a Funge-98 ’fingerprint’ defined by the interpreter. The name of the fingerprint is 0x49464649, or as text, ‘IFFI’.
When a program formed by linking INTERCAL and Befunge-98 is run, the first thing that happens is some internal INTERCAL initialisation which is not visible to either program, and then initialisation routines specified in the Befunge-98 program run (if an initialisation routine is also specified in a linked C program using ick_startup, it is unspecified whether the C or Befunge-98 initialisation happens first.) In the Befunge program, the initialisation routine consists of everything that happens until the ‘Y’ command in the ‘IFFI’ fingerprint is run; the author of the Funge-98 must load the ‘IFFI’ fingerprint themselves during this initialisation to access that command. (This is so that the Befunge program ends up complying with the Funge-98 standard; commands not defined in that standard cannot be used until a fingerprint is loaded.) During initialisation, no commands from the ‘IFFI’ fingerprint may be used except ‘Y’ and ‘A’. (If a different command is used, ‘C’, ‘M’, and ‘X’ remove the arguments they would use from the stack (if any) but otherwise do nothing, and the other commands in the ‘IFFI’ fingerprint reflect.)
After the ‘Y’ command is called, the
INTERCAL program starts running; in order
for the Befunge program to regain control, it has to be
NEXT
ed to from the INTERCAL
program, or COME
or NEXT FROM
the
INTERCAL program, or contain the line
label to which syntax in the INTERCAL
program was CREATE
d. (In other words, the normal
INTERCAL ways of transferring information
between parts of a program.) In order to do this, therefore,
line labels and INTERCAL control flow
statements must be placed into the Befunge program.
Code like COME FROM (100)
is a single statement
in INTERCAL, but several statements in
Funge-98; therefore, some method of telling the interpreter
where to start executing to look for COME FROM
s,
NEXT FROM
s, and line labels is needed. The
method used by C-INTERCAL is that of the
’marker’; a marker is represented by character
0xB7 (a mid-dot in Latin-1) in the input Funge-98 program,
but is transformed to a capital ‘M’
by ick
. (The reason for using a special
character for a marker and transforming it rather than just
using ‘M’ is to prevent occurences
of ‘M’ in comments and string
literals, etc., having an effect on the control flow of the
program.) Whenever a NEXT
or line label is
encountered (in the INTERCAL program, the
Funge program or elsewhere), the Funge program is executed
starting from each marker in each cardinal direction to look
for line labels or COME
/NEXT FROM
s
respectively. Therefore, COME FROM (100)
is
written in Funge-98 as Maa*C
(where the M is a
marker in the source code), and likewise the line label
(100)
would be written as Maa*L
.
(This code can be written in any cardinal direction, that is
left to right, top to bottom, right to left, or bottom to
top, but not diagonally or flying.) There are likely to be
unused directions from markers, which will be evaluated too;
you can (and must) close these off by reflecting code
execution back into that marker, another marker, or a
non-marker M
. Note also that a marker remains in
Funge-space even if the M
on the same square is
deleted (the marker itself is not visible to the
g
command, though).
Here are the commands in the ‘IFFI’ fingerprint:
A
¶
This command pops a line number and then an
0"gnirts"-format string from the stack; they are used as
the line number and signature to CREATE
syntax in the INTERCAL program; for
details of the format of the signature, see ick_create. Although using this
command during speculative execution works, doing so is
not recommended; if the target line number for
CREATE
d syntax is changed during speculative
execution to find the line that that syntax corresponds
to, its effect is delayed until after the original line
is found and execution continues from that point. (Side
effects during speculative execution are never
recommended, because they might or might not be optimised
away.)
C
¶
During speculative execution to find COME
FROM
s and NEXT FROM
s, pops a line
label off the top of the stack and does a COME
FROM
that location. During speculative excution to
find line labels, pops the top of the stack and ends that
particular speculative execution as a failure. When not
doing speculative execution, pops and discards the top
element of the stack.
D
This command must only be used when the Funge program is
executing a CREATE
d command, and allows
access to the arguments that command has. It pops an
integer off the top of the stack, and treats it as an
argument position (0-based, so 0 refers to the first
argument, 1 refers to the second, and so on). Note that
providing an invalid argument number, or running this
command when not implementing a CREATE
d
command, leads to undefined behaviour (possibly a
reflection, possibly a segfault, possibly worse).
The command pushes information about the argument chosen onto the stack; the following information is pushed from bottom to top:
.123
would push 123 here, but .1~.2
would push
0).
CREATE
d instruction was called.
F
¶
During speculative execution, this command reflects;
otherwise, this command pops an integer from the top of
stack, and FORGET
s that many
NEXT
stack entries (or all of them if the
argument given is negative).
G
¶
This command pops an integer from the top of stack. If it is positive, the value of the onespot variable whose name is the popped integer is pushed onto the stack; if it is negative, the value of the twospot variable whose name is minus the popped integer is pushed onto the stack; and if it is zero, the command reflects. If the referenced variable is not in the INTERCAL program at all, this causes an INTERCAL error due to referencing a nonexistent variable.
L
¶
During speculative execution to find COME
FROM
s and NEXT FROM
s, this command
pops and discards the top stack element, then ends that
speculative execution. During speculative execution to
find a line label, this command pops an integer from the
top of stack and succeeds with that integer as the line
label (that is, it is possible to NEXT
to an
L
in the Funge program if a marker, followed
by code to push the correct line number onto the stack,
precedes that L
). When not doing speculative
execution, the integer on the top of the stack is used as
a line label (assuming it is in the range 1–65535,
otherwise it is popped and discarded), and a search is
made for COME FROM
s and NEXT
FROM
s aiming for that line label (including in the
INTERCAL program and the Befunge
program itself, as well as programs in any other language
which may be linked in). Note that just as in
INTERCAL, it is possible to
NEXT
to a line label which has a COME
FROM
aiming for it, in which case the COME
FROM
will come from that line label as soon as the
NEXT
transfers control to it.
M
Does nothing if not in speculative execution, or ends the current speculative execution with failure. (This is so that code like
v >M5C ^
does exactly the same thing as COME FROM
(5)
, even when, for instance, it is entered from
the left in the Funge program, rather than gaining
control from the line label (5)
.)
N
¶
During speculative execution, reflects. Otherwise, pops
the top stack element, interprets it as a line label, and
NEXT
s to that line label (this may start
speculative execution to look for line labels, but might
not if it isn’t needed, for instance if the line
label in question is in the INTERCAL
program or in a C program linked to the Befunge program).
R
¶
During speculative execution, reflects. Otherwise, pops
the top stack element, removes that many items from the
NEXT
stack, and RESUME
s at the
last item removed. (If the top stack element was zero,
negative, or too large, this will cause a fatal error in
the INTERCAL program.)
S
¶
Pops a variable number (interpreted as onespot if positive, or minus the number of a twospot variable if negative) and an integer from the stack, and sets the referenced variable to the integer. This reflects if an attempt is made to set the nonexistent variable 0, causes a fatal error in the INTERCAL program if an attempt is made to set a variable that doesn’t exist there, and does not set read-only variables (but pops the stack anyway). If the integer is too high for the variable it is being stored in, only the least significant 16 or 32 bits from it will be used; and likewise, if it is negative, it will be treated as the two’s complement of the number given.
V
Pops a CREATE
d argument index and an integer
from the top of stack. (This is undefined behaviour if
not in the implementation of a CREATE
d
statement, or if the referenced argument does not exist;
as with the D
instruction, 0 refers to the
first argument, 1 to the second, and so on.) If the
-a option is not used, this command does
nothing; otherwise, the value of the argument will be set
to the integer. (This involves doing a reverse assignment
if the argument is a non-variable expression, as usual,
and causes a fatal error in the
INTERCAL program if the reverse
assignment is impossible or an attempt is made to assign
a scalar to an array.)
X
¶
This is identical to C
, except that it does
a NEXT FROM
rather than a COME
FROM
.
As with external calls to C, terminating any program involved
(whether the INTERCAL program with
GIVE UP
, the Befunge program with @
or q
, or a C program with exit()
)
causes all programs involved to terminate, and likewise a
fatal error will end all programs with an error.
One final point which is probably worth mentioning is that
flow control instructions only record the IP’s position
and direction, nothing else; so for instance, if the stack is
modified in one part of the code, those modifications will
remain even after a RESUME
, for instance.
INTERCAL-72 | C-INTERCAL | CLC-INTERCAL | J-INTERCAL |
---|---|---|---|
no | version 0.29+ | no | no |
It is possible to specify other information to the external calls system by using the filename list after all the options are given. To be precise, certain filename patterns are recognised and used to change the options that are used to compile the externally-called files.
The ‘.c99’ extension is treated
identically to ‘.c’, except that it
causes the file with that extension to be preprocessed as C99
(the more modern version of the C standard, the older C89 is
more common), and that all C files involved will be compiled
and linked as C99. (This corresponds to -std=c99
in gcc
.) Likewise, the
‘.c11’ extension can be used to
indicate C11.
The ‘.a’ extension indicates that an
object-code library should be linked in to the final program.
This is most commonly used to link in the maths library
libm.a and other such system libraries. If the
filename is of the form ‘lib*.a’, then
the file will be searched for in the standard directories for
libraries on your system, and also where the
C-INTERCAL libraries are stored (which may be the
same place); otherwise, the current directory will be searched.
(Specifying libm.a on the command line corresponds
to passing -lm to gcc
.)
Whatever language your source files are written in, when -e is used (see -e), the compiler will go through much the same steps.
First, the INTERCAL program specified is
compiled into a C program that uses the
INTERCAL external call conventions for its
control flow operations. The resulting
‘.c’ file will be left behind in the
same directory (even if -g isn’t used); if
you look at it, you’ll see the #include
<ick_ec.h>
line, and the other hallmarks of an
external call program (for instance,
INTERCAL NEXT
s will translate
into slightly modified ick_next
s; the modification
is simply to allow the correct line number to be displayed in
case of error).
After that, the resulting files are preprocessed twice. First,
the C preprocessor is run on the files; then, a special
C-INTERCAL ‘preprocessor’ is run on
the files. (‘Preprocessor’ is a bit of a misnomer
here, as it’s near the end of the compilation process;
‘postprocessor’ would likely be more accurate, or
maybe ‘interprocessor’.) Its job is to fix line
labels between the gotos that are used to implement jumping
into the middle of a C function, to assign unique numbers to
things that need them, and to keep track of which functions
need to be checked for line labels and for COME
FROM
s and NEXT FROM
s. The resulting file
will have the extension ‘.cio’; it is
almost human-readable, especially if you run it through a C
code indenter, and consists of C code (which might be a thin
wrapper around some other language) and instructions to
gcc
. The ‘.cio’ file will
be left behind for you to look at, if you like.
Once the ‘.cio’ files have been
produced, gcc
is used to compile all the
‘.cio’ files and link them together
into an executable; the executable will have the same name as
the INTERCAL source, minus any extension
(and on DJGPP, assuming that its version of gcc
could handle the resulting command line (not necessarily
guaranteed), a ‘.exe’ extension is
added), and will consist of all the C files linked together
with the INTERCAL. Any functions named
main
in the C files will be deleted; likewise, if
there is a name clash between any two functions, the one in the
file named earlier on the command line will be used. There is
presumably some use for this feature, although I haven’t
figured out what it is yet.
Extending this to other compiled languages is mostly a problem
of determining how they fit into the
INTERCAL control structure, which is not a
trivial task, and of figuring out how to link them to C code,
which in some cases is trivial (especially if the language is
one that gcc
can compile!) and in other cases is
very difficult. If anyone has any ideas of new languages that
could be added to the external calls system, feel free to
contact the current C-INTERCAL maintainer with
suggestions or patches.
INTERCAL-72 | C-INTERCAL | CLC-INTERCAL | J-INTERCAL |
---|---|---|---|
no | version 0.28+ | no | no |
The C-INTERCAL distribution comes with libraries
that can be used to extend its capabilities; they are
implemented using the external call mechanism, and are in
effect standard files to include using that mechanism. To use
an expansion library, give the -e option to
ick
(note that this means you cannot use them with
the debugger or profiler, nor with multithreaded or
backtracking programs), and specify the expansion
library’s name at the end of the command line (or to be
precise, anywhere after the initial INTERCAL
file). The libraries themselves are written in C and have a
‘.c’ extension, and are
human-readable; C-INTERCAL will look for them in
the same places as it looks for the system library (including
in the current directory, so you can test your own expansion
libraries without having to install them).
Expansion libraries use C identifiers which start with the string ‘ick_my_’ (this is not used by the compiler, and is explicitly not affected by the prohibition on identifiers starting ‘ick_’ when writing an expansion library), and use line labels in the range (1600) to (1699). (Most programs will be avoiding this range anyway, because it’s inside the (1000) to (1999) range reserved for the system library, but the system library doesn’t use it, in much the same way that the identifiers used are inside the range reserved for the compiler, but the compiler doesn’t use them.)
Expansion libraries are available from C-INTERCAL version 0.28; CLC-INTERCAL has a similar concept (that of ‘preloads’), but implemented a completely different way.
syslibc is an implementation of the base-2 INTERCAL system library in C (see System Libraries); using it in programs running in other bases is accepted by the compiler, but likely to produce unpredictable results. When using this expansion library, you also need to give the -E option (see -E) so that the main system library is not included, or it will be used in preference to the expansion library. All documented features of the INTERCAL base-2 system library are implemented, but most undocumented features are not, so INTERCAL programs which relied on them (dubious behaviour in any case) will not work with syslibc. The main reason to use this library is to increase the speed of an INTERCAL program; however, note that the speed gains in arithmetic will be accompanied by the performance penalty of using the external calls infrastructure, unless you were already using it.
As an example of using ick_create
, a very simple
expansion library is provided to enable a computed NEXT
capability, by defining a new command COMPUNEX
. It
is used as DO .1 COMPUNEX
(allowing any expression
in place of the .1), and is similar to an ordinary
NEXT
, but has two limitations: it takes up two
NEXT
stack entries, and the top one should not be
RESUMEd
past or forgotten (thus it isn’t a
particularly useful command, except maybe to produce the
equivalent of something like function pointers). By the way,
note that C-INTERCAL avoids computed
NEXT
mainstream for much the same way that
CLC-INTERCAL avoids NEXT
altogether;
it makes things too easy. This example is provided mostly just
to demonstrate the syntax, and the care that needs to be taken
with implementing flow control operators.
‘compunex’ is double-deprecated; an
alternative is the following sequence of commands involving
computed CREATE
:
DO CREATE .1 ABC DO ABC
This sequence emulates all features of NEXT
(although it has different gerunds and is two statements, not
one), making it much more useful for simulating computed
NEXT
than COMPUNEX
is. (There’s
no need to avoid forgetting the return value; although this
skips the CREATE
cleanup, none is required because
the created statement ABC
(any other statement
would do just as well) takes no arguments.)
The C-INTERCAL compiler exists in a world of several other compilers.
The Princeton compiler was the first INTERCAL
compiler available, and compiled INTERCAL-72. Using
convickt
(see convickt) to
translate its programs from the original EBCDIC to Latin-1 or
Atari-syntax ASCII is required to run them under the
C-INTERCAL compiler, but apart from that there
should be no problems; everything that that compiler can do can
be reproduced by C-INTERCAL, even including some of
its bugs. The only potential problems may be where constructs
were nonportable or dubious to begin with (such as the
IGNORE
/RETRIEVE
interaction), or where
commands intended to be syntax errors were used in the program
but have a meaning in C-INTERCAL. For extra
portability, it’s possible to use the -t
compiler option to ick
(see -t) to tell it to interpret the program as
INTERCAL-72, but as C-INTERCAL’s
dialect of INTERCAL is basically
backward-compatible anyway this mostly serves to check newer
programs for compatibility with older compilers.
The Atari compiler was an uncompleted implementation of
INTERCAL-72, optimistically pre-described in some
1982 additions to the original INTERCAL-72 manual.
Despite the implementation’s never actually existing, the
documentation of the syntax provided a model for
C-INTERCAL. If any Atari 800 INTERCAL source code
actually existed, there would be no need to use
convickt
on it.
The J-INTERCAL compiler is an implementation of
INTERCAL written in Java that compiles
INTERCAL into Java (and so has a similar
relationship with Java to that of the C-INTERCAL
compiler (which is written in C and compiles into C) with C).
J-INTERCAL has much the same feature set as older
versions of C-INTERCAL, with a few changes (such as
the addition of Esperanto and error messages coming up in
different situations). J-INTERCAL programs should
run fine on C-INTERCAL without trouble (as it is
also an Atari syntax compiler), except in nonportable cases such
as IGNORE
/RETRIEVE
interaction.
The CLC-INTERCAL compiler is the most modern INTERCAL compiler apart from C-INTERCAL (both compilers are maintained and updated every now and then as of the time of writing, so which is more modern is normally a matter of when you happen to check). Unlike the other three compilers mentioned above, it has a quite significant feature set, including many features not implemented or only partially implemented in C-INTERCAL, and is responsible for the origin of many of the features added in more recent versions of C-INTERCAL. Generally speaking, a CLC-INTERCAL program that uses its advanced features is unlikely to run on C-INTERCAL, or vice versa, whatever you do (apart from completely rewriting the more advanced parts of the program).
However, there are certain steps that can be taken to transfer
less advanced programs from one compiler to the other. First,
translate the program to Latin-1 Princeton syntax (if translating
from CLC-INTERCAL to C-INTERCAL) or
Atari syntax (if translating from C-INTERCAL to
CLC-INTERCAL), maybe using convickt
, if
necessary. (Note that here the program is being translated to the
syntax that is not default for the target compiler.) Then use
command-line arguments to switch the compiler into the correct
emulation mode for the other compiler; C-INTERCAL
uses the options -xX, and on
CLC-INTERCAL this is done by selecting the
appropriate preloads, or by changing the program’s file
extension to ‘.ci’. In each case other
options may be needed to turn on various extensions (maybe
-m or -v if translating to
C-INTERCAL, maybe the preload for gerund-based
COME FROM
if translating to
CLC-INTERCAL), and if translating to
CLC-INTERCAL you need to append the system library
to your program yourself because CLC-INTERCAL
doesn’t load it automatically.
In the case of very simple programs, or if you want to spend the
effort in translating compiler-specific code from one compiler to
another, you may be able to work without emulation options. (This
is a good target to aim for, in any case.) In such a case, you
would do nothing other than possibly edit the program to be more
portable and a possible character set and syntax change using
convickt
. If you need compiler-specific code, you
may be able to detect the compiler in the code itself and adapt
accordingly; making use of the
IGNORE
/RETRIEVE
interaction is one way
to do this, as it differs between C-INTERCAL,
J-INTERCAL, and CLC-INTERCAL. The other
things to watch out for when doing this are that
CLC-INTERCAL needs an explicit option to enable the
use of NEXT
, that CLC-INTERCAL
doesn’t load the system library itself (you need to
manually append it to the end of the program) and that you
probably shouldn’t number a line (666) unless you know what
you’re doing, because that line number has a special
meaning in CLC-INTERCAL.
The following table explains the equivalences between the various
character sets used for INTERCAL: 7-bit ASCII
Atari syntax, 5-bit Baudot Princeton syntax, 8-bit EBCDIC
Princeton syntax, and 8-bit Latin-1 Princeton syntax. (The Baudot
and EBCDIC are the CLC-INTERCAL versions, which are
used by INTERCAL compilers but basically
nowhere else.) The characters themselves are not shown in the
table below, because they would have to be shown in some syntax,
which would be misleading. (Atari syntax is used throughout this
manual; you could convert from that, assuming you have an ASCII
table handy.) You can also use the convickt
command-line tool to translate INTERCAL
programs from one format to another (see convickt). Note that Baudot has more than one
’shift state’; the shift state (1, 2, 3, or 4) is
written before the hexadecimal code for each character, and *
represents a character available in every shift state. To change
from one shift state to another, use character 1f to change from
shift states 3 or 4 to 1, or from 1 or 2 to 2, and character 1b
to change from shift states 1 or 2 to 3, or from 3 or 4 to 4.
Atari | Baudot | EBCDIC | Latin-1 |
---|---|---|---|
09 | N/A | 09 | 09 |
0a | * 02 | 0a | 0a |
0d | * 08 | 0d | 0d |
20 | * 04 | 40 | 20 |
21 | 3 0d | 4f | 21 |
22 | 3 11 | 7f | 22 |
23 | 4 06 | 7b | 23 |
24 | 4 01 | 4a | a2 |
25 | 4 1c | 6c | 25 |
26 | 3 1a | 50 | 26 |
27 | 3 0b | 7d | 27 |
28 | 3 0f | 4d | 28 |
29 | 3 12 | 5d | 29 |
2a | 4 09 | 5c | 2a |
2b | 4 03 | 4e | 2b |
2c | 3 0c | 6b | 2c |
2d | 3 03 | 60 | 2d |
2e | 3 1c | 4b | 2e |
2f | 3 1d | 61 | 2f |
30 | 3 16 | f0 | 30 |
31 | 3 17 | f1 | 31 |
32 | 3 13 | f2 | 32 |
33 | 3 01 | f3 | 33 |
34 | 3 0a | f4 | 34 |
35 | 3 10 | f5 | 35 |
36 | 3 15 | f6 | 36 |
37 | 3 07 | f7 | 37 |
38 | 3 06 | f8 | 38 |
39 | 3 18 | f9 | 39 |
3a | 3 0e | 7a | 3a |
3b | 3 1e | 5e | 3b |
3c | 4 0f | 4c | 3c |
3d | 4 07 | 7e | 3d |
3e | 4 12 | 6e | 3e |
3f | 4 0c | 65 | a5 |
40 | 3 19 | 6f | 3f |
41 | 1 03 | c1 | 41 |
42 | 1 19 | c2 | 42 |
43 | 1 0e | c3 | 43 |
44 | 1 09 | c4 | 44 |
45 | 1 01 | c5 | 45 |
46 | 1 0d | c6 | 46 |
47 | 1 1a | c7 | 47 |
48 | 1 14 | c8 | 48 |
49 | 1 06 | c9 | 49 |
4a | 1 0b | d1 | 4a |
4b | 1 0f | d2 | 4b |
4c | 1 13 | d3 | 4c |
4d | 1 1c | d4 | 4d |
4e | 1 0c | d5 | 4e |
4f | 1 18 | d6 | 4f |
50 | 1 16 | d7 | 50 |
51 | 1 17 | d8 | 51 |
52 | 1 0a | d9 | 52 |
53 | 1 05 | e2 | 53 |
54 | 1 10 | e3 | 54 |
55 | 1 07 | e4 | 55 |
56 | 1 1e | e5 | 56 |
57 | 1 12 | e6 | 57 |
58 | 1 1d | e7 | 58 |
59 | 1 15 | e8 | 59 |
5a | 1 11 | e9 | 5a |
5b | 4 10 | 9e | 5b |
5c | 4 05 | N/A | 5c |
5d | 4 13 | 5a | 5d |
5e | 4 0d | 6a | 7c |
5f | 4 15 | 7c | 40 |
60 | N/A | N/A | 60 |
61 | 2 03 | 81 | 61 |
62 | 2 19 | 82 | 62 |
63 | 2 0e | 83 | 63 |
64 | 2 09 | 84 | 64 |
65 | 2 01 | 85 | 65 |
66 | 2 0d | 86 | 66 |
67 | 2 1a | 87 | 67 |
68 | 2 14 | 88 | 68 |
69 | 2 06 | 89 | 69 |
6a | 2 0b | 91 | 6a |
6b | 2 0f | 92 | 6b |
6c | 2 13 | 93 | 6c |
6d | 2 1c | 94 | 6d |
6e | 2 0c | 95 | 6e |
6f | 2 18 | 96 | 6f |
70 | 2 16 | 97 | 70 |
71 | 2 17 | 98 | 71 |
72 | 2 0a | 99 | 72 |
73 | 2 05 | a2 | 73 |
74 | 2 10 | a3 | 74 |
75 | 2 07 | a4 | 75 |
76 | 2 1e | a5 | 76 |
77 | 2 12 | a6 | 77 |
78 | 2 1d | a7 | 78 |
79 | 2 15 | a8 | 79 |
7a | 2 11 | a9 | 7a |
7b | 4 0a | 9c | 7b |
7c | 4 1e | fe | N/A |
7d | 4 11 | dc | 7d |
7e | 4 0b | a1 | 7e |
A variety of character sets have historically been used to
represent INTERCAL programs. Atari syntax was
designed specifically for use with ASCII-7, and all
Atari-syntax-based INTERCAL compilers accept
that character set as possible input. (C-INTERCAL
also accepts Latin-1 and UTF-8.) However, the story is more
complicated with Princeton syntax; the original Princeton
compiler was designed to work with EBCDIC, but because modern
computers are often not designed to work with this character set
other character sets are often used to represent it, particularly
Latin-1. The CLC-INTERCAL compiler accepts Latin-1,
a custom dialect of EBCDIC, Baudot, and a punched-card format as
input; C-INTERCAL can cope with Latin-1 Princeton
syntax, but for the other character sets, for other compilers, or
just for getting something human-readable, it’s useful to
have a conversion program. convickt
is an
INTERCAL character set conversion program
designed with these needs in mind.
The syntax for using convickt
is
convickt inputset outputset [padding]
(that is, the input and output character sets are compulsory, but the parameter specifying what sort of padding to use is optional).
The following values for inputset and outputset are permissible:
Latin-1, or to give it its official name ISO-8859-1, is the character set most commonly used for transmitting CLC-INTERCAL programs, and therefore nowadays the most popular character set for Princeton syntax programs. Because it is identical to ASCII-7 in all codepoints that don’t have the high bit set, most of the characters in it can be read by most modern editors and terminals. It is also far more likely to be supported by modern editors than EBCDIC, Baudot, or punched cards, all of which have fallen into relative disuse since 1972. It is also the only input character set that C-INTERCAL supports for Princeton syntax programs. It uses 8 bit characters.
EBCDIC is an 8-bit character set that was an alternative to
ASCII in 1972, and is the character set used by the original
Princeton compiler. Unfortunately, there is no single
standard version; the version of EBCDIC used by
convickt
is the one that
CLC-INTERCAL uses. It is the default input
character set that CLC-INTERCAL uses (although
more recent versions of CLC-INTERCAL instead try
to guess the input character set based on the input program.)
Baudot is a 5-bit character set with shift codes; therefore
when storing it in a file on an 8-bit computer, padding is
needed to fill in the remaining three bits. The standard
Baudot character set does not contain all the characters
needed by INTERCAL; therefore,
CLC-INTERCAL uses repeated shift codes to add
two more sets of characters. convickt
uses the
CLC-INTERCAL version of Baudot, so as to be able
to translate programs designed for that compiler; however,
standard Baudot is also accepted in input if it contains no
redundant shift codes, and if the input contains no
characters not in standard Baudot, the output will be written
so that it is both correct standard Baudot and correct
CLC-INTERCAL Baudot for those characters.
This option causes convickt
to attempt a limited
conversion to or from Atari syntax; this uses ASCII-7 as the
character set, but also tries to translate between Atari and
Princeton syntax at the character level, which is sometimes
but not always effective. For instance, ?
is
translated from Atari to Princeton as a yen sign, and from
Princeton to Atari as a whirlpool (@
); this sort
of behaviour is often capable of translating expressions
automatically, but will fail when characters outside ASCII-7
(Atari) or Latin-1 (Princeton) are used, and will not, for
instance, translate a Princeton V
, backspace,
-
into Atari ?
, but instead leave
it untouched. ASCII-7 is a 7-bit character set, so on an 8
bit computer, there is one bit of padding that needs to be
generated; note, however, that it is usual nowadays to clear
the top bit when transmitting ASCII-7, which the
‘printable’ and ‘zero’ padding styles
will do, but the ‘random’ style may not do.
When using a character set where not all bits in each byte are specified, a third argument can be given to specify what sort of padding to use for the top bits of each character. There are three options for this:
Option | Meaning |
---|---|
printable | Keep the output in the range 32-126 where possible |
zero | Zero the high bits in the output |
random | Pad with random bits (avoiding all-zero bytes) |
Note that not all conversions are possible. If a character cannot be converted, it will normally be converted to a NUL byte (which is invalid in every character set); note that this will prevent round-tripping, because NUL is interpreted as end-of-input if given in the input. There is one exception; if the character that could not be converted is a tab character, it will be converted to the other character set’s representation of a space character, if possible, because the two characters have the same meaning in INTERCAL (the only difference is if the command is a syntax error that’s printed as an error message). (The exception exists to make it possible to translate existing INTERCAL source code into Baudot.)
One file in the C-INTERCAL distribution (src/idiotism.oil) is written in Optimizer Idiom Language, a programming language designed especially for expressing optimizer idioms for INTERCAL in an easily editable form (well, at least it’s easier than the unmaintainable set of idioms hard-coded in C that were used in previous versions of the INTERCAL compiler).
The structure of an OIL file consists of a sequence of idioms. An optimizer idiom looks for a certain pattern in an expression (which could be an INTERCAL expression, or an expression that has already been partly optimized and therefore contains some non-INTERCAL operators), and replaces it with a replacement that’s ‘simpler’ in some sense (in the case of C-INTERCAL, ‘simpler’ is interpreted to mean ‘compiles into a faster or smaller executable when run through a C compiler’). When an OIL program acts on an input INTERCAL file, it keeps on matching idioms to simplify expressions, until none of the idioms act any more (and if a situation occurs where idioms can keep matching indefinitely, the compiler goes into an infinite loop; so don’t allow that to happen); at present, the idioms are tried from left to right, from the leaves of an expression to its root, and from the start of the OIL file to the end; but don’t rely on that, because it’s subject to change (and gets confusing when you think about what happens when the program actually does a replacement). Anyway, the point is that if an idiom can match an expression, and another idiom doesn’t change it first, then the idiom will be matched against that part of the expression eventually, and the program won’t end until there are no idioms that match the optimized expression.
At present, the only place that OIL is used
in the C-INTERCAL compiler is when the
-O option (see -O)
is used in base 2. (Syntax is planned to extend
OIL to higher bases, and some of this is
documented and even implemented, but there’s no way to
use it.) The idioms are read from the file
src/idiotism.oil during the compilation of the
C-INTERCAL from sources; you can change the
idioms, but you will then have to recompile the distribution
(and if you are using the config.sh
method, also
reinstall, but that will be pretty fast.)
An OIL file is encoded as an ASCII text file using no codepoints outside the range 0-127; using 10 for newline (as on a UNIX or Linux system) is always acceptable, but using 13 then 10 (as is common on Windows or DOS) for newline is acceptable only if your C compiler recognizes that as a newline. I have no idea what happens if you use just 13 on an Apple computer on which that is the common newline convention.
Comments can be given anywhere in the file by writing lines
starting with semicolons (known as hybrids to
INTERCAL programmers). It’s also
possible to write a semicolon after part of a line to comment
out the rest of the line. Inside braced C expressions, comments
can be given anywhere whitespace would be allowed by placing
them between /*
and */
(in such
cases, the comments will be copied verbatim to the C temporary
files used when building the C-INTERCAL compiler,
where your C compiler will ignore them). Whitespace is ignored
nearly everywhere; the only places it isn’t ignored are
in the middle of a decimal constant, inside square brackets,
immediately after one of the characters
‘.:#_}’, and anywhere that C
doesn’t allow it in quoted C code. (This means that you
can even place it inside operators like && if you like,
as long as they’re part of OIL code and not C code,
although doing this is not recommended.) If you use whitespace
in a situation where it isn’t ignored, that’s
almost certainly an error.
Idioms are grouped into groups of idioms by placing an identifier in square brackets before the group; this follows the rules for C identifiers, except that there’s a maximum length of 30 characters. This identifier is the ‘name’ of the group, which has no effect except on optimizer debug output; for that matter, the only effect a group has is that all idioms in the group look the same in optimizer debug output, because they have the same name. It’s recommended that idioms only have the same name if they are the same idiom, possibly written in several ways. For example, a shift by 0 has no effect and may as well be removed from the output; the way to express this in OIL is:
[nullshift] (_1 >> #0)->(_1) (_1 << #0)->(_1)
Here, nullshift is the name of the group of idioms, and two idioms are given; one which removes a null rightshift, and one which removes a null leftshift.
As the example above shows, the syntax of an idiom itself is
(pattern)->(replacement)
The parentheses here are actually part of the pattern and/or replacement, and as such sparks (apostrophes) or rabbit-ears (double quotes) can be used instead; they’re shown in the syntax because the outer layer of parenthesising is always required. Both the pattern and replacement are OIL expressions, although they both have their own special syntax elements as well.
An OIL expression is built around subexpressions connected by infix binary operators and/or preceded by prefix unary operators, the same way as in C or INTERCAL (although unary operators must be entirely before their argument; the one character later position is not allowed.) As in INTERCAL, there is no operator precedence; expressions must be very fully bracketed to show unambiguously what the precedences must be, and then more so; for instance, bracketing marks must be placed around the argument of a unary operator in most circumstances. Bracketing of expressions can be done with parentheses, sparks (apostrophes) or rabbit-ears (double-quotes).
The following unary and binary operators are allowed in OIL expressions:
$
|
INTERCAL mingle |
~
|
INTERCAL select |
&16
|
INTERCAL unary AND (16-bit) |
V16
|
INTERCAL unary OR (16-bit) |
?16
|
INTERCAL unary XOR (16-bit) |
^16
|
INTERCAL unary sharkfin (16-bit) |
@16
|
INTERCAL unary whirlpool (16-bit) |
@216..@516
|
INTERCAL unary generalised whirlpool (16-bit) |
&32
|
INTERCAL unary AND (32-bit) |
V32
|
INTERCAL unary OR (32-bit) |
?32
|
INTERCAL unary XOR (32-bit) |
^32
|
INTERCAL unary sharkfin (32-bit) |
@32
|
INTERCAL unary whirlpool (32-bit) |
@232..@532
|
INTERCAL unary generalised whirlpool (32-bit) |
&
|
C binary bitwise AND |
|
|
C binary bitwise OR |
^
|
C binary bitwise XOR |
+
|
C addition |
-
|
C subtraction |
*
|
C multiplication |
/
|
C integer division |
%
|
C modulus |
>
|
C greater than |
<
|
C less than |
~
|
C unary bitwise complement |
!=
|
C not equals operator |
==
|
C equals operator |
&&
|
C logical AND |
||
|
C logical OR |
>>
|
C bitwise rightshift |
<<
|
C bitwise leftshift |
!
|
C unary logical NOT |
(Note that in some cases two operators are expressed the same way, but that this doesn’t matter because one is unary and the other is binary so that there can’t be any ambiguity, only confusion. Also note that unlike INTERCAL unary logic operators, OIL unary logic operators must have a bitwidth stated.)
It hasn’t yet been explained what operands these operators have to operate on; the syntax for those depends on whether it’s a pattern or replacement that the expression is representing.
Patterns are simply OIL expressions; the expressions match either original INTERCAL input or expressions produced by earlier idioms. Each operator must match the same operator in the (possibly partially-optimised) input; the operands themselves are pattern templates specifying what operands in the input they can match.
One special simple form of match is possible:
#NUMBER
, where NUMBER is in
decimal, matches a constant with that value. (Unlike in
INTERCAL, this constant is not limited to
being a onespot value; it is, however, limited to being at most
twospot, as are all operands and intermediate values in
OIL.)
Otherwise, an operand consists of the following parts, written in order:
_
to specify that
any data type can be matched. In a few cases, you may want to
use .
or :
to specify that you only
want to match a onespot or twospot value respectively (that is,
16- or 32-bit). You can also use #
; this specifies
a value that can be any width, but must be known to be a
constant with a known value at optimize time (either because it
was hardcoded as a constant originally or because a constant
was produced there by the optimizer, for instance via a
constant folding optimization).
{
and
}
). This expression is written in C, not
OIL (as are all expressions in braces), and puts an
extra condition on whether the pattern matches. The exact
meaning of this will be explained later.
Note that syntax like #2
is ambiguous given what
comes so far; the first interpretation is the one that is taken
in practice, and if the second interpretation is wanted the
operand should be expressed as #{1}2
, using a
no-op braced expression to tell them apart. This particular
no-op is recommended because it’s detected and optimized
by the OIL
compiler.
Braced expressions, which must be written purely in C, add extra conditions; they must return nonzero to allow a possible match or zero to prevent one. They can reference the following variables and functions:
c
¶
cNUMBER
This accesses a calculation made automatically by the
compiled OIL program to identify which
bits of the operand can possibly be set, and which ones
cannot be. c
by itself refers to the operand
to which the braced expression is attached; if a number is
given, it refers to another node (the number is interpreted
as a reference number). The actual value of c
is a 32-bit unsigned integer, each bit of which is true, or
1, if there is any chance that the corresponding bit of the
operand might be 1, and false, or 0, if it’s known
for certain that the corresponding bit of the operand is 0.
For instance:
_{!(c&4294901760LU)}1
The constant given here is FFFF0000 when expressed in
hexadecimal; the point is that the expression matches any
operand that is known to have a value no greater than
65535. Unless the operand is the argument to a unary AND,
this check generally makes more sense than explicitly
specifying .
rather than _
,
because it will identify both 16- and 32-bit values as long
as they’re small enough to fit into a onespot
variable. This code could, for instance, be used to check
that an argument to a mingle must be small enough before
optimising it (this is important because an optimisation
shouldn’t optimise an error – in this case, an
overflowing mingle – into a non-error).
x
¶
xNUMBER
x
is like c
, and refers to
operands in the same way, except that it can only refer to
an operand marked with #
. It holds the value
of that constant (a 32-bit unsigned integer), which will be
known to the optimizer at optimize time. One common use of
this is to detect whether a constant happens to be a power
of 2, although there are many other possibilities that may
be useful.
r
¶
When inside a loop, r
is the value of the loop
counter. (It’s almost certainly a mistake if you have
a loop but don’t reference the loop counter at least
once, and usually at least twice, within the loop.) See
OIL Loops.
and16
¶
and32
or16
or32
xor16
xor32
iselect
mingle
These are all functions with one argument (apart from iselect and mingle, which each take two arguments); they exist so that INTERCAL operators can be used by C expressions. They all take unsigned longs as input and output, even if they are onespot operators. Note that it’s entirely possible for these to cause a compile-time error if used on invalid arguments (such as mingling with an operand over 65535), or to silently truncate an invalid argument down to the right number of bits; both of these should be avoided if possible, so the optimiser should check first to make sure that it doesn’t use any of these functions on invalid arguments.
xselx
¶
This function returns its argument selected with itself; so
xselx(c)
is shorthand for
iselect(c,c)
. When the argument is very
complicated, this can save a lot of space in the original
OIL program.
setbitcount
¶
This function simply returns the number of bits with value 1 in its argument. This is sometimes useful with respect to various select-related optimisations, and can be a useful alternative to having to take logarithms in various situations.
smudgeright
¶
smudgeleft
The smudgeright
function returns its argument
but with all the bits less significant than the most
significant bit with value 1 set to 1; likewise,
smudgeleft
returns its argument with all the
bits more significant than the least significant bit with
value 1 set to 1.
Note that all OIL calculation is done
internally using unsigned 32-bit numbers, and C expressions you
write should do the same. The practical upshot of this is that
you should write LU
after any constant you write
in C code; if you don’t do this, you are reasonably
likely to get compiler warnings, and the resulting program may
not work reliably, although the OIL compiler
itself will not complain.
Here’s a more complicated example of an optimizer operand:
#{!(x&2863311530LU)&&iselect(x,1431655765LU)== xselx(iselect(x,1431655765LU))}3
It helps to understand this if you know that 2863311530 in hexadecimal is AAAAAAAA and 1431655765 in hexadecimal is 55555555. (It’s worth putting a comment with some frequently-used decimal constants in an OIL input file to help explain what these numbers mean and make the code more maintainable.) The operand matches any constant integer which has no bits in common with AAAAAAAA, and for which if any bit in common with 55555555 is set, all less significant bits in common with that number are also set.
Replacements have much the same syntax as patterns. The
expressions are parsed in much the same way; however, one
peculiarity of replacements is that bitwidths must be
specified. INTERCAL has a typecaster that
figures out whether each expression is 16 bits or 32 bits wide,
but it runs before the optimizer, and as the optimizer can
produce expressions whose bitwidths don’t obey
INTERCAL’s rules, this information
needs to be inserted somehow in a replacement. In
C-INTERCAL, it usually doesn’t matter what
the bitwidth is, and in cases where it doesn’t matter the
normal operators ($
, ~
, and so on)
can be used. (The bitwidth of the entire replacement may be
different from the bitwidth of the original, thus leading to,
say, a 32-bit unary logical operation applied to a
“16-bit” argument; but this is not a problem, as it
just means that there’s an implied typecast in there
somewhere.) In cases where it does matter (due to
C-INTERCAL’s lenient interpretation of
bitwidth on mingle inputs, the only place it matters is in the
input to INTERCAL unary logical operators),
both the bitwidth of the operator and the argument on which it
operates must be explicitly given, and given as the same value;
to set the bitwidth of an operator’s result, simply write
the bitwidth (16 or 32 for onespot and twospot respectively)
immediately after the operator; for instance, !=32
will generate a not-equals operation with a 32-bit bitwidth. If
an operator’s width is set to 16, and during the course
of execution of the optimized program, a value that
doesn’t fit into 16 bits is encountered, that’s
undefined behaviour and anything might happen (most likely,
though, the program will just act as though its width had been
set to 32 bits instead); this error condition is not detected.
Also note that operators like &32
already have
a bitwidth specified, so specifying &3232
(or
worse, &3216
) is not allowed.
Replacement operands are simpler than pattern operands, because there are only a few forms they can take.
_NUMBER
.NUMBER
:NUMBER
This tells the optimiser to copy the operand or expression
with reference number NUMBER to this point in
the replacement used for the expression matched by the
pattern. The three forms are identical; the last two are
provided for aesthetic reasons (it can look better and be
clearer to match .1
in the pattern with
.1
in the replacement, for instance). You
cannot use #NUMBER
here to copy in
a constant from the left-hand side, though, nor
#{1}NUMBER
, because the first means
something else and the second is undefined behaviour (that
is, no behaviour for the second case has been specifically
implemented in the compiler and therefore its behaviour is
unpredictable and subject to change in future versions);
use _NUMBER
to copy over a constant
with an unknown at optimizer compile time (but known at
optimize time) value from the left hand side, as you can do
with any other operand being copied.
#NUMBER
Insert a constant with the literal value NUMBER here.
#{EXPRESSION}0
Calculate the value of EXPRESSION (a C expression, which can reference the same variables and functions as a C expression in a pattern can; see C functions in OIL) and insert a constant with the calculated value here. (That is, a value is calculated at optimise-time and the resulting value is therefore constant at runtime.)
As an example, here’s an idiom that moves C bitwise AND operations inside leftshifts. (This is useful because if the optimizer has generated a large sequence of mixed ANDs and bitshifts, moving all the ANDs to one end allows them to be clumped together and optimized down to one AND, whilst the shifts can all be combined into one large shift.)
((_1 << #{1}2) & #{1}3)->((_1 & #{x3>>x2}0) << _2)
When writing idioms, sometimes instead of using very complicated expressions to try to match multiple situations at once it’s easier to have a separate idiom for each possible situation; for instance, it’s easier to write idioms for right-shift by 1, right-shift by 2, right-shift by 3, etc., rather than a general idiom to rightshift by any amount. When the idioms follow a pattern, as they will do in basically every case of this sort, it’s possible to automatically generate them using a loop. For instance, idioms to optimize a one-bit rightshift and a two-bit rightshift are:
(_1~#{xselx(x)<<1==x&&x}2)->((_1&_2)>>#1) (_1~#{xselx(x)<<2==x&&x}2)->((_1&_2)>>#2)
Adding a loop to automatically generate the idioms, and placing a name for the group of idioms at the start, produces the following code:
[rshift] <#1-#31 (_1~#{xselx(x)<<r==x&&x}2)->((_1&_2)>>#{r}0) >
That’s 31 different idioms, generated with a loop. As the
above example shows, a loop starts with
<#NUMBER-#NUMBER
and ends
with >
; a different idiom is generated for each
possible value of the loop counter r
in the range
given by the opening line of the loop. Loops must be placed
around idioms, but inside a group of idioms. Note the use of
#{r}0
to generate a constant whose value is equal
to the value of the loop counter.
Here are some tips for the best use of OIL:
MAXTOFREE
in oil.y;
this isn’t a limit on the number of idioms but on the
number of strings that are allocated internally to process
the idioms), and lack of error checking (invalid
OIL may produce errors in the
OIL compiler, or cause the output C code
to contain errors or warnings, or may even appear to work).
To finish off this appendix, here’s an example of the
power of OIL; this is the optimization of an
idiom from the INTERCAL-72 system library, as
shown with -H; this should give a good idea of how
OIL programs work. (All the relevant idioms
are in idiotism.oil as of the time of writing.)
Note how the expression is reduced one small step at a time;
the smallness of the steps makes the optimizer more general,
because if the original expression had been slightly different,
the optimizer wouldn’t have come to the same result but
could have optimized it quite a bit of the way, up to the point
where the optimizations were no longer valid; in an older
version of INTERCAL, this idiom was simply
hardcoded as a special case and so slight variations of it
weren’t optimized at all. If you look at the idioms
themselves, it’ll also be apparent how c
(the record of which bits of an expression can be 1 and which
bits can’t be) is important information in being able to
apply an optimization more aggressively.
.3 <- ((((((((.3 $ 0x0) ~ (0x7fff $ 0x1)) $ 0x0) ~ (0x7fff $ 0x1)) $ 0x0) ~ (0x3fff $ 0x3)) $ 0x0) ~ (0xfff $ 0xf)) [minglefold] .3 <- ((((((((.3 $ 0x0) ~ 0x2aaaaaab) $ 0x0) ~ (0x7fff $ 0x1)) $ 0x0) ~ (0x3fff $ 0x3)) $ 0x0) ~ (0xfff $ 0xf)) [lshift16] .3 <- ((((((((((.3 >> 0x0) & 0x7fff) << 0x1) | 0x0) $ 0x0) ~ (0x7fff $ 0x1)) $ 0x0) ~ (0x3fff $ 0x3)) $ 0x0) ~ (0xfff $ 0xf)) [noopor] .3 <- (((((((((.3 >> 0x0) & 0x7fff) << 0x1) $ 0x0) ~ (0x7fff $ 0x1)) $ 0x0) ~ (0x3fff $ 0x3)) $ 0x0) ~ (0xfff $ 0xf)) [minglefold] .3 <- (((((((((.3 >> 0x0) & 0x7fff) << 0x1) $ 0x0) ~ 0x2aaaaaab) $ 0x0) ~ (0x3fff $ 0x3)) $ 0x0) ~ (0xfff $ 0xf)) [lshift16] .3 <- (((((((((((.3 >> 0x0) & 0x7fff) << 0x1) >> 0x0) & 0x7fff) << 0x1) | 0x0) $ 0x0) ~ (0x3fff $ 0x3)) $ 0x0) ~ (0xfff $ 0xf)) [noopor] .3 <- ((((((((((.3 >> 0x0) & 0x7fff) << 0x1) >> 0x0) & 0x7fff) << 0x1) $ 0x0) ~ (0x3fff $ 0x3)) $ 0x0) ~ (0xfff $ 0xf)) [minglefold] .3 <- ((((((((((.3 >> 0x0) & 0x7fff) << 0x1) >> 0x0) & 0x7fff) << 0x1) $ 0x0) ~ 0xaaaaaaf) $ 0x0) ~ (0xfff $ 0xf)) [lshift16] .3 <- ((((((((((((.3 >> 0x0) & 0x7fff) << 0x1) >> 0x0) & 0x7fff) << 0x1) >> 0x1) & 0x1fff) << 0x3) | 0x0) $ 0x0) ~ (0xfff $ 0xf)) [noopor] .3 <- (((((((((((.3 >> 0x0) & 0x7fff) << 0x1) >> 0x0) & 0x7fff) << 0x1) >> 0x1) & 0x1fff) << 0x3) $ 0x0) ~ (0xfff $ 0xf)) [minglefold] .3 <- (((((((((((.3 >> 0x0) & 0x7fff) << 0x1) >> 0x0) & 0x7fff) << 0x1) >> 0x1) & 0x1fff) << 0x3) $ 0x0) ~ 0xaaaaff) [lshift16] .3 <- (((((((((((((.3 >> 0x0) & 0x7fff) << 0x1) >> 0x0) & 0x7fff) << 0x1) >> 0x1) & 0x1fff) << 0x3) >> 0x3) & 0x1ff) << 0x7) | 0x0) [noopor] .3 <- ((((((((((((.3 >> 0x0) & 0x7fff) << 0x1) >> 0x0) & 0x7fff) << 0x1) >> 0x1) & 0x1fff) << 0x3) >> 0x3) & 0x1ff) << 0x7) [nullshift] .3 <- (((((((((((.3 & 0x7fff) << 0x1) >> 0x0) & 0x7fff) << 0x1) >> 0x1) & 0x1fff) << 0x3) >> 0x3) & 0x1ff) << 0x7) [combinelrshift] .3 <- ((((((((((.3 & 0x7fff) << 0x1) & 0x7fff) << 0x1) >> 0x1) & 0x1fff) << 0x3) >> 0x3) & 0x1ff) << 0x7) [andintolshift] .3 <- ((((((((((.3 & 0x7fff) & 0x3fff) << 0x1) << 0x1) >> 0x1) & 0x1fff) << 0x3) >> 0x3) & 0x1ff) << 0x7) [combinellshift] .3 <- (((((((((.3 & 0x7fff) & 0x3fff) << 0x2) >> 0x1) & 0x1fff) << 0x3) >> 0x3) & 0x1ff) << 0x7) [combinelrshift] .3 <- ((((((((.3 & 0x7fff) & 0x3fff) << 0x1) & 0x1fff) << 0x3) >> 0x3) & 0x1ff) << 0x7) [andintolshift] .3 <- ((((((((.3 & 0x7fff) & 0x3fff) & 0xfff) << 0x1) << 0x3) >> 0x3) & 0x1ff) << 0x7) [combinellshift] .3 <- (((((((.3 & 0x7fff) & 0x3fff) & 0xfff) << 0x4) >> 0x3) & 0x1ff) << 0x7) [combinelrshift] .3 <- ((((((.3 & 0x7fff) & 0x3fff) & 0xfff) << 0x1) & 0x1ff) << 0x7) [andintolshift] .3 <- ((((((.3 & 0x7fff) & 0x3fff) & 0xfff) & 0xff) << 0x1) << 0x7) [combinellshift] .3 <- (((((.3 & 0x7fff) & 0x3fff) & 0xfff) & 0xff) << 0x8) [combineand] .3 <- ((((.3 & 0x3fff) & 0xfff) & 0xff) << 0x8) [combineand] .3 <- (((.3 & 0xfff) & 0xff) << 0x8) [combineand] .3 <- ((.3 & 0xff) << 0x8)
The majority of the files in the C-INTERCAL
distribution are licensed under the GNU General Public License
(version 2 or later), but with some exceptions. The files
ick-wrap.c and pickwrap.c are licensed
under a license that allows them to be used for any purpose and
redistributed at will, and are explicitly not GPL’d. This
means that C source code generated by the compiler has the same
copyright conditions as the original INTERCAL
source. (Note that the libraries libick.a and
libickmt.a are GPL, though, so you cannot
redistribute an executable produced by ick
or by
linking a C file to either of those libraries unless the original
INTERCAL source was GPL.) For similar reasons,
the expansion libraries syslibc.c and
compunex.c are explicitly public domain. Also, this
manual, and the files that are the source code for creating it,
are licensed under the GNU Free Documentation License rather than
the GPL, and the licenses themselves (fdl-1-2.texi
and COPYING.txt) are licensed under a license that
allows verbatim redistribution but not creation of derivative
works. All other files, though (including the man
pages, which are not part of this manual), are licensed
under the GPL. For the full text of the GPL, see the file
COPYING.txt in the distribution.
Copyright © 2000,2001,2002 Free Software Foundation, Inc. 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed.
The purpose of this License is to make a manual, textbook, or other functional and useful document free in the sense of freedom: to assure everyone the effective freedom to copy and redistribute it, with or without modifying it, either commercially or noncommercially. Secondarily, this License preserves for the author and publisher a way to get credit for their work, while not being considered responsible for modifications made by others.
This License is a kind of “copyleft”, which means that derivative works of the document must themselves be free in the same sense. It complements the GNU General Public License, which is a copyleft license designed for free software.
We have designed this License in order to use it for manuals for free software, because free software needs free documentation: a free program should come with manuals providing the same freedoms that the software does. But this License is not limited to software manuals; it can be used for any textual work, regardless of subject matter or whether it is published as a printed book. We recommend this License principally for works whose purpose is instruction or reference.
This License applies to any manual or other work, in any medium, that contains a notice placed by the copyright holder saying it can be distributed under the terms of this License. Such a notice grants a world-wide, royalty-free license, unlimited in duration, to use that work under the conditions stated herein. The “Document”, below, refers to any such manual or work. Any member of the public is a licensee, and is addressed as “you”. You accept the license if you copy, modify or distribute the work in a way requiring permission under copyright law.
A “Modified Version” of the Document means any work containing the Document or a portion of it, either copied verbatim, or with modifications and/or translated into another language.
A “Secondary Section” is a named appendix or a front-matter section of the Document that deals exclusively with the relationship of the publishers or authors of the Document to the Document’s overall subject (or to related matters) and contains nothing that could fall directly within that overall subject. (Thus, if the Document is in part a textbook of mathematics, a Secondary Section may not explain any mathematics.) The relationship could be a matter of historical connection with the subject or with related matters, or of legal, commercial, philosophical, ethical or political position regarding them.
The “Invariant Sections” are certain Secondary Sections whose titles are designated, as being those of Invariant Sections, in the notice that says that the Document is released under this License. If a section does not fit the above definition of Secondary then it is not allowed to be designated as Invariant. The Document may contain zero Invariant Sections. If the Document does not identify any Invariant Sections then there are none.
The “Cover Texts” are certain short passages of text that are listed, as Front-Cover Texts or Back-Cover Texts, in the notice that says that the Document is released under this License. A Front-Cover Text may be at most 5 words, and a Back-Cover Text may be at most 25 words.
A “Transparent” copy of the Document means a machine-readable copy, represented in a format whose specification is available to the general public, that is suitable for revising the document straightforwardly with generic text editors or (for images composed of pixels) generic paint programs or (for drawings) some widely available drawing editor, and that is suitable for input to text formatters or for automatic translation to a variety of formats suitable for input to text formatters. A copy made in an otherwise Transparent file format whose markup, or absence of markup, has been arranged to thwart or discourage subsequent modification by readers is not Transparent. An image format is not Transparent if used for any substantial amount of text. A copy that is not “Transparent” is called “Opaque”.
Examples of suitable formats for Transparent copies include plain ASCII without markup, Texinfo input format, LaTeX input format, SGML or XML using a publicly available DTD, and standard-conforming simple HTML, PostScript or PDF designed for human modification. Examples of transparent image formats include PNG, XCF and JPG. Opaque formats include proprietary formats that can be read and edited only by proprietary word processors, SGML or XML for which the DTD and/or processing tools are not generally available, and the machine-generated HTML, PostScript or PDF produced by some word processors for output purposes only.
The “Title Page” means, for a printed book, the title page itself, plus such following pages as are needed to hold, legibly, the material this License requires to appear in the title page. For works in formats which do not have any title page as such, “Title Page” means the text near the most prominent appearance of the work’s title, preceding the beginning of the body of the text.
A section “Entitled XYZ” means a named subunit of the Document whose title either is precisely XYZ or contains XYZ in parentheses following text that translates XYZ in another language. (Here XYZ stands for a specific section name mentioned below, such as “Acknowledgements”, “Dedications”, “Endorsements”, or “History”.) To “Preserve the Title” of such a section when you modify the Document means that it remains a section “Entitled XYZ” according to this definition.
The Document may include Warranty Disclaimers next to the notice which states that this License applies to the Document. These Warranty Disclaimers are considered to be included by reference in this License, but only as regards disclaiming warranties: any other implication that these Warranty Disclaimers may have is void and has no effect on the meaning of this License.
You may copy and distribute the Document in any medium, either commercially or noncommercially, provided that this License, the copyright notices, and the license notice saying this License applies to the Document are reproduced in all copies, and that you add no other conditions whatsoever to those of this License. You may not use technical measures to obstruct or control the reading or further copying of the copies you make or distribute. However, you may accept compensation in exchange for copies. If you distribute a large enough number of copies you must also follow the conditions in section 3.
You may also lend copies, under the same conditions stated above, and you may publicly display copies.
If you publish printed copies (or copies in media that commonly have printed covers) of the Document, numbering more than 100, and the Document’s license notice requires Cover Texts, you must enclose the copies in covers that carry, clearly and legibly, all these Cover Texts: Front-Cover Texts on the front cover, and Back-Cover Texts on the back cover. Both covers must also clearly and legibly identify you as the publisher of these copies. The front cover must present the full title with all words of the title equally prominent and visible. You may add other material on the covers in addition. Copying with changes limited to the covers, as long as they preserve the title of the Document and satisfy these conditions, can be treated as verbatim copying in other respects.
If the required texts for either cover are too voluminous to fit legibly, you should put the first ones listed (as many as fit reasonably) on the actual cover, and continue the rest onto adjacent pages.
If you publish or distribute Opaque copies of the Document numbering more than 100, you must either include a machine-readable Transparent copy along with each Opaque copy, or state in or with each Opaque copy a computer-network location from which the general network-using public has access to download using public-standard network protocols a complete Transparent copy of the Document, free of added material. If you use the latter option, you must take reasonably prudent steps, when you begin distribution of Opaque copies in quantity, to ensure that this Transparent copy will remain thus accessible at the stated location until at least one year after the last time you distribute an Opaque copy (directly or through your agents or retailers) of that edition to the public.
It is requested, but not required, that you contact the authors of the Document well before redistributing any large number of copies, to give them a chance to provide you with an updated version of the Document.
You may copy and distribute a Modified Version of the Document under the conditions of sections 2 and 3 above, provided that you release the Modified Version under precisely this License, with the Modified Version filling the role of the Document, thus licensing distribution and modification of the Modified Version to whoever possesses a copy of it. In addition, you must do these things in the Modified Version:
If the Modified Version includes new front-matter sections or appendices that qualify as Secondary Sections and contain no material copied from the Document, you may at your option designate some or all of these sections as invariant. To do this, add their titles to the list of Invariant Sections in the Modified Version’s license notice. These titles must be distinct from any other section titles.
You may add a section Entitled “Endorsements”, provided it contains nothing but endorsements of your Modified Version by various parties—for example, statements of peer review or that the text has been approved by an organization as the authoritative definition of a standard.
You may add a passage of up to five words as a Front-Cover Text, and a passage of up to 25 words as a Back-Cover Text, to the end of the list of Cover Texts in the Modified Version. Only one passage of Front-Cover Text and one of Back-Cover Text may be added by (or through arrangements made by) any one entity. If the Document already includes a cover text for the same cover, previously added by you or by arrangement made by the same entity you are acting on behalf of, you may not add another; but you may replace the old one, on explicit permission from the previous publisher that added the old one.
The author(s) and publisher(s) of the Document do not by this License give permission to use their names for publicity for or to assert or imply endorsement of any Modified Version.
You may combine the Document with other documents released under this License, under the terms defined in section 4 above for modified versions, provided that you include in the combination all of the Invariant Sections of all of the original documents, unmodified, and list them all as Invariant Sections of your combined work in its license notice, and that you preserve all their Warranty Disclaimers.
The combined work need only contain one copy of this License, and multiple identical Invariant Sections may be replaced with a single copy. If there are multiple Invariant Sections with the same name but different contents, make the title of each such section unique by adding at the end of it, in parentheses, the name of the original author or publisher of that section if known, or else a unique number. Make the same adjustment to the section titles in the list of Invariant Sections in the license notice of the combined work.
In the combination, you must combine any sections Entitled “History” in the various original documents, forming one section Entitled “History”; likewise combine any sections Entitled “Acknowledgements”, and any sections Entitled “Dedications”. You must delete all sections Entitled “Endorsements.”
You may make a collection consisting of the Document and other documents released under this License, and replace the individual copies of this License in the various documents with a single copy that is included in the collection, provided that you follow the rules of this License for verbatim copying of each of the documents in all other respects.
You may extract a single document from such a collection, and distribute it individually under this License, provided you insert a copy of this License into the extracted document, and follow this License in all other respects regarding verbatim copying of that document.
A compilation of the Document or its derivatives with other separate and independent documents or works, in or on a volume of a storage or distribution medium, is called an “aggregate” if the copyright resulting from the compilation is not used to limit the legal rights of the compilation’s users beyond what the individual works permit. When the Document is included in an aggregate, this License does not apply to the other works in the aggregate which are not themselves derivative works of the Document.
If the Cover Text requirement of section 3 is applicable to these copies of the Document, then if the Document is less than one half of the entire aggregate, the Document’s Cover Texts may be placed on covers that bracket the Document within the aggregate, or the electronic equivalent of covers if the Document is in electronic form. Otherwise they must appear on printed covers that bracket the whole aggregate.
Translation is considered a kind of modification, so you may distribute translations of the Document under the terms of section 4. Replacing Invariant Sections with translations requires special permission from their copyright holders, but you may include translations of some or all Invariant Sections in addition to the original versions of these Invariant Sections. You may include a translation of this License, and all the license notices in the Document, and any Warranty Disclaimers, provided that you also include the original English version of this License and the original versions of those notices and disclaimers. In case of a disagreement between the translation and the original version of this License or a notice or disclaimer, the original version will prevail.
If a section in the Document is Entitled “Acknowledgements”, “Dedications”, or “History”, the requirement (section 4) to Preserve its Title (section 1) will typically require changing the actual title.
You may not copy, modify, sublicense, or distribute the Document except as expressly provided for under this License. Any other attempt to copy, modify, sublicense or distribute the Document is void, and will automatically terminate your rights under this License. However, parties who have received copies, or rights, from you under this License will not have their licenses terminated so long as such parties remain in full compliance.
The Free Software Foundation may publish new, revised versions of the GNU Free Documentation License from time to time. Such new versions will be similar in spirit to the present version, but may differ in detail to address new problems or concerns. See http://www.gnu.org/copyleft/.
Each version of the License is given a distinguishing version number. If the Document specifies that a particular numbered version of this License “or any later version” applies to it, you have the option of following the terms and conditions either of that specified version or of any later version that has been published (not as a draft) by the Free Software Foundation. If the Document does not specify a version number of this License, you may choose any version ever published (not as a draft) by the Free Software Foundation.
To use this License in a document you have written, include a copy of the License in the document and put the following copyright and license notices just after the title page:
Copyright (C) year your name. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license is included in the section entitled ``GNU Free Documentation License''.
If you have Invariant Sections, Front-Cover Texts and Back-Cover Texts, replace the “with…Texts.” line with this:
with the Invariant Sections being list their titles, with the Front-Cover Texts being list, and with the Back-Cover Texts being list.
If you have Invariant Sections without Cover Texts, or some other combination of the three, merge those two alternatives to suit the situation.
If your document contains nontrivial examples of program code, we recommend releasing these examples in parallel under your choice of free software license, such as the GNU General Public License, to permit their use in free software.
This is the index of everything in this manual. (Note that in some versions of the manual this is called ‘Main Index’ to prevent it transforming into a page called index.html in the HTML version of the manual. The complications that that caused were really odd.)
Jump to: |
"
#
$
%
&
'
+
,
-
.
/
1
3
:
;
?
@
^
A B C D E F G H I J L M N O P Q R S T U V W X Y |
---|
Jump to: |
"
#
$
%
&
'
+
,
-
.
/
1
3
:
;
?
@
^
A B C D E F G H I J L M N O P Q R S T U V W X Y |
---|