Are transpiling techniques different than compiling techniques?

51 views
Skip to first unread message

Roger L Costello

unread,
Oct 11, 2021, 1:55:59 PM10/11/21
to
Hi Folks,

Today I learned a new word: transpiling

I looked it up and learned that it is converting one source code to another.
See below.

"Neat!" I thought. "I am converting a military air navigation data format to a
civilian air navigation data format, which is a kind of transpiling, I think.
I wonder if there are techniques specific to transpiling?

Is there a book or tutorial on how to build a transpiler? Are there techniques
unique to transpilers?

/Roger

-----------------------------------------------

Compiler: is an umbrella term to describe a program that takes source code
written in one language and produce a (or many) output file in some other
language. In practice we mostly use this term to describe a compiler such as
gcc which takes in C code as input and produces a binary executable (machine
code) as output.

Transpilers are also known as source-to-source compilers. So in essence they
are a subset of compilers which take in a source code file and convert it to
another source code file in some other language or a different version of the
same language. The ouput is generally understandable by a human. This output
still has to go through a compiler or interpreter to be able to run on the
machine.

Some examples of transpilers:
1. Emscripten<https://kripken.github.io/emscripten-site/>: Transpiles C/C++
to JavaScript
2. Babel<https://babeljs.io/>: Transpiles ES6+ code to ES5 (ES6 and ES5 are
different versions or generations of the JavaScript language)

https://stackoverflow.com/questions/44931479/compiling-vs-transpiling

[Back in the day, the term was "sift", from a translator from Fortran
II to Fortran IV written in 1962. In the late 1960s IBM had a Fortran
to PL/I translator which worked (I used it) but generated ugly code
due to all the places where the semantics of PL/I were almost but not
quite the same as similar looking Fortran constructs:

http://bitsavers.org/pdf/ibm/360/fortran/GC33-2002-2_FORTRAN_To_PL1_Translator_Jan73.pdf

I think you will find two approaches. There's the half-hearted one in which
it translates contstructs into corresponding ones and hopes the differences
don't matter, and the full one that is a real compiler with all of the
usual analyses and a code generator that happens to generate another high
level language. The f2c Fortran to C translator is an example

https://www.netlib.org/f2c/f2c.pdf

-John]

Kartik Agaram

unread,
Oct 12, 2021, 11:17:25 AM10/12/21
to
On a slight tangent, I've never liked the term "compiler". I prefer
"translator". "Translator" maps well with "interpreter" when talking about
natural languages. That seems like a good reason to also use it for
computer languages.

Bringing it back to this thread, I think the difference between compilers
and transpilers is largely meaningless. They're both just translators.

[It is about 65 years too late to change "compiler". On the other
hand, approximately nobody uses "transpiler" and we can use something
less cute like translator, or the classic SIFT. -John]

Detlef Meyer-Eltz

unread,
Oct 12, 2021, 11:18:17 AM10/12/21
to
I'm working for years on the Delphi to C++ translater "Delphi2Cpp",
without beeing aware, that this kind of software is called a "transpiler".

https://www.texttransformer.com/Delphi2Cpp_en.html
<https://www.texttransformer.com/Delphi2Cpp_en.html>

What might come close to a special transpiler technique are "rewrite
rules" of syntax trees. But I use a naive approach with no mysterious
transpiler theory in the background. I shortly describe the steps that
are done during conversion:

1. the Delphi source code is pre-processed according to the set conditions
2. the resulting reduced code is parsed to build a syntax tree
3. the syntax tree is pre-processed to calculate some information needed
for the output.
4. the syntax tree is output as C++ code

For the first two steps an own parser generator called "TextTransformer"
is used. The first step can be regarded as a kind of
compilation/"transpilation" of its own. An example for the third step is
the calculation of the variables that have to be passed to
sub-functions, when nested functions are unbundled. A lot of manual work
has to be done for the fourth step. Numerous special cases have to be
hard-coded there, as there is no simple deduction relationship between
the source language and the target language. Some Delphi constructs
cannot be converted at all. But C++ is more powerful than Delphi, so
that many Delphi constructs can be reconstructed or simulated in C++. A
converter the other way round would be quite poor. The power of a
language could be part of a transpiler theory.

In contrast to a compiler, which has to be fast because it is used over
and over again in the development of software, the speed of the
tranpiler does not matter: ideally, it only has to be used once to do
its job.


Detlef

jan van katwijk

unread,
Oct 12, 2021, 10:12:04 PM10/12/21
to
I have - long time ago - written al Algol 60 to C translator.
Not one where "some intermediate machine" is defined and implemented in C,
but each Algol 60 construct is mapped upon - hopefully - semantically
equivalent C constructs.

Looking at the translation process it is just a simplified compiler,
which a parser, a scan for name resolution, a scan to generate an
include file and a scan to map Algol procedures to C procedures.

Apart from handling by name parameters, to be mapped into (almost)
parameterless procedures, to function parameters (in Algol one does
not specify the parameter profile of a formal procedure parameter) and
- to a certain extent - switches and labels as parameter, it is fairly
straight forward (extensive description is available, see "
https://github.com/JvanKatwijk/algol-60-compiler).

I would not give it another name than translator or compiler.

Of course mapping any language to any other language may give
problems, in the 80-ies we made a subset A60 to Ada translator, and
direct mapping of by name parameters and things like non-local gotos
is not well possible (but then, the programs that needed to be
translated was simply structured, apart from a few goto's no big
problems)

jan



Op di 12 okt. 2021 om 17:18 schreef Detlef Meyer-Eltz <
Meyer...@t-online.de>:

> I'm working for years on the Delphi to C++ translater "Delphi2Cpp",
> without beeing aware, that this kind of software is called a "transpiler".
> [ by some people ]

Hans-Peter Diettrich

unread,
Oct 12, 2021, 10:12:35 PM10/12/21
to
On 10/12/21 11:34 AM, Detlef Meyer-Eltz wrote:
> I'm working for years on the Delphi to C++ translator "Delphi2Cpp",
> without beeing aware, that this kind of software is called a "transpiler".
>
> https://www.texttransformer.com/Delphi2Cpp_en.html
> <https://www.texttransformer.com/Delphi2Cpp_en.html>

Hi Detlef, I find your "TextTransformer" quite a good name :-)

> In contrast to a compiler, which has to be fast because it is used over
> and over again in the development of software, the speed of the
> tranpiler does not matter: ideally, it only has to be used once to do
> its job.

Depending on the project type all (daily...) updates of the origin have
to be translated anew. With the risk of introduced bugs that require a
verification of each translation.

DoDi

Hans-Peter Diettrich

unread,
Oct 12, 2021, 10:13:03 PM10/12/21
to
On 10/11/21 8:23 PM, Kartik Agaram wrote:
> On a slight tangent, I've never liked the term "compiler". I prefer
> "translator". "Translator" maps well with "interpreter" when talking about
> natural languages. That seems like a good reason to also use it for
> computer languages.
>
> Bringing it back to this thread, I think the difference between compilers
> and transpilers is largely meaningless. They're both just translators.

I'd classify both like with lexer and parser by I/O type: A compiler
translates from source text into *binary* code, the other one into
another source *text*.

The "transpiler" IMO is a relict from the time when translation of human
speech was the domain of humans, to deprecate the output of translation
programs. While automated translation really sucked for decades, in the
last years I found human translations and presentations often less
precise or meaningful than automated translation.

DoDi

Christopher F Clark

unread,
Oct 13, 2021, 9:17:01 PM10/13/21
to
In this interested thread, Detlef wrote:
> In contrast to a compiler, which has to be fast because it is used over
> and over again in the development of software, the speed of the
> transpiler does not matter: ideally, it only has to be used once to do
> its job.

Sometimes, this is true, sometimes not.

Twice in my career I worked on projects which developed a transpiler.

In the first case, it was true.

My mentor on that project developed a Jovial to PL/I transpiler using PL/I's
macro facility. We only used it to bootstrap the "real" Jovial compiler
to Multics, which was also only a bootstrap to get the Interdata 8/32
Jovial compiler to work. And, the Jovial to PL/I transpiler didn't have to
be real accurate nor fast nor deal with the entire language, just good
enough to get the compiler bootstrapped We may have done the
bootstrap several dozen times (but probably not several hundred)
during the development, but once the Multics Jovial compiler was
working, we never did it again. It was a throw away transpiler.

In the second case, it was not.

At Intel I was part of a CAD team for chip design tools. The tool we
built was called "VMOD" (I don't remember what that stood for). Anyway,
it had one part that allowed designers to draw gates and wire graphically.
That was the initial impetus to the project. However, it was known that
there were places where the design would be better expressed in Verilog
and we allowed the user to drop "combo" boxes (short for combinatorial
logic boxes) that looked vaguely like "chips" (i.e. they were rectangular)
and had "pins" around the edges for connecting to graphical wires.
But in those combo boxes you could use a text editor to input Verilog code,
which could refer to the pins to communicate with the graphical model.

Anyway, I did the Verilog compiler (and also the "compiler" for the graphical
gates--they were actually the same compiler and used the same IR).
But, the technology for both was transpilation. For simulation of the chip,
we transpiled to C++ code and used Visual C++ as our "backend". Our
runtime library was also written in C++. So, we got out a C++ version
of the "chip" that did a cycle accurate model of what the real chip would
do and it hooked back to the graphic model for certain aspects of debugging,
but it also generated log files and allowed debugging of the C++ code.

However, for that code, we didn't just translate once. Since the design
work was on-going and simulating the design so that the developers
could get the chip(s) right (it was used to design about a dozen or two
chips over its useful life) meant it needed to run at something approximating
C speed, which we managed to do and it was thus about 1000x of the
performance of the previous simulators that Intel had been using. In fact,
it was fast enough that teams had us do a pure Verilog version (no graphic
support) for teams that were coding in Verilog and didn't buy into the
development model that the tool was designed to promote. So, it became
a Verilog to C++ transpiler.

Of course, being for chip design, the other important aspect was synthesizing
real gates. For synthesis, we transpiled the graphic model into Verilog
and that included transpiling the Verilog combo box code into Verilog,
Now, that was mostly an identity transform except for hooking up the pins,
dealing with name collisions (e.g. renaming multiple copies of the same
box to unique names) and a few clock related portions. When it was
acting as a Verilog compiler, only the name collision and clock related
portions were relevant as there were no graphic gates and no pins.

And, by the way, there was no big secret to achieving approximate C speed.
We got it because we let Visual Studio do all the heavy lifting of optimization
and code generation. There was no way, I was going to compete with
them on that aspect and no need to. Thus, transpiling gave us a reasonably
good compiler for a fraction of the effort.

And the main thing we had to do was deal with the fact that in Verilog
each "bit" has 4 states 0,1, x and z. And the x and z states of a bit
are used in stylized ways (x means invalid and z means don't care).
So, we did a small amount of analysis to detect if the gates and wires
under consideration could have x or z values and if so, used the more
complex logic that got those values correct (and mapped each bit to a
2 bit pair, so that we had 4 states to use and we did it FORTRAN
"column major" style, so that the bits for 0,1 were in one contiguous
array for the width of the wire/bus they were representing and the
bits indicating that 0,1 was really x, z were in a parallel array and
a quick check of the 2nd array for all 0s allowed us often to not deal
with it at all for some wire/bus. And, if we statically determined
that none of the bits on that bus would ever be x or z, we didn't need
that 2nd array at all. So, things like adders could then use the
normal simple C/C++ logic for addition and not have to do a bit-by-bit
version of it.

We also had special code for clocks, because they couldn't be x or z, but most
flip-flops are edge triggered, so you want to distinguish rising edge from high
(or low) and the same for falling edge. And all the gates were partitioned into
what edge or level they were sensitive to and we only ran the code when the
relevant clock was in that state.
--
******************************************************************************
Chris Clark email: christoph...@compiler-resources.com
Compiler Resources, Inc. Web Site: http://world.std.com/~compres
23 Bailey Rd voice: (508) 435-5016
Berlin, MA 01503 USA twitter: @intel_chris
------------------------------------------------------------------------------

Kaz Kylheku

unread,
Oct 16, 2021, 1:47:29 PM10/16/21
to
On 2021-10-11, Kartik Agaram <a...@akkartik.com> wrote:
> On a slight tangent, I've never liked the term "compiler". I prefer
> "translator". "Translator" maps well with "interpreter" when talking about
> natural languages. That seems like a good reason to also use it for
> computer languages.

Back in the day of Grace Hopper working on Fortran, the terms were
different from today. The "tran" in Fortran of course stands for
translation.

Back then, the word "coding" stood for taking a program (e.g. written by
hand on paper in pseudo-code) and turning into to a machine-language
computer program: among the last steps of programming. Today, we have
"source code" and producing it is coding.

The word "automatic coding" denoted the situation when a computer
was programmed into coding: taking a higher level description of the
program and trnaslating it to machine language.

"Compiling" existed; that referred to something that is more like
"linking" or "loading" today, or perhaps the preparation of an archive
containing object files. It had the obvious meaning: sticking together
routines to create a collection.

Somehow "compile" came to have the meaning to include the translation
step too. Perhaps because some of the steps came to be combined into one
tool invocation.

"To compile" is an attractive word in that it means putting stuff
together, but is only used in specialized circumstances. You don't
usually say that you compiled the clothes after taking them out of the
dryer, or that you compiled the toppings onto the sandwich, or that many
responsibilities have been compiled upon your shoulders. It's not a
commonly used word. It is mostly used in the context of combining
multiple published works, which is a very specific meaning.

That's the big reason why it was possible to give the word a technical
meaning is clear to the point that we can use "compile" almost entirely
out of context (other than it being clear it's a computing context) and
we know what kind of activity it refers to.

"To translate" is not so: do you mean C++ to assembly, or English to
German? Translating what: people translating user interfaces or
docuemntation to another language? Or the machine translating something?
Translate is also a term in English-language mathematics: to displace
coordinates. This happens in computing: logical window-relative
coordinates get translated to a pixel coordinate in the display buffer.
In memory management, virtual addresses get translated to physical
addresses.

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal

Kaz Kylheku

unread,
Oct 16, 2021, 1:48:14 PM10/16/21
to
On 2021-10-12, Detlef Meyer-Eltz <Meyer...@t-online.de> wrote:
> I'm working for years on the Delphi to C++ translater "Delphi2Cpp",
> without beeing aware, that this kind of software is called a "transpiler".

It isn't; that's just a word used by some web programming hipsters.
Transpilers are everywhere, because browsers are stuck with Javascript
as their lowest-level target language*, and it sucks so terribly that
people want to use almost anything else. The bar is quite low; it's easy
to write toy languages that spit out Javascript, so it has become a kind
of popular sport, and from there came "transpiling".

---
* I know what Webassembly is; it's gadget for expressing lower-level
computations with machine-oriented types, to complement and accompany
Javascript; it is not a replacement for Javascript.)

Thomas Koenig

unread,
Oct 16, 2021, 6:14:05 PM10/16/21
to
Kaz Kylheku <480-99...@kylheku.com> schrieb:

> Back in the day of Grace Hopper working on Fortran, the terms were
> different from today. The "tran" in Fortran of course stands for
> translation.

Grace Hopper working on Fortran? Hardly, you probably meant John Backus.

Hans-Peter Diettrich

unread,
Oct 16, 2021, 6:15:57 PM10/16/21
to
On 10/16/21 7:16 PM, Kaz Kylheku wrote:

> "To compile" is an attractive word in that it means putting stuff
> together

The same applies to "assemble" at machine level.

I could imagine that at that time the result was more important than
sophisticated handling of source code. A portable C compiler also is
assumed to output executable modules where other compilers rely on a linker.

DoDi
[The Bell Labs portable C compiler output assembler source code, although
most people didn't notice since it normally assembled it and threw the
assembler code away. Last time I checked gcc and clang do the same. -John]

Hans-Peter Diettrich

unread,
Oct 17, 2021, 2:27:23 PM10/17/21
to
On 10/16/21 11:55 PM, Hans-Peter Diettrich wrote:

> [The Bell Labs portable C compiler output assembler source code, although
> most people didn't notice since it normally assembled it and threw the
> assembler code away.  Last time I checked gcc and clang do the same. -John]

I meant the final executable result is (can be) generated from source
code by a single C compiler invocation. How this result is obtained in
detail, in how many passes, by how many related tools, is not so obvious
and of less interest to the user.

Nowadays dedicated managing tools are available, starting with (batch)
Make and a number of (interactive) Integrated Development Environments.
Here the compiler can be recognized as a source code translation part of
the system, not as the all-embracing process.

DoDi

gah4

unread,
Oct 17, 2021, 2:38:50 PM10/17/21
to
On Saturday, October 16, 2021 at 10:48:14 AM UTC-7, Kaz Kylheku wrote:

(snip on the word transpiler)

> It isn't; that's just a word used by some web programming hipsters.
> Transpilers are everywhere, because browsers are stuck with Javascript
> as their lowest-level target language*, and it sucks so terribly that
> people want to use almost anything else. The bar is quite low; it's easy
> to write toy languages that spit out Javascript, so it has become a kind
> of popular sport, and from there came "transpiling".

In the 1970's, programs to improve Fortran were common,
with Ratfor and Mortran as two examples.
(That is, Fortran IV or Fortran 66.)

The ones I know were written as macro processors, where macros
match some strings in the input data, along with arguments, and replace
them with new strings. At least for the Mortran processor, macros can
create or modify macros. A fairly simple processor, then, allows for a
somewhat complicated language.

One problem, though, is that such processors don't fully parse
the input. Syntax errors in the input produce some strange output,
and strange errors from the final compiler.

It does seem that there are some macro processors for use with Javascript.
[Ratfor used a yacc grammar, which is why early versions of yacc could produce
ratfor output. As you note, it didn't understand all of Fortran so it let
syntax errors through, which is why I did my PDP-10 hack to put the source
line numbers in the Fortran output, to help figure out where the bug is.
I later wrote a full Fortran 77 parser which was awful. No wonder they
didn't try to do it in ratfor. -John]

gah4

unread,
Oct 17, 2021, 6:10:57 PM10/17/21
to
On Sunday, October 17, 2021 at 11:27:23 AM UTC-7, Hans-Peter Diettrich wrote:

(snip on compilers generating assembly source code.)

> I meant the final executable result is (can be) generated from source
> code by a single C compiler invocation. How this result is obtained in
> detail, in how many passes, by how many related tools, is not so obvious
> and of less interest to the user.

Unix tradition, and still supported by gcc, is to stop after generating the
assembly source file, with the -S option.

Some compilers allow mixing assembly code in with the source language.
Seeing the combined result makes it easier to debug. (Or edit the
generated file before sending it to the assembler.)

Many compilers that don't write an assemblable output file, will generate
a pseudo-assembly listing. Enough to figure out what the generated code
does, but usually nowhere close to input to an assembler.

I mostly learned OS/360 assembly language reading the generated
code listings from the Fortran compilers.
[The code from Fortran G was putrid, but from Fortran H and its successors pretty impressive. -John]
Reply all
Reply to author
Forward
0 new messages