How to Build a Compiler?

A generalization of compiling code involves understanding the programming language grammar, parsing it, organizing it into a syntax tree, and then evaluating expressions.

Today, in the Advanced JavaScript Developers group, someone asked the question:

Musing from a Noob: so if you were to write a translator from one language to another, say JS to Java or something arbitrary, would that translator be written almost entirely in like regular expressions or something? My google-fu is solid, but I'm trying to get some professional opinions first.... What y'all think?

I'm no professional in compiler development (I never got a chance to take Brock University's awesome compiler course), but I once wrote a LISP interpreter in Java (takes in some LISP code and executes it).

For this, you might be able to hack your way through it with regular expressions, but I think the proper way is to build an Abstract Syntax Tree.

In all programming languages, there's the concept of a grammar. The grammar is something that defines all legal expressions in a language (much like English does).

For example, in JavaScript, we're allowed to do:

const x = 12

But we can't do:

const x y z = 12

The grammar that formalizes this might look something like:

"[variable type] [variable name] = [value]"

And we didn't follow that rule in the second example.

If languages each have a set of grammar rules, and those rules are all the legal expressions, then there are primarily two parts to building a compiler.

  • Establish the language rules (grammar)
  • Be able to read a file, parse it, then build an validate an Abstract Syntax Tree from that grammar. If you can't build the syntax tree, it's because some grammar is wrong (this is the concept of a syntax error).

There's also the validation of the tree itself.

For example, in Java, for a single tree, where a tree is a file (a program, like myApp.java), you're only allowed to have the public static void main (String[] args) once in the program, because that's the entry point.

So I think in order to PROPERLY do this conversion, you need to:

  1. Build the grammar for JavaScript
  2. Build the grammar for Java
  3. Know what the conversion rules from one grammar to the other are
  4. Build the syntax tree for JS code
  5. For each node (or a group of nodes) in the tree, run it against the conversion rules for Java and build a new syntax tree (a Java one).
  6. Convert the syntax tree to Java source code!
  7. (optional) run it through a prettier thing to make it look clean :p

Definitely leave a comment below if you know a bit about the subject, you've built a compiler before, or you have some useful resources for others.

Resources

I found this page, Introduction to Programming Languages/Grammars. It seems really useful.

There's also Babel, a modern JavaScript compiler that you might have used. I suspect peeking at the source code here would be insightful :)

Other stuff:



Discussion

Liked this? Sing it loud and proud 👨‍🎤.


0 Comments

Be the first to leave a comment

Submit

Stay in touch!



About the author

Khalil Stemmler,
Developer Advocate @ Apollo GraphQL ⚡

Khalil is a software developer, writer, and musician. He frequently publishes articles about Domain-Driven Design, software design and Advanced TypeScript & Node.js best practices for large-scale applications.



View more in Software Engineering



You may also enjoy...

A few more related articles

Client-Side Architecture Basics [Guide]
Though the tools we use to build client-side web apps have changed substantially over the years, the fundamental principles behind...
Absolute and Relative Complexity
Determining if the complexity of a problem is related to the nature of the problem or related to the way that we approach solving ...
How I Write Testable Code | Khalil's Simple Methodology
The single biggest thing that improved the quality of my designs was understanding how dependencies influence my ability to write ...
GraphQL's Greatest Architectural Advantages
There's a lot of advantages to using GraphQL instead of REST on your server, and there's a lot of advantages to using Apollo Clien...

Want to be notified when new content comes out?

Join 8000+ other developers learning about Domain-Driven Design and Enterprise Node.js.

I won't spam ya. 🖖 Unsubscribe anytime.

Get updates