Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: My Turbo Pascal compiler in JavaScript (teamten.com)
101 points by lkesteloot on Oct 18, 2013 | hide | past | favorite | 81 comments



Your project is great, Pascal is really underrated language.

I looked at parser (hand written are best), this part spotted my eyes https://github.com/lkesteloot/turbopascal/blob/master/Parser... - IMHO order of checking whatever node is identifier or not have to be reversed. Now, such simple program:

program Test;

procedure foo; begin end;

begin foo := 10; end.

produces error 'Error: can't cast from 70 to 76 ("10", line 8)' - whatever this means. :)


Well the error message is confusing (it's trying to cast from an integer (type 70) to a procedure (type 76)), but it's correct to fail to compile. But yeah I'm not sure that my logic there is altogether correct. I re-wrote it a few times as I found new cases that weren't correctly handled. Specifically, I'm not sure whether the type of the identifier should guide the parsing. It's similar to the "x*y" ambiguity in parsing C (multiplication or declaration of pointer variable y).


Very nice indeed.

It was a pity that C took over the PC world as it did, Turbo Pascal could do everything better than it, including being safe by default, modules, OO.

Why Borland, why!


I was at Apple when they were transitioning from Pascal (which had driven much of the Lisa and Macintosh) to C (and some C++).

With the exception of a few curmudgeons, C was the way to go for most people. You lost nested procedures, but the stuff you got back was immense. Pascal's big problem was that the extensions required to make it a "real" systems programming language (arbitrary memory access, I/O that didn't suck, etc.) were not standardized; good luck porting anything. Pascal strings were effectively broken (with a proliferation of Str255 / Str64 / Str32 types that were effectively incompatible unless you cheated).

C, while it still lacked a standard, had everything you needed out of the box, and the direction for a standard was clear.

Pascal became a legacy language and died at Apple in the early 90s.


True, Pascal's main problem was that ISO Extended Pascal arrived too late and Turbo Pascal was a PC only thing.

I only moved into C land (actually C++), when I started developing to UNIX was well.

I still remember when after my first Xenix session, I asked my teacher about using Turbo Pascal and being disappointed for it not being available.

> C, while it still lacked a standard, had everything you needed out of the box, and the direction for a standard was clear.

Except modules, namespaces and being unsafe by default.


I never liked Pascal verbosity back in school but I loved C right away... Maybe it is just a matter of taste?


IMHO it's a matter of experience. When I was young I believed that programming is about writing a lot of code that "just work".

Now, after many private and commercial projects, I know that is about reading and understanding code. And about eliminating bugs. :)


Pascal is a safer language overall, which I would argue is a good thing. C eskewed many safety features for slightly (but not significantly) more power.


Having written TSR's in Pascal and C I have to disagree. C had significantly more power and was a lot simpler to get working for lower level stuff.


Only if you are comparing to standard Pascal.

Turbo Pascal had all the capabilities of C as well. I can produce a one to one table, if you wish.

Additionally after version 5.5, it also had support for OO.


I never understood the complaints about verbosity in Pascal languages and the escape to C, Perl and friends.

Why not use APL then?

Programs are to be read, not to be write-only.


Pascal verbosity does not add information, but noise. As humans, we recognize symbols and structures better than words. {} as delimiters make much more sense than BEGIN/END.


I always thought there was a use for something like RATFOR for Pascal; maybe Ratscal? It could have left most of the verbosity in place, but replaced the BEGINs and ENDs with { and }, and that would have made it an almost perfect language from my point of view.


Nimrod is a lot like a modernized Pascal with a Python-like syntax.


Lets all code in APL then.

> {} as delimiters make much more sense than BEGIN/END.

Sure {{{{{}}}}}}


BEGIN BEGIN BEGIN BEGIN BEGIN END END END END END

That's MUUUUCH better!


More readable.

Now we can discuss tab vs spaces.


It's not that readable if you didn't realize I actually fixed your extra }. I expected you to nitpick about it! :P

PS: spaces obviously.


> It's not that readable if you didn't realize I actually fixed your extra }. I expected you to nitpick about it! :P

Yes, I did it on purpose.

> PS: spaces obviously.

Well there we happen to agree. :)


Verbose programs are hard to read.


"begin" isn't harder to read than "{", with each on its own line (arguably a very readable bracing/indentation style).

Later Pascal/Delphi compilers had very useful syntax highlighting built in, so this is really not an issue - unless you prefer as much code as possible crammed onto a single "page" (IMO not really helpful for reading).

{} are also somewhat harder (more strenuous) to type than begin/end, especially on german keyboards (it's Alt Gr + 7 / Alt Gr + 0) and even more so on german Apple keyboards ({} are not even printed on any keys there and they are on different key combinations than on PC keyboards).


Maybe to comic book readers


> comic book readers

You say that like reading comic books is an anti-intellectual crime.

Getting to the point, look at these examples and tell me which one is better from the readability perspective.

> for(std::vector<std::shared_ptr<MyClass> >::const_iterator it = v.begin(); it != v.end(); ++it)

or

> for(const auto &value : v)

Even in the case when the "value" variable's inferred type isn't apparent from the context, we can use

> for(const std::shared_ptr<MyClass> &value : v)

which is still much more readable than the monstrosity in the first example.


To make your comparison more useful (and on-topic), you should compare this to a Pascal or an Ada program. You won't find such a syntax massacre there. In fact some verbosity is a design feature that actually helps programers maintaining large, long-living sytems.


> In fact some verbosity is a design feature that actually helps programers maintaining large, long-living sytems.

I completely agree with that. For example, static typing adds verbosity, but it also documents the program and makes figuring out a large codebase a lot easier. The "noise" can even be reduced in languages with type inference like D, Haskell or C++11.

The thing is, unnecessary verbosity only crowds the code and doesn't add any value. Most of the keywords in "if ... then begin ... end else begin ... end" serve no real purpose and could be replaced with a symbol, like a curly brace.


I have pretty much the exact opposite view: I find type signatures add noise I don't want, while I find {} to be suboptimal for large blocks. I prefer the Ruby approach, where what you described would usually be:

    if
       ...
    else
       ...
    end
So it cuts down on the clutter vs. Pascal, but sticks to words rather than single character symbols


What you call "noise" is the compiler catching a type-mismatch bug before it goes out into production.


Type errors are typically trivially caught with units tests that are necessary to verify functionality anyway.

I used to be a strong believer in static typing and insist that dynamic typing would be extremely error prone, but when I actually tried it, I found that the additional testing required is pretty much non-existent - if you test properly, you tend to cover types as a side effect.

I'd still prefer static typing, but only if it doesn't get in the way. And that means minimal annotations, and minimal restrictions.


Working in the enterprise, with off-shored project modules, will restore your faith in static typing. :)


I suspect braces vs begin/end pair played a big role in this, other than that I think you are correct.


Very nice! Now I've got to see if my old floppies are still readable.

While in college, I got myself an oddball MS-DOS computer. My dad read an article in the business pages about this guy who had the audacity to market a complete software product for 39 bucks. Dad didn't know anything about computers, except that there was growing interest in programming, so he got Turbo Pascal version 1 for me as an Xmas present. That was the original version, and I stuck with Turbo Pascal through successive revisions until 1994 when I discovered HyperCard on an Apple Mac.

In my view, among the strengths of TP, its manuals shouldn't be overlooked. Each version came with a relatively slim yet complete manual that a person could actually read. Of course it helped that the target platform was relatively simple too.


I agree, the manuals were great. (Though I bought them again on eBay for this project and they actually contained a lot of errors!) But also the floppy came with sample programs, and I learned most of Pascal from those.


I forwarded this to Anders Hijlsberg [1]. It'd be fun if he commented on this here.

I cut my programming teeth on Pascal and Turbo Pascal. This is hugely nostalgic for me!

[1]http://en.wikipedia.org/wiki/Anders_Hejlsberg


Maybe he will rewrite it in TypeScript?


Pascal, my childhood love. I spent so many days in TP 7.1 coding games with the graph library!

At some point I made a GUI for DOS, similar to Windows Explorer (with a desktop, icons, start menu, context menus and obligatory clock! :) ).

I called it Winnie. Then my 386 died and I lost the code :/

Thanks for the memories!


OMG!!! That was my favorite TP. I learned to program using Turbo Pascal.

Congratulations on your work and thanks for the nostalgia!


Likewise!

I started in Basic then moved to Turbo Pascal so I could compile my code & distribute it - mostly on 5.25" floppies as cover disks on magazines.

Later by uploading them to BBSes via a 9600bps modem (yes 9.6Kbps was super quick in those days)!


Pretty much the same here. I purchased Borland TP5 in high school which was the first version with "OOP" but I think we were using versions 3 & 4 at school which were also nice.

The equivalent of WordPress back then was Telegard which was written in Pascal. The sourcecode got loose which resulted in several spin-off "Telegard hacks" and many of the "elite" BBS where essentially themed versions of that original Telegard.

Then I heard Pascal morphed into Delphi? I never made it to Delphi. By 1994 I was into Javascript, Java, Rexx and best of all the opposite sex ;-)


I also learned Basic (Spectrum, GW, Quick, Turbo), followed by Assembly (Z80, 68000, x86) and Turbo Pascal.

So when I got to learn C, my reaction was like "meh", given that I was already used to Turbo Pascal capabilities in type safety, modules, OO, low level programming (everything that C does), blazing compilation times.

So I quickly moved away from C into C++ that allowed me to use stuff I was already used to from Turbo Pascal. Parallel to that I used most of Oberon family of languages.


Seriously cool. But I can't imagine ever using it ;-) Half the fun of TP was all the libraries, like to interface Souldblaster, modems, etc. All that stuff is long gone.


Very slick. I'm impressed at the compilation speed, given that it's in JavaScript and I'm on a slow machine. Pascal always was good for that — and you've made me miss it.


Great work, well done! Pascal was the second language I learned, after Basic.

I'm interested in studying the code and possibly replicating it for the purpose of learning about how compilers work. Could you give a general overview of the components, or steps you took to implement them? Thanks!


Thanks! The code is on GitHub: https://github.com/lkesteloot/turbopascal

It's a two-pass compiler. The first pass runs the code through the Stream, Lexer, and Parser classes to generate a parse tree (with type information). The second pass generates bytecode using the Compiler class. The compiled bytecode is then handed to Machine to execute. Start in IDE.js in the _run() function. Also go into turbo.css and comment out the "display:none" line in the .debug-output block. That'll let you see the parse tree and generated bytecode.


> see the parse tree and generated bytecode

Make that a feature!


Done! Try the hidden "X" command.


Nice, presumably you take the x86 emulator from yesterday, and boot it with a copy of the TurboPascal disk?

Took me a good 30 minutes to remember how to write hello world in PASCAL :-) And I admit I forgot about the period on the End statement.


I found more details here, including a link to the source code:

http://www.teamten.com/lawrence/projects/turbo_pascal_compil...

Well done!


That is awesome, somewhere I've got a pretty printer I developed for TurboPASCAL. I tried to sell all the rights to Borland in exchange for a enough cash to buy a decent oscilloscope (I have a history of 'consulting for toys' :-)and they weren't going for it. In hindsight (much later) I realized I should have set the price really high, let them try a demo with a non-refundable deposit of 10% of the price, and then feigning sadness when they decided not to proceed :-)


Some interesting background of the author:

http://www.teamten.com/lawrence/oscar/


Boy, does that bring back memories. Turbo Pascal was just plain wonderful software. As for Pascal, I never got over the array length being part of the array's type.


I remember you could define an array to be indexed by an enumerated type which was nice. I haven't seen that done elsewhere.


At least in Ada, Modula-2, Modula-3 it is also possible.


Isn't that crazy? You can't write a generic integer array sort routine!


Actually you could, by using GetMem() to allocate the amount of space you wanted.


In other words, 'Write in C' :)


Nice! I still have an Atari Portfolio in my kit, running Turbo C 1.0 .. those were the days! Everything you needed to rule the world in a single .EXE. Or .. was it a .COM? 64k should've been enough for everyone .. ;)


/* set overflow to auto */ #screen { overflow: auto; }


:-)


The examples compile super fast. How long did it use to take?


Not much time...

Pascal (and most of Niklaus Wirth's languages) was designed to be easy to compile. The grammars are very small (typically less than a page of text; Turbo Pascal used to include the full BNF grammer in the manual), and his languages are designed to be possible to compile in one pass.

Also "traditional" Wirth-style compilers usually output code directly rather than build an abstract syntax tree and a lot of compilers of the era were heavily influenced by his compiler textbook, which included the full source of a compiler.. They also tend to output object files or final executables directly rather than go via an assembler.

All of that combine to make it very easy to make the compilers very small, and very fast even on very slow machines. For an example of simplicty, take a look at the source code to an educational Oberon-0 (subset of his Oberon language) compiler, from one of his books: http://www.inf.ethz.ch/personal/wirth/books/CompilerConstruc... (it generates code for a virtual RISC machine, and one of the files includes an interpreter for the generated code)

The whole book is available as a PDF here: http://www.ethoberon.ethz.ch/WirthPubl/CBEAll.pdf ; Wirth's material on compiler contruction evolved over the years from an early book about writing a PL/0 compiler, through Pascal and Modula (2) and finally Oberon, but always focusing on simplicity.

For a time I wanted to go to ETH Zurich to study computer science there because of Wirth. I'm a bit of a Wirth fanboy (which is ironic, since my favourite language these days is Ruby - a language that is pretty much the anti-thesis of Wirth's focus on simplicity of implementation).

As as a final digression, lets play six (well, 3) degrees of Pascal to Javascript: One of Wirth's PhD students was Michael Franz who got his PhD with a one of my favourite dissertations, on dynamic architecture independent code generation for Oberon. Dr Michael Franz co-invented the trace-trees technique that went into Mozilla's TraceMonkey version of their javascript JIT with Andreas Gal. Andreas Gal now works at Mozilla with Brendan Eich, the creator of Javascript.


Exactly right. Turbo Pascal compiled straight to machine language, which made it hard for them to do nice optimizations. But people like me didn't care. IBM's compiler couldn't blow off optimizations, though. Typical worse-is-better.

This compiler started out as a five-pass compiler, then I reduced it to two. I could have reduced it to one and simplified the code quite a bit, but having the intermediate parse tree made debugging easier. The parse and compile passes together take less than 20ms on the largest Pascal program I have (rose.pas).


One of the Go guys, Robert Griesemer, studied at ETHZ, http://www.ethoberon.ethz.ch/hall.html

He is most likely the reason why Go methods share Oberon-2's method declarations.


It's a small world. Griesemer finished his PhD the year before Franz - I remember seeing his thesis and some of his papers, but I don't think I read his thesis in full. I spent a lot of time going through stuff on ETHZ website and FTP servers back then; there were a ton of interesting papers coming out of there at the time. The current stuff on active object etc. doesn't resonate with me as much as the stuff they did in the early 90's though.

I keep meaning to go back and read more of their research output - it's amazing just how small and simple a lot of their systems were.


After being a Native Oberon user, I become convinced that doing OS in GC enabled systems programming languages is quite possible.

I keep looking forward to the day it becomes mainstream.

Sadly only people like us, that experimented with ETHZ's work seem to share such enlightenment.


..Turbo Pascal was sort of shocking, since it basically did everything that IBM Pascal did, only it ran in about 33K of memory including the text editor. This was nothing short of astonishing. Even more astonishing was the fact that you could compile a small program in less than one second. It's as if a company you had never heard of introduced a clone of the Buick LeSabre which could go 1,000,000 MPH and drive around the world on so little gasoline than an ant could drink it without getting sick.

http://www.joelonsoftware.com/articles/fog0000000023.html

I miss Joel's writing!


It was only shocking to people used to complicated behemoths of multi-pass compilers or complicated optimizing compilers with expensive internal representations.

Turbo Pascal really mostly "only" does things the way Wirth used to: As simple as possible. Wirth's languages are all designed for single pass compilation with direct code generation, and if you write your compilers that way, they will be fast. It is very, very hard not to make them fast that way.

His compiler textsbooks, all the way back to "Compilerbay" in 1977 (using PL/0 as the language) included full source code for compilers following that style, and they're so small the entire compilers fit in a few handful pages.

(EDIT: just to make clear: I do admire Turbo Pascal - I used it quite a lot, but main the genius of Turbo Pascal was to pick the Wirth way of doing compilers and integrate it with a simple "IDE" - a large part of the amazing impression of Turbo Pascal was that you did not have to write your code to disk, exit the editor, run a compiler etc. but could do it all in one)


I remember that differently: on Z80 CPU machines, UCSD Pascal was slow. It compiled to the p-code (it was both an "operating system and compiler" and they wanted it portable). On the first PC's (8088 CPU), Microsoft's Pascal was slow. It produced obj files. Turbo Pascal was lightning fast on both. It had to be implemented contrary to beliefs common then: it was itself written in asm, it was never just a compiler but the editor and compiler at once, kept everything in memory if it could, produced the pure native code processor specific and never produced obj files.


UCSD Pascal was slow because the compiler was written in Pascal and compiled to p-code itself (the source is available: http://techtransfer.universityofcalifornia.edu/NCD/19327.htm... ).

It was not a good benchmark. It was prevalent because it was trivial to port, not because it was good.

(Note that UCSD Pascal was based on a part of Wirths research groups "porting kit" - their intent was not for p-code to be used for production systems, but as an easy way of getting a Pascal compiler running on a new platform to be able to then use the target platform while retargeting the code generator)

> It had to be implemented contrary to beliefs common then: it was itself written in asm,

That was nothing out of the ordinary at the time, though not something Wirth himself did, as a large part of his mission was to teach the development of compilers.

But compilers for small systems were often written in asm at the time. It'd not be very viable to write compilers in high level languages for a C64 or other home computers, for example, yet there were plenty of compilers for small systems. When I wrote my first compilers on the C64 and a bit later on the Amiga, it never even occurred to me to write them in a high level language until a few years later (incidentally when I got HiSoft Pascal - an integrated Pascal environment for the Amiga very much in the spirit of Turbo Pascal)

> it was never just a compiler but the editor and compiler at once, kept everything in memory if it could, produced the pure native code processor specific and never produced obj files.

That's technically true, though the compiler that became Turbo Pascal was actually originally a separate, stand alone compiler.

Combining the compiler with the editor was fairly uncommon and certainly innovative. When Kahn at Borland decided he wanted an environment like that, though, he went out and bought an existing, standalone, commercial compiler, from Anders Hejlsberg, and brought him to Borland.

Producing pure native code on the other hand was not unusual - UCSD's p-System was an aberration in that respect, and it's greatest lasting legacy was in inspiring VM based systems; e.g. Bill Joy has cited it as one of the inspirations behind the JVM. Not producing object files was also quite common - for many small systems there was no tradition of object files at all, as they were too small for separate compilation to make much sense.


You remember correctly (at least your memories jive with mine). UCSD Pascal was, at best, 'functional'.


I need to stop writing late at night. That textbook is "Compilerbau" ("Compiler Construction").


This article from 2009 came up again recently explaining how Turbo Pascal was so fast back on an 8MHz 8088 http://prog21.dadgum.com/47.html


When I need some inspiration, I just reread "Things That Turbo Pascal is Smaller Than":

http://prog21.dadgum.com/116.html


I remember compilation being nearly instantaneous.


Nice, no Compile option though? Is it an interpreter? I tried a Hello, World, and the (Input,Output) after the program name choked it. Nice job!


The "Compile" option was, I think, for when you wanted to generate a .COM or .EXE. So I don't support that. The "Run" command does compile and run, though.


If I remember right it was COM only until TP4.


Yep!


Oh interesting, none of my programs had (input,output) so my parser doesn't support that. It was optional in TP and I don't think did anything.


This is cool. I remember having this on DOS in the late 80's. Never really got into Pascal though.


Yay, I don't have to use ctrl-K codes like the original TP3 editor!


severely cool.

-bowerbird




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: