I do wish we were more adventurous in how we present our code in editors. As the article says, we seem to be stuck in a rut. Typography in the main editing area is almost always based on a classic monospaced text file. Colours are almost always based on “syntax highlighting”, which in turn often involves distinguishing elements of the language we’re looking at whose differences aren’t very important.
Given the potential of modern graphical environments and typography, I’d like to see languages and editors adopt simple spacing improvements like elastic tab stops¹. I don’t want to chose between an aesthetically pleasing layout that draws attention to patterns and anomalies in my code and a layout with completely uniform spacing because anything more carefully aligned is a pain to maintain in later edits and adds noise to diffs. We could shrink blank lines to half-height as well, which would keep the useful visual separation but waste less space.
I’d like to use a (somewhat) wider range of basic symbols than whatever we had in ASCII last century. Even allowing comparison operators like ≤, <, ≠ and = that align consistently would be a useful start.
I’d like simpler syntax highlighting, concentrating on distinguishing major language elements like literals, variables and types, and emphasising problems I probably care about like unclosed strings and misspelt keywords. I’d also like to see colour used more effectively for matching related code like opening/closing pairs or the definition of a value and later references to it.
I’d like to see a sidebar for documentation, not unlike the example for comments in the article, but allowing documentation to be attached at different levels from an individual function or variable up to an entire file or beyond, then having the editor automatically offer all documentation relevant to wherever I am in the code and whatever I’m typing there.
If we were designing a UI for another application, we’d always consider aspects like spacing, symbols and iconography, colour schemes and how to show and navigate related information. It’s strange that we give so little consideration to them in the tools we use ourselves!
(Join me next week for part two of the epic rant, provisionally entitled Why aren’t we all using semantic diff tools yet?)
With a bit of help on tex.stackexchange.com I managed to put together a system which allows editing a .tex file (without the docstrip mandated almost everything is a comment greyness):
Yes, I wrote some literate Haskell professionally a few years back. I think it’s an interesting and useful concept. In my experience, it also fundamentally changed the way I read and wrote code, with both pros and cons.
What I’m advocating here is a little different, though. While it would certainly be useful to write documentation using more expressive formats than plain text, here I’m arguing for the ability to (a) attach that documentation to specific parts of the code at different scales and then (b) have all relevant information at all scales readily available depending on what I’m currently editing.
One of the reasons I would like to have this ability in my development environment is that I found the literate style to be very helpful for writing informative documentation but relatively unhelpful for finding it again later. By its nature, literate documentation is organised primarily for a linear reading flow, like the story in a book, but most of the time when I’m editing code, that’s not my typical searching behaviour.
Apart from Donald Knuth, you are the first person that I heard from who used literate programming professionally. Would be great to learn more from your experiences, good and bad, if you every fancy writing about it.
I've also used it fairly extensively, both in academia and in industry jobs (using Emacs org-mode, with several target languages, often combined). I found it very useful when thinking through a problem, and as documentation for myself later on. At some point it starts to become a bit unwieldy. Nowadays, most of my work is in Clojure, and I find that the literate workflow gets in the way of REPL-driven development, so I haven't been using it as much lately.
I continued to work on that project for a while afterwards, but I don’t think my overall views on the pros and cons of the literate style changed much from what I wrote there.
If there’s anything you’re particularly curious about, I’m happy to discuss further (though my professional experience was limited to that one project, so I’m hardly an expert).
One thing which might help with "scannability" is to have a toggle in the LaTeX code which changes it from focusing on displaying the document as a typeset whole to showing only the typeset code. A possible way to do this would be change the width of the page so that the text lines are so wide each paragraph would only be a single line, then you could set the PDF view to only be the width of the widest code line at the left edge.
I've found that having a hyper-linked ToC and index and having the marginal callouts of module and variable names helps a great deal in finding code again.
It wasn’t so much the code that I found less discoverable with the literate style, more the relevant documentation when reading any given part of the code. I found the linear flow didn’t really lend itself to quickly finding all relevant commentary about any given piece of code, since it tends to encourage building up a “big picture” and so there could easily be comments much earlier in any given chapter/file that were relevant to later code but with no obvious way to know that and find them short of reading the whole thing from front to back.
My issue with LP is that common tools don't support it directly.
I'm generally against metaprogramming and DSLs for that reason, the linter and the debugger don't understand it. Even template languages have the issue, which is a great case for logicless templates.
Seems like LP would be at it's best if your audience is going to read and carefully study your code without a bunch of extra stuff added, as is probably more common in academia, rather than see it indirectly through an IDE.
> I’d like to use a (somewhat) wider range of basic symbols than whatever we had in ASCII last century. Even allowing comparison operators like ≤, <, ≠ and = that align consistently would be a useful start.
Visually, font ligatures do just that, and combine your !=, >=, etc. into single elements
But they don't actually work, because there are many ambiguities. Is <= "less than or equal" or an arrow pointing to the left? And in the other direction too: is ≠ "!=" or "/=" or something else? And as soon as you allow 3 or more characters to build ligatures, your number of ligatures explodes, but you still miss some.
For example it doesn't matter if <= is an arrow or LTE and => is an arrow, in a <=> all are wrong, same as in a >=> or a >>= or...
The main reason I stopped using programming ligatures that much is that many languages actually support Unicode characters in the source code, and this is actively used in some projects I work on.
It then becomes important to see the difference between ASCII approximations and real Unicode symbols, since this is relevant e.g. when searching for a symbol in your code base or to see at a glance whether usage of Unicode vs. ASCII etc. is consistent.
The only spot I still kinda use such a feature (prettify-symbols-mode specifically) is in LaTeX, where Unicode math is more rare and long equations become nearly unreadable without some kind of rendering in the source code.
I'm a bit confused. My only experience with ligatures is in vscode, I don't use them personally, but I've seen it where if you type >= then it changes it to look like ≥, if you hover over it, it shows >=, if you put your cursor to the right of it and hit delete, it deletes just the = and not the >, in other words, it is just >= except visually. On a git diff it would still be >=.
So I understand how you could have confusion in your case if you have logical comparators for ≥ and also have strings with "≥" or even variable names, if your language is permissive enough to allow that. But if you search for "≥" it should not return the ">=" disguised as "≥", ...right?
> But if you search for "≥" it should not return the ">=" disguised as "≥", ...right?
That’s my issue :). Languages like Julia permit both “>=" and "≥" in the source code with the same meaning. But if you use say search-based navigation of your file, how do you know which one to type to get where you want?
Or in Python I’ve seen some people make “alpha” render as α, since it’s a common variable name in physics codes. But “α” itself is a valid Python variable name, and I personally use Greek letters in Python physics code and know others do as well. That can again be confusing, since in Python “alpha” and “α” are not the same variable, so it’s crucial to write it correctly.
For vim/neovim users you can use digraphs, where you press `<C-k>` followed by a 2-character code to get a symbol (see `:help digraphs` and `:help digraph-table`).
For example,
<C-k> -> gives →
<C-k> => gives ⇒
<C-k> d* gives δ (greeks are generally all _letter_*)
<C-k> D* gives Δ
Some of the combinations are a little weird to rememeber, but if you use them regularly then it's easy enough (like greek or arrows).
In my editor I just press Ctrl+x, 8, Enter, and then I can look up Unicode codepoints based on either the hex code or the name. Alternatively you could do the whole Unix 'do one thing and do it well' thing and use a character selection program which allows you to look up a character (I've got KCharSelect, I believe Windows has a similar thing called Charmap or something).
- Like the sibling mentioned, you can use a “compose key” (it’s built in on Linux and can be installed on other platforms). There are other OS-wide options like the TeX input method for Gnome, or TextExpander snippets on Mac.
- In Vim, Ctrl-K in insert mode lets you enter “digraphs”. For instance, Ctrl-K *s will insert a Greek letter “sigma”.
- In Emacs, C-\ will activate an “input method” such as the “TeX” input method that can insert any math or Greek Unicode symbol using LaTeX notation. (I personally like Emacs’ “transient” input method that is activated for one symbol at a time, since it interferes less with coding.)
- In Sublime Text and VSCode there are third-party plugins that insert Unicode symbols via e.g. TeX notation.
- Every Julia editor extension, as well as its REPL, lets you press tab to convert a set of LaTeX-like notations into Unicode symbols in a common way.
- Most editors have support for snippets of some kind, and many have available snippet packs for common math and Greek symbols.
Depending on your operating system, you might have (the option to remap a key to) a compose key. Then you can press compose followed by, for example, <= to make ≤. This also works for letters with diacritics on them, like "o for ö or 'o for ó, currency like =C for € or =L for ₤, or other symbols like tm for ™.
You might also be able to set up your own compose codes in ~/.XCompose, but that didn't work for me. Could be another casualty of Wayland or just some quirk in my setup.
You've highlighted an issue with mapping syntax to ligatures in a generic sense, i.e, one that will work for programming language. But that's not an issue, because it's impossible, and we knew that from the start. When people make these stylized symbols, they have to write a parser for each individual language that they're applying them to.
> Even allowing comparison operators like ≤, <, ≠ and = that align
> consistently would be a useful start.
Every text editor that understands Unicode already allows for that. If you want to use ≤ rather than <=, then you need to alter the language, not the editor.
You don't need to modify the language, and most popular editors already support this. You'll need to enable ligature support, and install a font that contains the appropriate ligatures (as many coding-specific fonts already do)
That's not really a solution, is it, for example I don't want <= to be confused with ≤ within a string. What you propose is just one more half-baked solution that the industry is full of.
That's why you need editor support for this; the lexer understands the difference between an operator and a string literal (that's how syntax highlighting works), and should be able to use ligatures in one context but not the other.
(BTW I'm not a fan of coding ligatures myself, but they seem to be popular.)
Yes, if you do it on top of the lexer it starts to make sense to me. I still think the language should also support the corresponding "ligatures", so that no confusion can result if you happen to enter ≤ directly, but this is actually a nice solution to the problem of what to do in your language aware editor when the user enters <= : nothing! You just assign to different token classes different fonts.
To clarify, I’m not talking about ligatures in programming fonts here. I’d like to use a slightly wider set of Unicode characters, for their designated purposes, as the main and only way to write those ideas. This could include comparison operators, arrows, maybe a few other common and easily recognisable operations.
I would like to use those same characters, and where applicable with the same sizes and alignments of related characters, everywhere — in my editor, my linters and formatters, my diff tools and version control system UIs, my greybeard CLI text file batch processing incantations, everywhere.
All that really needs is languages that use Unicode characters like ≤, fonts that include all the relevant glyphs, and tools that provide a simple way to enter those characters. But all of these are important, because right now we don’t have simple, standardised ways of doing these things, and so we mostly don’t do them at all.
It’s obviously not a big deal, just a small “quality of life” improvement, but it’s 2024 and I don’t see any good reason why we can’t have nice things. :-)
The long and the short of it is because most keyboards don't have dedicated keys for such characters. Ligatures are here, today, and everyone can make use of them. Rich unicode characters require input method changes that are custom to every platform (and likely need to be customised for each international keyboard layout as well).
This is not a hypothetical, btw. One of the most persistent friction points for C++ adoption in Europe was the lack of a ~ (tilde) key, forcing folks to ALT-1-2-6 every time they wanted to implement a destructor...
I appreciate that there is a barrier to adoption here. On the other hand, we’ve had office software that autocorrects -- and " to their more typographically savvy counterparts for a very long time. In many places, people use modifier keys much more than we do in the English-speaking world to type the wider set of characters used by other languages. Many platforms offer methods like compose keys or Unicode look-ups for entering special characters and plenty of people use them. I’m fairly sure that if code editors started offering to convert >= into ≥ automatically then the developers using them would figure it out!
I guess I'm unclear why (contextually) auto-converting >= into ≥ is objectively better than just (contextually) displaying >= as ≥? Seems like it accomplishes exactly the same thing
Just because using the real characters is unambiguous and has the same size/alignment everywhere. I’d prefer to use those real characters because they are more compact and consistent, but it’s a small thing, so IMHO it’s only worthwhile if the whole team sees the same thing and in all of the tools we use. Relying on ligatures in coding fonts doesn’t achieve the latter and often doesn’t achieve the original goal either, so while some might enjoy them, it’s not the solution I’m looking for personally.
The C committee has just deprecated trigraphs. It might be the right time to propose ≠ as alternative to != and then in 40 years we can deprecate != and only keep ≠
How would you type those characters? For me (on mobile) I have a few options: copy it from your message, or google "not equals sign" and copy from the browser, or add a new keyboard that contains the character. On my desktop, I could also memorize windows' alt+x code... none are fantastic options.
Agreed. There's a great deal of improvement I'd like to see in text editors and IDEs.
Collapsible images as inline comments can be FAR more effective at embedding relevant information in certain sections of code.
As far as "related coloration" - Jetbrains IDEs let you setup regex around comments that will color them differently which I use in collaborative environments so devs can quickly collate their TODOs, etc.
There's also an extension called Rainbow Braces that helps to highlight opening/closing pairs with different colors.
> I’d like to use a (somewhat) wider range of basic symbols than whatever we had in ASCII last century. Even allowing comparison operators like ≤, <, ≠ and = that align consistently would be a useful start.
Not necessarily the exact same symbol set, but yes. Nothing too fancy, just using widely recognisable symbols that have been supported by Unicode forever instead of half-baked, ASCII-centric alternatives that date from last century.
I've been thinking for quite some time about how to incorporate spacing properly into the syntax of a programming language, and I now settled for a simple solution: hard-code blocks of texts via spacing into the text format itself. I call the result Recursive teXt (RX, http://recursivetext.com), and I am using it as the basis for the rest of my work on Practal, instead of using just plain text. A lot of issues are resolved with this, for example it is trivial to have different kinds of syntax live next to each other, as they can just occupy separate blocks.
You can customize the syntax highlighting with 15 minutes to 4 hours of work, depending on your IDE, languages, and particular preferences. If it's important enough to you to rant about it online, perhaps you should set a new standard to help others that also want syntax highlighting the way you do? And you'd also be helping yourself for the rest of time.
"I’d like to use a (somewhat) wider range of basic symbols than whatever we had in ASCII last century. Even allowing comparison operators like ≤, <, ≠ and = that align consistently would be a useful start"
That's already doable on vim/nvim. BTW, I'm on macOS, and install them via macport. Or perhaps because I already installed some specific fonts....
I don't know what Notepad you use to write your code, but VSCode with font ligatures (for comparison operators) and a linter (aligns the same as elastic tabstops) already does the 80% of what you want. The 20% rest – sidebar for documentation – are not implemented for a reason.
The 20% rest – sidebar for documentation – are not implemented for a reason.
The sidebar to show the documentation is just a presentation detail. What I’d really like as a developer is (a) the ability to attach documentation with a specific scope ranging from small things like a function parameter or variable, though larger things like a function or a set of related functions, up to whole files or packages, and then (b) editors that recognise the scoped documentation and present it contextually.
I’d like all relevant comments to be easily visible when I might want to refer to them. Also, I’d like to know what documentation exists in the first place, which is not always obvious when we rely on things like big banner comments at the top of files/sections or having separate README files scattered through the tree.
> The 20% rest – sidebar for documentation – are not implemented for a reason.
I mean, keeping an overview of the documentation open on the side is something I do every day in GNU Emacs, so I'm a bit curious what the reason is why it's not done in Microsoft VS Code.
You can do it in VSCode. It's running in a browser. There may extensions to do that or if not you could write one easily.
The reason it doesn't come with the editor is because every language has its own specific documentation format, though I do wish that they could create a consistent set of keyboard shortcuts, search UI, etc for documentation.
Given the potential of modern graphical environments and typography, I’d like to see languages and editors adopt simple spacing improvements like elastic tab stops¹. I don’t want to chose between an aesthetically pleasing layout that draws attention to patterns and anomalies in my code and a layout with completely uniform spacing because anything more carefully aligned is a pain to maintain in later edits and adds noise to diffs. We could shrink blank lines to half-height as well, which would keep the useful visual separation but waste less space.
I’d like to use a (somewhat) wider range of basic symbols than whatever we had in ASCII last century. Even allowing comparison operators like ≤, <, ≠ and = that align consistently would be a useful start.
I’d like simpler syntax highlighting, concentrating on distinguishing major language elements like literals, variables and types, and emphasising problems I probably care about like unclosed strings and misspelt keywords. I’d also like to see colour used more effectively for matching related code like opening/closing pairs or the definition of a value and later references to it.
I’d like to see a sidebar for documentation, not unlike the example for comments in the article, but allowing documentation to be attached at different levels from an individual function or variable up to an entire file or beyond, then having the editor automatically offer all documentation relevant to wherever I am in the code and whatever I’m typing there.
If we were designing a UI for another application, we’d always consider aspects like spacing, symbols and iconography, colour schemes and how to show and navigate related information. It’s strange that we give so little consideration to them in the tools we use ourselves!
(Join me next week for part two of the epic rant, provisionally entitled Why aren’t we all using semantic diff tools yet?)
¹ https://nick-gravgaard.com/elastic-tabstops/