My jaw is on the floor. I would've been fully impressed with the unoptimized version, but pushing it to squeak out significant rendering performance improvements was a surprise treat!
sixel-tmux works literally anywhere you can use tmux: as long you can display unicodes on your terminal, the sixels will be "captured" by sixel-tmux and converted into something you can see. Sixels are in-band, so ssh isn't a problem.
In a way, using sixel-tmux is like "giving magical goggles" to your terminal, to let it render sixels so you can see something (even if it isn't perfect), in the hope you'll be tempted to use a better terminal that will show you sixels in all their glory, with a pixel perfect quality.
Sixels enable all kind of cool things, like gnuplot right in your terminal (cf https://github.com/csdvrx/sixel-gnuplot ): sometimes I even watch youtube on my terminal lol
sixel-tmux was made as a first step towards turning derasterize into a more general library: my plan was to add it to nnn but I got bored along the way and moved to other stuff. I might still do that I I love nnn as a filemanager.
BTW, even if there have been quite a few interesting work by @hpa and others in the last 2 years, I think derasterize still has textmode supremacy. derasterize is a collab with @jart after I started adding features to her previous solutions which was based on half blocks like this solution; she's also made further work based on this like https://justine.lol/printimage.html and https://justine.lol/printvideo.html
On Linux, virtual terminals are currently limited to 512 glyphs and 16 colors. You can't do much graphics with that, unless you bypass the terminal layer and write directly to the framebuffer, like w3mimg does.
VT's but not XTerms. Also, framebuffer based VT's allow 256 colors with ease.
On the font issue, yes, sadly you are right. It's retarded to just have 512 glyphs on a framebuffer, not a true tty. The framebuffer or KMS should support TTF fonts by default. A Unifont TTF would be godsend here.
I should have been clearer: as long as you have a font which has entries for the glyphs used by derasterize and sixel-tmux - check the source and you'll see you don't need a lot of them.
Exactly what I was thinking. Brow.sh has been around for 5 years. I guess the difference is this one is a fork instead of a wrapper of a standard browser?
I remember trying brow.sh several years back, and thinking it was cool, and then not touching it since it was a bit too sluggish and "off" to be a practical tool for multi-media heavy sites. This might not be a fair evaluation, but that was my first impression.
This project, on the other hand, allows me to "watch" Youtube videos across a fairly slow SSH connection on a remote host on the other side of NA, and "smoothly" scroll around, looking through comments and the like.
Comparing them side by side, it seems like brow.sh is a bit more precise (re: layout), but carbonyl is much more responsive. Very impressive. Not sure if there's a use for watching Unicode youtube videos in the terminal, but impressive!
It's a fork in the sense that it patches Chromium. More accurately difference is in this case the rendering happens by hooking a new capture device to the graphics library utilized by Chrome whereas browsh utilizes a webext which communicates with a background process (thus theoretically* could be browser-independent but currently only supports Firefox).
*Issue is Chrome doesn't support extensions in headless mode.
I immediately thought of SSO use cases that require the browser. For example logging to generate a save a kubeconfig for RBAC workflows.
This seems like an automation dream for all kinds of things when I want to see browser interactions and results - but all from a terminal! This is awesome.
> One of the unicode characters we can render is the lower half block element U+2584: . Knowing that cells generally have an aspect ratio of 1:2, we can render perfectly square pixels by setting the background color to the top pixel color, and the foregound color to the bottom pixel color.
I did write a very fast SIXEL back-end [0] with adaptative quantizing and GPU acceleration, but most emulators have a very inefficient implementation unfortunately. For example iTerm converts the SIXEL data to PNG, writes it to tmpdir, and displays it as an image you can drag and drop.
I’m still exploring it thought, but as a separate program to run any X apps through SSH.
What nonsense, it takes literally 15 lines of code without using anything beyonf the standard library to write a client for the kitty graphics protocol. I challenge you to match that for sixel. https://sw.kovidgoyal.net/kitty/graphics-protocol/#a-minimal...
Sixels are pixels and enjoy a wide support due to how old this VT240 format is.
The Kitty protocol starts with multiple formats:
>> The terminal emulator must understand pixel data in three formats, 24-bit RGB, 32-bit RGBA and PNG.
Different tools for different needs, but if you are going for a wide support you want something simple that doesn't also have 5 different types you have to separately implement and test:
>> d: Direct (the data is transmitted within the escape code itself)
>> f: A simple file (regular files only, not named pipes or similar)
>> t: A temporary file, the terminal emulator will delete the file after reading the pixel data. For security reasons the terminal emulator should only delete the file if it is in a known temporary directory, such as /tmp, /dev/shm, TMPDIR env var if present and any platform specific temporary directories and the file has the string tty-graphics-protocol in its full file path.
>> s: A shared memory object, which on POSIX systems is a POSIX shared memory object and on Windows is a Named shared memory object. The terminal emulator must read the data from the memory object and then unlink and close it on POSIX and just close it on Windows.
> What nonsense, it takes literally 15 lines of code without using anything beyond the standard library to write a client
Conveniently taking a preencoded PNG and assuming away the necessary queries of supported protocol:
>> Since a client has no a-priori knowledge of whether it shares a filesystem/shared memory with the terminal emulator, it can send an id with the control data, using the i key (which can be an arbitrary positive integer up to 4294967295, it must not be zero).
> I challenge you to match that for sixel
Challenge accepted. There are many libraries for many languages. Let's look at this one in perl:
use Image::LibSIXEL;
$encoder = Image::LibSIXEL::Encoder->new();
$encoder->setopt("w", 400);
$encoder->setopt("p", 16);
$encoder->encode("images/egret.jpg");
and some print($encoder) to print the output, and this works on all the terminals listed on https://www.arewesixelyet.com/
Compared to kitty: "As of April 2022, kitty and WezTerm are the only terminal emulators to support this graphics protocol completely, with Konsole and wayst having partial support"
If we want graphics in the terminal, first we need graphics in the terminal (using sixels is the simplest way, and you get access to many tools) then you can do more if you still want to - but you're likely to realize by then that all the extra things Kitty supports is YAGNI: https://en.wikipedia.org/wiki/You_aren%27t_gonna_need_it?use...
1) sixel has no querying. So if you want to compare apples to apples, you leave out the querying from the kitty protocol as well.
2) The code I linked to above uses only the standard library which I emphasized. You are using some random sizel specific library to add support for sixel to your perl script.
So no, you dont meet the challenge. I can output images with the kitty protocol using pure BASH + base64, available everywhere.
Oh and incidentally the saitoha sixel library is both unmaintained and full of security issues, so save yourself some headaches and dump it.
Trying to claim the kitty protocol is harder to use than sixel is beyond disingenuous.
Either you have PNG data, or you have to support the various RGB encodings
> I can output images with the kitty protocol using pure BASH + base64, available everywhere.
This is disingenuous: so can I if instead of .PNG files, I have .SIX files.
I won't even need base64 - which you'll need to write in shell script if you write a portable bash script. And if want you want to show isn't .png or .jpg, you'll need more...
> Trying to claim the kitty protocol is harder to use than sixel is beyond disingenuous.
Funny, because I think the same about you.
Given the tone of your previous post, and the tone of this post, I'm out.
Yes and PNG is an actual file format widely used and available. SIX isnt. And sure if you want to invent mythical SIX file formats, I can just as well invent mythical KITTY_IMAGE file formats that you can just cat into a terminal, without needing even base64 or bash. The point is, as it stands, the kitty image protocol is FAR easier to USE to display images than sixels. Trying to claim otherwise is just ridiculous. Indeed, if you want to talk of ease of use, the iterm2 protocol is even easier, since iterm supports more image formats without requiring conversion to RGB.
Good that you are out, hopefully you are out of trying to shill sixels as well.
Oh I dont hide the fact that I love kitty. I have good reason to do so, it has almost single handedly fixed so many problems that haveplagues the terminal space for decades. And done so in a robust, thoughtful and well designed manner. If you dont like kitty that's fine. But PLEASE do not advocate broken, badly designed disasters like SIXEL and modifyOtherKeys. they will just doom this ecosystem for decades more.
Funny, I don't see vt52, vt100, vt102, vt200, xterm et al. on that list. Could this be why sixels aren't everywhere except terminal emulators that run on gui platforms with highly advanced graphics already?
xterm is on that list (and at least for me, the distro provided version has sixels enabled, I've not dug into what build options can be switched off, but I wouldn't be surprised that lots of things that people assume, such as colour depth, can be switched off). Given the vt52, vt100, vt102 and vt200 predate the introduction of sixel (based on https://en.wikipedia.org/wiki/Sixel), I'm not surprised those 40 year old terminals don't support sixels ;)
It is crazyfast. Alacritty will merge it someday, but they have very very high standards and allowing graphics is a major change. So it's taking a lot of time.
Mind blowing, that's the kind of projects that makes me wonder what I am doing from my free time before realizing I have not even 10% of the skills and smartness required to pull out something like this. Thank you for sharing, that's really incredible!
This is really neat. Did you consider using the full range of Unicode block glyphs to squeeze out a little bit more resolution from the terminal vs just using the half-block?
I've thought about that for a GB emulator that renders to the terminal, but how do you handle more than 2 colors? That was the issue that kept me from using it.
This is incredible, think of the applications! Running a terminal inside a WASM VM inside a browser running from your VT100 terminal is now possible. I am literally floored, and I've been waiting for this moment for years. Thank you for coming up with this inventive solution, and I feel this will usher in a new era of "text" interfaces!
No, the opposite. This is half baked. Elinks having JS supports would be much better. Videos? Just spawn mpv on audio/video links or send the links thru xsel to mpv+yt-dlp.
Wow. Absolutely incredible work. Once in a while there's a rare demonstration of incredible software engineering on HN, and this is one of these times.
This is extremely cool. Do you plan to try to get the chromium patch upstreamed, or some subset of it?
Have you had problems with this triggering bot/scraper/abuse-detection?
You force-enable `kContentCapture`, `kContentCaptureTriggeringForExperiment`, and `kContentCaptureInWebLayer`, which make the browser look -- to the website -- a LOT like a bot/scraper and not at all like a normal user.
I plan on submitting some bug fixes, but I believe the rest will be a hard sell to the Chromium team. They also have very high standards, and the current implementation is not up to it as of today.
Something that could be cool is work with them/submit patches to improve the DevTools protocol to be more efficient and suited for Carbonyl and html2svg's use-case. Combined with a virtual X server (in Rust of course!), this could provide very similar performance and remove the need to fork.
For the bot-detection, I only had Twitter telling me the browser is not compatible so far, it was caused by the "Carbonyl (version)" user-agent. I fixed it by switching to "Chromium (version) / Carbonyl".
Good point on kContentCapture, I didn't realize it could be visible to websites. I enabled it to experiment with new ways to get text data, but ended falling back on Skia. I'm going to disable it now that it's unused!
Thanks for posting this link. I saw this submission and remembered that Chromium had a terminal rendering mode, but couldn't manage to find this page, since I wasn't searching Ozone specifically.
The fork here goes even further, which is really cool.
> The fork here goes even further, which is really cool.
The fork here goes incomparably further. Libcaca just produces an ‘ASCII art’ approximation of an image, not capable of handling text. So, while the photo is too small to see it, the text on the terminal is gibberish. The libcaca backend was done as an backend interoperability exercise, and run on a real terminal just for fun.
Could probably zoom into the captcha with some elbow grease. The resolution is there if it filled the terminal.
This is actually the showstopper for me with any kind of terminal browser. I enjoyed playing with browsh (or what was it called?) a few years ago, but this is much better.
I wonder if my suggested workaround has any meat to it.
That is some really cool development!
I'm curious to see an implementation adding plugin support. And maybe ultimately bringing this all inside of emacs <3
AFAIK it used same NeXTStep Text (NSText) API [0] that is still being used on macOS nowadays [1], it didn't support rendering images at first:
> The inline images such as the world/book icon and the CERN icon, would have been displayed in separate windows, as it didn't at first do inline images.
Historically, DPS is one of the reasons Safari PDF exports look so good: Apple based CoreGraphics APIs on DPS to make the migration from NeXTStep to Mac OS easy. This makes the CoreGraphics<->PS/PDF conversion fairly straightforward.
Most emulate an xterm, which didn't have support for graphics. (And xterm itself is emulating an even older TTY … with features bolted on…)
There are some escape sequences you can feed to some terminals, though, that emit graphics. "Sixels" are pretty well known for pushing graphics to a terminal, but they're not universally supported.
iTerm2, on macOS, has a means of escaping and then sending a PNG, which is then displayed, IIRC. (But it's unique to iTerm2, and IIRC iTerm2 identifies as an xterm so it's basically impossible to autodetect.)
Cool, ok, and certainly a surprising hack, but what's with the gushing comments? Who wants to run Chrome in a terminal? Almost nobody, because there are dedicated, lean alternatives for browsing since the start of the web, and almost nobody uses them.
Are you teasing us on purpose by not suggesting some well known lean alternatives? lynx comes to mind, doesn't show images, does work ok. But what are you thinking about?
Alas, in 2023 most of the modern web just won’t work with Lynx. Or w3m. Or Elinks. Or it will work but in a way that makes you suffer and miss on functionality you might need.
Just a consequence of turning a markup language into an OS.
But what's the point of using a terminal to render a website lynx and friends can't handle, when the browser is but a click away? I can imagine a scenario like this: you must use a website that can only be reached from within a certain network, and you can ssh in, but there's no X11 tunnel, and the website really requires rendering animated SVGs and Javascript. That would be a lousy design, but it might happen. But how often do you encounter that?
Personally, I'm pretty sick of the bloated, image heavy, ad infested nature of the modern web. Terminal browsers promise a lighter experience focused on content rather than style.
What we actually need would be a full-blown modern javascript-enabled browser written in simple and plain C (c89 with benign bits of c99/c11), namely with a compile-time/runtime object model suited for the "web".
I good start: re-use the parsers from netsurf, get mr Bellard and friends quickJS... and then the ez part: get 748392749324732984 full-time devs full for 748937489234 centuries in order to get such engine working... BUT it is not finish since, once "working", you will have to play catch up with blackrock/vanguard financed web engines and everything they will do to break compat with your engine.
Some ppl are still trying with rust? servo something?
Been a user of it for a while now. But don't worry, where it was used to work, well... it usually do not last very long: it feels like there is active work done to "break" any alternative from vanguard/blackrock financed javascript-able browers from "working".
What do you think I have... Ofc course I run git from edbrowse, libpcre2, quickjs, tidy-html.
And it "works"... never a very long time. It feels like once I manage to use edbrowse to interact with a "web app", it gets broken not that long after.
I think edbrowse should use the (x)html parser from netsurf.
2 examples:
- I did download my motherboard bios update with edbrowse... did that 1 time, then the site was changed and the new file download javascript code does not run with edbrowse and has not ever since.
- I used edbrowse to unlock the wifi in the starbucks in my city, 1 month later the javascript was changed and broke it. (I don't care anymore, I am going thru 4G, IPv6, anywhere I want now).
Well geeko/webkit and blink are all financed by the big tech controlling funds. webkit(apple) and blink(google=alphabet) are own and steered by those same ppl.
Oh, and those ppl own and steer microsoft too, and some wonder why those are using blink now. I do ask myself how long they will keep webkit and blink since they have already geeko to feint the anti-trust laws.
But your are right, they made the "Standard" so insanely complex on top of the core noscript/basic (x)html, that it "works" only in their implementations with their armies of devs.
This is accutely toxic for humanity. The only "real" ways out of this: restore noscript/basic (x)html where it is still doing a good enough job, and if a "new" web has to be born: an implemention must be something reasonable in time/skills and efforts for a small team of devs.
If you have a closer look, they did drop javascript a long time ago. duktap is just a remnant of this past. netsurf is mostly a noscript (x)HTML parser with a CSS renderer. I am not a user of netsurf, I use links2/lynx and edbrowse, but I was planning to write my own frontend to their CSS renderer (noscript). You can ask them how technically _insane_ is the javascript-ed web. They have a good idea, they tried and broke their jaws on it, and they work is rather very clean.
Their code is plain and simple C, namely their libs (because they did things cleanly), can be used as a stepping stone to write your own noscript browser, and if you want with CSS rendering.
Regarding the javascript-ed web, if you have 7483947398 devs for 479837438 centuries, and are able to sustain the permanent planned obsolescence and breakage of the web "standard", that for forever, you may have a chance to provide a real alternative implementation, namely not dependent on their SDK, so using a language simple enough like plain and simple C and certainly not c++/java and similar. Some are trying (still?) with rust (orders of magnitude cheaper than c++).
Don't forget, blink/geeko/webkit are financed by the blackrock/vanguard galaxy of owned companies ("big tech"), we are talking tens of thousands of billions of $, namely you must be "outside" the economy to have a chance, and don't worry, if you start to have something interesting (and that without using their SDK), they'll make your life rough (they will break their web sites/javascript on your implementation actively, "buy" your devs, try to torpedo your financing, etc).