-
-
Notifications
You must be signed in to change notification settings - Fork 700
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sixel support #371
Comments
Thank you for this issue and for putting this on our radar, I think this would be fantastic. |
We discussed this a couple of months back and would surely like to support something like this in order to enable video/image rendering in the terminal though I'm not sure this is going to take priority. IMO Zellij has some way to go in terms of compatibility, performance, and UX before we continue to venture into this (very exciting!) option. Other maintainers might have other thoughts in this regard |
Cool. I'm happy to confirm you are as excited about this as I am, and I also understand there are more urgent issues to tackle on before working on this. I just wanted to make sure you knew about it. |
this is great, though if you want to do serious video and are willing to blaze a new trail, the sixel protocol is sadly terrible, and we could do much, much better. i'm the lead developer of Notcurses, pretty much the only game in town when it comes to mixing glyphs and bitmaps freely, and have some thoughts on the topic. in the meantime you might want to check out this essay. if you'd like to talk, mail me at [email protected], or respond here. if you want to stick to something that already exists, the kitty protocol is itself in many ways superior to sixel, and i heartily recommend implementing it (ideally in its 0.20 version, including the animation method) over sixel. sixel is kinda fundamentally underspecified and very difficult to cleanly mix with text (i've had to go through an obscene number of hoops to unify Sixel+Kitty). anyway, sorry to butt in. |
Some discussion of interoperability/terminal protocol here https://gitlab.freedesktop.org/terminal-wg/specifications/-/issues/12 iTerm2's protocol seems to have buy-in from several terminals too. I want to mention a use case: IPython REPL with inline plots and other image output, including latex math. This already works in terminals, outside of multiplexers, but would be very valuable inside. For me that's a more tangible usecase than just a hack to play videos in the terminal 🙂 . (Animations/videos are a common output in IPython too, btw). |
Jexer has supported sixel in a terminal multiplexer environment for a couple years now. XtermWM is an easy demo of those capabilities. Some notes from that time:
For zellij, you would be looking at both decoding and encoding:
|
Any updates? |
No one that I'm aware of is working on this. If anyone's interested and needs some guidance to get around the app - I'd be happy to help. |
FYI if you want a start on the decoding side, wezterm's sixel decoder is in terminalstate . |
I'm starting to look into this again. Been reading all the helpful content linked to in this issue (thanks all!) and doing some basic tinkering. Talking about Sixel, I think the major question I have for @dankamongmen, @AutumnMeowMeow and @joseluis is if the behaviour of the sixel images on screen is defined somewhere that I'm missing? Questions like:
The reason I'm asking these is that a way I'm interested in taking the Zellij implementation in is to automatically open any rendered sixel assets in a new floating pane. Logically this floating pane would "belong" to the terminal pane that opened it, any updates to that area would be happening in it, only it would be draggable and in a different place. This seems to me like a better UX than most of the implementations I've encountered - but I'm new to this protocol and am wondering if this would break other assumptions and expectations people usually have when developing/using Sixel? |
i don't believe there to be a single place where sixel behavior is canonically outlined in all its detail. there are a few places to look:
alacritty's ayosec/graphics branch is highly divergent from other implementations in terms of sixel removal (but has not yet been merged, and might never be). for other implementations, you've got the following basic semantics: (a) painting a sixel is done as a single logical event, distinct from events before and after to answer your questions:
|
Hi @imsnif, I have a DEC VT340+, which I believe was the ultimate hardware sixel terminal emulator Digital sold. I'm always happy to test programs to answer any questions which the documentation is vague about. Just file an issue at github.com/hackerb9/vt340test.
Typically sixels are mapped to specific cells, but it may be possible to abstract that away for a terminal multiplexer. Regardless, sixels never wrap.
Your video app/tool must handle everything. Sixel is great for command line use (the stereotypical glass TTY with past commands scrolling up as if on infinite fan-fold paper). But it is too low-level for the sort of full-screen interaction you are talking about. That's why @dankamongmen's notcurses is such a welcome library.
By "damaged", you mean the delta, right? You can re-render just the changed areas, if you want. Sixel supports 1-bit transparency, same as GIF, so it is possible for video to be optimized quite a bit further than current video players take advantage of. However, I wrote the simplest possible video player which splats every frame of sixels and was surprised that performance was more than sufficient when playing local files. (Someday, when I get around to it, I'll optimize it to work better over ssh.) If the user scrolls or adds line breaks, it would cause damage your application wouldn't know about. Since the application is responsible, you'll have to find another way around the problem. You can either prevent that from happening or you can simply send the whole I-frame every once in a while. I believe notcurses is the answer you're looking for as it should make it trivial to prevent the user from causing the damage in the first place.
I'm not sure I understand what you're saying. Unlike Kitty, sixel has no "assets". (Okay, technically, one can store sixel images in page memory, but that's probably not what you're talking about.) Are you thinking of something like XTerm's floating Tek4014 pane for drawing graphics? That's probably not what you mean, but if so, Tektronix graphics was a different protocol and treating sixels that way would certainly break my assumptions as a developer. To me, the biggest benefit of sixel is that it is integrated with the text. |
[blink] you know what, i don't think i'd considered this. maybe i had? i don't think my data model can take advantage of it immediately, but definitely something to keep in the back of my head. |
oh yeah; i had assumed that you couldn't use notcurses for some reason, but if you can, its entire reason for existence is abstracting all this crap away |
Hello. I'm not good in this issue but I saw this repo about new terminal image protocol implementation. https://github.com/contour-terminal/terminal-good-image-protocol And this article: The point is sixel is not best and not only one image protocol. |
sure, you might enjoy reading https://nick-black.com/dankwiki/index.php?title=Theory_and_Practice_of_Sprixels, which discusses how these various backends can be united by a client library |
@dankamongmen thank you! |
i sat down to design my ideal protocol, and it ended up looking so much like kitty's that i just recommend using that. kitty's is far superior to sixel imho, save with regard to portability. of course if you use notcurses, it handles various backends for you =]. |
Thanks! looks like my favorite terminal emulator foot doesn't support kitty image protocol... btw, fzf is based on notcurces right? Slightly offtop but do you know does it support image preview with some of image protocol? |
foot absolutely supports sixel
it does not use notcurses. it uses both sixel and (iirc) kitty itself. |
@dankamongmen @hackerb9 - thanks for the explanations. I think I'm starting to get the hang of what's required of me. (also @hackerb9 - I didn't know about the Xterm floating pane, this looks really cool! I did mean something very similar, only rendered as a floating zellij pane - you can see something similar in the gif on this repo) A few more questions, if I may:
About using notcurses:
|
so i took a look through your code, and i think i agree with what you say here: given your current infrastructure, it makes sense for you to do it. what i think would make the most sense, though (and obviously once again i'm biased), is for you to undo some of your current infrastructure, and adopt notcurses for it. your entire system of planes and subwindows and z-axes are notcurses's bread and butter. as i wrote in chapter 8 of the book:
you could eliminate a ton of your code if you used all this sixel-atop-sixel annoyance would be handled for you, and you'd get kitty, etc. of course, if you're uninterested in doing that, i can understand. |
We support wide-glyphs (bugs still pop up here and there, but it's becoming less of an issue with time). I'd be curious to hear more about the concerns of RTL languages (I speak one and as far as I'm aware aside from the wide-glyph part, dealing with the directionality is mostly the app's responsibility?) but maybe we can do this in another issue. About the library itself: not using an external library for rendering is definitely a conscious decision I've made early on. Granted, as you say this causes a lot of grief (more than I was initially aware of), but since this is our bread and butter as well, I think it allowed us to add lots of specialized features and performance optimizations that I'm super happy about. I'm not saying we do them better, just that it's good to have this sort of freedom in our core offering. |
As a possibility, it may be worth considering offering alternative pixel support behind a non-default Ideally a future zero-dependencies 100% rust sixel (&kitty) solution for pixels in the terminal would be very nice to have, specially as a separate library, but that will probably require many more hours of work. |
Maybe I'm not understanding something, but wouldn't this also require us to do just as much work translating the notcurses state to our state? Or alternatively changing many parts of our architecture to use notcurses instead? |
It very well maybe be the case, I've not yet investigated how this library is built internally. If the work that is needed to be done in both cases turns out to be similar while achieving similar compatibility/feature-set, it would probably mean it's worth it to go full rust. |
(I am generally away from F/OSS for the time being (all projects archived), so may be quite sporadic or late on responses. Trans stuff. Sorry. 🤷♀️) As @dankamongmen and I have both seen, sixel implementations are quite different by terminal. When you get to testing, I would recommend testing on xterm first, and then: mlterm, foot, mintty, wezterm. (Do it without manipulating DECSDM at all. Assume that the bottom text row is not available for sixel images.) If your solution works the same on all those, then you should be able to justify filing bugs on any others that look different. Two main issues are:
Point 2 above is also a serious challenge for terminals when you use lots of small images as jexer does now, and notcurses might in the future if it goes to a mosaics design: almost all of the terminals I tested had to fix crashes and bad screen artifacts, due to not being designed from the get-go for frequent cell-sized overwrites of larger images. Which answers the animation question: for sixel, you must replace the changed area for each frame. The 1-bit transparency for GIF-like animations would likely work for the majority of terminals, but not wezterm. I would be unsurprised if it also exposed memory pressure/bugs on others. Raster attributes: @j4james would probably know better than anyone, but so far I have not used or seen any cases (on current terminals) for a non-1:1 pixel aspect ratio. The image width/height are used though, for defining rectangles of current background color. Though some of the DEC standards suggest staying within raster attributes, in practice you can always go outside them: they are a minimum size of the image pixel data, not a maximum. Within the pty: implement the stuff on the bottom of this and the applications inside will be good to go. On the encoder to the terminal side:
Have fun! 🐱 💗 |
Thanks for all the details @AutumnMeowMeow ! I'm going to start hacking on this and will totally hit you (and the others on this thread) up with questions as they come. Hope you are well. |
More progress! Just merged #1316 which adds support for XTWINOPS 14 + 16 (we already supported 18). This both queries the terminal emulator in which Zellij runs for this pixel data on startup and SIGWINCH - so that we can know how many pixels fit inside a character cell when rendering Sixels, and also responds to similar queries from terminals running inside Zellij panes (adjusted for their size ofc). |
Making good progress - and unless there are surprises, I think the bulk of the work is behind me at this point. I created: https://github.com/zellij-org/sixel-image - a sixel serializer / deserializer which (if I dare say so myself) is pretty fast. Takes me less than 300ms to serialize and deserialize the infamous lady-of-shalott image from here: https://jexer.sourceforge.io/sixel.html (including writing it to the HD). Haven't done any thorough benchmarking though, so just a first impression. I'm hoping that having this as an external pure-rust crate with very few dependencies will encourage use outside of Zellij as well in the future. I have a local branch using this to successfully render images into Zellij panes, but there are naturally some adjustments to work out. I'll keep this thread posted. |
I do have a question for @j4james meanwhile: I implemented raster attributes so that the image is padded with current background color their full width/height. I thought this was what was suggested in this thread, but both implementations I've seen (xterm with the flag and wezterm) don't do this. What do you think is the most common and expected approach? |
The expected approach would be for you match the behavior of the VT340 as demonstrated by the test cases here: https://github.com/hackerb9/vt340test/blob/main/j4james/raster_dimensions.sh The most common approach, though, is to studiously ignore the specifications and do whatever is most convenient. This need not be the same as anyone else, since they'll all be doing something different anyway. |
That's what I like hearing. :) Seriously though - since Zellij is an intermediary here, I want to try and make our behaviour in this regard as predictable as possible. I'd even be happy to go with "what most people do". That being said, I'll throw some stuff together and release experimental support ASAP and might ping you to give it a try and let me know what you think if you're willing. |
As an intermediary, the ideal approach would be for you to interpret the protocol correctly when parsing images into your internal buffer, but use a dumbed-down version of sixel when forwarding that content to the downstream clients. On the parsing side, though, most people probably won't care as long as you can handle a basic sixel image without exploding. |
Yeah, that's kind of the approach I've been following so far. For example, when we interpret a raster attribute with padding, we pad the internal buffer, but when we serialize it outside we add the padding explicitly rather than use a raster attribute. |
An interesting approach on deserialize/serialize, in that you aren't actually quantizing a 24-bit image into sixel. That drastically reduces the work for the terminal multiplexing case. (FYI hanging on to the palette/registers is also what xterm does internally -- which is great for sixel, but a pain for adding support for 24/32-bit images.) It makes layered transparency of the text-window-over-image variety really straightforward too: just alpha-blend the background color of a floating window against the palette, and otherwise serialize as normal. Another door opens on new things to do. :-) Maybe a command-line tool to fast-crop a sixel image read from stdin, that could be plugged into a larger program...🤔 |
Good to know xterm had the same idea. It shouldn't be too much of a problem to adjust the library's internal representation, keep the speed and allow other paths for deserializing these images. I don't have plans for supporting 24/32-bit images in the near future (who knows though, if someone else wants to do it).
Might take me a little bit because I'm plugging holes in other parts of Zellij (Sixels are my fun side project), but I hope to get at least 1 out of the way in the near future.
Yeah, that would be cool! My idea is providing a filter method and letting people do image thumbnails (or just general resizing). |
I implemented this by having an "anchor cell" (the top-left cell of an image) and then serializing the relevant parts of the image on render providing it intersects the pane in one way or another. The cropping of the sixel-image lib allowed me to also only render the changed part of the image rather than the whole image every time. The anchor cell is really only used for stuff like SIGWINCH (font size change) and scrollback overflow. Otherwise we know where the image is. This made things tremendously easier (I find) than marking each cell as being potentially under one or more images. |
More progress! New things that work:
What's left is handling Will keep this thread updated. |
Hey, question to the thread denizens regarding raster attributes padding once more (specifically for @j4james and @hackerb9 ) The behavior I'm seeing with some sixel assets (as well as Is my interpretation correct? Or am I running into a different issue here? I'm asking because this is different than what I understood from the discussion above. |
I'm not entirely clear what you're asking, but if your raster attributes are declared as say 120x120, but the sixel data for the image only takes up 60x60, then it should be padded out to 120x120 (assuming the background mode hasn't been set to transparent). Again, you can look at the test cases I linked above to see how this is meant to work. It's been a while since I've done any sixel testing, so things may have changed, but last I checked, neither XTerm, WezTerm, or Contour were handling raster attributes correctly. |
Yeah, that's what I'm asking. I guess in this case I'll kind of be forced to implement it in the same not-correct way, because our users use those terminals and apps developed for those terminals rather than the VT340. |
With |
I thought what you were going to do was make sure the attributes you set exactly matched the dimensions of the sixel data, then it shouldn't matter if the terminal emulator gets it wrong. |
For sure - the problem now is:
|
Ah, what you're probably running into there is the clipping issue. When you're padding an image, that padding can only extend as far as the boundaries of the screen (or the margins if there are any). The padding won't force the viewport to scroll if it extends past the bottom. So in your example, the image you produce may need to be anything between 400 and 500 in height depending on where on the screen it was rendered. I'm surprised lsix is doing that though. |
I see! Right. I'll give it a try when I get a chance and report the results. I'm totally open to this being just a misunderstanding or misuse on my part. But just to make sure I ran |
I was right. This was my bug - I switched horizontal/vertical padding around. The gap was the horizontal padding which I mistakenly made vertical. It works now as expected. |
PR is up: #1557 |
At long last - this has been implemented! I just merged Sixel support into main and it will be available in the next release. Much thanks to all thread denizens for your help and support. It was really fun to see us terminal geeks coming together to improve the ecosystem. This was quite a task and would have been considerably harder without you. |
This is interesting. So width/height from raster attributes should certainly be working in contour. If anything (or anything else) is broken, we'd have to see if they're relevant at all to modern applications (eying at aspect ratio here), but nevertheless, getting to know that there is interest on a broken implementation to be fixed, I'm very open to that. (sorry for being late and I also don't plan to hijack that ticket nor forum. It's meant more like a general statement :-) ) |
When I last tested (which was a long time ago), the problems I saw were with images being clipped when the raster attribute dimensions were smaller than the actual image content, incorrect handling of sizes that didn't align with cell boundaries, and incorrect handling of zero/default values.
If you want to see all the ways in which your terminal doesn't match the original DEC implementation, there's a whole bunch of test cases here: https://github.com/hackerb9/vt340test/tree/main/j4james Whether you think any of that is relevant is another matter. But this is good lesson to bear in mind when you're designing your new image protocol. Future terminal devs will likely be picking and choosing which bits of your spec they feel like following, and implementing it just as badly as current devs have done with sixel. |
A good point, James. Personally, I think current devs have done best they could with the limited information available (a situation which @j4james has done quite a bit to rectify). Also, as much as I love that my VT340 is still useful, picking and choosing (and extending) is what has allowed sixel to evolve as a living protocol. And that's another thing to remember when designing the GIP: don't let perfect be the enemy of the good enough for right now. |
Several terminals already support Sixels, like xterm (in vt340 mode), wezterm, mlterm, foot, (kitty has its own thing) while others are working on it like Alacritty and Microsoft Terminal.
The thing is right now there's no yet a terminal multiplexer that really supports sixels. Screen and tmux don't and probably never will. (Althought there's been a series of tmux forks trying to fix that (latest) they're unofficial, unstable, and personally I've never been able to make them work).
It would be beyond fantastic if Zellij would be the first one to support this feature, and help bridge the gap in the road into a future where CLI apps can display pixel graphics inside terminal emulators. Soon more interesting apps will be created that makes use of it.
Disclaimer: I'm currently developing the Rust bindings for notcurses which will feature pixel support in the next version 2.3.0.
The text was updated successfully, but these errors were encountered: