More

jacquesnadeau · 2026-01-13T21:34:56 1768340096

Apple container is more akin to a replacement for docker or colima (although patterned more like Kata containers where each container is a separate vm as opposed to a bunch of containers in a single vm). It's a promising project (and nice to see Apple employees work to improve containers on macOS).

Hopefully, they can work towards being (1) more docker api compatible and (2) making it more composable. I wrote up https://github.com/apple/container/discussions/323 for more details on the limitations therein.

Originally, I planned to built shai to work really well on top of apple container but ultimately gave up because of the packaging issues.

jacquesnadeau · 2026-01-13T21:29:21 1768339761

cool to see ctenv. definitely a similar vibe. thanks for sharing! will look at more closely.

Interesting to see how you incorporated some dockerfile patterns. devcontainer feature-esque.

I'm curious to know if you are using it for the isolation concepts I call "cellular development": https://shai.run/docs/concepts/cellular-development/

jacquesnadeau · 2026-01-12T21:29:35 1768253375

I'm one of the creators of shai. Thanks for the callout!

Interesting to see the work on Yolobox and in this space generally.

The pattern we've seen as agent use grows is being thoughtful about what different agents get access to. One needs to start setting guardrails. Agents will break all kind of normal boundaries to try to satisfy the user. Sometimes that is useful. Sometimes it's problematic. (For example, most devs have a bunch of credentials in their local env. One wants to be careful of which of those agents can use to do things).

For rw of current directory, shai allows that via `shai -rw .` For starting as an alternative user, `shai -u root`.

Shai definitely does have the attitude that you have to opt into access as opposed to allowing by default. One of the things we try to focus on is composability: different contexts likely need different resources and shai's config. The expectation is .shai/config.yaml is something committed to the repo and shared across developers.

jacquesnadeau · on Nov 6, 2017

Arrow supports a union type for heterogeneous columns (we use it for random json in Dremio) and a 128-bit decimal.

jacquesnadeau · on Oct 31, 2017

Arrow is all about in-memory, not long-term persistence. Systems can write it to disk but it is the one in-memory representation to rule them all, not storage/disk. Disk has its own requirements and challenges outside the scope of Arrow.

jacquesnadeau · on Oct 31, 2017

It's more than an implementation detail because we're also targeting interoperability between multiple separate technologies. One of the key things that the article didn't fully cover is that Arrow serves two purposes: high performance processing and interoperability.

A key part of the vision is: two systems can share a representation of data to avoid serialization/deserialization overhead (and potentially copying in a shared memory environment). This is only possible if the in-memory format is also highly efficient for processing. This allows the processing systems (say Pandas and Dremio) to share a representation, both process against it, and then move the data between each other with zero overhead.

If you shared the data representation on the wire and then each application had to transform it to a better structure for processing, you're still paying for a form of ser/deser. By using Arrow for both processing and interoperation you benefit from near-no-cost movement between systems and also a highly efficient representation to process data (including some tools to get you started in the form of the Arrow libraries).

jacquesnadeau · on Oct 31, 2017

People are already using Arrow for high performance GPU processing. FPGA is possible but I'm not sure if anyone is actually doing this today.

jacquesnadeau · on Oct 31, 2017

One quick note to make on this. Kudu is a storage implementation, (similar to Parquet in some ways). Arrow isn't about persistence and is actually built to be complementary to both Kudu and Parquet.

Also note: Kudu is a distributed process. Arrow and Parquet are libraries that can be embedded into your existing applications.

jacquesnadeau · on Oct 31, 2017

You nailed it. The most exciting part about all of this is being able to move between a "data science" context and a "database" context (and back again) without pain or penalty.

jacquesnadeau · on Oct 31, 2017

Possible, yes. Performant or pleasant? Maybe not :)