shaz0x's comments

shaz0x · 2026-04-15T22:28:41 1776292121

On mobile the Q4 vs Q6 tradeoff flips. Gemma 4 E2B at Q4_K_M barely fits in RAM on a 6GB Android, so Q6 isn't on the table. In practice the Q4 hit shows up in tool-call reliability more than general reasoning, which is usually fine for a constrained skill surface.

shaz0x · 2026-04-15T22:25:17 1776291917

Went through the SDK docs before asking. On RN/Expo specifically, does Fabric run inside a Bare worklet with IPC back to Hermes, or drop into a native module the way llama.rn does via JNI and llama.cpp? Perf and memory footprint would look very different between the two, curious which path you landed on.

elchiapp · 2026-04-16T13:25:59 1776345959

Bare worklet with IPC, exactly. Let me know if there's anything I can help with.

shaz0x · 2026-04-14T08:31:21 1776155481

Even Gemma 4 E2B is more useful than you'd think if you give it the right harness. I've been running it on Android via llama.rn and it handles function calling natively — the model outputs structured tool calls without any prompt engineering. Won't replace Opus for hard reasoning but for a mobile app that needs to pick a tool and run it, the cost math is hard to argue with. $0/query forever.