arkonrad's comments

arkonrad · 2025-09-30T12:39:30 1759235970

Early stage with ARK Cloud API & Stateful LLM sessions without input token costs. Conversation history is kept server-side, so you only send new messages. Demo and docs: https://ark-labs.cloud/documentation

Looking for feedback on use cases and session controls (machine2machine).

arkonrad · 2025-08-27T09:57:04 1756288624

A great map showing how governments/armies interfere with navigation systems — and the regions they target are no coincidence.

arkonrad · 2025-08-18T15:07:54 1755529674

On cookies: we use an HTTP cookie (ark_session_id) purely as an opaque session identifier. The cookie is how the client ties subsequent requests to the same pinned session/worker/GPUs on the provider side so the provider can keep the model activations/state in GPU memory between calls. Not a magic for the model; it’s a routing key that enables true session affinity.

On “thinking steps” and contamination: good point - naively persisting raw chain-of-thought tokens can degrade outputs. ARKLABS Stateful approach is not a blanket “store everything” policy.

And my criticism targets higher-level provider practices: things like response caching, aggressive prompt-matching / deduplication heuristics, or systems that return previously generated outputs when a new prompt is “similar enough.” Those high-level caches absolutely can produce the behaviour I described - a subtle prompt change that nevertheless gets routed to a cached reply.

The platform has been launched — we’re collecting data, but early results are very promising: we’re seeing linear complexity, lower latency, and ~80% input-token savings. At the same time we’d love to hear more feedback on whether this approach could be useful in real-world projects.

And about going against the grain, as you mentioned at the end… well — if startups didn’t think differently from everyone else, what would be the point of being a startup?

arkonrad · 2025-08-06T13:57:48 1754488668

Do you have your own platform to run inference?

arkonrad · 2025-08-06T13:56:15 1754488575

I’ve been leaning more toward open-source LLMs lately. They’re not as hyper-optimized for performance, which actually makes them feel more like the old-school OpenAI chats-you could just talk to them. Now it’s like you barely finish typing and the model already force-feeds you an answer. Feels like these newer models are over-tuned and kind of lost that conversational flow.

arkonrad · 2025-07-31T11:39:11 1753961951

Whisper Large V3 Turbo