It is a precarious time to look backwards on the definition of the roles of an IC or an engineering manager and make any extrapolations to what those will look like in the future.
In my own team, I have seen ICs increasingly function like engineering managers, and even suffer some of the pitfalls of the role switch, as they change from reasoning about creating code to delegating to teams of software agents.
Increasingly, ICs are needing to understand the product roadmap more deeply, figure out how to spec a problem and constraints on a solution in the right way to get their subordinates to produce reasonable output, and be the communication bridge between other jobs functions and the entities actually producing the code.
I've also heard concerns of skill atrophy, as these team members spend less brain energy on language syntax, low level logic, etc, and more on interpreting abstract strategies to solving a problem and pattern matching those strategies against their software engineering wisdom.
If anything, ICs should consider that the skills that will make them successful managing agents might be the ones that have made first-level engineering managers successful: the ability to coordinate with other job functions, map implementation strategies to product and organization needs, and deliberately and carefully delegate and coordinate work of others writing the actual code.
Just keep in mind that there are many highly motivated people directly working on this problem.
It's hard to predict how quickly it will be solved and by whom first, but this appears to be a software engineering problem solvable through effort and resources and time, not a fundamental physical law that must be circumvented like a physical sciences problem. Betting it won't be solved enough to have an impact on the work of today relatively quickly is betting against substantial resources and investment.
Why do you think it's not a physical sciences problem? It could be the case that current technologies simply cannot scale due to fundamental physical issues. It could even be a fundamental rule of intelligent life, that one cannot create intelligence that surpasses its own.
Plenty of things get substantial resources and investment and go nowhere.
Of course I could be totally wrong and it's solved in the next couple years, it's almost impossible to make these predictions either way. But I get the feeling people are underestimating what it takes to be truly intelligent, especially when efficiency is important.
That's not what I mean, rather than humans cannot create a type of intelligence that supersedes what is roughly capable from human intelligence, because doing so would require us to be smarter basically.
Not to say we can't create machines that far surpass our abilities on a single or small set of axis.
Think hard about this. Does that seem to you like it's likely to be a physical law?
First of all, it's not necessary for one person to build that super-intelligence all by themselves, or to understand it fully. It can be developed by a team, each of whom understands only a small part of the whole.
Secondly, it doesn't necessarily even require anybody to understand it. The way AI models are built today is by pressing "go" on a giant optimizer. We understand the inputs (data) and the optimizer machine (very expensive linear algebra) and the connective structure of the solution (transformer) but nobody fully understands the loss-minimizing solution that emerges from this process. We study these solutions empirically and are surprised by how they succeed and fail.
We may find we can keep improving the optimization machine, and tweaking the architecture, and eventually hit something with the capacity to grow beyond our own intelligence, and it's not a requirement that anyone understands how the resulting model works.
We also have many instances in nature and history of processes that follow this pattern, where one might expect to find a similar "law". Mammals can give birth to children that grow bigger than their parents. We can make metals puter than the crucible we melted them in. We can make machines more precise than the machines that made those parts. Evolution itself created human intelligence from the repeated application of very simple rules.
On the contrary, nearly every machine we've created is capable of things that we are not capable of ourselves. Cars travel more than twice as fast as the swiftest human. Airplanes fly. Calculators do math in an instant that would take a human months. Lightbulbs emit light. Cranes lift many tons. And so on and so forth.
So to create something that exceeds our capabilities is not a matter of hubris (as if physical laws cared about hubris anyway), it's an unambiguously ordinary occurrence.
Well your great great ... great grandfather was a fish. So the universe is clearly fine with gradual intelligence increments. So all we need to do is make an AI that itself can make an AI that is slightly smarter than itself.
I'll believe that claim when a SOTA model can autonomously create content that matches the quality and length of any average PhD dissertation. As of right now, we're nowhere near that and don't know how we could possibly get there.
SOTA models are superhuman in a narrow sense, in that they have solid background knowledge of pretty much any subject they've been trained on. That's great. But no, it doesn't turn your AI datacenter into "a country of geniuses".
Are humans just Phd students in a vat? Can a SOTA model walk? Humans in general find that task, along with a trillion other tasks that SOTA models cannot do, to be absolutely trivial.
Seems like if evolution managed to create intelligence from slime I wouldn't bet on there being some fundamental limit that prevents us from making something smarter than us.
Many highly motivated people with substantial resources and investment have worked on a lot of things and then failed at them with nothing to show for it.
The implication of your assertion is pretty much a digital singularity. You’re implying that there will be no need for humans to interact with the digital world at all, because any work in the digital world will be achievable by AI.
Wonder what that means for meatspace.
Edit: Would also disagree this isn’t a physics problem. Pretty sure power required scales according to problem complexity. At a certain level of problem complexity we’re pretty much required to put enough carbon in the atmosphere to cook everyone to a crisp.
Edit 2: illustrative example, an Epic in Jira: “Design fusion reactor”
Kinelo | kinelo.com | Full-Stack Engineer (AI + Backend) | San Francisco | Onsite
Kinelo manages the new type of hybrid team that will emerge in the future: humans+AI working shoulder-to-“shoulder”.
Our first product helps human software engineering teams deliver more efficiently and more reliably by assisting in coordination and engineering management. But it’s built on a foundational layer that ingests data, knowledge, and process across communications and systems, exposing that to humans and AI alike.
Eventually, Kinelo will become a proper human+AI orchestration layer, managing entire companies, and our goal is to have Kinelo run Kinelo (we already dogfood it today).
We’re looking for high ownership individuals to join our small team and shape its direction. We also just opened a new SF office and new hires will have the ability to help shape company culture, engineering practices, architecture, and roadmap.
We’re well-funded (just raised another round), founded by a serial founder, anti-bureaucratic, and highly technical.
We’re looking for:
- Those who want to go 0 to 1 with high ownership: moving features and products from idea on a napkin, to prototype, to engineering UI, to shipped and polished product
- ~8 years professional experience, including the pre-LLM days with strong CS fundamentals
- A balance of wisdom from experience and optimism about the future
- Strong TypeScript and Postgres (or other RDBMS) experience at scale and with high reliability (SaaS apps, APIs, etc)
- Experience with a variety of technologies, including containerization, distributed systems, and at least one strongly typed programming language (Go, C++, Java, etc)
- Ideally experience with ML and AI systems (though expertise in ML, data science, etc is not needed for this role)
We can’t sponsor visas but relocation assistance to SF is possible.
Kinelo | kinelo.com | Full-Stack Engineer (AI + Backend) | San Francisco | Onsite
Kinelo manages the new type of hybrid team that will emerge in the future: humans+AI working shoulder-to-“shoulder”.
Our first product helps human software engineering teams deliver more efficiently and more reliably by assisting in coordination and engineering management. But it’s built on a foundational layer that ingests data, knowledge, and process across communications and systems, exposing that to humans and AI alike.
Eventually, Kinelo will become a proper human+AI orchestration layer, managing entire companies, and our goal is to have Kinelo run Kinelo (we already dogfood it today).
We’re looking for high ownership individuals to join our small team and shape its direction. We also just opened a new SF office and new hires will have the ability to help shape company culture, engineering practices, architecture, and roadmap.
We’re well-funded (just raised another round), founded by a serial founder, anti-bureaucratic, and highly technical.
We’re looking for:
- Those who want to go 0 to 1 with high ownership: moving features and products from idea on a napkin, to prototype, to engineering UI, to shipped and polished product
- ~8 years professional experience, including the pre-LLM days with strong CS fundamentals
- A balance of wisdom from experience and optimism about the future
- Strong TypeScript and Postgres (or other RDBMS) experience at scale and with high reliability (SaaS apps, APIs, etc)
- Experience with a variety of technologies, including containerization, distributed systems, and at least one strongly typed programming language (Go, C++, Java, etc)
- Ideally experience with ML and AI systems (though expertise in ML, data science, etc is not needed for this role)
We can’t sponsor visas but relocation assistance to SF is possible.
Avy | Staff Engineer / Startup Polyglot Engineer | San Francisco | Onsite (Hybrid) | Full-Time | https://www.avy.app
We are an early-stage, well-funded startup making humans and computers work together more efficiently. Experienced team from Apple AIML and other great companies.
We're looking for a polyglot "startup engineer" at the Staff+ level can help with the end-to-end implementation of new features and capabilities and who can move between different components, be creative, be flexible, take ownership, and get things done. We have a complex, challenging product, but one that's amazing to work on and that will soon change how people think about working with each other and with AI.
You'll enjoy this role if:
- You like going 0 to 1 with high ownership: moving features and products from idea on a napkin, to prototype, to engineering UI, to shipped and polished product
- You are innovative, get satisfaction out of learning new things, and thrive in an environment of uncertainty and opportunity
To succeed, you should:
- Have the wisdom to know when to take the fast route vs when to step back and clean up your mess
- Be capable and excel on your own, but be humble and communicate well with others as projects require more collaboration
- Be versed across many programming languages (e.g. from C++/Java/Swift to Go to JavaScript or Python)
- Have worked on many types of projects and software -- have you built servers, written games, shipped native apps, and vibe-coded (kidding) web apps? If so, you're a fit.
- Be versatile and work both high level and low level, on old tech and new. Have you both prompt-engineered an LLM AND tweaked the code that decodes tokens in your own local LLM runner? If so, you're a fit.
Avy | Staff Engineer / Startup Polyglot Engineer | San Francisco or Salt Lake City | Onsite (Hybrid) | Full-Time | https://www.avy.ai
We are an early-stage, well-funded startup making humans and computers work together more efficiently. Experienced team from Apple AIML and other great companies.
We will be opening several roles soon, but are previewing this one early: Staff Engineer / Startup Polyglot Engineer.
You'll enjoy this role if:
- You like going 0 to 1 with high ownership: moving features and products from idea on a napkin, to prototype, to engineering UI, to shipped and polished product
- You are innovative, get satisfaction out of learning new things, and thrive in an environment of uncertainty and opportunity
To succeed, you should:
- Have the wisdom to know when to take the fast route vs when to step back and clean up your mess
- Be capable and excel on your own, but be humble and communicate well with others as projects require more collaboration
- Be versed across many programming languages (e.g. from C++/Java/Swift to Go to JavaScript or Python)
- Have worked on many types of projects and software -- have you built servers, written games, shipped native apps, and vibe-coded (kidding) web apps? If so, you're a fit.
- Be versatile and work both high level and low level, on old tech and new. Have you both prompt-engineered an LLM AND tweaked the code that decodes tokens in your own local LLM runner? If so, you're a fit.
All said, we're looking for the stereotypical "startup engineer" who can move between different components, be creative, be flexible, take ownership, and get things done. We have a complex, challenging product, but one that's amazing to work on and that will soon change how people think about working with each other and with AI.
1. I stumbled for a while looking for the license on your website before finding the Apache 2.0 mark on the Hugging Face model. That's big! Advertising that on your website and the Github repo would be nice. Though what's the business model?
2. Given the LLama 3 backbone, what's the lift to make this runnable in other languages and inference frameworks? (Specifically asking about MLX but Llama.cpp, Ollama, etc)
Google Colab is quite easy to use and has the benefit of not making your local computer feel sluggish while you run the training. The linked Unsloth post provides a notebook that can be launched there and I've had pretty good luck adapting their other notebooks with different foundational models. As a sibling noted, if you're using LORA instead of a full fine-tune, you can create adapters for fairly large models with the VRAM available in Colab, especially the paid plans.
If you have a Mac, you can also do pretty well training LORA adapters using something like Llama-Factory, and allowing it to run overnight. It's slower than an NVIDIA GPU but the increased effective memory size (if you say have 128GB) can allow you more flexibility.
Vision LLMs are definitely an interesting application.
At Avy.ai we're running small (2B-7B, quantized) vision models as part of a Mac desktop application for understanding what someone is working on in the moment, to offer them related information and actions.
We found that the raw results in understanding the images with a light LORA fine tune are not substantially different -- but the ease of getting a small model to follow instructions in outputting structured data in response to the image and at the level of verbosity and detail we need is greatly enhanced with fine tuning. Without fine tuning the models on the smaller end of that scale would be much more difficult to use, not reliably producing output that matched what the consuming application expects
Using a grammar to force decoding say valid JSON would work, but that hasn't always been available in the implementations we've been using (like MLX). Solvable by software engineering and adding that to the decoders in those frameworks, but fine tuning has been effective without that work.
The bigger thing though was getting the models to have the appropriate levels of verbosity and detail in their ouput which fine tuning made more consistent.
It certainly would be nice if Apple explicitly supported replacing Apple Intelligence with a third party option by exposing the same privileged APIs they use to others.
What exactly would you like a third party system to do? Mac or iOS?
I'd argue that there's no way that Apple would want to offer access to something like that to any third party, and that it'd be better to start implementing it ourselves from scratch in any other operating system.
In my own team, I have seen ICs increasingly function like engineering managers, and even suffer some of the pitfalls of the role switch, as they change from reasoning about creating code to delegating to teams of software agents.
Increasingly, ICs are needing to understand the product roadmap more deeply, figure out how to spec a problem and constraints on a solution in the right way to get their subordinates to produce reasonable output, and be the communication bridge between other jobs functions and the entities actually producing the code.
I've also heard concerns of skill atrophy, as these team members spend less brain energy on language syntax, low level logic, etc, and more on interpreting abstract strategies to solving a problem and pattern matching those strategies against their software engineering wisdom.
If anything, ICs should consider that the skills that will make them successful managing agents might be the ones that have made first-level engineering managers successful: the ability to coordinate with other job functions, map implementation strategies to product and organization needs, and deliberately and carefully delegate and coordinate work of others writing the actual code.