For those not aware, Shift Left[1] is (at this point) an old term that was coined for a specific use case, but now refers to a general concept. The concept is that, if you do needed things earlier in a product cycle, it will end up reducing your expense and time in the long run, even if it seems like it's taking longer for you to "get somewhere" earlier on. I think this[2] article is a good no-nonsense explainer for "Why Shift Left?".
That's an orthogonal concept. The idea of "shift left" is that you have work that needs to be done regardless (it's not an analysis of "bare minimum"), but tends strongly to be serialized for organizational reasons. And you find ways to parallelize the tasks.
The classic example is driver development. No one, even today, sits down and writes e.g. a Linux driver until after first silicon has reached the software developers. Early software is all test stuff, and tends to be written to hardware specs and doesn't understand the way it'll be integrated. So inevitably your driver team can't deliver on time because they need to work around nonsense.
And it feeds back. The nonsense currently being discovered by the driver folks doesn't get back to the hardware team in time to fix it, because they're already taping out the next version. So nonsense persists in version after version.
Shift left is about trying to fix that by bringing the later stages of development on board earlier.
Sounds like a similar/parallel thought to the project management waterfall paradigm whereby the earlier you get things correct (left) the less costly it is in the long run or conversely if you have to go and re-do things later on (right) you’re in for a shock (either cost, time or quality).
Funny how correct is not associated with right. The normal association is reflected in he Haskell Either datatype for describing computations that either run into some error (Left error) or run successfully producing a value (Right value).
That's because the association isn't right or wrong, it's progressing or not progressing, deriving from reading order. A Left value typically halts computation there and returns something to the user, whereas a Right value allows you to move on.
One nice side effect of tying the mnemonic to reading direction rather than homonyms is that it carries over across languages better (though still imperfectly).
I’ve come to the conclusion that Kent Beck got just about everything right with Extreme Programming. There’s just not a formal training industry out there pushing it so many more people are selling and invested in other approaches with a lot of warts.
XP is almost the opposite or shift right, similar to how every other “agile” development form pushes the necessary work forward. I think it’s telling that none of the modern software design systems seem to have given us a world where our industry is capable of delivering software on time and within budget. They’ve been around for 20 years, places follow them religiously and still fail.
Maybe everything about those concepts are just wrong? I mean, you can have people like Uncle Bob who’ll tell you that “you just got it wrong”. He’s also always correct in this, but if so many teams “get it wrong” then maybe things like clean, xp and so on simply suck?
> I think it’s telling that none of the modern software design systems seem to have given us a world where our industry is capable of delivering software on time and within budget.
It's worth noting that left-leaning systems were widely used once too, and they had about half of the overall success rate of the right-leaning ones on those criteria.
I’ve never seen any actual metrics on it that weren’t pseudoscience at best. If you have some I would genuinely like to see them.
From my anecdotal experience YAGNI and making sure it’s easy to delete everything is the only way to build lasting maintainability in software. All those fancy methods like SOLID, DRY, XP are basically just invitations for building complexity you’ll never actually need. Not that you can really say that something like XP is all wrong, nothing about it is bad. It’s just that nothing about it is good without extreme moderation either.
I guess it depends on where you work. If it’s for customers or a business, then I think you should just get things out there. Running into scalability issues is good, it means you’ve made it further than 95% of all software projects.
No evidence most of the activities actually save money with modern ways of delivering software (or even ancient ways of delivering software; I looked back and the IBM study showing increasing costs for finding bugs later in the pipeline was actually made up data!)
To be more specific, let's say I can write an e2e test on an actual pre-prod environment, or I can invest much development and ongoing maintenance to develop stub responses so that the test can run before submit in a partial system. How much is "shifting left" worth versus investing in speeding up the deployment pipeline and fast flag rollout and monitoring?
Nobody I've worked with can ever quantify the ROI for elaborate take test environments, but somebody made an okr so there you go. Far be it we follow actual research done on modern software... http://dora.dev
In fact, in my experience, these elaborate test environments and procedures cripple products.
I'm firmly of the opinion that if a test can't be run completely locally then it shouldn't be run. These test environments can be super fragile. They often rely on a symphony of teams ensuring everything is in a good state all the time. But, what happens more often than not, is one team somewhere deploys a broken version of their software to the test environment (because, of course they do) in order to run their fleet of e2e tests. That invariably ends up blowing up the rest of the org depending on that broken software and heaven help you if the person that deployed did it at 5pm and is gone on vacation.
This rippling failure mode happens because it's easier to write e2e tests which depend on a functional environment than it is to write and maintain mock services and mock data. Yet the mock services and data are precisely what you need to ensure someone doesn't screw up the test environment in the first place.
You are not wrong but I have had many experiences where mock services resulted in totally broken systems since they were incorrectly mocked. In complex systems it is very hard to accurately mock interactions.
Personally I think the real issue is not the testing strategy but the system itself. Many organizations make systems overly complex. A well structured monolith with a few supporting services is usually easy to test while micro service/SOA hell is not.
There are many reasons you want to be able to turn up your whole stack quickly; disaster recovery is just one of them. And if you can turn up your environment quickly then why not have multiple staging environments? You start with the most recent of yours and everyone else's prod version, then carrots combinations from there
Obviously this is for large-scale systems and not small teams.
No argument, but there can be limitations on how much you can speed up the deployment pipeline. In particular, the article is about integrated circuit development (actually about systems made of many ICs), where a “production deployment” takes months and many, many millions of dollars, and there’s not much you can do about it.
I heard a story decades ago about a software team that got a new member transferred in from the IC design department. The new engineer checked in essentially zero bugs. The manager asked what the secret was, and the new engineer said “wait, we’re allowed to have bugs?”
ICs are about the only exception, but semiconductor engineers are no dummies and have been building the appropriate simulation tools for a long time now.
If you could spin a chip every day that would mostly be a huge waste of time.
Shift-left comes from the world before everyone was selling services. In that world shipping service packs was the way to fix problems discovered after a release, and releases were years apart.
From a QA perspective, I greatly regret that the world of infrequent releases is mostly gone. There are few kinds of products that still hold onto the old strategy, but this is a dying art.
I see the world of services with DevOps, push on green etc. strategies as a kind of fast-food of software development. A kind of way of doing things that allows one to borrow from the future self by promising to improve the quality eventually, but charging for that future improved quality today.
There are products where speeding the rollout is a bad idea. Anything that requires high reliability is in that category. And the reason is that highly reliable systems need to collect mileage before being released. For example, in storage products, it's typical to have systems run for few months before they are "cleared for release". Of course, it's possible to continue development during this time, but it's a dangerous time for development because at any moment the system can be sent back to developers, and they would have to incorporate their more recent changes into the patched system when they restart the development process. I.e. a lot of development effort can be potentially wasted before the system is sent out to QA and the actual release. And, to amortize this waste, it's better to release less frequently. It's also better to approach the time when the system is handed to QA with a system already well-tested, as this will minimize the back-and-forth between the QA and the development -- and that's the problem shift-left was intended to solve.
NB. Here's another, perhaps novel thought for the "push on green" people. Once it was considered a bad idea for the QA to be aware of implementation details. Testing was seen as an experiment, where the QA were the test subjects. This also meant that exposing the QA to the internal details of the systems or the rationale that went into building it would "spoil" the results. In such a world, allowing the QA to see a half-baked system would be equivalent to exposing them to the details of the system's implementation, thus undermining their testing effort. The QA were supposed to receive the technical documentation for the system and work from it, trying to use the system as documented.
Companies won't pay enough for quality QA people, so you can't get good people to do it instead of a more lucrative dev position. So now everybody has to do testing, except they only want to do the bare minimum and they aren't as practiced at it so they aren't as good as someone with equivalent talent but more experience writing comprehensive tests.
Start paying "QA" more than their dev partners consistently and with better promotion opportunities and you can get better testing, but everybody seems to be making plenty of money without it.
Last rant: everybody is testing in production, but only some people are looking at the results. If you aren't then there's better ROI to be found than "shifting left" your e2e tests.
Are you referring to the IBM Systems Science claims (likely apocryphal) in the Pressman paper, or Barry Boehm's figure in "Software Engineering" 1976 paper which did include some IBM sourcing (literally drawn on) but was primarily based on survey data from within TRW?
It baffles me that anyone would continue to promulgate the Pressman numbers (which claim ~exponential growth in cost) based on... it's not entirely clear what data, as opposed to Boehm's paper which only claims a linear relative cost increase, but is far more credible.
In a waterfall, single-deliverable model it wouldn't surprise me there is some increase in costs the later a bug is discovered, but if you're in that world you have more obvious problems to tackle.
People still use the Pressman numbers. So much for "data-driven decision making"...
> The concept is that, if you do needed things earlier in a product cycle, it will end up reducing your expense and time in the long run, even if it seems like it's taking longer for you to "get somewhere" earlier on.
Isn't ignoring the early steps that could save time later also known as false economy?
I'm familiar with both concepts and they don't seem related imo. Maybe if you applied bottleneck theory to process design in the abstract? Sorry I can't evaluate this comparison further, I must have run out of my quota for Abstract Reasoning Tokens today.
- implementing security features earlier (DevSecOps)
- implement tracing and metrics/analysis tools earlier, use them to test and debug apps earlier (as opposed to laptop-based solutions)
- building the reliable production model earlier (don't start with a toy model on your laptop if you're gonna end up with an RDS instance in AWS; build the big production thing first, and use it early on)
- add synthetic end-to-end tests early on
The linked article is talking about Shift Left in the context of developing semiconductors, so you can see how it can be applied to anything. Just do the needed thing earlier, in order to iterate faster, improve quality, reduce cost, ship faster.
Yes, it can be applied to anything, but might not pay off with software like it does with semiconductors? Better tests and more static analysis are useful, but teams will often work on making their releases faster and easier, iterating more frequently, and then testing some things in production. We don’t want our software to be too brittle, so it should tolerate more variation in performance.
But with chip design, they can’t iterate that fast and performance is more important, so they are doing more design and testing before the expensive part, using increasingly elaborate simulations.
The knowledge needed to do the 'shift-left' tasks. The downside of making your developers take on more tasks that used to be handled by someone further down the chain is they need to know how to do those tasks.
This is interesting. Seems like it would mean something like having QA or infosec pair program with app devs. I'm sure somebody does this and am curious how it works out in practice.
More broadly it includes moving activities that are normally performed at a later stage so that they are earlier in the process. The idea is that defects found later in the process are more costly.
It can also mean manual testing earlier. Valve credits playtesting early and playtesting often for their success with Half-Life, Half-Life 2, and other games.
No, not really. It's a concept from the bygone era, before X-as-a-service took over a lot of software categories. It was intended to minimize friction between the QA and the development teams at the time the product was handed to QA. If the product had multiple defects, incrementally discovered, it would have to travel back and forth between the QA and the developers, stalling the development of the next version and generally creating a lot of noise and inconvenience due to the surrounding bureaucracy.
There's no need to write tests upfront for you to shift left. All shift left means is that testing happens during development. Whether you start by writing tests and then write the actual program or the other way around -- doesn't matter.
I had management who were so enthused about "shift left" that we shifted far left and broke the shifter, or perhaps, mis-shifted. Now we spend too much time developing test plan and testing and arguing about PRD that we actually deliver more slowly than competitor by a lot.
Another flavor of this I've encountered is people in the management chain who can only think about faults per time interval and not about faults per interaction.
So you make a tool that prevents errors and speeds up a process, now people use it four times as much, and now they wonder why they're only seeing half as many faults instead of an order of magnitude less.
We are humans. We cannot eliminate errors, we can only replace them with a smaller number of different errors. Or as in your case, a larger number of them.
What helped me finally remember "left of what, exactly" is to imagine a timeline, left to right. Shifting left is moving things earlier in that timeline.
In my org [1], "shift left" means developers do more work sooner.
So before we clearly clarify product requirements, we start the build early with assumptions that can change. Sometimes it works, sometimes it does not, which just means we end up net neutral.
But an executive somewhere up the management chain can claim more productivity. Lame.
I think that's a myopic view. Getting something anything in the hands of your potential users that's even vaguely in the ball-park of a solution shaped thing gives you extremely valuable information both on what is actually needed and, to me more importantly, what you don't have to build at all.
I consider it a success when an entire line of work is scrapped because after initial testing the users say they wouldn't use it. Depending on the project scope that could be 6-7 figures of dev time not wasted right there.
I’ve never been in banking, but I’ve been in games (AAA) and biotech most of my career.
I agree with the previous commenter. It isn’t about breaking things out even moving fast (in my experience), but about getting “the juices flowing”.
I’ve found that when there’s a request for some new tool or thing, people have a very difficult time defining [all] the real world needs. They know they need X, but don’t remember - or even realize - they also need Y, Z, etc. Or worse, many people have been conditioned through corporate culture that they can’t even ask for things that would make their lives easier.
Getting something basic in someone’s hands quickly with the desire of creating a loop of feedback, feature, feedback, … gets the end user more focused and engaged with the final product, which gives the engineers more focused requirements. I’ve found it also gives the engineers the opportunity to make suggestions and get more invested.
I'm in fintech right now, experiencing that pain. We do a sort of "reverse waterfall".
We respond to spikes in increased exceptions, then read logs to try to figure out why something "went wrong". If we can change some code to fix it, we do so. But since we don't have confidence in our fixes, we have to move further back to the design and guess how the system is supposed to work. I'm hoping next month we start a dialogue with with upstream and ask how we're supposed to use their API.
We seem to be hell bent on ignoring age old common sense repeatedly in our work while simultaneously inventing new names for them. Wonder why? Money? Survival? Personal ambition?
These concepts are not worth the paper you wipe your backside with unless you have a manager and a team that cares.
In software shifting left often also means putting more tasks into the hands of the person with the most context, aka, moving deployment from ops to developers.
The main benefits you get from this is reduced context switching, increased information retention, increased ownership.
But it has to go with tooling, and process that enables that. If you have a really long deployment process where devs can get distracted then they will lose context switching between tasks. If you make every Dev a one man army that has to do everything on their own you won't be able to ship anything and your infra will be a disjointed mess.
They key thing is reducing toil, and increasing decision making power within a single team/person.
From the sound of the article they might be just squishing two teams together? What's the advancement that made the two steps be able to happen at the same time?
I'm in chip design and our VP's and C level execs have been saying "shift left" for the last year. Every time someone in the Q&A asks "What does shift left mean?" No one talks this way except for executives. We just joke "The left shift key isn't working on my keyboard" or some other nonsense.
Bingo. I started reading the article and found it to be packed with jargon, buzz words, and the language of breathless prophecy and faddish hype. I couldn't hear the point of the article over the blasting sirens of my BS detectors, so I just stopped reading. Seriously, is there anything in the article worth the effort of slogging through all of that business-school/project-management shorthand/nonsense?
> Optimization strategies have shifted from simple power, performance, and area (PPA) metrics to system-level metrics, such as performance per watt. “If you go back into the 1990s, 2000s, the road map was very clear,”
Tell me you work for Intel without telling me you work for Intel.
> says Chris Auth, director of advanced technology programs at Intel Foundry.
Yeah that’s what I thought. The breathlessness of Intel figuring out things that everyone else figured out twenty years ago doesn’t bode well for their future recovery. They will continue to be the laughing stock of the industry if they can’t find more self reflection than this.
Whether this is their public facing or internal philosophy hardly matters. Over this sort of time frame most companies come to believe their own PR.
Intel has had a few bad years, but frankly I feel like they could fall a lot lower. They aren't as down bad as AMD was during the Bulldozer years, or Apple during the PowerPC years, or even Samsung's early Exynos chipsets. The absolute worst thing they've done in the past 5 years was fab on TSMC silicon, which half the industry is guilty of at this point.
You can absolutely shoot your feet off trying to modernize too quickly. Intel will be the laughingstock if 18A never makes it to market and their CPU designs start losing in earnest to their competitors. But right now, in a relative sense, Intel isn't even down for the count.
Intel has failed pretty badly IMO. Fabbing at TSMC might actually have been a good idea, except that every other component of arrow like is problematic. Huge tile to tile latencies, special chiplets that are not reusable in any other design, removal of hyperthreading, etc etc. Intel’s last gen CPU is in general faster than the new gen due to all the various issues.
And that’s just the current product! The last two gens are unreliable, quickly killing themselves with too high voltage and causing endless BSODs.
The culture and methods of ex-Intel people at the management level is telling as well, from my experiences at my last job at least.
(My opinions are my own, not my current employers & a lot of ex-Intel people are awesome!)
We'll see, I mostly object to the "vultures circling" narrative that HN seems to be attached to. Intel's current position is not unprecedented, and people have been speculating Intel would have a rough catchup since the "14nm+++" memes were vogue. But they still have their fabs (and wisely spun it out to it's own business) and their chip designs, while pretty faulty, successfully brought x86 to the big.LITTLE core arrangement. They've beat AMD to the punch on a number of key technologies, and while I still think AMD has the better mobile hardware it still feels like the desktop stuff is a toss-up. Server stuff... glances at Gaudi and Xeon, then at Nvidia ...let's not talk about server stuff.
A lot of hopes and promises are riding on 18A being the savior for both Intel Foundry Services and the Intel chips wholesale. If we get official confirmation that it's been cancelled so Intel can focus on something else then it will signal the end of Intel as we know it.
>> successfully brought x86 to the big.LITTLE core arrangement.
Really? I thought they said using e-cores would be better than hyper threading. AMD has doubled down on hyper threading - putting a second decoder in each core that doesn't directly benefit single thread perf. So Intels 24 cores are now competitive with (actually losing to) 16 zen 5 cores. And that's without using AVX512 which Arrow Lake doesn't even support.
I was never a fan of big.little for desktop or even laptops.
In nearly every generation of Intel chip where I needed to care about whether hyperthreading was a net positive, it was either proven to be a net reduction in throughput, or a single digit improvement but greatly increased jitter. Even if you manage to get more instructions per cycle with it on, the variability causes grief for systems you have or want telemetry on. I kind of wonder why they keep trying.
I don’t know AMD well enough to say whether it works better for them.
Whether SMT a.k.a. hyperthreading is useful or not depends greatly on the application.
There is one important application for which SMT is almost always beneficial: the compilation of a big software project, where hundreds or thousands of files are compiled concurrently. Depending on the CPU, the project building time without SMT is usually at least 20% greater than with SMT, for some CPUs even up to 30% greater.
For applications that spend much time in executing carefully optimized loops, SMT is usually detrimental. For instance, on a Zen 3 CPU running multithreaded GeekBench 6 with SMT disabled improves the benchmark results by a few percent.
It's working well for Mac laptops, although I'd rather people call it "asymmetric multiprocessing" than "big.LITTLE". Why is it written like that anyway?
(Wikipedia seems to want me to call it "heterogeneous computing", but that doesn't make sense - surely that term should mean running on CPU+GPU at the same time, or multiple different ISAs.)
Of course, it might've worked fine if they used symmetric CPU cores as well. Hard to tell.
> (Wikipedia seems to want me to call it "heterogeneous computing", but that doesn't make sense - surely that term should mean running on CPU+GPU at the same time, or multiple different ISAs.)
According to Wikipedia, it means running with different architectures, which doesn't necessarily mean instruction set architectures.
They do actually have different ISAs though. On both Apple Silicon and x86, some vector instructions are only available on the performance cores, so some tasks can only run on the performance cores. The issue is alluded to on Wikipedia:
*> In practice, a big.LITTLE system can be surprisingly inflexible. [...] Another is that the CPUs no longer have equivalent abilities, and matching the right software task to the right CPU becomes more difficult. Most of these problems are being solved by making the electronics and software more flexible.
> They do actually have different ISAs though. On both Apple Silicon and x86, some vector instructions are only available on the performance cores, so some tasks can only run on the performance cores.
No, there's no difference on Apple Silicon. You'll never need to know which kind of core you're running on. (Except of course that some of them are slower.)
Their last 14 nm launch has been Rocket Lake, the desktop CPU of the year 2020/2021.
The next 3 years, 2021/2022, 2022/2023 and 2023/2024, have been dominated by "10 nm" rebranded as "Intel 7" products, which have used the Golden Cove/Raptor Cove and Gracemont CPU core micro-architectures.
Now their recent products are split between those made internally with the Intel 4/Intel 3 CMOS processes (which use rather obsolete CPU cores, which are very similar to the cores made with Intel 7, except that the new manufacturing process provides more cores per package and a lower power consumption per core) and those made at TSMC with up-to-date CPU cores, where the latter include all the higher end models for laptop and desktop CPUs (the cheapest models for the year 2024/2025 remain some rebranded older CPU models based on refreshes of Meteor Lake, Raptor Lake and Alder Lake N).
Fabbing at TSMC is an embarrassing but great decision. The design side of Intel is the revenue/profit generating side. They can't let the failures of their foundries hold back their design side and leave them hopelessly behind AMD and ARM. Once they've regained their shares of the server market or at least stabilized it, they can start shifting some of their fabbing to Intel Foundry Services, who are going to really suck at the beginning. But no one else is going to take that chance on those foundries if not Intel's design side. The foundry side will need that stream of business while they work out their processes.
The reason why Cadence and Synopsys do not have concurrence is that there are no open PDKs for modern technologies.
The existing open PDKs are for technologies from 20 years ago.
If TSMC, Intel, Samsung, or even Global Foundries or other second tier foundry would publish the design rules for their manufacturing processes and specifications of the formats for the product documentation that they must receive from a customer, it would be easy to make something better than the overpriced commercial tools.
I have used for many years the tools provided by Cadence, Mentor and Synopsys, so I know with certainty that it would be easy to improve on them if only one would have access to the documents kept secret by TSMC and the like.
This collusion of the foundries with the EDA vendors that eliminates the competition from the EDA market does not make much sense. Normally better and cheaper design tools should only bring more customers to the foundries, so I do not understand what do they gain from the status quo. (Unless they fear that public documentation could make them vulnerable to litigation from patent trolls.)
Fear of competition in the foundry market cannot be the reason for secret design rules. There is negligible competition in the foundry market and the most dangerous competitor of TSMC is Intel. Yet TSMC makes now chips for Intel, so the Intel designers have seen much of the TSMC design rules and they could chat about them with their buddies from the Intel foundry division. So TSMC must not give great importance in keeping the design rules secret from direct competitors. Therefore they must keep the rules secret mainly for other parties, but I cannot guess who are those.
Having talked to folks at two foundries, they were terrified of pissing off Cadnopsys. They wanted help with software w/o getting screwed over.
I am not sure that they have more business than they know what to do with, esp at larger node sizes. Most foundries are forever stuck at what their current node is. 40nm is still plenty good for the majority of long tail ICs.
The problem is that we don't really need more digital. Digital is covered--there is a 99% probability that you can abuse somebody's microcontroller to do what you need if what you need is digital.
That isn't true if your need is analog or RF. If the thing you need is even slightly off the beaten path, you're simply out of luck.
The big problem with those right now is DRC (design rule checking), LVS (logical vs schematic), and parasitic extraction. Magic wasn't even good back in the 90s, and technology moving forwards has simply made it even less good.
The issue is that funding for the open source EDA stuff mostly comes from the US DoD, and the US DoD is only funding all this because they want to somehow magically make digital hardware design a commodity so that their projects somehow magically get cheaper.
The problem, of course, is that the primary reason US DoD chip runs are so damn expensive is that they are stupidly low volume so all the NRE costs dominate. Even worse, everybody in government contracting knows that your funding is whimsical, so you frontload the hell out of everything to minimize your funding risk. So, you have big upfront costs multiplied by a big derisking factor.
The real solution would be for the DoD to pull this stuff in house somewhere--software, design and manufacturing. But that goes against the whole "Outsource All The Things(tm)!" mantra of modern US government.
Can we please stop naming things after directions? I don't want to shift left/right/up/down and my data center has no north/south/east/west. Just say what you actually want to say without obfuscating.
I guarantee your data center has a north, south, east and west. I'm willing to bet that the north, south, east and west of your data center are the same as the north, south, east and west of the office I'm sitting in!
[1] https://en.wikipedia.org/wiki/Shift-left_testing [2] https://www.dynatrace.com/news/blog/what-is-shift-left-and-w...
reply