You don’t need to be “enterprise-ready” or “scalable”

fasteddie31003 · on May 31, 2022

I was interviewing for a platform engineer role for a company that from the outside their application doesn't seem too complex. Through the interview process, I find out that you cannot run much of their app locally to develop against and their staging environment does not reflect their production enviroment well. I asked them how they develop their new features if they cannot run the whole app locally and if they intorduce many bugs without testing locally. They response was along the lines that they hired good engineers that could understand all the complexity before deploying new features.

I believe having a solid development environment where you can develop and test the whole stack is key to making reliable software. Is this common at other companies that you cannot run the whole app locally to develop and test against?

bob1029 · on May 31, 2022

> Is this common at other companies that you cannot run the whole app locally to develop and test against?

Yes. I work in B2B banking software, so having an "authentic" environment to test against is extraordinarily complex. Even the big vendors in the space seem unable to maintain accurate facsimiles of production with their own decades-old core software, so smaller vendors stand no chance unless they adjust their entire strategy.

What we ultimately had to do was negotiate a way to test in the production environment in such a way that all the activity could be tracked and backed out afterwards. A small group of important customer stakeholders were given access to this pre production environment for purposes of validating things.

This is something that has never sat well with me. But, the way the business works the alternatives are usually worse (if you want to survive long-term, that is). We wasted 18 months waiting for a customer to have a test environment installed before we learned our lesson about how bad those environments are in reality.

gumby · on May 31, 2022

My gf works in UX for a major bank and they simply issued her a bank account in her name (with her SSN etc -- it's a real account, with ATM and debit cards and the like). If she and some co workers want to test something out relating to accounts they actually use these accounts to minimize this kind of problem.

They must have some special code that says that these accounts can have their fees zeroed and probably are subject to some special out of band audits. I don't really know the details of course but I was actually kind of impressed.

(no she is not actually a customer of her employer)

tixocloud · on June 1, 2022

Previously led data teams in banks and you’re right. There are dozens of flags for test accounts. When auditing, we screen out these accounts. Of course, with a proper data cleaning process, analytics gets a lot easier

lobocinza · on June 1, 2022

That reminds me of "Mostly Harmless" from Douglas Adams where Ford Prefect breaks into The Guide's offices and gets himself an infinite expense account from the computer system. That's a dangerous thing to have.

SheinhardtWigCo · on June 1, 2022

Problem is, anyone with a good story to tell should really keep their mouth shut.

invalidname · on June 1, 2022

I worked at banks too but mostly segregated from the core business. Today I see observability tools as the solution for problems like that. Specifically tools like Lightrun where you can observe production, mitigate and fix directly from the IDE.

I wonder if other developers are familiar with this approach?

shardullavekar · on June 1, 2022

We are working on Videobug - it records code execution on production and plays in the developer IDE. Our Show HN here: https://hackernews.hn/item?id=31286126

acwan93 · on May 31, 2022

Same here in e-commerce. We end up testing in production environments with an element somewhere that's in a development environment that can be tracked and actively watched. Amazon notoriously does not have a production environment, and eBay's is just unusable.

grogenaut · on May 31, 2022

im pretty sure amazon has a production environment... just their staging is problematic. It's incredibly expensive to run a copy of prod and keep it up. You essentially have to get paged for it doubling your burden.

acwan93 · on June 1, 2022

Yep, I mistyped. Amazon MWS doesn't have a staging/sandbox environment that's available for third party developers to test on.

nijave · on May 31, 2022

Worked with banking software before; it can be incredibly complicated. I worked on proof of concept infrastructure for a Websphere monolith that ran on AIX servers. It had at least 10-15 services it connected to and each one of those were huge Java monoliths, too. Some of those apps had 8+ copies of /each/ environment (8x dev, 8x test, 4-8x stage/uat, 1-2x prod) so you could do integrated tests without stepping on other people or something software (software B might have a dedicated test environment of software A)

halfmatthalfcat · on May 31, 2022

I haven't found a wholistic integration testing framework where you can effortlessly mock out upstreams as running services, so that you don't have to ham-fist integration testing code into your app or rely on actual, live upstreams which are probably prone to breakage anyway.

nijave · on June 1, 2022

One way to balance is contract testing where you run a test suite against your mock/stub and the same test suite against the real thing. Your mock/stub would usually just have hardcoded data that aligns with the test suite to pass.

Another way is creating separate code libraries to abstract away external services. For instance, you might write ObjectStorage library that has real integration tests that use S3 but in your application you just stub out the ObjectStorage library. A lot of frameworks have verifying doubles or "magic mocks" or similar that will stub the method signatures so that can help make sure you're calling methods correctly (but wouldn't help catch data/logic issues).

Live upstreams are also prone to inconsistent state and race conditions--especially when you try to run more than 1 instance of the test suite at once.

For web services, I also like using "fixtures". You just manually run against a real service and copy the requests to a file (removing any sensitive data) than have your stub load the data from the file and return it. If you hit any bugs/regressions, you can just copy the results that caused them into a second file and add another test case.

mieubrisse · on June 1, 2022

Shameless plug: this is exactly the spirit behind what we're building at Kurtosis! (https://www.kurtosistech.com/) You get an isolated environment running real services connected to each other, so if you have a container with your mock then from your app's perspective it will look just like the actual upstream.

lifeisstillgood · on June 1, 2022

I have looked at your docs and don't quite get it. Is the idea that I have say 10 microservices that app-in-test will / may use, and then I set up the 10 instances (somehow all in one container?) and then can test that.

I think that seems sensible, but how is that different to 10 containers ? Is the issue something about simpler network management inside a container?

nerdponx · on June 1, 2022

I've done things like this in ad-hoc setups with Docker Compose and environment variables, managed with shell scripts. I would be very interested in a more thoughtful alternative that was integrated with testing.

chupasaurus · on June 1, 2022

I've done a thing automating whole environment setups with VMware cloning (undocumented feature because there is it's own product on top of it) and Docker via CI, in a bank. Basically we did golden images of VMs with databases having all accounts for test cases and prepared images for services to be deployed, while getting all necessary variables from CI and VSphere to write down in configs and Consul/Vault via scripts. Initial version control for the env could be done via same prefixes in the Git branches of services, after the deployment each env had a way to deploy whatever via CI's Web UI or API.

Running automated integration tests on the whole env is just an added step in the pipeline.

raffraffraff · on June 1, 2022

It's funny, I worked in banking about 15 years ago and the technology stack was the same. Java monoliths, Websphere on AIX (plus mainframes naturally)

winrid · on May 31, 2022

This works well. I've seen deployments where you have test1, 2, 3 etc, and you can "lock" the environment to yourself for a few hours.

tluyben2 · on June 1, 2022

Yep, work in b2b banking as well and there is a huge issue with banking ‘test’ environments. Our partners (100s of companies, mostly banks/brokers/insurers/etc) mostly don’t even have test envs and if they do, they don’t represent production at all (sometimes the api is a completely different version etc). Our software allows to spin up a test/dev env locally or where ever but nothing much works without the partners and to dev/test those, those, you will have to dev/test against their production systems with real money.

In the beginning (more than 20 years ago), I insisted that we implement test systems ourselves so we didn’t have to test in production; that was naive; it was expensive and there is so much complexity that this does not work well.

john_the_writer · on May 31, 2022

I'm in that boat. We have a "microservice" code base, and spinning up the whole "app" is a Herculean task.

Even if you do manage to get it all running, getting the multitude of database with useable data adds an odyssey to your day.

We have some parts automated, but ops cannot keep up with all the changes in the micro apps.

Back when I worked on a mono-rails app with a React front end, getting things up and running was as simple as running the JS server, rake db seed and then rails s. We used puppeteer to test FE, and rspec for the BE. Two test suites, but we knew if something was wrong.

That same place sometimes had a net outage, and I wouldn't even notice because I was running everything locally.

Now I can only get everything running locally with confidence, if I block out an hour or so.

Honestly... microservices are great in theory, but make dev a huge pain. Docker does not mitigate the pain.

Nextgrid · on May 31, 2022

That's the reason why I walk away as soon as I hear "microservices". It's cargo-cult mentality where complexity and the issues you speak of above are considered a feature.

jamesfinlayson · on June 1, 2022

Yep I've had the same thing happen - a company I worked for went from having a monolith that could run locally, then everyone drunk the microservice Kool-Aid and we ended up with no way to run the microservices locally in a way that integrated with the monolith. There were staging sites with everything but there was one staging site per team.

danjac · on June 1, 2022

The worst cases are startups adopting microservices. I get where they are a solution to a large company with multiple teams hitting the pain of monolithic development, but a team of half a dozen adopting microservices is a big red flag.

deterministic · on June 2, 2022

Microservices are not even great in theory if you spend just a bit of time deeply thinking about the complexities that microservices introduce compared with simply using libraries/modules to organise your system.

bri3d · on June 1, 2022

Quite common.

Usually someone develops a remote or mixed local/remote development environment which is easier to use than configuring the full stack to run locally, and then the ability to ever do so again atrophies until nobody can figure out how anymore.

In some companies I've worked at this is fine and not a major setback. In others, it's been a mess. I don't think that "can run the stack locally" is a good litmus test for developer experience - the overall development experience is a lot more nuanced than that. "How long does it take to get going again if you replace your computer" and "what's the turnaround time to alter a single line of code" are usually better questions.

clepto · on May 31, 2022

I work at a VOIP services provider and we operate multiple brands which are companies we’ve acquired over several years. One thing that has rang true of every company we’ve acquired is that there is typically barely even a way to run one piece locally, let alone the whole thing.

Some of these when I took over there wasn’t even version control, and they were being developed by FTPing files to production server or even editing them live with vim or something. Certainly no reproducible containerization or even VMs to speak of.

I agree completely with you about having a solid development environment. I’ve spent the better part of a year creating such a thing for some of the brands, and it has increased our time to ship features and fixes probably ten fold.

temp8964 · on June 1, 2022

> they were being developed by FTPing files to production server or even editing them live with vim or something.

Many editors / IDEs (e.g. Notepad++) can directly connect to the FTP server, so working on a FTP servers looks almost same as working on a local folder. You just make changes to the files, and then refresh the web browser to see the changes.

bonestormii_ · on June 1, 2022

lol, yes, sure. But why don't you sound horrified?

Version control at least is essential for something of any size with any number of people working on it. You must be able to "revert" a set of changes quickly and reliably if the application has any importance at all.

WJW · on June 1, 2022

> You must

Don't want to be rude here, but version control is clearly not "required" in the strict sense. Plenty of software has been developed without it. Now I wouldn't want to work at a place without version control either, but it is as required for developing software as seat belts are required for driving a car. Less even, since at least for seat belts they are required by law.

vbezhenar · on June 1, 2022

People manually control versions all the time. Ctrl + C, Ctrl + V on a project folder to make a version. I have doubts that someone developed any nontrivial software without that approach.

asd88 · on May 31, 2022

I work at a company that has similar practices to what you describe (we don’t run our services locally). While it’s hard to get used to at first, I think the silver lining is that it forces everyone to write good unit tests, since it’s the only efficient way to run their code before publishing a PR. Having worked on the opposite end (at a team where we mostly tested locally and had almost no automated tests), I much prefer this environment.

fasteddie31003 · on June 1, 2022

My last job was working at a company where you could not bring up the app locally. It didn't start that way. When the company was smaller I could develop a whole feature from the backend to the frontend and see the whole flow. As we grew it became increasingly difficult to run the app locally. Then it became impossible. Bugs grew and development time increased honstly 4x to do the same feature I could have done 3 years ago. I now think a good clean developer experience is key to a stable app.

raffraffraff · on June 1, 2022

I'm at a company that works like this, and it's awful. There are also anti-patterns around git branching and container deployment. The staging environment is broken every other day and production deployment requires the full time effort of a very senior engineer who cherry-picks through git commits from one branch to the production branch (there are thousands of differences between these branches). I'm honestly done with them. They don't even know what sensible dev/test/deploy looks like.

sage76 · on June 1, 2022

I'm sure that paradoxically your company must have sky high hiring standards, with interviewers particularly grilling candidates about best practices?

This is not even sarcasm.

ingenieroariel · on May 31, 2022

I saw a project get themselves in that position by using a lot of AWS services. At some point it seems they had a dev environment but it quickly faded.

Their solution was to have an anonymized copy of prod to a few (like 3 for 20 engineers) environment where features could be tested. Engineers usually paired in order use them concurrently for unrelated features and they were very expensive.

I did not live enough to see the on-demand environments arrive - devops was always busy with operations tickets to continue building those.

matt_heimer · on June 1, 2022

A lot of it depends on the complexity and the technology stack.

Microservices based architecture with massive amounts of microservices and other service dependencies, its a pain to do more than local unit testing and mocks.

Are you using FaaS or a lot of other cloud features? Also difficult to fully test locally. Extract as much logic out of the cloud dependant code.

Often there will be sub-sections of large apps that can be run locally.

And if the only way to test is in production hopefully they practice some form of limited rollout so you can control who adopts the changes or similarly some form of optional feature flags that are used to enable the new behavior until it can become a new default. If their release model is "We are so good it goes live for everyone all at once" then that tends to make for a stressful release days.

kcartlidge · on June 1, 2022

> Microservices based architecture with massive amounts of microservices and other service dependencies, its a pain to do more than local unit testing and mocks.

For microservices if you use something like Consul for service discovery within those microservices then you can hit them with, eg, Postman and change your local Consul registrations so everything goes to a main environment except the handful of services you're working on - which you redirect to your local machine for debugging.

roflyear · on May 31, 2022

Extremely common. For some reason it is cool to put stuff in lambda or azure functions or whatever. Or to use 11 different stacks.

vbezhenar · on June 1, 2022

It is the case in my company. Production and staging is bunch of docker-composes scattered around several servers. We had to create another staging environment, I think it took several days to configure it. People run services locally, but not entire stack, that takes too much effort.

Right now I’m trying to understand and deploy Kubernetes. Scalability and availability is good, but one of the reasons that I want to make it possible to roll out entire stack locally.

Basically it’s lack of devops. We don’t have separate role and developers do the minimal job of keeping production running well enough, they got other tasks to do. There should be someone working on improving those things full time.

raffraffraff · on June 1, 2022

This: "Basically it's lack of devops". Companies think they can avoid hirig any infra/devops people and make all of that stuff the job of junior developers. They should contract someone it to set things up, at the very least.

hatware · on May 31, 2022

You can find developer first roles where the company puts a lot of time and energy into making developers the most efficient workers, but most companies are a cluster-fuck of resource management and you'll be lucky to find a healthy company that also puts developers first in this way.

In my experience, recreating staging as "production junior with caveats" comes down to cost and living with bad architecture/design. I'm not defending the practice, but I do think it's incredibly common.

ma2rten · on June 1, 2022

At Google, it is very team-specific but it is generally not possible to run servers locally for testing and there generally is no staging environment, but it might be possible to bring up some part of the stack on Google's clusters. Those severs may read production data or some test data. Other than that there are static checks, unit tests, code reviews and the release process to ensure reliability.

fasteddie31003 · on June 1, 2022

How do you develop features if you can't run the server locally? Do you just write code, get it approved in a code review and push it to production?

vwem · on June 1, 2022

Chipping in as another Googler - agree that it is team-specific, but you generally can bring up development environments (in terms of what traffic is served, closed to the public, etc.), but these aren't actually run locally.

kijin · on June 1, 2022

Production is not the only alternative to local. Companies can set up servers, either in the cloud or on premise, specifically for development and/or staging purposes.

This is very useful if the production environment is hard to replicate on each developer's local machine. For example, a team where some people prefer the MacBook Air and others prefer beefy Windows desktops might set up a cluster of Linux servers for shared development and testing. A common alternative is to tell everyone to run a bunch of VMs/containers locally, but this isn't always feasible depending on the size and complexity of the production cluster.

nijave · on June 1, 2022

Automated tests where you're just running a tiny subset of the code you're working on

jamesfinlayson · on June 1, 2022

I've been at a place where the whole stack couldn't be run locally at all but everything is separated by well-defined API boundaries and there are mature mock servers where required.

The architecture was probably more complicated than it needed to be but it was also decades old.

nijave · on May 31, 2022

Yeah, once you have more than a small handful of dependencies (SaaS, databases, microservices). Some of the stuff I worked on will have built in stubbing/mocking so you can run parts of the stack locally and the test suite also uses the mocks/stubs.

icedchai · on May 31, 2022

In my experience, good local development environments are relatively rare. We had them at a couple of small <10 person B2B, SaaS startups I was involved with. In larger companies, there were just too many pieces to run everything locally. Or in companies with software development, but not focused on software, local environments seemed more of an afterthought. With increased dependencies on cloud services, "serverless" environments, etc. it can also be painful to run stuff locally. Obviously you can build around this (and should...) but it requires thinking ahead a bit, not just uploading your lambdas to AWS and editing them there...

inferiorhuman · on June 1, 2022

Yaknow I actually transitioned from doing dev stuff to ops at a smallish startup. I was brought in to help with a green field rewrite. Eventually it became clear just how much of a mess the deployment environments were so I cleaned everything up and parameterized things so that when we went to stand up the new production environment it was quick and painless.

I got hired on at megacorp to do pretty much that with a suite of green field apps. And so I did, across a few apps and a zillion different environments because all of a sudden spinning up a new integrated environment was easy. The devs ran a lot of stuff locally but still had cloud based playgrounds to use before they hit the staging environments. Perhaps the best part though was having the CI stuff hooked into appropriate integrations, so even if you couldn't run it locally you sure as shit tested it before it hit staging. SOC2 compliance was painful for the company but not so much for us.

If you're sensing a theme with "green field" you'd be on to something. Even in a largeish company it's not so much the complexity as it is the legacy stuff that'll get you. Some legacy stuff will just never be flexible (e.g. the 32-bit Windows crap we begged Amazon to let us keep running) and even the most flexible software will have enough inertia to make change painful.

Tangentially this is also why I get apprehensive when I see stuff like that linux-only redis competitor.

exdsq · on June 1, 2022

Yea, it’s common that one doesn’t exist. I work as a test engineer and am usually the one to build and support those environments, I’ve yet to find a company that has it all working well before I join. I actually made a killing building integration testing environments of blockchains and sticking them in CI too.

avg_dev · on June 1, 2022

Can you elaborate on the blockchain testing environments/CIs thing please?

exdsq · on June 1, 2022

Sure! So when testing you want a deterministic environment - starting from genesis, couple accounts, the ability to build or wipe state, etc… sometimes projects use multiple chains, so maybe Ethereum and Polkadot along with a bridge between them. Building this out in Docker, managing it in CI, and building a test framework to allow devs to easily complete actions like sending a tx or waiting for an event to occur in a block has been my bread and butter for three years (and taken me from a median salary to a top 1% rate). I’ve recently moved to another industry but it was good fun.

avg_dev · on June 1, 2022

Wow, that's fantastic.

So basically: unit test fixture setup spanning multiple external services ("Chains?") in a scriptable build, with various common/comprehensive/programmable interaction flows, and then integrating with CI? I'm making some guesses here as I'm not very familiar with the space.

I think it's amazing that you have (had) made this into a career. I am a dev and spent a couple years as a QA developer. Do you think this is a space I could break into? (Decade plus of development experience plus the good fortune of growing up with a mother who was a software developer. I learned to code (badly) when I was 13. I'm more than double that age now :)

If so, I'm wondering if you'd be up for chatting some more like on IM (Google Chat or Whatsapp or whatever) to elaborate further if you think this is still a viable proposition for someone new. I'd be happy to pay you for your time if you think this could make me money. I know how to set up CI, I'm vaguely familiar with bitcoin (I used to own some), and I don't care for cryptocurrency but I'm happy to meet market needs with a smile to pay the bills and then some.

avg_dev · on June 1, 2022

And here's another thought. I know very little of the space, again, but I've seen some headlines about smart contracts going wrong.

Do you think there's any space for low hanging fruit simple auditing of smart contracts? Maybe a linter?

EDIT: Here's what Ethereum suggests: https://consensys.github.io/smart-contract-best-practices/se...

In particular,

https://github.com/protofire/solhint/blob/master/docs/rules....

https://github.com/prettier-solidity/prettier-plugin-solidit...

bin_bash · on May 31, 2022

Local dev is great until production gets complex enough that you lose parity. That'll happen even earlier now that we run x86 on the server but ARM locally.

At my company we don't support local dev nor have any sort of staging environment. We have development linux servers connected to prod services and dbs. There are no pragmas or build flags to perform things differently in "dev". Things are gated with feature flags.

It scared me at first but now I think it makes sense: staging is always a pain and doing it this way we avoid parity problems between staging and prod. Local development would be impossible for a system at our scale but I think even a staging setup would result in more defects—not fewer.

roflyear · on May 31, 2022

Local dev isn't the end, it is the beginning. With it it allows you to get further than you would otherwise - and most importantly, when (not if) you do discover a problem on a lower environment (or prod) it gives you a place to try and replicate it. You run everything locally so hopefully you can make a change to mirror the problem... Not a guarantee but at least you have a good starting point.

Also your local dev should try and mirror prod as much as possible.

lmm · on June 1, 2022

Yep. This is one reason I've always ran linux on my dev machine anywhere that would let me - sure, there shouldn't be any difference between OSes, but otherwise it's one more place for differences between dev and prod to creep in.

ek750 · on June 1, 2022

Heck, in my previous $WORK, I saw differences from local (running Debian) to production running CentOS. Luckily one cluster of Jenkins build nodes were also running CentOS and those caught a subtle incompatibility between system libraries that would have blown up if we had merged and deployed to production. My local Debian testing env did not catch it.

roflyear · on June 1, 2022

As a Python developer, yes, there are differences - and those differences are sometimes a huge PITA.

The only reason I don't run Linux is due to other MS tech - Teams, Outlook, and primarily office products.

andyjohnson0 · on June 1, 2022

In my experience, yes.

Often it's due to complexity. Systems accrete stuff (features, subsystems, dependencies, whatever) over time due to changing business requirements. The more this happens, the harder it becomes to integrate them properly without breaking other parts of the system. And the system is running the business, often 24 hours a day, which makes migrations hard to do. So you end up of something thats basically too complex to run locally.

And particularly in regulated industries like banking you have the security teams locking everything down hard. Problems accessing systems and getting realistic test data, etc.

DeathArrow · on June 1, 2022

>Is this common at other companies that you cannot run the whole app locally to develop and test against?

It depends what you mean by the whole app. To reflect locally 100% of the conditions the big microservice app I am working on is very hard as I have to replicate locally multiple Kubernetes environments, Kibana, Data Dog, Elasticsearch, Consul and several databases and data stores.

What I can do, is to run one or more microservices locally, connected to remote development databases and speaking to other microservices and apps from some development Kubernetes environments.

xputer · on June 1, 2022

I'm sad to say this even happens at FAANG. So many engineers seem to either not want to or not know how to optimize their own workflow as they're building it. It's really frustrating when you want to contribute something to their project and the development workflow for verifying the application still starts after making any changes is "upload the 5GB deployment bundle to AWS". Like come on, you can't seriously expect people to be productive that way.

voidfunc · on June 1, 2022

> Is this common at other companies that you cannot run the whole app locally to develop and test against?

Extremely. We can't replicate our entire infrastructure in development - it's simply too complex and expensive. We use a combination of dev, integration, staging, and canary environments plus pilot environments to gauge quality before a full worldwide rollout.

grp000 · on June 1, 2022

lol. It sounds like they hire smart people. But smart people =/= good engineers. Good engineers keep things as simple as they can.

Gigachad · on June 1, 2022

Simple is easy on a small product. Your simple sql setup doesn’t work so well when you have terabytes of data and need to be running tens of millions of jobs per day.

scarface74 · on May 31, 2022

Why is it a big deal to run your app locally? You just give each developer access to a cloud account (either one account or one per developer) with wide enough guard rails.

Heck, when I am in a place with bad internet, I spin up a Cloud 9 (Linux) or Amazon Workspace account (Windows) and do everything remotely.

I’m not wasting my time trying to use LocalStack or SAM local.

jamesfinlayson · on June 1, 2022

It would be quicker for me to run things locally - in my team we all have access to an AWS account each but to test my code remotely I need to compile code, build my container, push my container to ECR then deploy the container to ECS - that last step takes minutes.

scarface74 · on June 1, 2022

The micro service I was testing locally connected to remote services either via an https end point or via service discovery. It didn’t have to run in a container at all or be run remotely. If it communicated via messages, it simply read from or wrote to the queue using local IAM credentials.

If you’re developing locally to none public endpoints, use a VPN or develop within an IDE hosted on AWS behind a VPC.

Why does the app you’re testing run differently in a container than outside a container?

Standard disclaimer to expose my biases not to do the whole “appeal to authority thing”: I work in ProServe at AWS (cloud app development consultant). But I did the same thing when I was working in the real world. My specific job looks just like a senior enterprise dev + a shit ton of yaml, HCL, draw io diagrams and PowerPoint slides.

The other bias I have is that I work for the only company where I never have to worry about an AWS bill and I can spin up ridiculous remote EC2 based development environments when needed.

But I also didn’t have many constraints when I was the de facto “cloud architect” at a 60 person startup.

jamesfinlayson · on June 1, 2022

Unsurprising reasons for everything - security department would have been reluctant to allow development from inside AWS, and running in Docker was to ensure everyone (i.e. all developers, testers and TeamCity agents) were running on the same versions of runtimes etc (Docker was chosen as the easiest solution for this problem at the time).

scarface74 · on June 1, 2022

Damn, I first got into AWS not just to avoid administering systems. But to also avoid system administrators.

That being said, how do you get on different versions if everyone uses the same package manager config file (requirements.txt/package.json/the wonderful system that Go uses/whatever Nuget/C# uses).

I’m also spoiled because I’ve never been in a position where I wasn’t the developer and leading the “DevOps” initiatives since working with the cloud - not bragging, I only opened the AWS console four years ago for the first time.

HN/Reddit is a good place to find out other peoples experience.

jamesfinlayson · on June 1, 2022

It's more to make sure everyone is using Java 11.x, Java 17.x, Python 3.x, Node x etc - we're on Macs but TeamCity agents are Linux and Docker is easy way to pin the runtime on both operating systems.

Docker also fixes the "I forgot to brew install a specific version" in this case.

chupasaurus · on June 1, 2022

> deploy the container to ECS - that last step takes minutes

Either you wind up an instance for each container, it takes too long to start the app with less resources or the image size is in dozens of GB, everything could be avoided.

taylodl · on May 31, 2022

> "So unless your product legitimately only sells to the likes of Google, Facebook, and Apple, you probably don’t need to be 'enterprise ready'"

What in the actual fsck? Check out this list of companies ranked by the number of employees they have - https://companiesmarketcap.com/largest-companies-by-number-o.... Okay, Amazon is #2 - but notice the top 100 doesn't include Apple. There's a lot more to business than FAANG.

onion2k · on May 31, 2022

If you're selling enterprise software then the important number is not "number of employees" but "number of employees who would use your software". In large retail, manufacturing, oil etc companies they have a lot of software of course, but each person is typically accessing 4 or 5 apps at most (email, office suite, HR, etc). In FAANG companies each employee involved in software development is often using closer to 30 apps (project management, comms, source control, bug tracking, telemetry, server management, i18n, etc). It makes sense to focus on FAANG companies if you're producing enterprise software simply because they buy more of it. That obviously doesn't mean there isn't a huge market outside of that as well.

fnordpiglet · on May 31, 2022

You’re better off focusing on banks, oil and gas, etc. Anything that uses a lot of technology to get its job done and operates at a massive scale. FAANG are a very small part of the ecosystem and tend to build vs buy.

A major bank employs between 50,000-10,000 software engineers and in the 250,000-50,000 employee range. Netflix has only 11,000 employees of all types.

Spooky23 · on May 31, 2022

Task workers use 5 apps, but if you’re one of them, that might be 10,000 people.

I knew a guy who sold a $2M solution to track MSDS sheets in a prison system. Stupid simple application.

nijave · on May 31, 2022

>The buyer is typically a VP, a director, possibly C suite, depending on what you’re selling

If a company has VPs buying things surely they have some sort of internal vendor management that usually involves accounting/legal/compliance reviews. Being enterprise ready means you have the roles/documentation/process/procedure to answer those questions

throwaway892238 · on May 31, 2022

Although each of those companies is made up of 5 to 50 mini-companies, each with more business units/products/services. You have to understand which of them you're selling to, what they need, how they will use your product. Are two completely different tiny departments gonna hem and haw over trying your product, or will the whole org use your software? You might end up supporting a dozen users or 500,000. Will you still have to support the same features, or can the tiny teams get away with less? Is it even worth supporting the enterprise features if your product fit is smaller teams and you wouldn't get a large-scale contract anyway?

scarface74 · on May 31, 2022

I agree. The last three companies I worked for were large health care companies. They spend millions on SaaS products.

Avshalom · on May 31, 2022

right like any software deployed by Zebra is probably going to get used by every retail company and warehouse in the USA.

Now I don't know how much of the software on my TC26 actually comes from Zebra but, pretty sure, it's not zero.

ASSA ABLOY software is probably everywhere there's a door these days.

I don't know who the players in HVAC are but considering that a lot of HVAC/lighting control for like Target/Walmart are remotely controlled there's definitely somebody with tens of thousands of deployments.

Not to mention EdTech shit like BlackBoard, that gets deployed to every god damned school in the country

polote · on May 31, 2022

And good luck selling enterprise software to Apple, Google and Facebook, they all already have some custom internal softwares for everything

scarface74 · on May 31, 2022

No. They use ADP, Concur, WordPress (check out AWS official blogs), Salesforce and third party SaaS products just like everyone else.

Mandatum · on May 31, 2022

Yeah, they know the "keep everything default as much as you can" rule better than anyone else with SaaS providers. What a load of gobshite.

scarface74 · on May 31, 2022

Are you saying they should keep everything in house? I can tell you from personal experience that the external facing services from one of the major cloud providers handle spikes in traffic and has much better uptime than many internal systems.

deterministic · on June 2, 2022

Nope. Microsoft (for example) use SAP. It would be crazy for them to spend millions building something similar to SAP.

wereHamster · on May 31, 2022

I'm working on a project where the backend devs decided to use Cassandra to manage a few hundred entities. Due to different access modes they had to denormalize the data into multiple tables and we're constantly running into consistency issues. A fucking sqlite db with a few tables and indexes would do just fine. Also, the tool has maybe a dozen DAUs. But it's web scale, so, yeah…

_skel · on May 31, 2022

That's not a scale problem. That's a problem of database choice. Cassandra is not good at that sort of query pattern at any scale.

jsiaajdsdaa · on May 31, 2022

Why would a team decide to use nosql to manage "a few hundred" entities?

Nosql databases excel at simple lookups and gets by ID, rather than trying to sort and join pretty much ad hoc.

theepauk · on June 1, 2022

Resume driven design

maccard · on June 1, 2022

At a few hundred entities, it doesn't really matter what tech you choose, so you should probably choose whatever is closest to hand.

winrid · on May 31, 2022

NoSQL might not be the right term here. Maybe you mean "columnar databases"?

Ex Mongo sorts just fine, and can optimize sorting with indexes.

lmm · on June 1, 2022

Why would you ever use SQL? The SQL transaction model doesn't make a lot of sense for most use cases (including essentially anything that uses the internet), and trivial usability things like being able to have collection columns without having to define a separate table all add up.

yCombLinks · on June 1, 2022

That's roughly the same argument as choosing a strongly typed vs a weakly typed language. I much prefer the strong typing personally. Yes, it's more work up front. But every other person that works with the data knows what they're getting.

cutler · on June 1, 2022

> SQL transaction model doesn't make a lot of sense .... anything that uses the internet

So decades of PHP+MySQL and Rails+PostgreSQL were simply misguided?

lmm · on June 1, 2022

> So decades of PHP+MySQL and Rails+PostgreSQL were simply misguided?

To the extent that transactions were the point, yes (though there may not have been many alternatvies for a datastore to use when Rails was getting started).

In the case of PHP+MySQL of course MySQL transactions were disabled by default in that era (up until 2010 in fact).

housecarpenter · on June 1, 2022

I'd be interested to hear why you think this, or if you can link to something that explains this perspective.

lmm · on June 2, 2022

There's a fundamental mismatch between the HTTP request-response model and the SQL "open transaction" model - you can't open a transaction, serve a form, and wait for a user to submit it and commit that in the same transaction, because what if the user walks away? But if a transaction doesn't extend to the actual user action then it's basically useless; you can't do transactional read-modify-write when the one doing the modifying is the user.

After all, what do you do when a transaction fails? Most webapps either retry or fail (if they even bother to handle that case at all), but you don't need transactions to do either of those. In an actual database application you can tell the user their transaction didn't commit and ask them to deal with it, but again you can't really do that over the HTTP request/response cycle (because what if the user submits the form and walks away? They expect their input to be saved, whatever happens on your backend).

hnfong · on June 4, 2022

What is the alternative? Use distributed databases that don't guarantee atomicity at all, and write up complicated solutions to work around it? Or are there some non-transactional, atomic databases that more nicely fit the web model that I don't know about?

lmm · on June 9, 2022

You have to write the "complicated solutions" anyway. SQL just lets you paper over the problem for longer. 99% of webapps lose data and don't notice or care, ACID is just a stick to beat other datastores with. IME something like Cassandra offers far more actually useful advantages (collection columns, true master-master HA out of the box) than an ACID/SQL database ever can.

akhmatova · on June 3, 2022

That must have been what they were thinking when they made the decision to go with Cassandra.

lmm · on June 3, 2022

I'm working on a system that's using Cassandra on a small scale (like, probably less than a million entities) now, it's great. Haven't had any consistency issues because we've designed our data flow properly (SQL will help you paper over a bad data flow for longer but that's a double-edged sword), and getting true just-works HA (master-master where a failover doesn't interrupt your writes) out of the box is a huge advantage over something like Postgres.

akhmatova · on June 3, 2022

Oh, I don't dispute that there are valid use cases.

But the above instance sounds like a variant of the "because, bwah, SQL is hard" line of thought that seems to have powered much of the (now historical) NoSQL boom.

bee_rider · on May 31, 2022

This completely tangential, but any time I see someone talking about normalizing a database, I briefly think they are talking about doing something like:

x/|x|

Ya know, like normalizing a vector. Of course this brief confusion doesn't matter, I sort it out quickly, and it is totally clear language to other database people -- it is just a funny quirk of overloaded lingo.

But I wonder if as big data and machine learning folks (since there's some overlap with Linear Algebra there) ever get their lingo mixed up.

"To predict where the ridesharing customers will want to go, we will apply the taxicab norm to our database."

adepressedthrow · on May 31, 2022

While the overloading is annoying, database normalization actually has a strict definition that is not unlike the standard mathematical use; that is, to imply consistency (to make normal): https://en.wikipedia.org/wiki/Database_normalization

I would also point out that normalization within mathematics does not have a single concrete definition either, and it depends on your domain and desired property to stabilize: https://en.wikipedia.org/wiki/Normalization_(statistics)

bee_rider · on June 1, 2022

Just to be clear -- I'm not annoyed really, I actually find these little momentary confusions to be a charming little quirk of our language. And I think it is not a big deal anyway, 99% of the time we're talking inside our fields where it is obvious what we mean.

deterministic · on June 2, 2022

You are absolutely right. 95% of problems in software development are created by the software developers themselves. The fastest/smartest developers I have worked with simply make less dumb mistakes. They keep it simple. That’s all they have to do to be 10x more productive.

bigtones · on May 31, 2022

You actually do need to have Enterprise security and authentication features if you intend to sell to Enterprise customers. (SOC2, OAuth, SAML etc)

bob1029 · on May 31, 2022

I am on calls all week with our banking clients about getting SAML/SSO squared away once and for all.

We are having a lot more nervous takes around legacy LDAP/AD auth mechanisms these days.

dt3ft · on May 31, 2022

I developed SAML/SSO integration for an application used by the Swiss UBS which initially relied on individual accounts, not even AD. Tricky stuff.

grinich · on June 1, 2022

Small plug for http://WorkOS.com (where I work).

It’s an API for easily adding SAML, SCIM, and more. Currently being used in production by Vercel, Planetscale, Webflow, and +200 other apps.

hiharryhere · on June 1, 2022

I can vouch for this - I’ve just rolled out workos with three enterprise customers in the last 6 months. Super simple, great docs. Had been putting off SSO for years hoping something would come along and solve it for me.

E.g I know nothing about Azure AD but I just emailed the WorkOS setup link to our customer’s IT team, they followed the instructions and it just worked. No back and forth.

Can’t recommend enough.

formercoder · on May 31, 2022

I did SAML with Barclays years ago. The hardest part was getting someone there to click enable for our app. They’d never reply to my emails so I called them until they picked up. Then it was done in 30 minutes.

JacobThreeThree · on May 31, 2022

It's true. These types of security certifications like SOC2 are quickly becoming a minimum requirement.

pid-1 · on May 31, 2022

You actually need OAuth / SAML if you give 2 shits about your customers.

GauntletWizard · on May 31, 2022

You need OAuth and not to use SAML. Saml is insecure by design: https://joonas.fi/2021/08/saml-is-insecure-by-design/

ozim · on May 31, 2022

I agree with article because you don't need to have it to "start talking" with Enterprise customers.

You probably have to prove that you have these on your roadmap and that you have engineering team that can implement these in not so distant future. That you know these things exist and have will to invest in it in future.

haswell · on May 31, 2022

These capabilities are increasingly required just to get in the door, depending on your target customer.

The alternative is that your product becomes a security exception, which infosec teams are more and more unwilling to grant, for pretty good reasons.

The good news for product teams is that there are more and more off-the-shelf options that can be quickly integrated vs. having to start from scratch.

But the main point of this comment is: I think you’re right that “near term roadmap” was good enough a few years ago, but less so now, and continuing to trend towards “not acceptable” if selling to enterprise customers.

(Observations as a recent/former auth PM at a SaaS/PaaS that sold primarily to enterprise customers).

akullpp · on June 11, 2022

Yep, I'm working at a Series B startup and I can tell you, you absolutely need SOC 2 or ISO and/or GDPR compliance based on your market. It's not optional, you will lose business if you don't.

cubes · on May 31, 2022

Hard agree.

History is littered with dead startups that designed for scale before they had enough usage to justify it. Within reason, having users knock your site over resulting in failures like the Twitter Fail Whale is a good problem to have.

With that said, you need to be prepared to scale up quickly once you have this problem. There's a reason Facebook counts its users in the billions, and Friendster is a footnote in internet history.

s1k3s · on May 31, 2022

Yes, but.. I am a software architect. ~90% of the job offers I get are from CEOs/CTOs who want me to join their small company to help them refactor because they can’t grow anymore. Tech debt kills as well.

dgb23 · on May 31, 2022

Is that a "but" or is that how things should work? You solve a problem when you have it or can reasonably predict it (in technical/engineering terms). I also think it's sensible to look for experts when you need them right?

colechristensen · on June 1, 2022

I don't know I've been places where the difficulty scaling and failure to do it promptly posed an existential threat. As in "this system is poorly designed and needs to be rewritten so we're going to pause new features for six months to do it", you can't always just magically find an expert to solve your problems in a timely manner, and feeding the technical debt monster can easily kill startups.

icedchai · on May 31, 2022

For every company that has this problem, another several never made it far enough for it to matter.

noodle · on June 1, 2022

If they have the money to hire you and they have pent-up growth that refactoring will unlock, that really sounds like close to an ideal path for a startup to take. They're literally paying down their tech debt by paying to hire you. Maybe ideal would be a little earlier so it doesn't block growth, but yeah. In my experience way more startups have died from premature optimization

xboxnolifes · on June 1, 2022

If this pattern is successful, it's not killing them.

raffraffraff · on June 1, 2022

So how do you square they circle "Don't design to scale but scale up quickly when you need to"? You can't rewrite your stack just because you took on a new customer that is suddenly killing you.

I don't think that it's stupidly hard to get basic things set up for scaling at the start. These don't apply to everyone but:

- use a cloud provider, they let you grow

- write infrastructure as code (terraform, pulumi, cdk), it's not that hard and it makes building your 2nd, 3rd data center easier

- for the love of God think for 10 minutes about a scalable naming convention for data centers, infrastructure components, subdomains, IP ranges etc. This type of stuff causes amazing headaches if you get it wrong, and all you have to do is consider it at the start. Many AWS resources can't be renamed non-destructively, and are globally namespaced. Be aware of this. Don't paint yourself into a really stupid corner.

- even if you're not doing microservices, consider containers and an orchestrator that has scalability baked in (like Kubernetes). If you use a managed Kubernetes with managed node groups or fargate, you can basically forget about compute infrastructure from this point on

- deploy Prometheus, Grafana, Loki etc day one and build basic dashboards. With Kubernetes, you can get this installed quickly and within a day you'll at least be able to graph request/error count and parse logs.

- deploy an environment for developers to deploy to and use the product (even load test it)

The development cycle is now set up for building what you need to build. You can easily find developers at any level who have experience with containers, Kubernetes, Grafana so they can onboard quickly, be productive and troubleshoot their own stuff. They can refactor the monolith without having to invent service discovery, load balancing, ingress etc.

My personal experience with Kubernetes began at a start-up with a tiny team, none of whom had prior experience with it. We had started building a single monolithic app. But within a short time we figured out how to make a helm chart, add ingress and boom, we were "online". Every time someone was about to spend time and effort inventing something, we would quickly scan through the Kubernetes docs and CKAD material to see if we could avoid having to build it at all. We leverage cloud stuff too, where it was cheap and easy and scalable (like AWS SQS/SES/SNS/CodeArtifact/ECR/Parameter Store)

iovrthoughtthis · on June 1, 2022

i didn't see the words "it depends" anywhere in this

thats the whole point of engineering, picking/building the right solutions for the problems and context

some stuff just needs a vps, a php file and an sqlite db and it will scale

other stuff might need container orchestration or anywhere in-between (or beyond). it should depend on the context though and we should be prepared to change strategy as needed

raffraffraff · on June 1, 2022

I did say "These don't apply to everyone but:" and then gave my generic list of stuff that works for a lot of people.

I get that some people can just run a single go binary or whatever. The ones that don't, need to set themselves up to grow easily when they need to. Having some sort of basic plan or framework that doesn't take more than a week to set up is due diligence, and hiring (when your need to) will be much easier.

bryanrasmussen · on May 31, 2022

>With that said, you need to be prepared to scale up quickly once you have this problem.

See I interpret that to mean that you should design for scale, but not implement or invest in scale until needed.

travisgriggs · on June 1, 2022

Is one of the problems though is that the early audiences appreciate this kind of over-engineering? If your investors/buy-in folks are being shown early demos and you mention "and this thing is gonna scale seamlessly up to 1,000,000 users because of tech words A, B, and C" then they get excited. They don't understand A, B, or C. But what they just heard is that you're planning for Big. And that stokes their aspirations.

roflyear · on May 31, 2022

Facebook is a great example. They did tons of things not traditionally known as highly scalable. Like one repo with the project coded in php which was largely a monolith.

cutler · on June 1, 2022

And an imperative codebase which looked like something out of Matt's Script Archive. I often wonder whether a startup today could pull-off something similar.

jasonkester · on June 1, 2022

Absolutely they could. And they should, because doing so will get you launched quickly.

Instead, todays shops all start out with javascript front ends and 40 layers of backend complexity to accomplish the same thing.

cutler · on June 1, 2022

I'm not sure. The bar for UI has been raised significantly since Facebook launched and mobile has changed the game to such an extent that you need a presence on 3 platforms out of the door (iOS, Android & web) to even stand a chance.

roflyear · on June 1, 2022

Not entirely true for many apps. The app I work on does not need a working mobile site, because we don't think our consumers are going to use it on mobile.

However, we do very much care about our UI - and I think react at this point helps with that. But I don't think React complicates the stack too much. It may even make things easier, tbh - I don't have to test pages on backend tech, just endpoints.

cutler · on June 2, 2022

React easier compared with what - imperative PHP4 at Facebook in 2004? I'd have to disagree. React state management is much heavier than returning HTML from an embedded PHP template.

roflyear · on June 1, 2022

I believe so. But, most startups like to use insane tech for fun, or something - though in my experience all it does is kill productivity.

throwaway892238 · on May 31, 2022

Well it depends on what you're selling! Design for what you truly want to sell, and that will end up however it needs to end up, whether that's enterprise or not. This random blogger, and our random comments, can't tell you what you need. It has to emerge from the product development process like a sculpture from a block of marble.

ketzo · on May 31, 2022

I know, this is overly-general advice to the point of silliness.

It's a little like telling a construction worker "You don't need a super-powerful 'jackhammer' to do your work. A simple claw hammer is all you actually need."

Well... sure, that's definitely true if you're building a house.

But you might have a problem ripping up a sidewalk...

There are plenty of companies building houses, and plenty who are working with concrete! Neither one of them are "the big kids."

They're not even necessarily different-size markets, either -- you can make a lot of money building houses or making sidewalks -- it's just that different problems require different solutions, and different software companies are going to have different challenges to the problem of "get bigger."

tnr23 · on May 31, 2022

Exactly this. It sounds easy but this type of creator who can do this is very rare but usually has a high success rate. This should be the defintion of the "10X dev"

FpUser · on May 31, 2022

I've delivered custom backend business servers to a various businesses of healthy size. Almost all were deployed on rented multicore servers with gobbles of RAM from Hetzner / OVH / etc. Some running for many years already. Not a single one required to be "enterprise scaled".

Granted all servers are C++ and talking to a local DB. No problems in processing thousands requests per second without breaking much sweat.

winrid · on June 1, 2022

Sounds interesting. What's the DB? What framework/type of API are you exposing on the C++ size?

FpUser · on June 1, 2022

Postgresql. No framework (well there is rather low level one of my own but it is fluid). Using 4 libs for JSON, postgres, HTTP, logging. I expose basically JSON based RPC. Peers / Clients post JSON commands and receive JSON replies.

Upon startup all data (except couple of giant tables) is sucked from DB into RAM into highly efficient data structures hence most reading requests are handled in microseconds. Those 2 giant tables are loaded into the same structs but partially. This is usually enough as RAM keeps last few years worth of data and requests that need something older come just couple of times per month. Only writes touch the database and immediately reflected in RAM as well.

This is just an example. I write game like servers as well and those use different approach. Generally I treat each project / product individually.

deterministic · on June 2, 2022

Yep that’s the way to do it. One improvement IMHO would be to have an event log as the truth and everything else (in-memory/databases/…) derived from that event log. It is an architecture that scales from tiny to Google scale. You don’t even need something like Kafka. You can literally start with a single disk file as the event long and read it into memory at server startup. And then “scale up” by keeping the event log in a single table in a DB (SQLite/Postgres’s etc.)

FpUser · on June 2, 2022

For business it is better to have "properly" designed database with the clean structure looking at which it is easy to understand how everything is organized and works. It can be used / interrogated by other software (writes are of course prohibited).

Sure it will not do Google scale and I do not loose my sleep over it. For most "normal" businesses what I have is way more than enough.

Besides, as I've already said I treat each product individually so my architecture reflects the actual needs and changes accordingly.

deterministic · on June 4, 2022

Not my experience at all. The problem with a “clean” “properly” designed database schema is handling migration and schema changes. Also, in an Enterprise environment, you end up having external systems relying on your schema, making it nearly impossible to upgrade the schema without breaking systems you don’t even know about until somebody shouts at you.

winrid · on June 1, 2022

Nice! A lot of game servers are similar to this too, which you of course know. In my case I usually just "snapshot" the in memory structures to disk...

BTW for others considering this approach, the DB will do similar things that FpUser is doing in the app layer - keeping hot/recent stuff in memory. But when tail latency and cost really matters this is a good technique.

FpUser · on June 1, 2022

>"DB will do similar things"

Not really as the storage data layout and application data layout are different. If I were to use database only relying on it's caching abilities it would lead to complex queries that are definitely way less performant than getting data from in memory structures optimized for this particular application / usage patterns.

On top of that some requests require some relatively complex calculations that again are executed against optimized in memory layout and would suffer greatly if done against the database.

I did some experiments and the difference was very significant.

winrid · on June 3, 2022

Sure, I definitely don't know what you're doing. Usually I would reach for preaggregation here, computing the values on write, but I don't know anything about the use case so... :)

_ktx2 · on May 31, 2022

I like the article, but overall I think it's the same advice as "start small, limit scope, be ready to pivot" with a dash of "don't make enterprises an early customer". I obviously paraphrased quite a bit.

What is glaringly obvious from this article is that "enterprise" in this industry has lost much of its meaning. It's almost ubiquitous with "medium+", which I think is a self-defeating definition that starts to make sense only when you look at what features are commonly gatekept behind enterprise distributions of software.

yarg · on May 31, 2022

Though it helps if you design your software in a manner which allows for drop-in replacements to non-scalable parts of your design.

halfmatthalfcat · on May 31, 2022

Great use-case for GraphQL, unironically.

beckingz · on June 1, 2022

Or just a classic database, backend, frontend three tier application?

robertlagrant · on May 31, 2022

This is absolutely it.

zmmmmm · on June 1, 2022

> ‘Enterprise,’ technically, should mean extremely large companies such as Meta

Meta? This person wrote a whole article about it without knowing what enterprise software actually is.

corobo · on May 31, 2022

Maybe not quite enterprise ready but if you have any tabular data anywhere it's helpful to have an export to csv option. Business folk do love their Excel :)

dub · on June 1, 2022

Horizontally scalable data storage is generally available these days (cockroach, tidb, vitess, etc.)

Rearchitecting from unscalable to scalable data storage is notoriously difficult and expensive. Even the most famously competent companies and teams have struggled with that transition and invested millions and millions of dollars on it. Building on unscalable data storage when scalable data storage is readily available feels like planning for failure and making a huge bet that your product or system will never have widespread use.

dominotw · on June 1, 2022

I am suprised how niche these newsql databases are. Is it because they are expensive? why wouldn't everyone use those.

dub · on June 1, 2022

Scalable relational storage is relatively new to industry consciousness (outside of Google, where Spanner is used for nearly everything).

Spanner and Vitess existed at Google in the early 2010s. It was mid-to-late 2010s before CockroachDB, TiDB, and Vitess were available as open source.

Battle hardened database administrators who've been working with traditional unscalable databases for decades often aren't up-to-date with what's possible today unless they frequently talk shop with peers at other companies or go to conferences.

Personally I hope old databases like MySQL and PostgreSQL die out within the next decade. They served us well, but they're 25+ years old and the weight of legacy code plus the need for backwards compatibility is a huge drag on their ability to evolve.

cersa8 · on June 1, 2022

I've experimented with migrating from PostgreSQL to CockroachDB and YugabyteDB. It makes sense that a newsql DB is slower but compared to to a local PostgreSQL cluster it was unacceptably slow. You lose features and are now dependent on a heavily VC funded startup that may not make it.

encoderer · on May 31, 2022

You don’t need 90% of what you think you need before launching a product.

We didn’t have a “change email address” flow for two years. Instead we focused on the product. That’s what buyers really care about.

You need this stuff eventually - SAML included — but not before you launch and refine your product.

capkutay · on May 31, 2022

I wonder if a recession or bear market environment will quickly make this type of advice – that worked well for the 2010s boom – obsolete in a harsher climate. Might be hard to scale your go-to-market by depending on startups who don't want 'enterprise-ready' features to spend money on your product.

darkstar_16 · on May 31, 2022

I think enterprise ready here refers more to scale than to features. People try and go the microservices, distributed DBs, hyper scale route way too early rather than focussing on their product, possibly building it as a monolith with simpler SQL DBs before going all out.

bcrosby95 · on May 31, 2022

I think the opposite. Prematurely building things you don't need costs $$$$, and in a tight funding environment you don't want to burn cash you don't need to.

Also compared to the 00s, the 2010s were a decade of gold plating and planning for scales you don't need.

yen223 · on May 31, 2022

That already happened. A large part of why Fast collapsed so fast was their overly-high infrastructure and engineering spend

smokey_circles · on June 1, 2022

The title made me expect some argument about endlessly spinning up cloud resources, which I was ready to pull apart.

Instead, I found an argument about not building v1 with unnecessary optimizations.

Which I agree with but I didn't find anything new being said here.

What I've found _is_ worth investing in early on is measurement and metrics. No more blind changes, go look at the data and draw testable theories out of it. Huge quality of life improvement.

samsquire · on June 1, 2022

I wrote an simple YAML configured command line utility app that coordinates Vagrant LXC and Ansible so you could deploy any number of microservices locally

I wrote a cut down open sourced version which is far less advanced than the one I introduced to the DevOps team I was on.

HTTPS://GitHub.com/samsquire/platform-up

Paired with keepass2dir and secrets-login secrets-logout you can handle secrets too.

HTTPS://GitHub.com/samsquire/keepasscsv2dir

chupchap · on June 1, 2022

I think the author is drawing parallels between enterprise and data volumes that need to be managed. While that is a way to look at it, in my pre-sales experience it revolves around the complexity in terms of system functionality expected and the ecosystem of applications it needs to play along with. I would also add the importance of different ratings and certifications a system needs to have to make it enterprise ready along with its ability to meet varios legal requirements in each country it sells in. Example: GDPR, data residency laws, tokenisation for payments to name a few that a lot of tools still struggle with.

The expectation of the word scalability changes based on who you took ot: concurrent users, througput of data managed, data stored, processing time, system maintenance, feature extension, you name it. No system checks all the boxes 100% while still meeting the budget defined by a company.

davegauer · on June 3, 2022

There's this great quote at the beginning of the chapter I'm currently reading in the book _Designing Data-Intensive Applications_:

> "A complex system that works is invariably found to have evolved from a simple system that works. The inverse proposition also appears to be true: A complex system designed from scratch never works and cannot be made to work."

--John Gall, Systemantics (1975)

andrewstuart · on June 1, 2022

As always, you cannot make blanket statements with software such as "you don't need to worry about scale".

Perhaps your application does need to worry about scale and then if you don't then you're a fool. Then you launch, succeed and collapse or get beaten by competition because the effort to scale was beyond you in the needed time.

The blanket statement that is true about software is "be wary of blanket statements about software development".

c0mptonFP · on June 1, 2022

The advice "you don't need scalable architecture" comes from the same place as "performance isn't important, code is all about readability and maintainability"

It's just insecurity and pure coping. They're not dealing with complex performance bottlenecks, so they say these efforts are wasted.

DeathArrow · on June 1, 2022

If you want to be web scale, the easiest thing is to use /dev/null as a data store.

pragmatic · on June 3, 2022

Yes, to a point. But if you can easily paint yourself into a corner than the refactoring necessary to scale is so expensive it might as well be impossible.

Living that nightmare right now.

randyrand · on May 31, 2022

Healthcare.gov also took this advice!

pjmlp · on June 1, 2022

Indeed, but all those technologies look so nice on the curriculum.

samsquire · on June 1, 2022

I want a scalable platform that can scale to zero and scale upwards.

Like Kubernetes cluster autoscaler and scaling individual deployments.

I want to be capable of running the same code on a 100KiB file as a 10000 petabyte datastore.

As it stands you need to write your application differently.

wildrhythms · on May 31, 2022

What an aesthetically pleasing website!

Existenceblinks · on June 1, 2022

Tldr; use frequent iteration / feedback loop as enterprise-advantage alternative to build trust. Ahh, people really do like fashion though, peer testimonial, network effect etc involving kinda semi-real value construct just like the enterprise does (just spend more money because they can afford to)

jillesvangurp · on June 1, 2022

For a great UX, great performance is actually important so that your UI does not block too long on requests that users notice. To get great performance, you need to do things right in your backend. It's not optional. If you do things right, it's going to be easier to scale what you are doing when the time comes. Those two are connected.

Usually where SAAS startups get into trouble is actually the UX and product market fit. There are a lot of successful SAAS companies with absolutely terrible UX. The reason they are successful is that the UX does not matter if what the software does is so valuable that companies want to buy it anyway. So, obsessing about a UX for customers that don't actually care too much may be a mistake. E.g. investing in native apps and tripling the number of frontend engineers you need for that is usually a mistake. Get the software in the hands of your customers quickly and learn whether it lives up to your sales pitch fast. Simple web UI and a simple backend with some conservative tech choices goes a long way. And that stuff is easy to scale.

A lot of SAAS software is bought by people who don't actually use the software themselves. These people buy based on a story and a promise; not the actual UX. There's this curious dynamic where you e.g. sell to the VP of sales who then makes his sales team use the software. The higher the price, the less the UX matters. That's why a lot of SAAS software is absolutely miserable to use: it's designed to make the boss happy, not users. This is also what represents a big opportunity: a lot of that software is not very well defended and can be disrupted with better products. And this happens all the time. But then those companies also end up selling to the same C-level executives and that software becomes less usable over time. Salesforce is the classic example. I've met sales people that hate that stuff with a passion but have to use it anyway. But they started out by disrupting that space.

Ironically, UX matters more if you don't have a remarkable product that is easy to copy. These products depend on easy purchase decisions by people lower in the org chart. Slack is the classic example that showed up in many companies because they had a freemium layer. And then those companies decided to pay up because they had a growing number of people using it.

So, that kind of product actually needs to scale early because it depends on a lot of users not paying anything before enough accounts start converting to paid accounts. Slack only managed that because they had lots of funding. Many companies would have run out of steam long before they could have succeeded.

Those users are going to be a lot more critical on the UX, and it's responsiveness and if the experience sucks, they'll be gone in no time and use something shinier. So, such startups will never break even unless they learn to scale before their runway runs out. For every cute little SAAS app that made it, there are hundreds of me too apps that never came even close to breaking even. Fail fast causes lots of companies to fail. Fast. Unless they can scale and grow. This is why investors are weary of B2C as well. Same scaling issues and it's a lot harder to get consumers to pay. And the UX is critical.

More traditional SAAS companies can afford to close a few big deals with a mediocre product before they have to scale. Mostly these companies keep on selling to bigger and bigger companies without ever needing to address their UX issues.

jsiaajdsdaa · on May 31, 2022

You actually do need to be scalable if you intend to serve more than a few thousand concurrent customers.