If you're like me, every once in a while your CI process crosses a magic threshold and you decide you have to carve it back down to size. I ran just over 1000 TeamCity CI builds on a less-than-beefy VM with a basic React/TypeScript/ASP.Net Core project, trying different combinations of npm calls, flags, and alternatives.
Here's the options that performed the best.
If you do not clean the workspace on every build:
yarn install
or pnpm install
- 88% + 80% faster than npm install
npm install --prefer-offline --no-audit
- 15% faster than npm install
npm ci
, see note belownpm install at 20 seconds, vs yarn and npm at under 4 seconds
If you clean the workspace on every build (or use a build service that doesn't cache environments):
yarn install
or pnpm install
- 77% + 63% faster than npm install
npm ci --prefer-offline --no-audit
- 53% faster than npm install
vanilla npm install at 270 seconds vs vs yarn and npm under 1/2 of that
The biggest gain on the npm
calls is due to --prefer-offline
, which tells npm
to use
locally cached packages when available, only calling the registry if it isn't already available.
You will get poorer performance from the machine cache, try using a private registry or caching proxy. This way, once a developer or alternate build agent has downloaded the package once, it will be available closer to your other systems.
Example: Verdaccio (Note: I haven't personally tested this system yet)
Use yarn
or pnpm
(kidding, not kidding)
Another option I've seen has been to archive node_modules to a common directory at the end of your CI build, then add a first step to restore it if the archive exists. This moves you from the "clean" timings to the "dirty" timings, which is a 10x increase (then switch to yarn for another 8x).
The test data was produced using the following versions:
And the following project:
One result that really stood out during my tests was how poorly npm ci
performed
on non-clean builds. TeamCity can nuke and recreate the workspace in seconds, while
running npm ci
on a dirty folder was adding 50% (1 minute on this system) over
running it on an empty folder (clean build).
Clean `npm ci` at 128 seconds, dirty `npm install` at 185 seconds?
One other interesting item was that the release notes for npm ci
came with some
incredible performance results over yarn and pnpm. I don't know whether those
results were a fluke, ci
has gotten slower, everything else has gotten faster, or what.
npm ci
was visibly slower than both of those tools under the same circumstances and
latest versions.
Here are the details on the options I used:
npm --prefer-offline
optionBy default, verifies packages against the registry after a minimum cache time to see if the package is still good. The release notes say this is just a 304 check, but this still includes the network latency and lookup time in the registry. That minimum cache time is 10 seconds. 🙄
prefer-offline
tells npm to ignore the cache minimum time and just go ahead and use
the locally cached package if it's already been downloaded, without verifying it
against the registry.
The official docs are light on this command now, but some details available here as well as in the release notes here
npm --no-audit
optionThe new feature to check packages for known vulnerabilities is great! However, the CI run may not be the best place to run that (is anyone looking at those results?). Skipping the audit step doesn't cut a lot off, but it does help.
I legitimately like the audit feature and think it should be checked regularly, more coming in a future post
npm --progress=false
optionOlder versions of npm produced a progress bar as it installed packages, slowing down some systems (Windows) with the console interaction.
I did not see a statistically significant impact from this flag in up to date npm and assume I have one of the environment variables set on the build server that turns this off OR that the performance has become less of a blocker.
npm install
Installs all packages specified in the package-lock.json
file (or if that does not exist, package.json
). By default it does not clean out existing packages in node_modules
and
performs a number of operations sequentially (like building the dependency tree of your
millions of packages).
npm ci
"Continuous Integration" or "Clean Install", we may never know (both are used in the docs)
npm ci
performs a clean install from your package-lock.json
, with the goal of sort-of
reproducing a deterministic result. It will install exactly the versions you specify in
your package-lock
, but the dependencies of those packages may be updated.
yarn
Yarn is an alternative to npm. It takes better advantage of
caching, parallelizes operations (npm does a number of things sequentially), and produces
more repeatable, deterministic results than npm install
.
pnpm
pnpm aims to be a faster, more efficient package management client. The main advantages are that it links files from it's cache rather than copying them and does not use the flat structure that npm has moved to in recent years.
pnpm supports the same commands as npm. Operations not directly implemented by pnpm are passed through to npm to execute.
From those options, I ran a minimum of 20 builds for each test case on "clean" and "dirty" working folders. "Clean" folders were cleared by TeamCity before the build started, "Dirty" folders came after at least one prior untracked run with the same package command.
24 Test Cases of npm install, npm ci, yarn, and pnpm
Test Case Labels:
Tests were run over the course of a week and a half. The build server was stable and running on a VM on a dedicated server, with no other active VMs or workload. Network performance was consistent over this time and some cases (especially that long ci) were run over more than one interval or for much longer sampling times. Time was measured both for the total run time as well as specifically just the npm step. In total, 1120 TeamCity builds were run.
Keep in mind, these won't be exactly the same on your environment, but they should be directionally correct enough to give you a solid start on your own environment.
At Novu, we use a monorepo to manage our 24 libraries and apps. There are many debates over whether you should use a monorepo or a poly-repo. For us, visibility, code sharing, standardization, easier refactoring, and a few other reasons were the critical factors for choosing this approach for our open-source notification infrastructure project.
We migrated from yarn workspaces & lerna to PNPM and nx.dev
With all the advantages, there are a few drawbacks to using monorepos. We noticed a particular drawback when scaling the number of packages and amount of code in each one: The time it takes to bootstrap the project and then build any packages within. So a typical GitHub action for a service would run anywhere between 11 to 30 minutes. And that's for each time a PR was created or a code was pushed to remote.
More than that, installing a package locally with yarn install
could take around 5 minutes to install and build all the dependencies.
This amount of time spent bootstrapping and building reduced the developer experience and wasted collectively so much talented people's time. Being an open-source project with a growing number of contributors, this was unacceptable.
Inspecting a typical 12 minutes GitHub action, it was clear that two specific steps took almost 70-80% of the overall time:
PNPM is a fast, disk space-efficient package manager(as stated on their website), and from some of the benchmarks, there was a massive improvement in install time against yarn workspaces
.
Moving from yarn install
that took around 6 minutes, the migration to pnpm was effortless: Just adding a pnpm-workspace.yaml
to the project's root and running pnpm install
, that's all. The symlinks and dependencies for each package we're efficiently installed, in... wait for it... just 1.5 minutes! And that's without any cache at all! After PNPM caches the majority of the dependencies, it takes less than 40 seconds to build and install the dependencies from the cached store.
Reducing ~4 minutes from the bootstrap time for every CI run and locally for first-time contributors is a HUGE win. But wait, we can do even better.
After seeing the Turborepo demo by vercel, I was intrigued by their distributed caching mechanism. With such a mechanism, we can reuse the already built packages by other maintainers and download the dist
assets instead of rebuilding them each time.
turborepo vs nx.dev?
After brief research, we decided to go with nx.dev for multiple reasons:
Our community member nishit-g took over the open GitHub issue and quickly after we had a PR open, the results astonished us: 30 seconds the building step! (Instead of the previous 3-6 minutes building a specific set of packages).
After implementing the nx.cloud for distributed caching, the entire 24 packages take less than 5 seconds when fully cached building. But even without being fully cached due to the intelligent parallelism nx performs and builds our target package in less than 30 seconds.
Reducing our build times from 12+ minutes to around 3 minutes significantly impacts the developer experience of our maintainers. It also reduces the feedback loop from creating a PR to running our test suite to merging the feature.
HUGE Kudos to nishit-g for migrating us from Lerna to NX. Check him out on his Twitter as well!
You can check the final configuration on our GitHub repository.