How to reduce npm build time?

12 Apr.,2024

 

If you're like me, every once in a while your CI process crosses a magic threshold and you decide you have to carve it back down to size. I ran just over 1000 TeamCity CI builds on a less-than-beefy VM with a basic React/TypeScript/ASP.Net Core project, trying different combinations of npm calls, flags, and alternatives.

Here's the options that performed the best.

Suggested Options

If you do not clean the workspace on every build:

  • Best: Use yarn install or pnpm install - 88% + 80% faster than npm install
  • 2nd Best: Use npm install --prefer-offline --no-audit - 15% faster than npm install
  • Do not: Do not use npm ci, see note below

npm install at 20 seconds, vs yarn and npm at under 4 seconds

npm install at 20 seconds, vs yarn and npm at under 4 seconds

If you clean the workspace on every build (or use a build service that doesn't cache environments):

  • Best: Use yarn install or pnpm install - 77% + 63% faster than npm install
  • 2nd Best: Use npm ci --prefer-offline --no-audit - 53% faster than npm install

vanilla npm install at 270 seconds vs vs yarn and npm under 1/2 of that

vanilla npm install at 270 seconds vs vs yarn and npm under 1/2 of that

The biggest gain on the npm calls is due to --prefer-offline, which tells npm to use locally cached packages when available, only calling the registry if it isn't already available.

You will get poorer performance from the machine cache, try using a private registry or caching proxy. This way, once a developer or alternate build agent has downloaded the package once, it will be available closer to your other systems.

Example: Verdaccio (Note: I haven't personally tested this system yet)

Can I get the faster build times if I have to clean my build every time?

Use yarn or pnpm (kidding, not kidding)

Another option I've seen has been to archive node_modules to a common directory at the end of your CI build, then add a first step to restore it if the archive exists. This moves you from the "clean" timings to the "dirty" timings, which is a 10x increase (then switch to yarn for another 8x).

Environment + Versions

The test data was produced using the following versions:

  • node.js: 10.15.1
  • npm: 6.4.1
  • yarn: 1.13.0
  • pnpm: 2.25.6
  • TeamCity: 2018.2.2 (build 61245)

And the following project:

  • github/BlogExample.Web/ClientApp: React 16.2 with TypeScript 3.3.3, Redux, Thunk, etc

Warning: npm ci performance

One result that really stood out during my tests was how poorly npm ci performed on non-clean builds. TeamCity can nuke and recreate the workspace in seconds, while running npm ci on a dirty folder was adding 50% (1 minute on this system) over running it on an empty folder (clean build).


Clean `npm ci` at 128 seconds, dirty `npm install` at 185 seconds?

Clean `npm ci` at 128 seconds, dirty `npm install` at 185 seconds?

One other interesting item was that the release notes for npm ci came with some incredible performance results over yarn and pnpm. I don't know whether those results were a fluke, ci has gotten slower, everything else has gotten faster, or what. npm ci was visibly slower than both of those tools under the same circumstances and latest versions.

Tested Options

Here are the details on the options I used:

npm --prefer-offline option

By default, verifies packages against the registry after a minimum cache time to see if the package is still good. The release notes say this is just a 304 check, but this still includes the network latency and lookup time in the registry. That minimum cache time is 10 seconds. 🙄

prefer-offline tells npm to ignore the cache minimum time and just go ahead and use the locally cached package if it's already been downloaded, without verifying it against the registry.

The official docs are light on this command now, but some details available here as well as in the release notes here

npm --no-audit option

The new feature to check packages for known vulnerabilities is great! However, the CI run may not be the best place to run that (is anyone looking at those results?). Skipping the audit step doesn't cut a lot off, but it does help.

I legitimately like the audit feature and think it should be checked regularly, more coming in a future post

npm --progress=false option

Older versions of npm produced a progress bar as it installed packages, slowing down some systems (Windows) with the console interaction.

I did not see a statistically significant impact from this flag in up to date npm and assume I have one of the environment variables set on the build server that turns this off OR that the performance has become less of a blocker.

npm install

Installs all packages specified in the package-lock.json file (or if that does not exist, package.json). By default it does not clean out existing packages in node_modules and performs a number of operations sequentially (like building the dependency tree of your millions of packages).

npm ci

"Continuous Integration" or "Clean Install", we may never know (both are used in the docs)

npm ci performs a clean install from your package-lock.json, with the goal of sort-of reproducing a deterministic result. It will install exactly the versions you specify in your package-lock, but the dependencies of those packages may be updated.

yarn

Yarn is an alternative to npm. It takes better advantage of caching, parallelizes operations (npm does a number of things sequentially), and produces more repeatable, deterministic results than npm install.

pnpm

pnpm aims to be a faster, more efficient package management client. The main advantages are that it links files from it's cache rather than copying them and does not use the flat structure that npm has moved to in recent years.

pnpm supports the same commands as npm. Operations not directly implemented by pnpm are passed through to npm to execute.

Details Test Results

From those options, I ran a minimum of 20 builds for each test case on "clean" and "dirty" working folders. "Clean" folders were cleared by TeamCity before the build started, "Dirty" folders came after at least one prior untracked run with the same package command.


24 Test Cases of npm install, npm ci, yarn, and pnpm

24 Test Cases of npm install, npm ci, yarn, and pnpm

Test Case Labels:

  • po: --prefer-offline
  • na: --no-audit
  • pf: --progress=false

Tests were run over the course of a week and a half. The build server was stable and running on a VM on a dedicated server, with no other active VMs or workload. Network performance was consistent over this time and some cases (especially that long ci) were run over more than one interval or for much longer sampling times. Time was measured both for the total run time as well as specifically just the npm step. In total, 1120 TeamCity builds were run.

Keep in mind, these won't be exactly the same on your environment, but they should be directionally correct enough to give you a solid start on your own environment.

At Novu, we use a monorepo to manage our 24 libraries and apps. There are many debates over whether you should use a monorepo or a poly-repo. For us, visibility, code sharing, standardization, easier refactoring, and a few other reasons were the critical factors for choosing this approach for our open-source notification infrastructure project.

We migrated from yarn workspaces & lerna to PNPM and nx.dev

The bigger, the slower

With all the advantages, there are a few drawbacks to using monorepos. We noticed a particular drawback when scaling the number of packages and amount of code in each one: The time it takes to bootstrap the project and then build any packages within. So a typical GitHub action for a service would run anywhere between 11 to 30 minutes. And that's for each time a PR was created or a code was pushed to remote.

More than that, installing a package locally with yarn install could take around 5 minutes to install and build all the dependencies.

This amount of time spent bootstrapping and building reduced the developer experience and wasted collectively so much talented people's time. Being an open-source project with a growing number of contributors, this was unacceptable.

Debugging the slowest tasks

Inspecting a typical 12 minutes GitHub action, it was clear that two specific steps took almost 70-80% of the overall time:

  • yarn install - takes 5-6 minutes
  • yarn build:{package} - could take from 3-6 minutes to build the selected package and its dependencies.

Migrating from yarn workspaces to PNPM

PNPM is a fast, disk space-efficient package manager(as stated on their website), and from some of the benchmarks, there was a massive improvement in install time against yarn workspaces.

Moving from yarn install that took around 6 minutes, the migration to pnpm was effortless: Just adding a pnpm-workspace.yaml to the project's root and running pnpm install, that's all. The symlinks and dependencies for each package we're efficiently installed, in... wait for it... just 1.5 minutes! And that's without any cache at all! After PNPM caches the majority of the dependencies, it takes less than 40 seconds to build and install the dependencies from the cached store.

Reducing ~4 minutes from the bootstrap time for every CI run and locally for first-time contributors is a HUGE win. But wait, we can do even better.

From Lerna to NX.dev

After seeing the Turborepo demo by vercel, I was intrigued by their distributed caching mechanism. With such a mechanism, we can reuse the already built packages by other maintainers and download the dist assets instead of rebuilding them each time.

turborepo vs nx.dev?
After brief research, we decided to go with nx.dev for multiple reasons:

  • Maturity - nx was in the market for a while now, and they have a pretty big community around them.
  • Performance - Seeing some of the benchmarks nx looks like a faster build system overall.

Our community member nishit-g took over the open GitHub issue and quickly after we had a PR open, the results astonished us: 30 seconds the building step! (Instead of the previous 3-6 minutes building a specific set of packages).

After implementing the nx.cloud for distributed caching, the entire 24 packages take less than 5 seconds when fully cached building. But even without being fully cached due to the intelligent parallelism nx performs and builds our target package in less than 30 seconds.

Summary

Reducing our build times from 12+ minutes to around 3 minutes significantly impacts the developer experience of our maintainers. It also reduces the feedback loop from creating a PR to running our test suite to merging the feature.

HUGE Kudos to nishit-g for migrating us from Lerna to NX. Check him out on his Twitter as well!

You can check the final configuration on our GitHub repository.

How to reduce npm build time?

How we reduced our nodejs monorepo build time by 70%