7 min read

Parallelism in NodeJs 0-1

graph TD style A fill:#f9d71b,stroke:#333,stroke-width:2px; A[Node.js Parallelism] B[Cluster Module] C[Event Loop] D[Worker Threads] E[Callbacks, Promises, Async/Await] F[Threads in a Single Process] G[Child Processes] A --> B A --> C A --> D B -->|Forks Multiple Processes| G C -->|Handles Async Operations| E D -->|Spawns Additional Threads| F

The Nature of JavaScript & Node.js

JavaScript was designed to be a single-threaded language. This means that it executes one operation at a time, in a specific order, from top to bottom. Code in JavaScript runs line by line. If one line takes a long time (e.g., a complex computation or an API call without callbacks), it can block the subsequent lines from executing.

By default everything runs on the main thread. So at one single time it can only do one specific thing.

Why was javascript single threaded in the first place ?

  • Javascript's primary role was to add interactivity to web pages, like form validation, animations, and handling user interactions. This didn't necessitate the complexity of multi-threaded architectures.
  • Being single-threaded made JavaScript easier to use, especially for beginners. Multi-threaded environments introduce complexities like race conditions, deadlocks, and thread synchronisation issues. By being single-threaded, JavaScript avoided these issues and offered a more straightforward model of execution.

So now typically in most applications we would have three types of operations :

  • Blocking Operations
  • Non Blocking IO Operations
    • Database / File Manipulation Operations
  • Non Blocking CPU Intensive Operations
    • Image processing
    • Playing with large amount of data

One by one we will look at how does NodeJs effective handles parallelism when it comes to IO operations and how we can write code for doing the same in CPU intensive operations.

1 ) Parallelism in IO operations : Event Loop

Now imagine we made an API call which took 3 seconds to complete and it is blocking the main thread, and since javascript is single threaded that would mean that for 3 seconds we will not be able to do anything.

sequenceDiagram participant MainThread as Main Thread participant API as API MainThread->>API: Initiates an API Call Note over MainThread,API: Main thread is blocked API-->>MainThread: Responds after 3 seconds Note over MainThread: Main thread resumes

Javascript is smart in that regard, it follows an event driven architecture. i.e. just after the IO starts it tells the main thread that we can start doing other things and informs the main thread once the IO has been done.

Event loop can be thought of as the orchestrator in the Node.js runtime environment. It continuously checks if there are tasks that need to be executed and ensures they are handled in the correct order

sequenceDiagram participant MainThread as Main Thread participant EventLoop as Event Loop participant API as API MainThread->>EventLoop: Initiates an API Call EventLoop->>API: Starts API Call Note over MainThread,API: While API processes, main thread is free for other tasks API->>EventLoop: API completes Note over EventLoop,MainThread: Event Loop notifies Main Thread to process API response

Behind the scenes : Libuv

Node.js uses a library called libuv to handle asynchronous operations. libuv has a pool of worker threads that perform the actual I/O operations in the background. Once an I/O operation is initiated, it's handed off to one of these threads, leaving the main thread free to continue executing other tasks.

To monitor if our libuv is working correctly Node.js that provides an API to track asynchronous resources throughout their lifecycle.

import * as async_hooks from 'async_hooks';

let timeout_async_id: number | null = null;

function initHook(asyncId: number, type: string, triggerAsyncId: number, resource: Object) {
    // Only log when the type is 'Timeout'
    if (type === 'Timeout') {
        timeout_async_id = asyncId;
        console.log(`Init: asyncId=${asyncId}, type=${type}, triggerAsyncId=${triggerAsyncId}`);
    }
}

function beforeHook(asyncId: number) {
    if (asyncId === timeout_async_id) {
        console.log(`Before: asyncId=${asyncId}`);
    }
}

function afterHook(asyncId: number) {
    if (asyncId === timeout_async_id) {
        console.log(`After: asyncId=${asyncId}`);
    }
}

function destroyHook(asyncId: number) {
    if (asyncId === timeout_async_id) {
        console.log(`Destroy: asyncId=${asyncId}`);
    }
}

const asyncHook = async_hooks.createHook({ init: initHook, before: beforeHook, after: afterHook, destroy: destroyHook });
asyncHook.enable();

function demonstrateTimeout() {
    console.log('Starting setTimeout...');
    setTimeout(() => {
        console.log('Inside setTimeout callback.');
    }, 1000);
}

demonstrateTimeout();
Starting setTimeout...
Init: asyncId=5, type=Timeout, triggerAsyncId=1
Before: asyncId=5
Inside setTimeout callback.
After: asyncId=5

Output for the above code

Libuv is very good at handling IO operations and allows us to execute operations like this

function saveValueInDatabase(value: string): Promise<string> {}

async function main() {
    const results = await Promise.all([
        saveValueInDatabase("Task 1"),
        saveValueInDatabase("Task 2"),
        saveValueInDatabase("Task 3")
    ]);
}

main();

But it is not good at handling cpu intensive tasks. When running a CPU-intensive task in Node.js, the main thread gets blocked because Node.js is single-threaded by design. This means it can't handle other incoming requests or events until that CPU-bound task completes.


Worker Threads

Node.js introduced the worker_threads module as a core module to tackle the limitations associated with running CPU-intensive tasks. This module enables the use of threads that execute JavaScript in parallel

  • Each worker thread runs in its own execution context. This means it has its own memory, variables, and call stack.
  • The main thread and worker threads can send and receive messages using the postMessage() method and by listening to the message event, respectively.
  • It's common to match the number of worker threads with the number of CPU cores for CPU-bound tasks, but this is not a hard and fast rule.
  • There's no strict minimum CPU requirement for the effectiveness of worker threads. Even a single-core CPU can run multiple threads, but they won't run in true parallel; instead, they'll be time-sliced.
sequenceDiagram participant MT as Main Thread participant WT1 as Worker Thread 1 participant WT2 as Worker Thread 2 participant IO as I/O Operations participant CPU as CPU Intensive Task MT->>IO: Handle I/O Operation Note over MT: Doesn't wait, remains free MT->>WT1: Delegate CPU Intensive Task 1 MT->>WT2: Delegate CPU Intensive Task 2 WT1->>CPU: Execute Task 1 WT2->>CPU: Execute Task 2 Note over WT1,WT2: Execute in parallel WT1->>MT: Return Result of Task 1 WT2->>MT: Return Result of Task 2 Note over MT: Can continue processing other tasks
import {isMainThread, Worker, parentPort} from 'worker_threads';

async function cpuIntensiveTask() {
    let sum = 0;
    for (let i = 0; i < 1e9; i++) {
        sum += i;
    }
    return sum;
}

if (isMainThread) {
    console.log('Main thread starting...');

    // Create two worker threads
    const worker1 = new Worker(__filename);
    const worker2 = new Worker(__filename);

    worker1.on('message', (result) => {
        console.log(`Worker 1 finished with result: ${result}`);
    });

    worker2.on('message', (result) => {
        console.log(`Worker 2 finished with result: ${result}`);
    });
} else {
    // This is the worker thread
    cpuIntensiveTask().then((result) => {
        if (parentPort) parentPort.postMessage(result);
    })
}

While worker threads enables us to launch separate threads while the root process is the same. How can we offload our operations in an entirely new process ?

Cluster Module


When do we want the root process to be the same ? How does it matter for us ?

  • Launching a new process has a more significant overhead in terms of memory and initialisation compared to starting a new thread.
  • One of the primary advantages of worker threads over separate processes (like those created with the child_process module in Node.js) is the ability to share memory
  • If a part of your application is CPU-bound and can run independently, moving it to a separate process can ensure that the main application remains responsive. This is especially relevant if the computation doesn't need frequent access to shared resources.

Sample code to use cluster module

// Filename: worker-demo.ts

import {isMainThread, Worker, parentPort} from 'worker_threads';

async function cpuIntensiveTask() {
    let sum = 0;
    for (let i = 0; i < 1e9; i++) {
        sum += i;
    }
    return sum;
}

import * as cluster from 'cluster';
import * as os from 'os';

const numCPUs = os.cpus().length;  // Number of CPU cores

if (cluster.isMaster) {
    console.log(`Master ${process.pid} is running ${numCPUs} workers`);

    // Fork workers for each CPU core
    for (let i = 0; i < numCPUs; i++) {
        cluster.fork();
    }

    cluster.on('exit', (worker, code, signal) => {
        console.log(`Worker ${worker.process.pid} died`);
    });

} else {
    console.log(`Worker Starting`);
    // Workers can share the same HTTP server
    cpuIntensiveTask().then((result) => {
        console.log(`Worker ${process.pid} finished: ${result}`);
    });
    console.log(`Worker ${process.pid} started`);
}

We can even inspect if our multiple processes are running by using the ps aux | grep node command. We can see our master thread spun off 8 child threads.

worker_demo@worker_demos-iMac worker_demo % ps aux | grep nodejs
worker_demo          5555   0.0  0.0 407972240   1200 s005  S+    8:52PM   0:00.00 grep nodejs
worker_demo@worker_demos-iMac worker_demo % clear
worker_demo@worker_demos-iMac worker_demo % ps aux | grep node
worker_demo          5623  93.8  1.6  9098580 131328 s002  R+    8:53PM   0:00.87 /Users/worker_demo/.nvm/versions/node/v14.19.3/bin/node /Users/worker_demo/worker_demo/worker_demo.js
worker_demo          5621  90.4  1.6  8967028 131152 s002  R+    8:53PM   0:00.85 /Users/worker_demo/.nvm/versions/node/v14.19.3/bin/node /Users/worker_demo/worker_demo/worker_demo.js
worker_demo          5624  88.9  1.6  9098612 131184 s002  R+    8:53PM   0:00.83 /Users/worker_demo/.nvm/versions/node/v14.19.3/bin/node /Users/worker_demo/worker_demo/worker_demo.js
worker_demo          5622  88.8  1.6  8836212 130624 s002  R+    8:53PM   0:00.83 /Users/worker_demo/.nvm/versions/node/v14.19.3/bin/node /Users/worker_demo/worker_demo/worker_demo.js
worker_demo          5625  88.7  1.6  9098324 130592 s002  R+    8:53PM   0:00.81 /Users/worker_demo/.nvm/versions/node/v14.19.3/bin/node /Users/worker_demo/worker_demo/worker_demo.js
worker_demo          5620  88.0  1.6  8836212 130560 s002  R+    8:53PM   0:00.82 /Users/worker_demo/.nvm/versions/node/v14.19.3/bin/node /Users/worker_demo/worker_demo/worker_demo.js
worker_demo          5626  87.7  1.6  9098612 130480 s002  R+    8:53PM   0:00.81 /Users/worker_demo/.nvm/versions/node/v14.19.3/bin/node /Users/worker_demo/worker_demo/worker_demo.js
worker_demo          5627  87.5  1.6  9098356 130256 s002  R+    8:53PM   0:00.80 /Users/worker_demo/.nvm/versions/node/v14.19.3/bin/node /Users/worker_demo/worker_demo/worker_demo.js
worker_demo          5619   1.5  1.2  9358188 104256 s002  S+    8:53PM   0:00.12 node worker_demo.js

Concluding thoughts ?

We learned that how Node Js is smart enough to handle non-blocking IO operations on its own and for cpu intensive operations we can use worker threads or cluster modules.

In my quest to explore parallelism in different languages I am going to explore how parallelism works in python and go as well. Stay tuned and subscribe