Game of Life looks simple, but rendering millions of cells in real time inside a browser is not. This post is based on my seminar talk “Efficient rendering in HTML Canvas for cellular automaton simulations” and walks through several CPU and GPU techniques, comparing their trade‑offs and performance.
TL;DR: Separating computation from rendering, avoiding unnecessary drawing, and moving work to multiple threads or the GPU are key to smooth large‑grid simulations.
My thesis project Fuzzy Life extends Conway’s Game of Life with fuzzy cell values, which immediately amplifies both computation and rendering costs. When simulating grids with hundreds of thousands or millions of cells, the bottleneck is no longer just the rules – it is how often and how efficiently the world is drawn.
Typical pain points:
The goal is to find rendering architectures that:
HTML <canvas> provides a bitmap surface that JavaScript can draw into with a 2D rendering context.
clearRect, fillRect, drawImage, stroke, or fillText.For Game of Life this means the naive approach is to loop over all cells and call fillRect for each live cell on every frame.
All demos are small HTML files that share a common JavaScript core and differ in how they compute and draw frames.
The demos are served through a minimal Express server with the necessary COOP/COEP headers:
// server.js
app.use((req, res, next) => {
res.setHeader('Cross-Origin-Opener-Policy', 'same-origin');
res.setHeader('Cross-Origin-Embedder-Policy', 'require-corp');
next();
});
app.use(express.static(__dirname));
app.get('/', (req, res) => {
res.sendFile('index1-full-redrawn.html', { root: __dirname });
});
Launch with:
npm init -y
npm i express
node server.js
# open http://localhost:8080/
Available demo pages:
index1-full-redrawn.html – full redraw baseline.index2-dirty-rectangles.html – dirty rectangles.index3-vis-region-rendering.html – visible region rendering.index4-static-web-workers.html – single web worker.index5-static-web-workers-n.html – multi‑worker with copying.index6-sharedarray-multiworker.html – SharedArrayBuffer multi‑worker.index7-static-image-data.html – ImageData push rendering.index8-sharedarray-multiworker-imagedata.html – SAB + ImageData hybrid.index9-gpu-webgl.html – WebGL2 GPU version.The simplest model is to redraw the entire world every step, ignoring camera or visibility.
Idea
ROWS * COLS cells.function drawAll(ctx) {
ctx.fillStyle = '\#fff';
ctx.fillRect(camX, camY, visW, visH);
ctx.fillStyle = '\#000';
// Full redraw of every cell
for (let y = 0; y < ROWS; y++) {
const off = y * COLS;
for (let x = 0; x < COLS; x++) {
if (filled[off + x]) {
ctx.fillRect(x, y, 1, 1);
}
}
}
}
Pros
Cons
fillRect calls, including off‑screen cells.Dirty rectangles track which cells changed between frames and redraw just those cells over the old frame.
Idea
filled (current) and next (next generation).filled[i] !== next[i]:
function drawDirtyGlobal() {
for (let y = 0; y < ROWS; y++) {
const off = y * COLS;
for (let x = 0; x < COLS; x++) {
const i = off + x;
if (filled[i] !== next[i]) {
ctx.fillStyle = next[i] ? '\#000' : '\#fff';
ctx.fillRect(x, y, 1, 1);
}
}
}
}
Characteristics
Visible region rendering clips the world to what the camera sees and draws only the currently visible part.
Idea
const W = canvas.width,
H = canvas.height;
const left = camX,
top = camY;
const right = camX + W * S;
const bottom = camY + H * S;
const cs = Math.max(0, Math.floor(left / CELL));
const ce = Math.min(COLS - 1, Math.ceil(right / CELL));
const rs = Math.max(0, Math.floor(top / CELL));
const re = Math.min(ROWS - 1, Math.ceil(bottom / CELL));
ctx.save();
ctx.scale(S, S);
ctx.translate(-camX, -camY);
ctx.fillStyle = '\#fff';
ctx.fillRect(left, top, right - left, bottom - top);
ctx.fillStyle = '\#000';
for (let y = rs; y <= re; y++) {
const off = y * COLS;
for (let x = cs; x <= ce; x++) {
if (filled[off + x]) {
ctx.fillRect(x, y, 1, 1);
}
}
}
ctx.restore();
Why it matters
Game of Life’s neighbor updates are local and parallelizable, so moving the simulation step to a Web Worker keeps the UI thread responsive.
Architecture
Main thread:
const worker = new Worker('worker.js');
worker.postMessage({
init: true,
COLS,
ROWS
});
worker.onmessage = (e) => {
const filled = new Uint8Array(e.data.buffer); // received world state
draw(filled); // UI thread only draws
};
Worker:
onmessage = (e) => {
if (e.data.init) {
initWorld(e.data.COLS, e.data.ROWS);
return;
}
// Compute next generation
for (let y = 0; y < ROWS; y++) {
for (let x = 0; x < COLS; x++) {
const i = y * COLS + x;
next[i] = rule(filled, x, y);
}
}
// Swap buffers and transfer
[filled, next] = [next, filled];
postMessage({
buffer: filled.buffer
}, [filled.buffer]);
};
Pros
ArrayBuffer as transferable avoids extra copies on send.Cons
To leverage multiple cores, the world can be split into horizontal strips, each handled by its own worker.
However:
filled.buffer.slice(...)) per step.postMessage each frame costs hundreds of milliseconds.Result:
This motivates removing data copies entirely.
SharedArrayBuffer allows several workers and the main thread to share the same underlying memory. No copies, no transfer list, just shared typed arrays with proper synchronization when needed.
Initialization
Main thread:
const sabA = new SharedArrayBuffer(COLS * ROWS);
const sabB = new SharedArrayBuffer(COLS * ROWS);
const worldA = new Uint8Array(sabA);
const worldB = new Uint8Array(sabB);
// spawn workers and send SABs
for (const worker of workers) {
worker.postMessage({
init: true,
COLS,
ROWS,
sabA,
sabB
});
}
Workers use new Uint8Array(sabA) and new Uint8Array(sabB) to read from one buffer and write the next generation into the other.
Pattern
current = B; next = A (ping‑pong).postMessage data payloads are necessary; messages only signal “step done”.Performance
Requirements
To use SharedArrayBuffer in the browser, the page must be cross‑origin isolated, e.g. via:
Cross-Origin-Opener-Policy: same-originCross-Origin-Embedder-Policy: require-corpputImageDataInstead of calling fillRect thousands of times, ImageData rendering builds a pixel buffer in memory and sends it to the canvas in a single call.
Idea
ctx.createImageData(W, H) to get an ImageData object for the viewport size.Uint8ClampedArray with grayscale or RGB values based on cell state and zoom.ctx.putImageData(img, 0, 0) once per frame.const img = ctx.createImageData(W, H);
const data = img.data;
for (let py = 0; py < H; py++) {
const wy = top + Math.floor(py * S);
if (wy >= bottom) break;
for (let px = 0; px < W; px++) {
const wx = left + Math.floor(px * S);
if (wx >= right) break;
const alive = filled[wy * COLS + wx];
const c = alive ? 0 : 255;
const i = (py * W + px) * 4;
data[i] = c; // R
data[i + 1] = c; // G
data[i + 2] = c; // B
data[i + 3] = 255; // A
}
}
ctx.putImageData(img, 0, 0);
Pros
Cons
ImageData block from JS to the native context still costs several milliseconds.The hybrid approach uses:
This combines:
In practice:
The GPU variant moves the entire Game of Life step onto the graphics card using WebGL2.
Key ideas
This means:
In measurements, the WebGL2 implementation handled millions of cells per frame in about 1 ms, significantly faster than even the SAB + ImageData combination.
The table below roughly summarizes measured times and qualitative notes from the seminar (times depend on hardware but show relative ordering).
| Technique | Time (ms) | Notes |
|---|---|---|
| Canvas Full redraw | 100–300 | Very simple, but scales with world size |
| Dirty rectangles | 70–200 | Draws only changes, still off‑screen work |
| Visible region | 10–250 | Only visible area; depends on zoom |
| Web Worker – 1 thread | ~300 | Separates compute from UI |
| Web Worker – 4 threads | 500–1000 | Parallel compute, but costly buffer copies |
| SAB – 4 workers | 80–120 | Zero‑copy shared memory, smooth UI |
| ImageData | 10–100 | One big putImageData, good at large zoom |
| SAB + ImageData | 40–70 | Best CPU‑side implementation |
| WebGL2 GPU | ~1 | Millions of cells in real time |
Observation: performance systematically improves as more work is parallelized and moved closer to the GPU, especially when avoiding redundant drawing and memory transfers.
Efficient Game of Life rendering in the browser is less about the Life rules and more about data movement and drawing strategy. Starting from a naive full redraw, progressively introducing dirty regions, camera clipping, multi‑threaded computation, shared memory, and GPU offloading leads to orders‑of‑magnitude speedups for large grids.
For practical browser simulations, the SAB + ImageData approach provides an excellent balance between simplicity, performance, and debuggability on the CPU, while a WebGL2 implementation remains the ultimate choice if a GPU is available and slightly higher complexity is acceptable.
The complete source code for all demos is available on GitHub: